[GitHub] [incubator-hudi] n3nash commented on issue #652: Reading Merge_on_read table| Failing SchemaParseException: Empty name

GitBox Thu, 25 Apr 2019 17:35:10 -0700

n3nash commented on issue #652: Reading Merge_on_read table| Failing 
SchemaParseException: Empty name
URL: https://github.com/apache/incubator-hudi/issues/652#issuecomment-486884444
 
 
   @sidnakoppa Something could be wrong with the setup for the realtime table 
to not be created by the hive sync tool automatically. if you take a look at 
the quickstart here : 
https://hudi.apache.org/docker_demo.html#step-3-sync-with-hive, it shows how 
both the tables should be created. 
   
   Can you create a new table with MERGE_ON_READ storage type and try syncing 
it again just to confirm that it doesn't work ? 
   
   On a side note, since you've anyways made the realtime table work, let me 
explain why you might not be seeing the latest records.
   Compaction process is just used to convert the delta files into a columnar 
file format such as parquet. 
   You should be able to query the latest records from the "_rt" table, like 
"select * from your_table_rt" even without running compaction. My suspicion is 
that you are creating new partitions every time you insert new records. You can 
confirm this by performing a `ls` on the base path. If you see a new folder for 
every new record, this is the case. If there are new partitions created, you 
will need to run the hive sync tool everytime to register the hive partitions 
against the hive metastore. So try doing the following : 
   
   // insert records
   // call hive sync tool
   // inser records
   // call hive sync tool
   
   Ideally, you can just write a wrapper job that invokes insert and hive sync 
tool one after another to avoid doing this manually. Hope this helps.
   
   Also, let's try to resolve this ticket here but going forward, please ask 
questions on the mailing list for faster responses.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #652: Reading Merge_on_read table| Failing SchemaParseException: Empty name

Reply via email to