n3nash commented on issue #652: Reading Merge_on_read table| Failing SchemaParseException: Empty name URL: https://github.com/apache/incubator-hudi/issues/652#issuecomment-486884444 @sidnakoppa Something could be wrong with the setup for the realtime table to not be created by the hive sync tool automatically. if you take a look at the quickstart here : https://hudi.apache.org/docker_demo.html#step-3-sync-with-hive, it shows how both the tables should be created. Can you create a new table with MERGE_ON_READ storage type and try syncing it again just to confirm that it doesn't work ? On a side note, since you've anyways made the realtime table work, let me explain why you might not be seeing the latest records. Compaction process is just used to convert the delta files into a columnar file format such as parquet. You should be able to query the latest records from the "_rt" table, like "select * from your_table_rt" even without running compaction. My suspicion is that you are creating new partitions every time you insert new records. You can confirm this by performing a `ls` on the base path. If you see a new folder for every new record, this is the case. If there are new partitions created, you will need to run the hive sync tool everytime to register the hive partitions against the hive metastore. So try doing the following : // insert records // call hive sync tool // inser records // call hive sync tool Ideally, you can just write a wrapper job that invokes insert and hive sync tool one after another to avoid doing this manually. Hope this helps. Also, let's try to resolve this ticket here but going forward, please ask questions on the mailing list for faster responses.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
