sidnakoppa commented on issue #652: Reading Merge_on_read table| Failing SchemaParseException: Empty name URL: https://github.com/apache/incubator-hudi/issues/652#issuecomment-487016906 Thank you ,I was able to create tables using sync tool and I have posted the followup query in mailing list. As suggested I run the set again.Below are the result > ds.withColumn("emp_name",lit("upd1 Emily")).withColumn("ts",current_timestamp).write.format("com.uber.hoodie") > .option(HoodieWriteConfig.TABLE_NAME,"emp_mor_26") > .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY,"emp_id") > .option(DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY,"MERGE_ON_READ") > .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY, "part_by") > .option("hoodie.upsert.shuffle.parallelism",4) > .mode(SaveMode.Append) > .save("/apps/hive/warehouse/emp_mor_26") after multiple updates to same record ,the generated log.1 has multiple instances of the same record. At this point the updated record is not fetched.(n-1)th record is fetched. ``` 14:45 /apps/hive/warehouse/emp_mor_26/2019/09/22/.278a46f9--87a_20190426144153.log.1 - has record that was updated in run 1 15:00 /apps/hive/warehouse/emp_mor_26/2019/09/22/.278a46f9--87a_20190426144540.log.1 - has record that was updated in run 2 and run 3 14:41 /apps/hive/warehouse/emp_mor_26/2019/09/22/.hoodie_partition_metadata 14:41 /apps/hive/warehouse/emp_mor_26/2019/09/22/278a46f9--87a_0_20190426144153.parquet ``` PS : I tried the same set running the sync tool after each run. same result.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
