HariprasadAllaka1612 commented on issue #1556: URL: https://github.com/apache/incubator-hudi/issues/1556#issuecomment-618825491
@vinothchandar No let me be more clear. Below is the complete process i am doing 1. Reading CDC table from hive (hoodie table) to get the latest marker, 2. Read the files from S3 based on the latest marked read in step1. 3. Process the files which will result in 2 data frames. 4. Write both the data frames into the S3 in hoodie format and sync them to hive 5. Update the marker with latest end time The problem here is when i am writing the data set for the first time, its working. But when i am trying UPSERT the data in the 2nd run its giving this error ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
