tmac2100 commented on issue #2806: URL: https://github.com/apache/hudi/issues/2806#issuecomment-818477456
> @tmac2100 The commands you have provided to reproduce this issue seem to be related to the internal working of your company. I'm unable to reproduce this issue with the given command. > > Can you provide some additional information so I can help debug this issue ? > > 1. How many records are you ingesting in every batch ? > 2. How many of these are inserts vs updates ? > 3. From what it looks like, you have chosen a partition path but no partitioning strategy, is this a partitioned table or a non-partitioned one ? > 4. Can you describe the amount of time by which each subsequent batch is increasing ? How many records are you ingesting in every batch ? About 300M How many of these are inserts vs updates ? 80% insert and 20% update From what it looks like, you have chosen a partition path but no partitioning strategy, is this a partitioned table or a non-partitioned one ? Partitioned table by create date column,date format yyyymmdd, like 20210413 Can you describe the amount of time by which each subsequent batch is increasing ? the first batch cost 7 minutes complete,5 hours later,the batch lasted 16 hours and was still incomplete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
