cb149 edited a comment on issue #3161:
URL: https://github.com/apache/hudi/issues/3161#issuecomment-920408212


   @codope `startingOffsets` is set to `earliest`.
   
   I changed my application from scheduled every hour to every two hours to 
allow for more new messages in the topic and I haven't really seen the issue 
more than a couple of times since then.
   
   I added a failsafe comparing `numInputRows` and `endOffset - startOffset`, 
but in the last month or so the only time I was alerted of a missmatch between 
those was when there were 0 new messages for the input topic and one of the 
offsets was increased by 1 for some reason.
   
   I somehow get the feeling that the issue is unrelated to Hudi and probably 
has its cause somewhere with Kafka or Spark, maybe somehow with the relatively 
unbalanced Kafka partitions. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to