ssdong commented on issue #2818: URL: https://github.com/apache/hudi/issues/2818#issuecomment-821810190
@garyli1019 Thank you for getting back to me. I've created a [JIRA](https://issues.apache.org/jira/browse/HUDI-1807) for the `NoSuchElementException` issue and will work on it. As for the incremental pulling concern, as the document says: ``` Property: hoodie.datasource.read.begin.instanttime, [Required in incremental mode] Instant time to start incrementally pulling data from. The instanttime here need not necessarily correspond to an instant on the timeline. New data written with an instant_time > BEGIN_INSTANTTIME are fetched out. For e.g: ‘20170901080000’ will get all new data written after Sep 1, 2017 08:00AM. ``` I believe it clearly states `The instanttime here need not necessarily correspond to an instant on the timeline`. It contradicts the behaviour I had observed in my experiment where I fed it an instant time in the past, but it only fetched partial updates for me; updates to `199` was _missing_. Something wasn't _correct_, either we had a discrepancy between our actual implementation and the way we present it in the document or the other way around that our existing archiving and timeline mechanism affects the integrity of our incremental query to some extend. I need a thorough understanding to make a judgemental call regarding whether we should fetch updates happening among the archived timeline since clearly it introduces overhead and may affect other things. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
