Re: [PR] [HUDI-8141] Incremental Query with Completion Time [hudi]

via GitHub Wed, 16 Oct 2024 01:02:51 -0700


yihua commented on code in PR #11947:
URL: https://github.com/apache/hudi/pull/11947#discussion_r1802563846



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/streaming/HoodieStreamSource.scala:
##########
@@ -71,19 +69,6 @@ class HoodieStreamSource(
     
parameters.get(DataSourceReadOptions.QUERY_TYPE.key).contains(DataSourceReadOptions.QUERY_TYPE_INCREMENTAL_OPT_VAL)
 &&
     
parameters.get(DataSourceReadOptions.INCREMENTAL_FORMAT.key).contains(DataSourceReadOptions.INCREMENTAL_FORMAT_CDC_VAL)
 
-  /**
-   * When hollow commits are found while doing streaming read , unlike batch 
incremental query,
-   * we do not use [[HollowCommitHandling.FAIL]] by default, instead we use 
[[HollowCommitHandling.BLOCK]]
-   * to block processing data from going beyond the hollow commits to avoid 
unintentional skip.
-   *
-   * Users can set 
[[DataSourceReadOptions.INCREMENTAL_READ_HANDLE_HOLLOW_COMMIT]] to
-   * [[HollowCommitHandling.USE_TRANSITION_TIME]] to avoid the blocking 
behavior.
-   */
-  private val hollowCommitHandling: HollowCommitHandling =

Review Comment:
   We discussed this issue and agreed that it's simpler to deprecate the hollow 
commit handling in Hudi 1.0 as the completion time-based incremental query can 
properly support serializability; user should upgrade the table to Hudi 1.0 for 
the correct behavior.
   
   Regarding this PR, there are many places that need clean-up on the hollow 
commit handling.  I'll defer this to a separate PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-8141] Incremental Query with Completion Time [hudi]

Reply via email to