nsivabalan commented on code in PR #7627:
URL: https://github.com/apache/hudi/pull/7627#discussion_r1182021390


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/streaming/HoodieStreamSource.scala:
##########
@@ -163,10 +178,7 @@ class HoodieStreamSource(
     startOffset match {
       case INIT_OFFSET => startOffset.commitTime
       case HoodieSourceOffset(commitTime) =>
-        val time = 
HoodieActiveTimeline.parseDateFromInstantTime(commitTime).getTime

Review Comment:
   why this was removed? was there any bug? 



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelation.scala:
##########
@@ -205,6 +218,9 @@ class IncrementalRelation(val sqlContext: SQLContext,
       val endInstantArchived = 
commitTimeline.isBeforeTimelineStarts(endInstantTime)
 
       val scanDf = if (fallbackToFullTableScan && (startInstantArchived || 
endInstantArchived)) {
+        if (useStateTransitionTime) {
+          throw new HoodieException("Cannot use stateTransitionTime while 
enables full table scan")

Review Comment:
   I am not sure if we can throw and move on. 
   I see this similar to how someone is doing an incremental query as of now. 
   what we are handling here is: 
   if downstream is lagging /down for sometime and then it resumes to consume, 
and if cleaner has cleaned up the instants to consume, we do a fallback here. 
   
   when migrating to stateTransitionTime, we should find a way to support the 
same. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to