HeartSaVioR commented on issue #23747: [SPARK-26848][SQL][SS] Introduce new 
option to Kafka source: offset by timestamp (starting/ending)
URL: https://github.com/apache/spark/pull/23747#issuecomment-531933055
 
 
   > I don't feel that qualified to review this, but see others have generally 
approved.
   
   I see. No problem and thanks for reviewing even the patch is not familiar 
for you. I can wait for other reviewers who can decide to merge.
   
   > Is there any impact to users who do not specify these new properties? does 
it overlap with or duplicate any existing "offset" functionality? Those would 
be my key review questions.
   
   No. It provides another way to set "offset", by timestamp. 
   
   For now, end users need to set exact offset no. or either latest/earliest, 
and when they want to run the query starting from specific time point they need 
to know about exact offset which is inserted at that time. While end users may 
retrieve it from cli tool (not 100% sure but given they expose API...), it's 
not convenient to retrieve the offset from Kafka for the time point and set to 
Spark option. There's another benefit for this change - once they specify the 
offset to Spark option, unless they also leave comment to describe where the 
offset came from, the offset number is not showing the intention that they want 
to run from specific time point. After the patch the intention could be 
represented very clear.
   
   > Regarding Kafka 0.10 support, yes I think it could be reasonable to drop 
support for < 1.0. ... Would there be any significant upside for Spark, like 
simplifying code or assumptions, making it easier to support, taking advantage 
of newer features?
   
   Maybe we don't need to guide about version issue for both this (>= 0.10) and 
Kafka header support (>= 0.11). We already use pretty high version of Kafka 
client so there's no significant change (benefits on code side) on drop 
supporting old versions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to