HeartSaVioR edited a comment on pull request #35521:
URL: https://github.com/apache/spark/pull/35521#issuecomment-1051552062


   >> Suppose we have an event E2 timestamped as 12:00 as input and there was 
an event E1 timestamped as 11:50.
   
   > For TTL, it is also possibly that E1 is evicted before when E2 is 
processed. It depends on TTL value.
   
   No one would set TTL as less than 10 mins as it doesn't effectively 
deduplicate events. It will be something like several hours (even days - [the 
example of Flink's blog post took 7 
days](https://flink.apache.org/2019/05/19/state-ttl.html)), and it doesn't harm 
much since state grows based on cardinality of grouping keys for deduplication.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to