HeartSaVioR commented on PR #53911:
URL: https://github.com/apache/spark/pull/53911#issuecomment-3797401679

   I'll clarify the intention.
   
   I argue event time should have been considered as a first class at the first 
time when designing state store. We didn't do this, hence the operation with 
event time is always not performant, though it is indeed the way operator 
produces output on append mode and evicts state. I'd rather say I'm trying to 
fix it.
   
   The only exception from the above is TWS which we separate out the data and 
the timer and the data won't be coupled with event time since timer will do it. 
While someone may argue this is a better design since we separate out the 
concerns, but I believe this doesn't perform well compared to the proposal.
   
   So for me the attempt of generalization to remove the concept of event time 
here and replace it with long type or something is against the direction of 
proposal. If we have a case where data should be ordered in integer type - 
should we do the same and expose an API and etcetc? I don't think that has 
sufficient motivation. The motivation of event time as first class is that it 
is one of the core concept of the streaming engine and we had been ignoring it 
in state store.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to