dmkozh edited a comment on pull request #13739:
URL: https://github.com/apache/beam/pull/13739#issuecomment-775544426


   > Even with the latest changes, this is still not writing the windowing 
information (including timestamps) to the cache.
   
   That's exactly the intent of the change - we don't want to cache trivial 
windowing information.
    
   > Maybe it would be helpful to understand what the objective of this change 
is?
   
   The objective is described in the attached ticket - basically, we don't want 
to cache redundant information at all, as it adds a huge overhead of ~500 
bytes/record. It can be somewhat reduced, but it's still hundreds of bytes.  
There may be some terminology confusion - by 'batch' pipelines I initially 
meant the pipelines which don't ever care about windowing as they process all 
the data at once.
   
   If there is a better way to figure out if the pipeline doesn't care about 
windowing, I could use that instead. Also, since this is an environment setting 
now, it should be pretty hard to get unexpected results (though for users who 
don't care about windowing there won't be an immediate benefit either...)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to