ryantam626 commented on PR #27924:
URL: https://github.com/apache/beam/pull/27924#issuecomment-1672082225

   > ... generally object creation of one or two should not make noticable 
difference, and Timestamp is light weight.
   
   Yes I am surprised it showed up that prominently in the flamegraph we 
obtained as well.
   
   > This suggest the involved code path is a hot path, not limited to 
Timestamp creation.
   
   Smells like it!
   
   > Would you mind sharing more about the benchmark / codes that doing the 
test ?
   
   I can try - the pipeline we have is not really a benchmark, but rather a 
production-ish pipeline that we use for analysing our data, the first few steps 
goes like this
   
   1. Read from a BigQuery query which returns a sparse minutely summary from 
data collection devices
   2. Assign timestamp using the summary's timestamp (already in the minute 
level resolution)
   3. Window into using a SlidingWindows, 30 days window size, 1 day increment.
   4. Unrelated (to this ticket) steps that analyse patterns follows.
   
   --------
   
   I am curious to know if you have spotted any inefficiency in these first few 
steps.
   
   We have since made some improvement to our pipeline which incidentally 
side-stepped this slowness [1], but regardless I thought this change would be a 
nice addition seeing as it's almost a risk-free change which supposedly improve 
performance.
   
   [1] We went for a sliding window specification of 90 days window size, 30 
days increment, trading extra memory usage (and small amount of correctness in 
our pipeline) for speed - as each minutely summary will only need to go into 3 
bucket instead of 30 buckets under this sliding window specification.
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to