TengHu removed a comment on pull request #8885:
URL: https://github.com/apache/flink/pull/8885#issuecomment-632942543


   > ## What is the purpose of the change
   > Flink triggers all panes belonging to one window at the same time. In 
other words, all panes are aligned and their triggers all fire simultaneously, 
causing the spiking workload effect. This pull request adds WindowStagger to 
generate staggering offset for each window assignment at runtime, so the 
workloads are distributed across time. Hence each window assignment is based on 
window size, window offset and staggering offset (generated in runtime).
   > 
   > This change only modifies TumblingProcessingTimeWindows, will send out 
other windows change in other PRs.
   > 
   > ## Brief change log
   > * _Add WindowStagger for generating staggering offsets_
   > * _Enable TumblingProcessingTimeWindows to generate staggering offsets if 
user enabled_
   > 
   > ## Verifying this change
   > This change is already covered by existing tests, such as 
_TumblingProcessingTimeWindowsTest_.
   > 
   > This change added tests and can be verified as follows:
   > 
   > * _Added unit tests for WindowStagger_
   > * _Validated the change by running in our clusters with 3500 task managers 
in total on a stateful streaming program using sliding and tumbling windowing. 
Some dashboards are shown below_
   > 
   > 
![](https://camo.githubusercontent.com/298c26fefe598c1b4605a83f84451adfde4fae50/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835362f737461676765725f77696e646f775f64656c61792e706e67)
   > 
![](https://camo.githubusercontent.com/3c4769bd8b7352c8860eaaa4bc6bbffc64104975/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835372f737461676765725f77696e646f775f7468726f7567687075742e706e67)
   > 
![](https://camo.githubusercontent.com/5fd01124c4c21d9388eab78c498c928ff6a651db/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835392f737461676765725f77696e646f772e706e67)
   > 
   > _some system metrics_
   > 
   > 
![buffers_in_queue](https://user-images.githubusercontent.com/10646097/60139232-00f29d80-9762-11e9-84f4-3bfbde28c028.png)
   > 
![buffer_usage](https://user-images.githubusercontent.com/10646097/60139234-03ed8e00-9762-11e9-99d1-de845d02a8c6.png)
   > 
![output_record_rate](https://user-images.githubusercontent.com/10646097/60139237-064fe800-9762-11e9-959d-2db96c7f7bf6.png)
   > 
   > ## Does this pull request potentially affect one of the following parts:
   > * Dependencies (does it add or upgrade a dependency): no
   > * The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes, TumblingProcessingTimeWindows(potentially all 
WindowAssigners)
   > * The serializers: no
   > * The runtime per-record code paths (performance sensitive): don't know, 
probably no ?
   > * Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
   > * The S3 file system connector: no
   > 
   > ## Documentation
   > * Does this pull request introduce a new feature? yes
   > * If yes, how is the feature documented? JavaDocs
   
   
   
   > This is a very good feature! Sorry for taking so (extremely) long to 
finally review this. I didn't initially see this PR.
   > 
   > Could you please address my comments?
   > 
   > Also, when creating a PR the individual commits should also have summaries 
according to the Jira issue, i.e. `[FLINK-XXXXX] Add ...`.
   
   Thank you for getting back to me. I lost the original fork (it's been a year 
since I made this change) so I created a new PR 
https://github.com/apache/flink/pull/12297
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to