[GitHub] flink pull request #6255: [FLINK-9681] [table] Make sure difference between ...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/6255 ---
[GitHub] flink pull request #6255: [FLINK-9681] [table] Make sure difference between ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/6255#discussion_r200333630 --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/harness/HarnessTestBase.scala --- @@ -384,4 +386,12 @@ object HarnessTestBase { value.row.getField(selectorField).asInstanceOf[T] } } + + /** +* Test class used to test min and max retention time. +*/ + class StreamQueryConfigTest(min: Time, max: Time) extends StreamQueryConfig { --- End diff -- I would rename the class to `TestStreamQueryConfig` because the `Test` at the end suggests that this class is testing something instead of being a util for a test. ---
[GitHub] flink pull request #6255: [FLINK-9681] [table] Make sure difference between ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/6255#discussion_r200330683 --- Diff: docs/dev/table/streaming.md --- @@ -591,16 +589,14 @@ qConfig.withIdleStateRetentionTime(Time.hours(12); val qConfig: StreamQueryConfig = ??? -// set idle state retention time: min = 12 hour, max = 16 hours -qConfig.withIdleStateRetentionTime(Time.hours(12), Time.hours(16)) -// set idle state retention time. min = max = 12 hours -qConfig.withIdleStateRetentionTime(Time.hours(12) +// set idle state retention time: min = 12 hour, max = 24 hours +qConfig.withIdleStateRetentionTime(Time.hours(12), Time.hours(24)) {% endhighlight %} -Configuring different minimum and maximum idle state retention times is more efficient because it reduces the internal book-keeping of a query for when to remove state. +Configuring different minimum and maximum idle state retention times is more efficient because it reduces the internal book-keeping of a query for when to remove state. Difference between minTime and maxTime shoud be at least 5 minutes. --- End diff -- The "... more efficient ..." does not apply anymore. Maybe rephrase to > Cleaning up state requires additional bookkeeping which becomes less expensive for larger differences of `minTime` and `maxTime`. The difference between `minTime` and `maxTime` must be at least 5 minutes. ---
[GitHub] flink pull request #6255: [FLINK-9681] [table] Make sure difference between ...
GitHub user hequn8128 opened a pull request: https://github.com/apache/flink/pull/6255 [FLINK-9681] [table] Make sure difference between minRetentionTime and maxRetentionTime at least 5 minutes ## What is the purpose of the change This PR aims to make sure difference between minRetentionTime and maxRetentionTime at least 5 minutes. Currently, for a group by(or other operators), if minRetentionTime equals to maxRetentionTime, the group by operator will register a timer for each record coming at different time which cause performance problem. The reasoning for having two parameters is that we can avoid to register many timers if we have more freedom when to discard state. As min equals to max cause performance problem it is better to make sure these two parameters are not same. ## Brief change log - Throw exception when difference between minRetentionTime and maxRetentionTime smaller than 5 minutes - Add a `QueryConfigTest` class extends from StreamQueryConfig. `QueryConfigTest` don't have the 5min limitation which makes test more convenient. - Adapt tests. ## Verifying this change This change added tests and can be verified as follows: - Added test that validates that min and max retention time are set correctly. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) You can merge this pull request into a Git repository by running: $ git pull https://github.com/hequn8128/flink retentionTime Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/6255.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6255 commit b41df60dde422bce9317479c8d77304d6ee4857b Author: hequn8128 Date: 2018-07-04T14:11:38Z [FLINK-9681] [table] Make sure difference between minRetentionTime and maxRetentionTime at least 5 minutes ---