[jira] [Commented] (FLINK-7608) LatencyGauge change to histogram metric
[ https://issues.apache.org/jira/browse/FLINK-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342946#comment-16342946 ] ASF GitHub Bot commented on FLINK-7608: --- Github user yew1eb commented on the issue: https://github.com/apache/flink/pull/5161 Yes, in our production environment, we report and store all metrics to an external Time series database for alarm and visual presentation. When the job is started, we will store the edge structure of the job's logical plan: eg.. For the latency metrics, we hope to include the operatorName instead of the operatorID in the tags. because operatorID will change if the job redeploys, the history's latency metrics data for the job will be hard to match and no latency statistics will be observed in our visualization panel. > LatencyGauge change to histogram metric > > > Key: FLINK-7608 > URL: https://issues.apache.org/jira/browse/FLINK-7608 > Project: Flink > Issue Type: Bug > Components: Metrics >Reporter: Hai Zhou UTC+8 >Assignee: Hai Zhou UTC+8 >Priority: Major > Fix For: 1.5.0 > > > I used slf4jReporter[https://issues.apache.org/jira/browse/FLINK-4831] to > export metrics the log file. > I found: > {noformat} > -- Gauges > - > .. > zhouhai-mbp.taskmanager.f3fd3a269c8c3da4e8319c8f6a201a57.Flink Streaming > Job.Map.0.latency: > value={LatencySourceDescriptor{vertexID=1, subtaskIndex=-1}={p99=116.0, > p50=59.5, min=11.0, max=116.0, p95=116.0, mean=61.836}} > zhouhai-mbp.taskmanager.f3fd3a269c8c3da4e8319c8f6a201a57.Flink Streaming > Job.Sink- Unnamed.0.latency: > value={LatencySourceDescriptor{vertexID=1, subtaskIndex=0}={p99=195.0, > p50=163.5, min=115.0, max=195.0, p95=195.0, mean=161.0}} > .. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5161: [FLINK-7608][metric] Refactor latency statistics metric
Github user yew1eb commented on the issue: https://github.com/apache/flink/pull/5161 Yes, in our production environment, we report and store all metrics to an external Time series database for alarm and visual presentation. When the job is started, we will store the edge structure of the job's logical plan: eg.. For the latency metrics, we hope to include the operatorName instead of the operatorID in the tags. because operatorID will change if the job redeploys, the history's latency metrics data for the job will be hard to match and no latency statistics will be observed in our visualization panel. ---
[jira] [Commented] (FLINK-8515) update RocksDBMapState to replace deprecated remove() with delete()
[ https://issues.apache.org/jira/browse/FLINK-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342802#comment-16342802 ] ASF GitHub Bot commented on FLINK-8515: --- Github user bowenli86 commented on the issue: https://github.com/apache/flink/pull/5365 cc @StefanRRichter > update RocksDBMapState to replace deprecated remove() with delete() > --- > > Key: FLINK-8515 > URL: https://issues.apache.org/jira/browse/FLINK-8515 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Minor > Fix For: 1.5.0 > > > Currently in RocksDBMapState: > {code:java} > @Override > public void remove(UK userKey) throws IOException, RocksDBException { > byte[] rawKeyBytes = > serializeUserKeyWithCurrentKeyAndNamespace(userKey); > backend.db.remove(columnFamily, writeOptions, rawKeyBytes); > } > {code} > remove() is actually deprecated. Should be replaced with delete() -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5365: [FLINK-8515] update RocksDBMapState to replace deprecated...
Github user bowenli86 commented on the issue: https://github.com/apache/flink/pull/5365 cc @StefanRRichter ---
[jira] [Issue Comment Deleted] (FLINK-6214) WindowAssigners do not allow negative offsets
[ https://issues.apache.org/jira/browse/FLINK-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jelmer Kuperus updated FLINK-6214: -- Comment: was deleted (was: Proposed pull request : https://github.com/apache/flink/pull/5376) > WindowAssigners do not allow negative offsets > - > > Key: FLINK-6214 > URL: https://issues.apache.org/jira/browse/FLINK-6214 > Project: Flink > Issue Type: Bug > Components: DataStream API >Affects Versions: 1.2.0 >Reporter: Timo Walther >Priority: Major > > Both the website and the JavaDoc promotes > ".window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8))) For > example, in China you would have to specify an offset of Time.hours(-8)". But > both the sliding and tumbling event time assigners do not allow offset to be > negative. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6214) WindowAssigners do not allow negative offsets
[ https://issues.apache.org/jira/browse/FLINK-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342761#comment-16342761 ] Jelmer Kuperus commented on FLINK-6214: --- Proposed pull request : https://github.com/apache/flink/pull/5376 > WindowAssigners do not allow negative offsets > - > > Key: FLINK-6214 > URL: https://issues.apache.org/jira/browse/FLINK-6214 > Project: Flink > Issue Type: Bug > Components: DataStream API >Affects Versions: 1.2.0 >Reporter: Timo Walther >Priority: Major > > Both the website and the JavaDoc promotes > ".window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8))) For > example, in China you would have to specify an offset of Time.hours(-8)". But > both the sliding and tumbling event time assigners do not allow offset to be > negative. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-6214) WindowAssigners do not allow negative offsets
[ https://issues.apache.org/jira/browse/FLINK-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342760#comment-16342760 ] ASF GitHub Bot commented on FLINK-6214: --- GitHub user jelmerk opened a pull request: https://github.com/apache/flink/pull/5376 [FLINK-6214] WindowAssigners do not allow negative offsets ## What is the purpose of the change The javadoc of TumblingEventTimeWindows and TumblingProcessingTimeWindows suggest that it is possible to use negative offsets but in practice this is not supported. This patch remedies this situation ## Brief change log - updated behavior of TumblingEventTimeWindows & TumblingProcessingTimeWindows ## Verifying this change This change added tests and can be verified as follows: - Added testWindowAssignmentWithNegativeOffset method in TumblingEventTimeWindowsTest & TumblingProcessingTimeWindowsTest ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable You can merge this pull request into a Git repository by running: $ git pull https://github.com/jelmerk/flink negative_offsets Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5376.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5376 commit 07f5577d67155c86af72dac85f7c96b4fea06456 Author: Jelmer KuperusDate: 2018-01-28T22:22:10Z FLINK-6214: allow negative offsets > WindowAssigners do not allow negative offsets > - > > Key: FLINK-6214 > URL: https://issues.apache.org/jira/browse/FLINK-6214 > Project: Flink > Issue Type: Bug > Components: DataStream API >Affects Versions: 1.2.0 >Reporter: Timo Walther >Priority: Major > > Both the website and the JavaDoc promotes > ".window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8))) For > example, in China you would have to specify an offset of Time.hours(-8)". But > both the sliding and tumbling event time assigners do not allow offset to be > negative. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5376: [FLINK-6214] WindowAssigners do not allow negative...
GitHub user jelmerk opened a pull request: https://github.com/apache/flink/pull/5376 [FLINK-6214] WindowAssigners do not allow negative offsets ## What is the purpose of the change The javadoc of TumblingEventTimeWindows and TumblingProcessingTimeWindows suggest that it is possible to use negative offsets but in practice this is not supported. This patch remedies this situation ## Brief change log - updated behavior of TumblingEventTimeWindows & TumblingProcessingTimeWindows ## Verifying this change This change added tests and can be verified as follows: - Added testWindowAssignmentWithNegativeOffset method in TumblingEventTimeWindowsTest & TumblingProcessingTimeWindowsTest ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable You can merge this pull request into a Git repository by running: $ git pull https://github.com/jelmerk/flink negative_offsets Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5376.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5376 commit 07f5577d67155c86af72dac85f7c96b4fea06456 Author: Jelmer KuperusDate: 2018-01-28T22:22:10Z FLINK-6214: allow negative offsets ---
[jira] [Commented] (FLINK-8484) Kinesis consumer re-reads closed shards on job restart
[ https://issues.apache.org/jira/browse/FLINK-8484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342608#comment-16342608 ] ASF GitHub Bot commented on FLINK-8484: --- Github user pluppens commented on the issue: https://github.com/apache/flink/pull/5337 @tzulitai Is there anything more I can do from my side? > Kinesis consumer re-reads closed shards on job restart > -- > > Key: FLINK-8484 > URL: https://issues.apache.org/jira/browse/FLINK-8484 > Project: Flink > Issue Type: Bug > Components: Kinesis Connector >Affects Versions: 1.4.0, 1.3.2 >Reporter: Philip Luppens >Assignee: Philip Luppens >Priority: Blocker > Labels: bug, flink, kinesis > Fix For: 1.3.3, 1.5.0, 1.4.1 > > > We’re using the connector to subscribe to streams varying from 1 to a 100 > shards, and used the kinesis-scaling-utils to dynamically scale the Kinesis > stream up and down during peak times. What we’ve noticed is that, while we > were having closed shards, any Flink job restart with check- or save-point > would result in shards being re-read from the event horizon, duplicating our > events. > > We started checking the checkpoint state, and found that the shards were > stored correctly with the proper sequence number (including for closed > shards), but that upon restarts, the older closed shards would be read from > the event horizon, as if their restored state would be ignored. > > In the end, we believe that we found the problem: in the > FlinkKinesisConsumer’s run() method, we’re trying to find the shard returned > from the KinesisDataFetcher against the shards’ metadata from the restoration > point, but we do this via a containsKey() call, which means we’ll use the > StreamShardMetadata’s equals() method. However, this checks for all > properties, including the endingSequenceNumber, which might have changed > between the restored state’s checkpoint and our data fetch, thus failing the > equality check, failing the containsKey() check, and resulting in the shard > being re-read from the event horizon, even though it was present in the > restored state. > > We’ve created a workaround where we only check for the shardId and stream > name to restore the state of the shards we’ve already seen, and this seems to > work correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5337: [FLINK-8484][flink-kinesis-connector] Ensure a Kinesis co...
Github user pluppens commented on the issue: https://github.com/apache/flink/pull/5337 @tzulitai Is there anything more I can do from my side? ---
[jira] [Commented] (FLINK-8434) The new yarn resource manager should take over the running task managers after failover
[ https://issues.apache.org/jira/browse/FLINK-8434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342553#comment-16342553 ] Gary Yao commented on FLINK-8434: - [~tiemsn] I think this is a duplicate to FLINK-7805. One of the tickets should be closed. > The new yarn resource manager should take over the running task managers > after failover > --- > > Key: FLINK-8434 > URL: https://issues.apache.org/jira/browse/FLINK-8434 > Project: Flink > Issue Type: Bug > Components: Cluster Management >Affects Versions: 1.5.0 >Reporter: shuai.xu >Assignee: shuai.xu >Priority: Major > Labels: flip-6 > > The app master which container the job master and yarn resource manager may > failover during running on yarn. The new resource manager should take over > the running task managers after started. But now the YarnResourceManager does > not record the running container to workerNodeMap, so when task managers > register to it, it will reject them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)