[GitHub] [flink] flinkbot commented on pull request #13272: [FLINK-18695][network] Netty fakes heap buffer allocationn with direct buffers
flinkbot commented on pull request #13272: URL: https://github.com/apache/flink/pull/13272#issuecomment-682341723 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 3ef9f728f6a8005a7f91846bf32df96ee718c626 (Fri Aug 28 05:58:42 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] klion26 commented on a change in pull request #13225: [FLINK-18974][docs-zh]Translate the 'User-Defined Functions' page of "Application Development's DataStream API" into Chinese
klion26 commented on a change in pull request #13225: URL: https://github.com/apache/flink/pull/13225#discussion_r478844519 ## File path: docs/dev/user_defined_functions.zh.md ## @@ -147,95 +153,77 @@ data.map (new RichMapFunction[String, Int] { -Rich functions provide, in addition to the user-defined function (map, -reduce, etc), four methods: `open`, `close`, `getRuntimeContext`, and -`setRuntimeContext`. These are useful for parameterizing the function -(see [Passing Parameters to Functions]({{ site.baseurl }}/dev/batch/index.html#passing-parameters-to-functions)), -creating and finalizing local state, accessing broadcast variables (see -[Broadcast Variables]({{ site.baseurl }}/dev/batch/index.html#broadcast-variables)), and for accessing runtime -information such as accumulators and counters (see -[Accumulators and Counters](#accumulators--counters)), and information -on iterations (see [Iterations]({{ site.baseurl }}/dev/batch/iterations.html)). +除了用户自定义的功能(map,reduce 等),Rich functions 还提供了四个方法:`open`、`close`、`getRuntimeContext` 和 +`setRuntimeContext`。这些对于参数化功能很有用 +(参阅 [给函数传递参数]({{ site.baseurl }}/zh/dev/batch/index.html#passing-parameters-to-functions)), +创建和最终确定本地状态,访问广播变量(参阅 +[广播变量]({{ site.baseurl }}/zh/dev/batch/index.html#broadcast-variables)),以及访问运行时信息,例如累加器和计数器(参阅 +[累加器和计数器](#累加器和计数器)),以及迭代器的相关信息(参阅 [迭代器]({{ site.baseurl }}/zh/dev/batch/iterations.html))。 {% top %} -## Accumulators & Counters + -Accumulators are simple constructs with an **add operation** and a **final accumulated result**, -which is available after the job ended. +## 累加器和计数器 -The most straightforward accumulator is a **counter**: You can increment it using the -```Accumulator.add(V value)``` method. At the end of the job Flink will sum up (merge) all partial -results and send the result to the client. Accumulators are useful during debugging or if you -quickly want to find out more about your data. +累加器是具有**加法运算**和**最终累加结果**的一种简单结构,可在作业结束后使用。 -Flink currently has the following **built-in accumulators**. Each of them implements the -{% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/Accumulator.java "Accumulator" %} -interface. +最简单的累加器就是**计数器**: 你可以使用 +```Accumulator.add(V value)``` 方法将其递增。在作业结束时,Flink 会汇总(合并)所有部分的结果并将其发送给客户端。 +在调试过程中或在你想快速了解有关数据更多信息时,累加器作用很大。 + +Flink 目前有如下**内置累加器**。每个都实现了 +{% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/Accumulator.java "累加器" %} +接口。 - {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/IntCounter.java "__IntCounter__" %}, {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/LongCounter.java "__LongCounter__" %} - and {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/DoubleCounter.java "__DoubleCounter__" %}: - See below for an example using a counter. -- {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/Histogram.java "__Histogram__" %}: - A histogram implementation for a discrete number of bins. Internally it is just a map from Integer - to Integer. You can use this to compute distributions of values, e.g. the distribution of - words-per-line for a word count program. + 和 {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/DoubleCounter.java "__DoubleCounter__" %}: + 有关使用计数器的示例,请参见下文。 +- {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/Histogram.java "__直方图__" %}: + 离散数量的柱状直方图实现。在内部,它只是整形到整形的映射。你可以使用它来计算值的分布,例如,单词计数程序的每行单词的分布情况。 -__How to use accumulators:__ +__如何使用累加器:__ -First you have to create an accumulator object (here a counter) in the user-defined transformation -function where you want to use it. +首先,你要在需要使用累加器的用户自定义的转换函数中创建一个累加器对象(此处是计数器)。 Review comment: ```suggestion 首先,在需要使用累加器的用户自定义函数中创建一个累加器对象(此处是计数器)。 ``` ## File path: docs/dev/user_defined_functions.zh.md ## @@ -147,95 +153,77 @@ data.map (new RichMapFunction[String, Int] { -Rich functions provide, in addition to the user-defined function (map, -reduce, etc), four methods: `open`, `close`, `getRuntimeContext`, and -`setRuntimeContext`. These are useful for parameterizing the function -(see [Passing Parameters to Functions]({{ site.baseurl }}/dev/batch/index.html#passing-parameters-to-functions)), -creating and finalizing local state, accessing broadcast variables (see -[Broadcast Variables]({{ site.baseurl }}/dev/batch/index.html#broadcast-variables)), and for accessing runtime -information such as accumulators and counters (see -[Accumulators and Counters](#accumulators--counters)), and information -on iterations (see [Iterations]({{ site.baseurl }}/dev/batch/iterations.html)). +除了用户自定义的功能(map,reduce 等),Rich functions 还提供了四个方法:`open`、`close`、`getRuntimeContext` 和 +`setRuntimeContext`。这些对于参数化功能很有用 +(参阅 [给函数传递参数]({{ site.baseurl
[jira] [Updated] (FLINK-18695) Allow NettyBufferPool to allocate heap buffers
[ https://issues.apache.org/jira/browse/FLINK-18695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-18695: --- Labels: pull-request-available (was: ) > Allow NettyBufferPool to allocate heap buffers > -- > > Key: FLINK-18695 > URL: https://issues.apache.org/jira/browse/FLINK-18695 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Chesnay Schepler >Assignee: Yun Gao >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > in 4.1.43 netty made a change to their SslHandler to always use heap buffers > for JDK SSLEngine implementations, to avoid an additional memory copy. > However, our {{NettyBufferPool}} forbids heap buffer allocations. > We will either have to allow heap buffer allocations, or create a custom > SslHandler implementation that does not use heap buffers (although this seems > ill-adviced?). > /cc [~sewen] [~uce] [~NicoK] [~zjwang] [~pnowojski] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13109: [FLINK-18808][runtime/metrics] Include side outputs in numRecordsOut metric.
flinkbot edited a comment on pull request #13109: URL: https://github.com/apache/flink/pull/13109#issuecomment-671381707 ## CI report: * 3d854c62355f9049062a7ae6a908dcceecd9c213 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5932) * 87ec25bce258c9fc953084701dd3acf7d96ac9e2 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5951) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii opened a new pull request #13272: [FLINK-18695][network] Netty fakes heap buffer allocationn with direct buffers
gaoyunhaii opened a new pull request #13272: URL: https://github.com/apache/flink/pull/13272 ## What is the purpose of the change This PR modifies the `NettyBufferPool` to also allocate direct buffers for the heap buffer request. This enables us not to change the memory footprint when upgrade Netty to 4.1.50-FINAL. In the future we could further decide how to adjust the Netty memory management. ## Brief change log - 3ef9f728f6a8005a7f91846bf32df96ee718c626 returns direct buffers for all the heap buffer allocation methods. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **no** - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **no** - The serializers: **no** - The runtime per-record code paths (performance sensitive): **no** - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: **no** - The S3 file system connector: **no** ## Documentation - Does this pull request introduce a new feature? **no** - If yes, how is the feature documented? **not applicable** This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13109: [FLINK-18808][runtime/metrics] Include side outputs in numRecordsOut metric.
flinkbot edited a comment on pull request #13109: URL: https://github.com/apache/flink/pull/13109#issuecomment-671381707 ## CI report: * 3d854c62355f9049062a7ae6a908dcceecd9c213 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5932) * 87ec25bce258c9fc953084701dd3acf7d96ac9e2 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] liming30 commented on a change in pull request #13109: [FLINK-18808][runtime/metrics] Include side outputs in numRecordsOut metric.
liming30 commented on a change in pull request #13109: URL: https://github.com/apache/flink/pull/13109#discussion_r478836972 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/collector/selector/DirectedOutputsCollector.java ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.collector.selector; + +import org.apache.flink.streaming.api.operators.Output; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; + +import java.util.Map; + +/** + * The selected outputs collector will send records to the default output, + * and output matching outputNames. + * + * @param The type of the elements that can be emitted. + */ +public class DirectedOutputsCollector implements SelectedOutputsCollector { + + private final Output>[] selectAllOutputs; + private final Map>[]> outputMap; + + public DirectedOutputsCollector( + Output>[] selectAllOutputs, + Map>[]> outputMap) { + this.selectAllOutputs = selectAllOutputs; + this.outputMap = outputMap; + } + + @Override + public boolean collect(Iterable outputNames, StreamRecord record) { + boolean emitted = false; + + if (selectAllOutputs.length > 0) { + collect(selectAllOutputs, record); + emitted = true; + } + + for (String outputName : outputNames) { + Output>[] outputList = outputMap.get(outputName); + if (outputList != null && outputList.length > 0) { + collect(outputList, record); + emitted = true; + } + } Review comment: In the old implementation via `set`, even if the same `output` appears multiple times in `outputNames`, it will only be sent once. Now it will send multiple times and I am not sure if this behavior is correct. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] liming30 commented on a change in pull request #13109: [FLINK-18808][runtime/metrics] Include side outputs in numRecordsOut metric.
liming30 commented on a change in pull request #13109: URL: https://github.com/apache/flink/pull/13109#discussion_r478431177 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/collector/selector/SelectedOutputsCollectorImpl.java ## @@ -0,0 +1,61 @@ +package org.apache.flink.streaming.api.collector.selector; + +import org.apache.flink.streaming.api.operators.Output; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; + +import java.util.Map; + +/** + * The selected outputs collector will send records to the default output, + * and output matching outputNames. + * + * @param The type of the elements that can be emitted. + */ +public class SelectedOutputsCollectorImpl implements SelectedOutputsCollector { + + private final Output>[] selectAllOutputs; + private final Map>[]> outputMap; + + private final boolean objectReuse; + + public SelectedOutputsCollectorImpl( + Output>[] selectAllOutputs, + Map>[]> outputMap, + boolean objectReuse) { + this.selectAllOutputs = selectAllOutputs; + this.outputMap = outputMap; + this.objectReuse = objectReuse; + } + + @Override + public boolean collect(Iterable outputNames, StreamRecord record) { + boolean emitted = false; + + if (selectAllOutputs.length > 0) { + collect(selectAllOutputs, record); + emitted = true; + } + + for (String outputName : outputNames) { + Output>[] outputList = outputMap.get(outputName); + if (outputList != null && outputList.length > 0) { + collect(outputList, record); + emitted = true; + } + } Review comment: In the old implementation via `set`, even if the same `output` appears multiple times in `outputNames`, it will only be sent once. Now it will send multiple times and I am not sure if this behavior is correct. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dianfu commented on pull request #13193: [FLINK-18918][python][docs] Add dedicated connector documentation for Python Table API
dianfu commented on pull request #13193: URL: https://github.com/apache/flink/pull/13193#issuecomment-682331400 @hequn8128 Thanks for the update. LGTM. @sjwiesman @morsapaes could you take a further look at of the latest PR? Thanks a lot! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13271: [FLINK-19043][docs-zh] Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese
flinkbot edited a comment on pull request #13271: URL: https://github.com/apache/flink/pull/13271#issuecomment-682323547 ## CI report: * dcfea8ba4f522e9cd1a87022eadb10438902f22c Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5950) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13271: [FLINK-19043][docs-zh] Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese
flinkbot commented on pull request #13271: URL: https://github.com/apache/flink/pull/13271#issuecomment-682323547 ## CI report: * dcfea8ba4f522e9cd1a87022eadb10438902f22c UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19063) Support join late event from dimension table side in temporal table join
[ https://issues.apache.org/jira/browse/FLINK-19063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186253#comment-17186253 ] Paul Lin commented on FLINK-19063: -- WRT event-time temporal table join, as an initial and naive thought, I think we can leverage the watermarks of both streams (may require some watermark mechanism changes). We should ensure the build side watermark is greater than the one of the probe size, if not, maybe we can keep the unjoined data of the probe side table before the build side watermark(plus allowed lateness) in the states. When a new element of the build side shows up, it triggers the unjoined data in the states to re-join, produce the join result (if any), and be removed from the states. The process would be similar to event-time interval join. > Support join late event from dimension table side in temporal table join > - > > Key: FLINK-19063 > URL: https://issues.apache.org/jira/browse/FLINK-19063 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Leonard Xu >Priority: Major > > To join late event from dimension table side in temporal table join is a > common user case > from user-zh mail list[1][2]. > And another similar user case is how to enable the faster stream to wait the > slower stream in regular stream join[3]. > I think we can discuss how to support these user cases. > > > [1][http://apache-flink.147419.n8.nabble.com/Flink-join-td6563.html] > [2][http://apache-flink.147419.n8.nabble.com/flinksql-mysql-td3584.html#a3585] > [3][http://apache-flink.147419.n8.nabble.com/Flink-sql-td4435.html#a4436] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] RocMarshal commented on a change in pull request #13271: [FLINK-19043][docs-zh] Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese
RocMarshal commented on a change in pull request #13271: URL: https://github.com/apache/flink/pull/13271#discussion_r478825509 ## File path: docs/monitoring/logging.zh.md ## @@ -100,15 +106,13 @@ import org.slf4j.Logger Logger LOG = LoggerFactory.getLogger(Foobar.class) {% endhighlight %} -In order to benefit most from slf4j, it is recommended to use its placeholder mechanism. -Using placeholders allows to avoid unnecessary string constructions in case that the logging level is set so high that the message would not be logged. -The syntax of placeholders is the following: +为了最大限度地利用 slf4j,建议使用其占位符机制。使用占位符可以避免不必要的字符串构造,以防日志级别设置得太高而不会记录消息。占位符的语法如下: {% highlight java %} LOG.info("This message contains {} placeholders. {}", 2, "Yippie"); {% endhighlight %} -Placeholders can also be used in conjunction with exceptions which shall be logged. +占位符也可以与应记录的异常一起使用。 Review comment: ```suggestion 占位符也可以和要记录的异常一起使用。 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13271: [FLINK-19043][docs-zh] Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese
flinkbot commented on pull request #13271: URL: https://github.com/apache/flink/pull/13271#issuecomment-682319220 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit a50c1399395cbc32008f02ffdb0380c897f522a4 (Fri Aug 28 04:33:59 UTC 2020) ✅no warnings Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-19043) Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese
[ https://issues.apache.org/jira/browse/FLINK-19043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-19043: --- Labels: Documentation Translation pull-request-available translation-zh (was: Documentation Translation translation-zh) > Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese > - > > Key: FLINK-19043 > URL: https://issues.apache.org/jira/browse/FLINK-19043 > Project: Flink > Issue Type: Improvement > Components: chinese-translation, Documentation >Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1 >Reporter: Roc Marshal >Assignee: Roc Marshal >Priority: Major > Labels: Documentation, Translation, pull-request-available, > translation-zh > > The page url is : > [Logging|https://ci.apache.org/projects/flink/flink-docs-release-1.11/monitoring/logging.html] > The markdown file location is : flink/docs/monitoring/logging.zh.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] RocMarshal opened a new pull request #13271: [FLINK-19043][docs-zh] Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese
RocMarshal opened a new pull request #13271: URL: https://github.com/apache/flink/pull/13271 ## What is the purpose of the change *Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese* ## Brief change log *Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese* - *The page url is : Logging https://ci.apache.org/projects/flink/flink-docs-release-1.11/monitoring/logging.html* - *The markdown file location is : flink/docs/monitoring/logging.zh.md* ## Verifying this change *Translate the 'Logging' page of 'Debugging & Monitoring' into Chinese* A pure translation work in the `Documentation` module. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable & docs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13270: [hotfix] [javadocs] fix typo in TaskManagerServices
flinkbot edited a comment on pull request #13270: URL: https://github.com/apache/flink/pull/13270#issuecomment-682310308 ## CI report: * 04b974f56351d188b12bfe7c22a2fb8cb6e8f68e Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5948) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13249: [FLINK-14087][datastream] Clone the StreamPartitioner to avoid being shared at runtime.
flinkbot edited a comment on pull request #13249: URL: https://github.com/apache/flink/pull/13249#issuecomment-680775587 ## CI report: * 4eae231a2cd3edbb2ba34f7bef983f64a4c4fefa Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5915) * 07e11ae0c700a4172a854ae4fd108b483bde6003 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5947) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-18900) HiveCatalog should error out when listing partitions with an invalid spec
[ https://issues.apache.org/jira/browse/FLINK-18900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-18900. Resolution: Fixed > HiveCatalog should error out when listing partitions with an invalid spec > - > > Key: FLINK-18900 > URL: https://issues.apache.org/jira/browse/FLINK-18900 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: Rui Li >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.2 > > > Take the following case as an example: > {code} > create table tbl (x int) partitioned by (p int); > alter table tbl add partition (p=1); > {code} > If we list partitions with partition spec {{foo=1}}, HiveCatalog returns > partition {{p=1}}, which is wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18900) HiveCatalog should error out when listing partitions with an invalid spec
[ https://issues.apache.org/jira/browse/FLINK-18900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186237#comment-17186237 ] Jingsong Lee commented on FLINK-18900: -- Revert interface modification for 1.11: a5767906916548ca51bf2d4b9e75c833ea6522a6 > HiveCatalog should error out when listing partitions with an invalid spec > - > > Key: FLINK-18900 > URL: https://issues.apache.org/jira/browse/FLINK-18900 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: Rui Li >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.2 > > > Take the following case as an example: > {code} > create table tbl (x int) partitioned by (p int); > alter table tbl add partition (p=1); > {code} > If we list partitions with partition spec {{foo=1}}, HiveCatalog returns > partition {{p=1}}, which is wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi merged pull request #13269: [FLINK-18900][table] Revert the modification of Catalog.listPartitions
JingsongLi merged pull request #13269: URL: https://github.com/apache/flink/pull/13269 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13270: [hotfix] [javadocs] fix typo in TaskManagerServices
flinkbot commented on pull request #13270: URL: https://github.com/apache/flink/pull/13270#issuecomment-682310308 ## CI report: * 04b974f56351d188b12bfe7c22a2fb8cb6e8f68e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13249: [FLINK-14087][datastream] Clone the StreamPartitioner to avoid being shared at runtime.
flinkbot edited a comment on pull request #13249: URL: https://github.com/apache/flink/pull/13249#issuecomment-680775587 ## CI report: * 4eae231a2cd3edbb2ba34f7bef983f64a4c4fefa Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5915) * 07e11ae0c700a4172a854ae4fd108b483bde6003 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-19081) Deprecate TemporalTableFunction and Table#createTemporalTableFunction()
[ https://issues.apache.org/jira/browse/FLINK-19081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonard Xu updated FLINK-19081: --- Summary: Deprecate TemporalTableFunction and Table#createTemporalTableFunction() (was: Deprecate TemporalTableFunction and Table$createTemporalTableFunction()) > Deprecate TemporalTableFunction and Table#createTemporalTableFunction() > --- > > Key: FLINK-19081 > URL: https://issues.apache.org/jira/browse/FLINK-19081 > Project: Flink > Issue Type: Sub-task >Reporter: Leonard Xu >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19082) Add docs for temporal table and temporal table join
Leonard Xu created FLINK-19082: -- Summary: Add docs for temporal table and temporal table join Key: FLINK-19082 URL: https://issues.apache.org/jira/browse/FLINK-19082 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19081) Deprecate TemporalTableFunction and Table$createTemporalTableFunction()
Leonard Xu created FLINK-19081: -- Summary: Deprecate TemporalTableFunction and Table$createTemporalTableFunction() Key: FLINK-19081 URL: https://issues.apache.org/jira/browse/FLINK-19081 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19080) Materialize timeindicator data type in the right input of temporal join
Leonard Xu created FLINK-19080: -- Summary: Materialize timeindicator data type in the right input of temporal join Key: FLINK-19080 URL: https://issues.apache.org/jira/browse/FLINK-19080 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19079) Support row time deduplicate operator
Leonard Xu created FLINK-19079: -- Summary: Support row time deduplicate operator Key: FLINK-19079 URL: https://issues.apache.org/jira/browse/FLINK-19079 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19077) Improve process time temporal join operator
Leonard Xu created FLINK-19077: -- Summary: Improve process time temporal join operator Key: FLINK-19077 URL: https://issues.apache.org/jira/browse/FLINK-19077 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19078) Import rowtime join temporal operator
Leonard Xu created FLINK-19078: -- Summary: Import rowtime join temporal operator Key: FLINK-19078 URL: https://issues.apache.org/jira/browse/FLINK-19078 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19076) Import rule to deal Temporal Join condition
Leonard Xu created FLINK-19076: -- Summary: Import rule to deal Temporal Join condition Key: FLINK-19076 URL: https://issues.apache.org/jira/browse/FLINK-19076 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19075) Infer changelog trait for temporal join node
Leonard Xu created FLINK-19075: -- Summary: Infer changelog trait for temporal join node Key: FLINK-19075 URL: https://issues.apache.org/jira/browse/FLINK-19075 Project: Flink Issue Type: Bug Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19073) Improve streamExecTemporalJoinRule
Leonard Xu created FLINK-19073: -- Summary: Improve streamExecTemporalJoinRule Key: FLINK-19073 URL: https://issues.apache.org/jira/browse/FLINK-19073 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19074) Materialize timeindicator in the right input of temporal join
Leonard Xu created FLINK-19074: -- Summary: Materialize timeindicator in the right input of temporal join Key: FLINK-19074 URL: https://issues.apache.org/jira/browse/FLINK-19074 Project: Flink Issue Type: Bug Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #13270: [hotfix] [javadocs] fix typo in TaskManagerServices
flinkbot commented on pull request #13270: URL: https://github.com/apache/flink/pull/13270#issuecomment-682306916 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 04b974f56351d188b12bfe7c22a2fb8cb6e8f68e (Fri Aug 28 03:38:24 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-19072) Import Temporal Table join rule
Leonard Xu created FLINK-19072: -- Summary: Import Temporal Table join rule Key: FLINK-19072 URL: https://issues.apache.org/jira/browse/FLINK-19072 Project: Flink Issue Type: Sub-task Reporter: Leonard Xu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] sdlcwangsong opened a new pull request #13270: [hotfix] [javadocs] fix typo in TaskManagerServices
sdlcwangsong opened a new pull request #13270: URL: https://github.com/apache/flink/pull/13270 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13263: [FLINK-17273][runtime] ActiveResourceManager closes task manager connection on worker terminated.
flinkbot edited a comment on pull request #13263: URL: https://github.com/apache/flink/pull/13263#issuecomment-681892463 ## CI report: * 990aa0f33e2fa93d40ea011082fedc6eeb37be44 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5922) * 4f0f42441fadc0961dbdecc9f1cf5ced660387d4 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5946) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-19070) Hive connector should throw a meaningful exception if user reads/writes ACID tables
Rui Li created FLINK-19070: -- Summary: Hive connector should throw a meaningful exception if user reads/writes ACID tables Key: FLINK-19070 URL: https://issues.apache.org/jira/browse/FLINK-19070 Project: Flink Issue Type: Improvement Components: Connectors / Hive Reporter: Rui Li Fix For: 1.12.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19064) HBaseRowDataInputFormat is leaking resources
[ https://issues.apache.org/jira/browse/FLINK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186232#comment-17186232 ] Nicholas Jiang edited comment on FLINK-19064 at 8/28/20, 3:22 AM: -- [~ZhuShang] Please pay attention to the defintion of method openInputFormat, which javadoc mentions that Resources should be allocated in this method. (e.g. database connections, cache, etc.). was (Author: nicholasjiang): [~ZhuShang] Please pay attention to the defintion of method openInputFormat, which javadoc mentions that Resources should be allocated in this method. (e.g. database connections, cache, etc.). > HBaseRowDataInputFormat is leaking resources > > > Key: FLINK-19064 > URL: https://issues.apache.org/jira/browse/FLINK-19064 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Nicholas Jiang >Priority: Critical > > {{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, > which creates a connection to HBase that is not closed again. > A user reported this problem on the user@ list: > https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19064) HBaseRowDataInputFormat is leaking resources
[ https://issues.apache.org/jira/browse/FLINK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186232#comment-17186232 ] Nicholas Jiang commented on FLINK-19064: [~ZhuShang] Please pay attention to the defintion of method openInputFormat, which javadoc mentions that Resources should be allocated in this method. (e.g. database connections, cache, etc.). > HBaseRowDataInputFormat is leaking resources > > > Key: FLINK-19064 > URL: https://issues.apache.org/jira/browse/FLINK-19064 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Nicholas Jiang >Priority: Critical > > {{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, > which creates a connection to HBase that is not closed again. > A user reported this problem on the user@ list: > https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19071) Some Hive window functions are not supported
Rui Li created FLINK-19071: -- Summary: Some Hive window functions are not supported Key: FLINK-19071 URL: https://issues.apache.org/jira/browse/FLINK-19071 Project: Flink Issue Type: Task Components: Connectors / Hive Reporter: Rui Li -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19069) finalizeOnMaster takes too much time and client timeouts
[ https://issues.apache.org/jira/browse/FLINK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186231#comment-17186231 ] Kenneth William Krugler commented on FLINK-19069: - I'd recently posted about a related issue to the dev mailing list, where I asked: {quote}[...] the default behavior of Hadoop’s FileOutputCommitter (with algorithm == 1) is to put files in task-specific sub-dirs. It’s depending on a post-completion “merge paths” action to be taken by what is (for Hadoop) the Application Master. I assume that when running on a real cluster, the HadoopOutputFormat.finalizeGlobal() method’s call to commitJob() would do this, but it doesn’t seem to be happening when I run locally. If I set the algorithm version to 2, then “merge paths” is handled by FileOutputCommitter immediately, and the HadoopOutputFormat code finds files in the expected location. Wondering if Flink should always be using version 2 of the algorithm, as that’s more performant when there are a lot of results (which is why it was added). {quote} > finalizeOnMaster takes too much time and client timeouts > > > Key: FLINK-19069 > URL: https://issues.apache.org/jira/browse/FLINK-19069 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Affects Versions: 1.9.0 >Reporter: Jiayi Liao >Priority: Major > > Currently we execute {{finalizeOnMaster}} in JM's main thread, which may > stuck the JM for a very long time and client timeouts eventually. > For example, we'd like to write data to HDFS and commit files on JM, which > takes more than ten minutes to commit tens of thousands files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13263: [FLINK-17273][runtime] ActiveResourceManager closes task manager connection on worker terminated.
flinkbot edited a comment on pull request #13263: URL: https://github.com/apache/flink/pull/13263#issuecomment-681892463 ## CI report: * 990aa0f33e2fa93d40ea011082fedc6eeb37be44 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5922) * 4f0f42441fadc0961dbdecc9f1cf5ced660387d4 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-16824) FLIP-132 Temporal Table DDL and Temporal Table Join
[ https://issues.apache.org/jira/browse/FLINK-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonard Xu updated FLINK-16824: --- Summary: FLIP-132 Temporal Table DDL and Temporal Table Join (was: Creating Temporal Table Function via DDL) > FLIP-132 Temporal Table DDL and Temporal Table Join > --- > > Key: FLINK-16824 > URL: https://issues.apache.org/jira/browse/FLINK-16824 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Konstantin Knauf >Assignee: Leonard Xu >Priority: Major > Fix For: 1.12.0 > > > Currently, a Temporal Table Function can only be created via the Table API or > indirectly via the configuration file of the SQL Client. > It would be great, if this was also possible in pure SQL via a DDL statement. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19063) Support join late event from dimension table side in temporal table join
[ https://issues.apache.org/jira/browse/FLINK-19063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonard Xu updated FLINK-19063: --- Description: To join late event from dimension table side in temporal table join is a common user case from user-zh mail list[1][2]. And another similar user case is how to enable the faster stream to wait the slower stream in regular stream join[3]. I think we can discuss how to support these user cases. [1][http://apache-flink.147419.n8.nabble.com/Flink-join-td6563.html] [2][http://apache-flink.147419.n8.nabble.com/flinksql-mysql-td3584.html#a3585] [3][http://apache-flink.147419.n8.nabble.com/Flink-sql-td4435.html#a4436] was: To join late event from dimension table side in temporal table join is a common user case from user-zh mail list[1][3]. And another similar user case is how to enable the faster stream to wait the slower stream in regular stream join[3]. I think we can discuss how to support these user cases. [1][http://apache-flink.147419.n8.nabble.com/Flink-join-td6563.html] [2][http://apache-flink.147419.n8.nabble.com/flinksql-mysql-td3584.html#a3585] [3][http://apache-flink.147419.n8.nabble.com/Flink-sql-td4435.html#a4436] > Support join late event from dimension table side in temporal table join > - > > Key: FLINK-19063 > URL: https://issues.apache.org/jira/browse/FLINK-19063 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Leonard Xu >Priority: Major > > To join late event from dimension table side in temporal table join is a > common user case > from user-zh mail list[1][2]. > And another similar user case is how to enable the faster stream to wait the > slower stream in regular stream join[3]. > I think we can discuss how to support these user cases. > > > [1][http://apache-flink.147419.n8.nabble.com/Flink-join-td6563.html] > [2][http://apache-flink.147419.n8.nabble.com/flinksql-mysql-td3584.html#a3585] > [3][http://apache-flink.147419.n8.nabble.com/Flink-sql-td4435.html#a4436] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19005) used metaspace grow on every execution
[ https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186221#comment-17186221 ] ShenDa edited comment on FLINK-19005 at 8/28/20, 2:57 AM: -- [~chesnay] Thanks for your detailed instruction. But I still think there's maybe something wrong in Flink. I find that the JdbcInputFormat & JdbcOutputFormat is key reason cause the Metaspace OOM, because the java.sql.DriverManager doesn't release the reference of the Driver. The DriverManager is loaded by java.internal.ClassLoader but the driver is loaded by ChildFisrtClassLoader, which means the ChildFirstClassLoader can't be garbage collected according analyzation of dump file. The following code is used by me to reproduce the issue and I use org.postgresql.Driver as jdbc Driver. {code:java} public static void main(String[] args) throws Exception { EnvironmentSettings envSettings = EnvironmentSettings.newInstance() .useBlinkPlanner() !origin-jdbc-inputformat.png! .inBatchMode() .build(); TableEnvironment tEnv = TableEnvironment.create(envSettings); tEnv.executeSql( "CREATE TABLE " + INPUT_TABLE + "(" + "id BIGINT," + "timestamp6_col TIMESTAMP(6)," + "timestamp9_col TIMESTAMP(6)," + "time_col TIME," + "real_col FLOAT," + "decimal_col DECIMAL(10, 4)" + ") WITH (" + " 'connector.type'='jdbc'," + " 'connector.url'='" + DB_URL + "'," + " 'connector.table'='" + INPUT_TABLE + "'," + " 'connector.USERNAME'='" + USERNAME + "'," + " 'connector.PASSWORD'='" + PASSWORD + "'" + ")" ); TableResult tableResult = tEnv.executeSql("SELECT timestamp6_col, decimal_col FROM " + INPUT_TABLE); tableResult.collect(); } {code} And below diagram shows the Metaspace usage constantly growing up, and finally TaskManager will be offline. !origin-jdbc-inputformat.png! Additional, I try to fix this issue by appending the following code to the function closeInputFormat() which can finally trigger garbage collect in Metaspace. {code:java} try{ final Enumeration drivers = DriverManager.getDrivers(); while (drivers.hasMoreElements()) { DriverManager.deregisterDriver(drivers.nextElement()); } } catch (SQLException se) { LOG.info("Inputformat couldn't be closed - " + se.getMessage()); } {code} The following diagram shows the usage of Metaspace will be decreased. !modified-jdbc-inputformat.png! So, do you think it's a flink problem, and should we create a new issue to fix. was (Author: dadashen): [~chesnay] Thanks for your detailed instruction. But I still think there's maybe something wrong in Flink. I find that the JdbcInputFormat & JdbcOutputFormat is key reason cause the Metaspace OOM, because the java.sql.DriverManager doesn't release the reference of the Driver. The DriverManager is loaded by java.internal.ClassLoader but the driver is loaded by ChildFisrtClassLoader, which means the ChildFirstClassLoader can't be garbage collected according analyzation of dump file. The following code is used by me to reproduce the issue and I use org.postgresql.Driver as jdbc Driver. {code:java} public static void main(String[] args) throws Exception { EnvironmentSettings envSettings = EnvironmentSettings.newInstance() .useBlinkPlanner() !origin-jdbc-inputformat.png! .inBatchMode() .build(); TableEnvironment tEnv = TableEnvironment.create(envSettings); tEnv.executeSql( "CREATE TABLE " + INPUT_TABLE + "(" + "id BIGINT," + "timestamp6_col TIMESTAMP(6)," + "timestamp9_col TIMESTAMP(6)," + "time_col TIME," + "real_col FLOAT," + "decimal_col DECIMAL(10, 4)" + ") WITH (" + " 'connector.type'='jdbc'," + " 'connector.url'='" + DB_URL + "'," + " 'connector.table'='" + INPUT_TABLE + "'," +
[GitHub] [flink] xintongsong commented on pull request #13263: [FLINK-17273][runtime] ActiveResourceManager closes task manager connection on worker terminated.
xintongsong commented on pull request #13263: URL: https://github.com/apache/flink/pull/13263#issuecomment-682296119 Thanks for the review, @tillrohrmann This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19064) HBaseRowDataInputFormat is leaking resources
[ https://issues.apache.org/jira/browse/FLINK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186224#comment-17186224 ] Nicholas Jiang commented on FLINK-19064: [~jark] I guess the meaning of [~ZhuShang] is that the javadoc of the method configure() is Creates a \{@link Scan} object and opens the \{@link HTable} connection, which means that HBase connection is defined to open in configure(). > HBaseRowDataInputFormat is leaking resources > > > Key: FLINK-19064 > URL: https://issues.apache.org/jira/browse/FLINK-19064 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Nicholas Jiang >Priority: Critical > > {{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, > which creates a connection to HBase that is not closed again. > A user reported this problem on the user@ list: > https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19064) HBaseRowDataInputFormat is leaking resources
[ https://issues.apache.org/jira/browse/FLINK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186224#comment-17186224 ] Nicholas Jiang edited comment on FLINK-19064 at 8/28/20, 2:51 AM: -- [~jark] I guess the meaning of [~ZhuShang] is that the javadoc of the method configure() is Creates a \{@link Scan} object and opens the \{@link HTable} connection, which means that HBase connection is defined to open in configure(). was (Author: nicholasjiang): [~jark] I guess the meaning of [~ZhuShang] is that the javadoc of the method configure() is Creates a \{@link Scan} object and opens the \{@link HTable} connection, which means that HBase connection is defined to open in configure(). > HBaseRowDataInputFormat is leaking resources > > > Key: FLINK-19064 > URL: https://issues.apache.org/jira/browse/FLINK-19064 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Nicholas Jiang >Priority: Critical > > {{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, > which creates a connection to HBase that is not closed again. > A user reported this problem on the user@ list: > https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19005) used metaspace grow on every execution
[ https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186221#comment-17186221 ] ShenDa commented on FLINK-19005: [~chesnay] Thanks for your detailed instruction. But I still think there's maybe something wrong in Flink. I find that the JdbcInputFormat & JdbcOutputFormat is key reason cause the Metaspace OOM, because the java.sql.DriverManager doesn't release the reference of the Driver. The DriverManager is loaded by java.internal.ClassLoader but the driver is loaded by ChildFisrtClassLoader, which means the ChildFirstClassLoader can't be garbage collected according analyzation of dump file. The following code is used by me to reproduce the issue and I use org.postgresql.Driver as jdbc Driver. {code:java} public static void main(String[] args) throws Exception { EnvironmentSettings envSettings = EnvironmentSettings.newInstance() .useBlinkPlanner() !origin-jdbc-inputformat.png! .inBatchMode() .build(); TableEnvironment tEnv = TableEnvironment.create(envSettings); tEnv.executeSql( "CREATE TABLE " + INPUT_TABLE + "(" + "id BIGINT," + "timestamp6_col TIMESTAMP(6)," + "timestamp9_col TIMESTAMP(6)," + "time_col TIME," + "real_col FLOAT," + "decimal_col DECIMAL(10, 4)" + ") WITH (" + " 'connector.type'='jdbc'," + " 'connector.url'='" + DB_URL + "'," + " 'connector.table'='" + INPUT_TABLE + "'," + " 'connector.USERNAME'='" + USERNAME + "'," + " 'connector.PASSWORD'='" + PASSWORD + "'" + ")" ); TableResult tableResult = tEnv.executeSql("SELECT timestamp6_col, decimal_col FROM " + INPUT_TABLE); tableResult.collect(); } {code} And below diagram shows the Metaspace usage constantly growing up, and finally TaskManager will be offline. !origin-jdbc-inputformat.png! Additional, I try to fix this issue by appending the following code to the function closeInputFormat() which can finally trigger garbage collect in Metaspace. {code:java} try{ final Enumeration drivers = DriverManager.getDrivers(); while (drivers.hasMoreElements()) { DriverManager.deregisterDriver(drivers.nextElement()); } } catch (SQLException se) { LOG.info("Inputformat couldn't be closed - " + se.getMessage()); } {code} The following diagram shows the usage of Metaspace will be decreased. !modified-jdbc-inputformat.png! > used metaspace grow on every execution > -- > > Key: FLINK-19005 > URL: https://issues.apache.org/jira/browse/FLINK-19005 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission, Runtime / Configuration, > Runtime / Coordination >Affects Versions: 1.11.1 >Reporter: Guillermo Sánchez >Assignee: Chesnay Schepler >Priority: Major > Attachments: heap_dump_after_10_executions.zip, > heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz, > modified-jdbc-inputformat.png, origin-jdbc-inputformat.png > > > Hi ! > Im running a 1.11.1 flink cluster, where I execute batch jobs made with > DataSet API. > I submit these jobs every day to calculate daily data. > In every execution, cluster's used metaspace increase by 7MB and its never > released. > This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i > need to restart the cluster to clean the metaspace > taskmanager.memory.jvm-metaspace.size is set to 512mb > Any idea of what could be causing this metaspace grow and why is it not > released ? > > > === Summary == > > Case 1, reported by [~gestevez]: > * Flink 1.11.1 > * Java 11 > * Maximum Metaspace size set to 512mb > * Custom Batch job, submitted daily > * Requires restart every 15 days after an OOM > Case 2, reported by [~Echo Lee]: > * Flink 1.11.0 > * Java 11 > * G1GC > * WordCount Batch job, submitted every second / every 5 minutes > * eventually fails TaskExecutor with OOM > Case 3, reported by [~DaDaShen] > * Flink 1.11.0 > * Java 11 > * WordCount Batch job, submitted every 5 seconds > * growing Metaspace, eventually OOM > -- This message was sent by Atlassian
[jira] [Updated] (FLINK-19005) used metaspace grow on every execution
[ https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ShenDa updated FLINK-19005: --- Attachment: modified-jdbc-inputformat.png > used metaspace grow on every execution > -- > > Key: FLINK-19005 > URL: https://issues.apache.org/jira/browse/FLINK-19005 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission, Runtime / Configuration, > Runtime / Coordination >Affects Versions: 1.11.1 >Reporter: Guillermo Sánchez >Assignee: Chesnay Schepler >Priority: Major > Attachments: heap_dump_after_10_executions.zip, > heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz, > modified-jdbc-inputformat.png, origin-jdbc-inputformat.png > > > Hi ! > Im running a 1.11.1 flink cluster, where I execute batch jobs made with > DataSet API. > I submit these jobs every day to calculate daily data. > In every execution, cluster's used metaspace increase by 7MB and its never > released. > This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i > need to restart the cluster to clean the metaspace > taskmanager.memory.jvm-metaspace.size is set to 512mb > Any idea of what could be causing this metaspace grow and why is it not > released ? > > > === Summary == > > Case 1, reported by [~gestevez]: > * Flink 1.11.1 > * Java 11 > * Maximum Metaspace size set to 512mb > * Custom Batch job, submitted daily > * Requires restart every 15 days after an OOM > Case 2, reported by [~Echo Lee]: > * Flink 1.11.0 > * Java 11 > * G1GC > * WordCount Batch job, submitted every second / every 5 minutes > * eventually fails TaskExecutor with OOM > Case 3, reported by [~DaDaShen] > * Flink 1.11.0 > * Java 11 > * WordCount Batch job, submitted every 5 seconds > * growing Metaspace, eventually OOM > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19069) finalizeOnMaster takes too much time and client timeouts
Jiayi Liao created FLINK-19069: -- Summary: finalizeOnMaster takes too much time and client timeouts Key: FLINK-19069 URL: https://issues.apache.org/jira/browse/FLINK-19069 Project: Flink Issue Type: Improvement Components: Runtime / Task Affects Versions: 1.9.0 Reporter: Jiayi Liao Currently we execute {{finalizeOnMaster}} in JM's main thread, which may stuck the JM for a very long time and client timeouts eventually. For example, we'd like to write data to HDFS and commit files on JM, which takes more than ten minutes to commit tens of thousands files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13269: [FLINK-18900][table] Revert the modification of Catalog.listPartitions
flinkbot edited a comment on pull request #13269: URL: https://github.com/apache/flink/pull/13269#issuecomment-682288332 ## CI report: * c967e9837780e4ac82c668845bc35d1a38c47ee8 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5945) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19064) HBaseRowDataInputFormat is leaking resources
[ https://issues.apache.org/jira/browse/FLINK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186215#comment-17186215 ] Jark Wu commented on FLINK-19064: - Hi [~ZhuShang], why the connection should be created in {{configure()}}? > HBaseRowDataInputFormat is leaking resources > > > Key: FLINK-19064 > URL: https://issues.apache.org/jira/browse/FLINK-19064 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Nicholas Jiang >Priority: Critical > > {{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, > which creates a connection to HBase that is not closed again. > A user reported this problem on the user@ list: > https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-19064) HBaseRowDataInputFormat is leaking resources
[ https://issues.apache.org/jira/browse/FLINK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-19064: --- Assignee: Nicholas Jiang > HBaseRowDataInputFormat is leaking resources > > > Key: FLINK-19064 > URL: https://issues.apache.org/jira/browse/FLINK-19064 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Nicholas Jiang >Priority: Critical > > {{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, > which creates a connection to HBase that is not closed again. > A user reported this problem on the user@ list: > https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19005) used metaspace grow on every execution
[ https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ShenDa updated FLINK-19005: --- Attachment: origin-jdbc-inputformat.png > used metaspace grow on every execution > -- > > Key: FLINK-19005 > URL: https://issues.apache.org/jira/browse/FLINK-19005 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission, Runtime / Configuration, > Runtime / Coordination >Affects Versions: 1.11.1 >Reporter: Guillermo Sánchez >Assignee: Chesnay Schepler >Priority: Major > Attachments: heap_dump_after_10_executions.zip, > heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz, > origin-jdbc-inputformat.png > > > Hi ! > Im running a 1.11.1 flink cluster, where I execute batch jobs made with > DataSet API. > I submit these jobs every day to calculate daily data. > In every execution, cluster's used metaspace increase by 7MB and its never > released. > This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i > need to restart the cluster to clean the metaspace > taskmanager.memory.jvm-metaspace.size is set to 512mb > Any idea of what could be causing this metaspace grow and why is it not > released ? > > > === Summary == > > Case 1, reported by [~gestevez]: > * Flink 1.11.1 > * Java 11 > * Maximum Metaspace size set to 512mb > * Custom Batch job, submitted daily > * Requires restart every 15 days after an OOM > Case 2, reported by [~Echo Lee]: > * Flink 1.11.0 > * Java 11 > * G1GC > * WordCount Batch job, submitted every second / every 5 minutes > * eventually fails TaskExecutor with OOM > Case 3, reported by [~DaDaShen] > * Flink 1.11.0 > * Java 11 > * WordCount Batch job, submitted every 5 seconds > * growing Metaspace, eventually OOM > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19068) Filter verbose pod events for KubernetesResourceManagerDriver
Xintong Song created FLINK-19068: Summary: Filter verbose pod events for KubernetesResourceManagerDriver Key: FLINK-19068 URL: https://issues.apache.org/jira/browse/FLINK-19068 Project: Flink Issue Type: Improvement Components: Deployment / Kubernetes Reporter: Xintong Song A status of a Kubernetes pod consists of many detailed fields. Currently, Flink receives pod {{MODIFIED}} events from the {{KubernetesPodsWatcher}} on every single change to these fields, many of which Flink does not care. The verbose events will not affect the functionality of Flink, but will pollute the logs with repeated messages, because Flink only looks into the fields it interested in and those fields are identical. E.g., when a task manager is stopped due to idle timeout, Flink receives 3 events: * MODIFIED: container terminated * MODIFIED: {{deletionGracePeriodSeconds}} changes from 30 to 0, which is a Kubernetes internal status change after containers are gracefully terminated * DELETED: Flink removes metadata of the terminated pod Among the 3 messages, Flink is only interested in the 1st MODIFIED message, but will try to process all of them because the container status is terminated. I propose to Filter the verbose events in {{KubernetesResourceManagerDriver.PodCallbackHandlerImpl}}, to only process the status changes interested by Flink. This probably requires recording the status of all living pods, to compare with the incoming events for detecting status changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #13269: [FLINK-18900][table] Revert the modification of Catalog.listPartitions
flinkbot commented on pull request #13269: URL: https://github.com/apache/flink/pull/13269#issuecomment-682288332 ## CI report: * c967e9837780e4ac82c668845bc35d1a38c47ee8 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13268: [FLINK-18988][hotfix][doc] Fix Flink Kafka Connector Dependency Error
flinkbot edited a comment on pull request #13268: URL: https://github.com/apache/flink/pull/13268#issuecomment-682283788 ## CI report: * 5357812df45c42860e95630891367a410f6bc34b Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5944) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-15719) Exceptions when using scala types directly with the State Process API
[ https://issues.apache.org/jira/browse/FLINK-15719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186212#comment-17186212 ] Ying Z commented on FLINK-15719: I pull a request here, [https://github.com/apache/flink/pull/13266] could someone do a review, thanks. > Exceptions when using scala types directly with the State Process API > - > > Key: FLINK-15719 > URL: https://issues.apache.org/jira/browse/FLINK-15719 > Project: Flink > Issue Type: Bug > Components: API / State Processor >Affects Versions: 1.9.1 >Reporter: Ying Z >Assignee: Tzu-Li (Gordon) Tai >Priority: Major > Labels: pull-request-available > > I followed these steps to generate and read states: > # implements the example[1] `CountWindowAverage` in Scala(exactly same), and > run jobA => that makes good. > # execute `flink cancel -s ${JobID}` => savepoints was generated as expected. > # implements the example[2] `StatefulFunctionWithTime` in Scala(code below), > and run jobB => failed, exceptions shows that "Caused by: > org.apache.flink.util.StateMigrationException: The new key serializer must be > compatible." > ReaderFunction code as below: > {code:java} > // code placeholder > class ReaderFunction extends KeyedStateReaderFunction[Long, (Long, Long)] { > var countState: ValueState[(Long, Long)] = _ > override def open(parameters: Configuration): Unit = { > val stateDescriptor = new ValueStateDescriptor("average", > createTypeInformation[(Long, Long)]) > countState = getRuntimeContext().getState(stateDescriptor) > }override def readKey(key: Long, ctx: > KeyedStateReaderFunction.Context, out: Collector[(Long, Long)]): Unit = { > out.collect(countState.value()) > } > } > {code} > 1: > [https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/state/state.html#using-managed-keyed-state] > > 2: > [https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/libs/state_processor_api.html#keyed-state] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-19060) Checkpoint not triggered when use broadcast stream
[ https://issues.apache.org/jira/browse/FLINK-19060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] henvealf closed FLINK-19060. Resolution: Not A Bug > Checkpoint not triggered when use broadcast stream > -- > > Key: FLINK-19060 > URL: https://issues.apache.org/jira/browse/FLINK-19060 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.11.1 >Reporter: henvealf >Priority: Major > Attachments: image-2020-08-27-16-41-23-699.png, > image-2020-08-27-16-44-37-442.png, image-2020-08-27-16-45-28-134.png, > image-2020-08-27-16-51-10-512.png > > Original Estimate: 1h > Remaining Estimate: 1h > > Code: > !image-2020-08-27-16-51-10-512.png! > KafkaSourceConfig: > consumer.setStartFromGroupOffsets() > Web UI: > !image-2020-08-27-16-45-28-134.png! > Checkpoint always doesn't happen. Did I write something wrong? > Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #13269: [FLINK-18900][table] Revert the modification of Catalog.listPartitions
flinkbot commented on pull request #13269: URL: https://github.com/apache/flink/pull/13269#issuecomment-682286516 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit c967e9837780e4ac82c668845bc35d1a38c47ee8 (Fri Aug 28 02:17:35 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingsongLi opened a new pull request #13269: [FLINK-18900][table] Revert the modification of Catalog.listPartitions
JingsongLi opened a new pull request #13269: URL: https://github.com/apache/flink/pull/13269 Revert the modification of Catalog.listPartitions for version Compatibility. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18900) HiveCatalog should error out when listing partitions with an invalid spec
[ https://issues.apache.org/jira/browse/FLINK-18900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186205#comment-17186205 ] Rui Li commented on FLINK-18900: Perhaps we can revert the change in 1.11, given that 1.11 doesn't support SHOW PARTITIONS. > HiveCatalog should error out when listing partitions with an invalid spec > - > > Key: FLINK-18900 > URL: https://issues.apache.org/jira/browse/FLINK-18900 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: Rui Li >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.2 > > > Take the following case as an example: > {code} > create table tbl (x int) partitioned by (p int); > alter table tbl add partition (p=1); > {code} > If we list partitions with partition spec {{foo=1}}, HiveCatalog returns > partition {{p=1}}, which is wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-18900) HiveCatalog should error out when listing partitions with an invalid spec
[ https://issues.apache.org/jira/browse/FLINK-18900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186204#comment-17186204 ] Jingsong Lee edited comment on FLINK-18900 at 8/28/20, 2:08 AM: We should not modify method throws exception for {{Catalog.listPartitions}} in release-1.11. was (Author: lzljs3620320): We should not modify method throws exception for {{Catalog.listPartitions}}. > HiveCatalog should error out when listing partitions with an invalid spec > - > > Key: FLINK-18900 > URL: https://issues.apache.org/jira/browse/FLINK-18900 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: Rui Li >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.2 > > > Take the following case as an example: > {code} > create table tbl (x int) partitioned by (p int); > alter table tbl add partition (p=1); > {code} > If we list partitions with partition spec {{foo=1}}, HiveCatalog returns > partition {{p=1}}, which is wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18900) HiveCatalog should error out when listing partitions with an invalid spec
[ https://issues.apache.org/jira/browse/FLINK-18900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186204#comment-17186204 ] Jingsong Lee commented on FLINK-18900: -- We should not modify method throws exception for {{Catalog.listPartitions}}. > HiveCatalog should error out when listing partitions with an invalid spec > - > > Key: FLINK-18900 > URL: https://issues.apache.org/jira/browse/FLINK-18900 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: Rui Li >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.2 > > > Take the following case as an example: > {code} > create table tbl (x int) partitioned by (p int); > alter table tbl add partition (p=1); > {code} > If we list partitions with partition spec {{foo=1}}, HiveCatalog returns > partition {{p=1}}, which is wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (FLINK-18900) HiveCatalog should error out when listing partitions with an invalid spec
[ https://issues.apache.org/jira/browse/FLINK-18900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee reopened FLINK-18900: -- > HiveCatalog should error out when listing partitions with an invalid spec > - > > Key: FLINK-18900 > URL: https://issues.apache.org/jira/browse/FLINK-18900 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: Rui Li >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.2 > > > Take the following case as an example: > {code} > create table tbl (x int) partitioned by (p int); > alter table tbl add partition (p=1); > {code} > If we list partitions with partition spec {{foo=1}}, HiveCatalog returns > partition {{p=1}}, which is wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #13268: [FLINK-18988][hotfix][doc] Fix Flink Kafka Connector Dependency Error
flinkbot commented on pull request #13268: URL: https://github.com/apache/flink/pull/13268#issuecomment-682283788 ## CI report: * 5357812df45c42860e95630891367a410f6bc34b UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-19067) FileNotFoundException when run flink examples on standby JobManager
JieFang.He created FLINK-19067: -- Summary: FileNotFoundException when run flink examples on standby JobManager Key: FLINK-19067 URL: https://issues.apache.org/jira/browse/FLINK-19067 Project: Flink Issue Type: Bug Affects Versions: 1.11.1 Reporter: JieFang.He 1、When run examples/batch/WordCount.jar on standby JobManager,it will fail with the exception: Caused by: java.io.FileNotFoundException: /data2/storageDir/default/blob/job_d29414828f614d5466e239be4d3889ac/blob_p-a2ebe1c5aa160595f214b4bd0f39d80e42ee2e93-f458f1c12dc023e78d25f191de1d7c4b (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.(FileInputStream.java:138) at org.apache.flink.core.fs.local.LocalDataInputStream.(LocalDataInputStream.java:50) at org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:143) at org.apache.flink.runtime.blob.FileSystemBlobStore.get(FileSystemBlobStore.java:105) at org.apache.flink.runtime.blob.FileSystemBlobStore.get(FileSystemBlobStore.java:87) at org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:501) at org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231) at org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:117) 2、Run examples success on other nodes 3、After run success on the other node, it can run success on the Standby JobManager. But run again will fail -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #13268: [FLINK-18988][hotfix][doc] Fix Flink Kafka Connector Dependency Error
flinkbot commented on pull request #13268: URL: https://github.com/apache/flink/pull/13268#issuecomment-682280167 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 5357812df45c42860e95630891367a410f6bc34b (Fri Aug 28 01:53:11 UTC 2020) **Warnings:** * Documentation files were touched, but no `.zh.md` files: Update Chinese documentation or file Jira ticket. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-18988) Continuous query with LATERAL and LIMIT produces wrong result
[ https://issues.apache.org/jira/browse/FLINK-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-18988: --- Labels: pull-request-available (was: ) > Continuous query with LATERAL and LIMIT produces wrong result > - > > Key: FLINK-18988 > URL: https://issues.apache.org/jira/browse/FLINK-18988 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.11.1 >Reporter: Fabian Hueske >Assignee: Danny Chen >Priority: Critical > Labels: pull-request-available > > I was trying out the example queries provided in this blog post: > [https://materialize.io/lateral-joins-and-demand-driven-queries/] to check if > Flink supports the same and found that the queries were translated and > executed but produced the wrong result. > I used the SQL Client and Kafka (running at kafka:9092) to store the table > data. I executed the following statements: > {code:java} > -- create cities table > CREATE TABLE cities ( > name STRING NOT NULL, > state STRING NOT NULL, > pop INT NOT NULL > ) WITH ( > 'connector' = 'kafka', > 'topic' = 'cities', > 'properties.bootstrap.servers' = 'kafka:9092', > 'properties.group.id' = 'mygroup', > 'scan.startup.mode' = 'earliest-offset', > 'format' = 'json' > ); > -- fill cities table > INSERT INTO cities VALUES > ('Los_Angeles', 'CA', 3979576), > ('Phoenix', 'AZ', 1680992), > ('Houston', 'TX', 2320268), > ('San_Diego', 'CA', 1423851), > ('San_Francisco', 'CA', 881549), > ('New_York', 'NY', 8336817), > ('Dallas', 'TX', 1343573), > ('San_Antonio', 'TX', 1547253), > ('San_Jose', 'CA', 1021795), > ('Chicago', 'IL', 2695598), > ('Austin', 'TX', 978908); > -- execute query > SELECT state, name > FROM > (SELECT DISTINCT state FROM cities) states, > LATERAL ( > SELECT name, pop > FROM cities > WHERE state = states.state > ORDER BY pop > DESC LIMIT 3 > ); > -- result > state name >CA Los_Angeles >NY New_York >IL Chicago > -- expected result > state | name > --+- > TX | Dallas > AZ | Phoenix > IL | Chicago > TX | Houston > CA | San_Jose > NY | New_York > CA | San_Diego > CA | Los_Angeles > TX | San_Antonio > {code} > As you can see from the query result, Flink computes the top3 cities over all > states, not for every state individually. Hence, I assume that this is a bug > in the query optimizer or one of the rewriting rules. > There are two valid ways to solve this issue: > * Fixing the rewriting rules / optimizer (obviously preferred) > * Disabling this feature and throwing an exception -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] Fin-chan opened a new pull request #13268: [FLINK-18988][hotfix][doc] Fix Flink Kafka Connector Dependency Error
Fin-chan opened a new pull request #13268: URL: https://github.com/apache/flink/pull/13268 ## What is the purpose of the change Fix Flink Kafka Connector Dependency Error in Doc ## Brief change log example:flink-connector-kafka-011{{ site.scala_version_suffix }} change to flink-connector-kafka-0.11{{ site.scala_version_suffix }} ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-17274) Maven: Premature end of Content-Length delimited message body
[ https://issues.apache.org/jira/browse/FLINK-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186198#comment-17186198 ] Dian Fu commented on FLINK-17274: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5942=logs=8fd975ef-f478-511d-4997-6f15fe8a1fd3=ac0fa443-5d45-5a6b-3597-0310ecc1d2ab > Maven: Premature end of Content-Length delimited message body > - > > Key: FLINK-17274 > URL: https://issues.apache.org/jira/browse/FLINK-17274 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Critical > Labels: test-stability > Fix For: 1.12.0 > > > CI: > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7786=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb > {code} > [ERROR] Failed to execute goal on project > flink-connector-elasticsearch7_2.11: Could not resolve dependencies for > project > org.apache.flink:flink-connector-elasticsearch7_2.11:jar:1.11-SNAPSHOT: Could > not transfer artifact org.apache.lucene:lucene-sandbox:jar:8.3.0 from/to > alicloud-mvn-mirror > (http://mavenmirror.alicloud.dak8s.net:/repository/maven-central/): GET > request of: org/apache/lucene/lucene-sandbox/8.3.0/lucene-sandbox-8.3.0.jar > from alicloud-mvn-mirror failed: Premature end of Content-Length delimited > message body (expected: 289920; received: 239832 -> [Help 1] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19066) InnerJoinITCase.testBigForSpill failed with "java.lang.ClassCastException: org.apache.flink.table.runtime.util.ResettableExternalBuffer$BufferIterator cannot be cast to
[ https://issues.apache.org/jira/browse/FLINK-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-19066: Labels: test-stability (was: ) > InnerJoinITCase.testBigForSpill failed with "java.lang.ClassCastException: > org.apache.flink.table.runtime.util.ResettableExternalBuffer$BufferIterator > cannot be cast to org.apache.flink.table.data.binary.BinaryRowData" > --- > > Key: FLINK-19066 > URL: https://issues.apache.org/jira/browse/FLINK-19066 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Dian Fu >Priority: Major > Labels: test-stability > Fix For: 1.12.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5942=logs=904e5037-64c0-5f69-f6d5-e21b89cf6484=39857031-7f0c-5fd5-d730-a19c5794f839] > {code} > Caused by: java.lang.ClassCastException: > org.apache.flink.table.runtime.util.ResettableExternalBuffer$BufferIterator > cannot be cast to org.apache.flink.table.data.binary.BinaryRowData > at > org.apache.flink.table.runtime.util.ResettableExternalBuffer$InMemoryBuffer$InMemoryBufferIterator.next(ResettableExternalBuffer.java:678) > at > org.apache.flink.table.runtime.util.ResettableExternalBuffer$InMemoryBuffer$InMemoryBufferIterator.next(ResettableExternalBuffer.java:650) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19066) InnerJoinITCase.testBigForSpill failed with "java.lang.ClassCastException: org.apache.flink.table.runtime.util.ResettableExternalBuffer$BufferIterator cannot be cast to
Dian Fu created FLINK-19066: --- Summary: InnerJoinITCase.testBigForSpill failed with "java.lang.ClassCastException: org.apache.flink.table.runtime.util.ResettableExternalBuffer$BufferIterator cannot be cast to org.apache.flink.table.data.binary.BinaryRowData" Key: FLINK-19066 URL: https://issues.apache.org/jira/browse/FLINK-19066 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.12.0 Reporter: Dian Fu Fix For: 1.12.0 [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5942=logs=904e5037-64c0-5f69-f6d5-e21b89cf6484=39857031-7f0c-5fd5-d730-a19c5794f839] {code} Caused by: java.lang.ClassCastException: org.apache.flink.table.runtime.util.ResettableExternalBuffer$BufferIterator cannot be cast to org.apache.flink.table.data.binary.BinaryRowData at org.apache.flink.table.runtime.util.ResettableExternalBuffer$InMemoryBuffer$InMemoryBufferIterator.next(ResettableExternalBuffer.java:678) at org.apache.flink.table.runtime.util.ResettableExternalBuffer$InMemoryBuffer$InMemoryBufferIterator.next(ResettableExternalBuffer.java:650) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19064) HBaseRowDataInputFormat is leaking resources
[ https://issues.apache.org/jira/browse/FLINK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186197#comment-17186197 ] Nicholas Jiang commented on FLINK-19064: [~jark] Yes of course. Could you assign this to me? > HBaseRowDataInputFormat is leaking resources > > > Key: FLINK-19064 > URL: https://issues.apache.org/jira/browse/FLINK-19064 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Priority: Critical > > {{HBaseRowDataInputFormat.configure()}} is calling {{connectToTable()}}, > which creates a connection to HBase that is not closed again. > A user reported this problem on the user@ list: > https://lists.apache.org/thread.html/ra04f6996eb50ee83aabd2ad0d50bec9afb6a924bfbb48ada3269c6d8%40%3Cuser.flink.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13267: [FLINK-19012][task] Check AsyncCheckpointRunnable status before throwing an exception
flinkbot edited a comment on pull request #13267: URL: https://github.com/apache/flink/pull/13267#issuecomment-682125254 ## CI report: * 9d1f76cbf475a5ea187dc84bd0c54a936c7a87c6 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5941) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13181: [FLINK-18957] Implement logical request bulk tracking in SlotSharingExecutionSlotAllocator
flinkbot edited a comment on pull request #13181: URL: https://github.com/apache/flink/pull/13181#issuecomment-675091412 ## CI report: * e7553689356882de1ffe606400d1255d1d757bc4 UNKNOWN * 18f88af3b438b13e9a240efd2b4979f841d2b978 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5939) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13265: [FLINK-19055] Wait less time for all memory GC in tests (MemoryManager#verifyEmpty)
flinkbot edited a comment on pull request #13265: URL: https://github.com/apache/flink/pull/13265#issuecomment-682059106 ## CI report: * af69156ea6dbc7513cb7dde023cdf8cc454bec31 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5936) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13234: [FLINK-18905][task] Allow SourceOperator chaining with MultipleInputStreamTask
flinkbot edited a comment on pull request #13234: URL: https://github.com/apache/flink/pull/13234#issuecomment-679243226 ## CI report: * e5a866429bf22eb6aeeb26733e5ab1d705780e66 UNKNOWN * 2382ecb7dcb679dddbc39f44717ed1c4c7c061cf Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5935) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13267: [FLINK-19012][task] Check AsyncCheckpointRunnable status before throwing an exception
flinkbot edited a comment on pull request #13267: URL: https://github.com/apache/flink/pull/13267#issuecomment-682125254 ## CI report: * 9d1f76cbf475a5ea187dc84bd0c54a936c7a87c6 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5941) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13267: [FLINK-19012][task] Check AsyncCheckpointRunnable status before throwing an exception
flinkbot commented on pull request #13267: URL: https://github.com/apache/flink/pull/13267#issuecomment-682125254 ## CI report: * 9d1f76cbf475a5ea187dc84bd0c54a936c7a87c6 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Reopened] (FLINK-16789) Retrieve JMXRMI information via REST API or WebUI
[ https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reopened FLINK-16789: --- > Retrieve JMXRMI information via REST API or WebUI > - > > Key: FLINK-16789 > URL: https://issues.apache.org/jira/browse/FLINK-16789 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > > Currently there are no easy way to assign jmxrmi port to a running Flink job. > The typical tutorial is to add the following to both TM and JM launch env: > {code:java} > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.local.only=false > {code} > However, setting the jmxremote port to is not usually a viable solution > when Flink job is running on a shared environment (YARN / K8s / etc). > setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option > however, there's no easy way to retrieve such port assignment. We proposed to > use JMXConnectorServerFactory to explicitly establish a JMXServer inside > *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*. > With the JMXServer explicitly created, we can return the JMXRMI information > via either REST API or WebUI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-16789) Retrieve JMXRMI information via REST API or WebUI
[ https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-16789: -- Summary: Retrieve JMXRMI information via REST API or WebUI (was: Support JMX RMI random port assign & retrieval via JMXConnectorServer) > Retrieve JMXRMI information via REST API or WebUI > - > > Key: FLINK-16789 > URL: https://issues.apache.org/jira/browse/FLINK-16789 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > > Currently there are no easy way to assign jmxrmi port to a running Flink job. > The typical tutorial is to add the following to both TM and JM launch env: > {code:java} > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.local.only=false > {code} > However, setting the jmxremote port to is not usually a viable solution > when Flink job is running on a shared environment (YARN / K8s / etc). > setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option > however, there's no easy way to retrieve such port assignment. We proposed to > use JMXConnectorServerFactory to explicitly establish a JMXServer inside > *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*. > With the JMXServer explicitly created, we can return the JMXRMI information > via either REST API or WebUI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer
[ https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186045#comment-17186045 ] Rong Rong commented on FLINK-16789: --- oops. yes you are right [~chesnay]. the closed PR doesn't contain any WebUI / RestAPI changes. I will prepare another PR for exposing a REST API for JMX url/port retrieval. (FYI: our current approach is directly digging into the container startup log since it is printed there, that's why I forgot in the first place LOL) > Support JMX RMI random port assign & retrieval via JMXConnectorServer > - > > Key: FLINK-16789 > URL: https://issues.apache.org/jira/browse/FLINK-16789 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > > Currently there are no easy way to assign jmxrmi port to a running Flink job. > The typical tutorial is to add the following to both TM and JM launch env: > {code:java} > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.local.only=false > {code} > However, setting the jmxremote port to is not usually a viable solution > when Flink job is running on a shared environment (YARN / K8s / etc). > setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option > however, there's no easy way to retrieve such port assignment. We proposed to > use JMXConnectorServerFactory to explicitly establish a JMXServer inside > *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*. > With the JMXServer explicitly created, we can return the JMXRMI information > via either REST API or WebUI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13266: [FLINK-15719] fix mistake of reading state in scala lang.
flinkbot edited a comment on pull request #13266: URL: https://github.com/apache/flink/pull/13266#issuecomment-682067939 ## CI report: * e3bd4b8fcc871f1d0b6f6154515dee6268e70fd1 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5938) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13267: [FLINK-19012][task] Check AsyncCheckpointRunnable status before throwing an exception
flinkbot commented on pull request #13267: URL: https://github.com/apache/flink/pull/13267#issuecomment-682120413 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 9d1f76cbf475a5ea187dc84bd0c54a936c7a87c6 (Thu Aug 27 18:34:54 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."
[ https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-19012: --- Labels: pull-request-available test-stability (was: test-stability) > E2E test fails with "Cannot register Closeable, this > subtaskCheckpointCoordinator is already closed. Closing argument." > --- > > Key: FLINK-19012 > URL: https://issues.apache.org/jira/browse/FLINK-19012 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Task, Tests >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Roman Khachatryan >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > Note: This error occurred in a custom branch with unreviewed changes. I don't > believe my changes affect this error, but I would keep this in mind when > investigating the error: > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d > > {code} > 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO > org.apache.flink.runtime.taskmanager.Task[] - Registering > task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING]. > 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO > org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state > backend has been configured, using default (Memory / JobManager) > MemoryStateBackend (data in heap memory / checkpoints to JobManager) > (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: > 5242880) > 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO > org.apache.flink.runtime.taskmanager.Task[] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING. > 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO > org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge > [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ... > 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO > org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge > [] - Elasticsearch RestHighLevelClient is connected to > [http://127.0.0.1:9200] > 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO > org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl > [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 > drained requests > 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO > org.apache.flink.runtime.taskmanager.Task[] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED. > 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO > org.apache.flink.runtime.taskmanager.Task[] - Freeing > task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0). > 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - > Un-registering task and sending final execution state FINISHED to JobManager > for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > cbc357ccb763df2852fee8c4fc7d55f2_0_0. > 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO > org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of > checkpoint 1 could not be completed. > 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: > java.io.IOException: Cannot register Closeable, this > subtaskCheckpointCoordinator is already closed. Closing argument. > 2020-08-20T20:55:30.2418956Z at > org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468) > ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > 2020-08-20T20:55:30.2420100Z at > org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91) > [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > 2020-08-20T20:55:30.2420927Z at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_265] > 2020-08-20T20:55:30.2421455Z at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_265] > 2020-08-20T20:55:30.2421879Z at
[GitHub] [flink] rkhachatryan opened a new pull request #13267: [FLINK-19012] Check AsyncCheckpointRunnable status before throwing an exception
rkhachatryan opened a new pull request #13267: URL: https://github.com/apache/flink/pull/13267 ## What is the purpose of the change Currently, `AsyncCheckpointRunnable` throws an exception if `SubtaskCheckpointCoordinatorImpl` is closed. However, it should also check its own status as it might be a normal case. ## Verifying this change The change is covered by existing end-to-end tests which are currently failing. Unit testing would involve concurrency which I think would be overkill for essentially a logging problem. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? no This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."
[ https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186036#comment-17186036 ] Roman Khachatryan commented on FLINK-19012: --- I think the problem is that AsyncCheckpointRunnable throws an exception when it sees that SubtaskCheckpointCoordinator is closed. Upon close, SubtaskCheckpointCoordinator closes its runnables but doesn't stop their threads. This also changes their statuses. So AsyncCheckpointRunnable should check its status before throwing an exception. The tests started to fail after increasing the log level in FLINK-18962. > E2E test fails with "Cannot register Closeable, this > subtaskCheckpointCoordinator is already closed. Closing argument." > --- > > Key: FLINK-19012 > URL: https://issues.apache.org/jira/browse/FLINK-19012 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Task, Tests >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Roman Khachatryan >Priority: Critical > Labels: test-stability > Fix For: 1.12.0 > > > Note: This error occurred in a custom branch with unreviewed changes. I don't > believe my changes affect this error, but I would keep this in mind when > investigating the error: > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d > > {code} > 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO > org.apache.flink.runtime.taskmanager.Task[] - Registering > task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING]. > 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO > org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state > backend has been configured, using default (Memory / JobManager) > MemoryStateBackend (data in heap memory / checkpoints to JobManager) > (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: > 5242880) > 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO > org.apache.flink.runtime.taskmanager.Task[] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING. > 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO > org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge > [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ... > 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO > org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge > [] - Elasticsearch RestHighLevelClient is connected to > [http://127.0.0.1:9200] > 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO > org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl > [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 > drained requests > 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO > org.apache.flink.runtime.taskmanager.Task[] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED. > 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO > org.apache.flink.runtime.taskmanager.Task[] - Freeing > task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0). > 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - > Un-registering task and sending final execution state FINISHED to JobManager > for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > cbc357ccb763df2852fee8c4fc7d55f2_0_0. > 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO > org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of > checkpoint 1 could not be completed. > 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: > java.io.IOException: Cannot register Closeable, this > subtaskCheckpointCoordinator is already closed. Closing argument. > 2020-08-20T20:55:30.2418956Z at > org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468) > ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > 2020-08-20T20:55:30.2420100Z at > org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91) >
[jira] [Comment Edited] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer
[ https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186024#comment-17186024 ] Chesnay Schepler edited comment on FLINK-16789 at 8/27/20, 6:12 PM: [~rongr] What about the port retrieval? Random port assignment was only one part of this ticket, no? was (Author: zentol): [~rongr] What about the port retrieval? Random port assignment was only one part of it, no? > Support JMX RMI random port assign & retrieval via JMXConnectorServer > - > > Key: FLINK-16789 > URL: https://issues.apache.org/jira/browse/FLINK-16789 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > > Currently there are no easy way to assign jmxrmi port to a running Flink job. > The typical tutorial is to add the following to both TM and JM launch env: > {code:java} > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.local.only=false > {code} > However, setting the jmxremote port to is not usually a viable solution > when Flink job is running on a shared environment (YARN / K8s / etc). > setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option > however, there's no easy way to retrieve such port assignment. We proposed to > use JMXConnectorServerFactory to explicitly establish a JMXServer inside > *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*. > With the JMXServer explicitly created, we can return the JMXRMI information > via either REST API or WebUI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer
[ https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186024#comment-17186024 ] Chesnay Schepler commented on FLINK-16789: -- [~rongr] What about the port retrieval? Random port assignment was only one part of it, no? > Support JMX RMI random port assign & retrieval via JMXConnectorServer > - > > Key: FLINK-16789 > URL: https://issues.apache.org/jira/browse/FLINK-16789 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > > Currently there are no easy way to assign jmxrmi port to a running Flink job. > The typical tutorial is to add the following to both TM and JM launch env: > {code:java} > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.local.only=false > {code} > However, setting the jmxremote port to is not usually a viable solution > when Flink job is running on a shared environment (YARN / K8s / etc). > setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option > however, there's no easy way to retrieve such port assignment. We proposed to > use JMXConnectorServerFactory to explicitly establish a JMXServer inside > *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*. > With the JMXServer explicitly created, we can return the JMXRMI information > via either REST API or WebUI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13181: [FLINK-18957] Implement logical request bulk tracking in SlotSharingExecutionSlotAllocator
flinkbot edited a comment on pull request #13181: URL: https://github.com/apache/flink/pull/13181#issuecomment-675091412 ## CI report: * e7553689356882de1ffe606400d1255d1d757bc4 UNKNOWN * bce753ff7f4da2cffa295bd4007517af4c5697d8 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5926) * 18f88af3b438b13e9a240efd2b4979f841d2b978 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5939) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-19009) wrong way to calculate the "downtime" metric
[ https://issues.apache.org/jira/browse/FLINK-19009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann closed FLINK-19009. - Release Note: The down time metric is now calculated as described in the documentation. Resolution: Fixed Fixed via 05cbf1eaebeba281bf13a4c3bc76f7d05a31ea66 > wrong way to calculate the "downtime" metric > > > Key: FLINK-19009 > URL: https://issues.apache.org/jira/browse/FLINK-19009 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Runtime / Metrics >Affects Versions: 1.7.2, 1.8.0 >Reporter: Zhinan Cheng >Assignee: kevin liu >Priority: Trivial > Labels: pull-request-available > Fix For: 1.12.0 > > Original Estimate: 1h > Remaining Estimate: 1h > > Currently the way to calculate the Flink system metric "downtime" is not > consistent with the description in the doc, now the downtime is actually the > current timestamp minus the time timestamp when the job started. > > But Flink doc (https://flink.apache.org/gettinghelp.html) obviously describes > the time as the current timestamp minus the timestamp when the job failed. > > I believe we should update the code this metric as the Flink doc shows. The > easy way to solve this is using the current timestamp to minus the latest > uptime timestamp. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tillrohrmann closed pull request #13242: [FLINK-19009][metrics] Fixed the downtime metric issue and updated the comment
tillrohrmann closed pull request #13242: URL: https://github.com/apache/flink/pull/13242 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer
[ https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186002#comment-17186002 ] Rong Rong commented on FLINK-16789: --- After discussion in [#13163|https://github.com/apache/flink/pull/13163], we decided to not support this feature. Although using port 0 is consider standard in JVM for getting random port (see: [https://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html]) it is also very convenient for users to just directly configure a large port range after FLINK-5552 has been merged. Closing this ticket as won't fix. > Support JMX RMI random port assign & retrieval via JMXConnectorServer > - > > Key: FLINK-16789 > URL: https://issues.apache.org/jira/browse/FLINK-16789 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > > Currently there are no easy way to assign jmxrmi port to a running Flink job. > The typical tutorial is to add the following to both TM and JM launch env: > {code:java} > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.local.only=false > {code} > However, setting the jmxremote port to is not usually a viable solution > when Flink job is running on a shared environment (YARN / K8s / etc). > setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option > however, there's no easy way to retrieve such port assignment. We proposed to > use JMXConnectorServerFactory to explicitly establish a JMXServer inside > *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*. > With the JMXServer explicitly created, we can return the JMXRMI information > via either REST API or WebUI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer
[ https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong closed FLINK-16789. - Resolution: Won't Fix > Support JMX RMI random port assign & retrieval via JMXConnectorServer > - > > Key: FLINK-16789 > URL: https://issues.apache.org/jira/browse/FLINK-16789 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination, Runtime / Task >Affects Versions: 1.11.0 >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > Labels: pull-request-available > > Currently there are no easy way to assign jmxrmi port to a running Flink job. > The typical tutorial is to add the following to both TM and JM launch env: > {code:java} > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.local.only=false > {code} > However, setting the jmxremote port to is not usually a viable solution > when Flink job is running on a shared environment (YARN / K8s / etc). > setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option > however, there's no easy way to retrieve such port assignment. We proposed to > use JMXConnectorServerFactory to explicitly establish a JMXServer inside > *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*. > With the JMXServer explicitly created, we can return the JMXRMI information > via either REST API or WebUI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13181: [FLINK-18957] Implement logical request bulk tracking in SlotSharingExecutionSlotAllocator
flinkbot edited a comment on pull request #13181: URL: https://github.com/apache/flink/pull/13181#issuecomment-675091412 ## CI report: * e7553689356882de1ffe606400d1255d1d757bc4 UNKNOWN * bce753ff7f4da2cffa295bd4007517af4c5697d8 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5926) * 18f88af3b438b13e9a240efd2b4979f841d2b978 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."
[ https://issues.apache.org/jira/browse/FLINK-19012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Khachatryan reassigned FLINK-19012: - Assignee: Roman Khachatryan > E2E test fails with "Cannot register Closeable, this > subtaskCheckpointCoordinator is already closed. Closing argument." > --- > > Key: FLINK-19012 > URL: https://issues.apache.org/jira/browse/FLINK-19012 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Task, Tests >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Roman Khachatryan >Priority: Critical > Labels: test-stability > Fix For: 1.12.0 > > > Note: This error occurred in a custom branch with unreviewed changes. I don't > believe my changes affect this error, but I would keep this in mind when > investigating the error: > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=0d2e35fc-a330-5cf2-a012-7267e2667b1d > > {code} > 2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO > org.apache.flink.runtime.taskmanager.Task[] - Registering > task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING]. > 2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO > org.apache.flink.streaming.runtime.tasks.StreamTask [] - No state > backend has been configured, using default (Memory / JobManager) > MemoryStateBackend (data in heap memory / checkpoints to JobManager) > (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: > 5242880) > 2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO > org.apache.flink.runtime.taskmanager.Task[] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING. > 2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO > org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge > [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ... > 2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO > org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge > [] - Elasticsearch RestHighLevelClient is connected to > [http://127.0.0.1:9200] > 2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO > org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl > [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 > drained requests > 2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO > org.apache.flink.runtime.taskmanager.Task[] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED. > 2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO > org.apache.flink.runtime.taskmanager.Task[] - Freeing > task resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > (cbc357ccb763df2852fee8c4fc7d55f2_0_0). > 2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - > Un-registering task and sending final execution state FINISHED to JobManager > for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) > cbc357ccb763df2852fee8c4fc7d55f2_0_0. > 2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO > org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: > Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of > checkpoint 1 could not be completed. > 2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: > java.io.IOException: Cannot register Closeable, this > subtaskCheckpointCoordinator is already closed. Closing argument. > 2020-08-20T20:55:30.2418956Z at > org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468) > ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > 2020-08-20T20:55:30.2420100Z at > org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91) > [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] > 2020-08-20T20:55:30.2420927Z at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_265] > 2020-08-20T20:55:30.2421455Z at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_265] > 2020-08-20T20:55:30.2421879Z at java.lang.Thread.run(Thread.java:748) > [?:1.8.0_265] >