[jira] [Updated] (FLINK-13184) Support launching task executors with multi-thread on YARN.
[ https://issues.apache.org/jira/browse/FLINK-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-13184: - Fix Version/s: 1.10.0 > Support launching task executors with multi-thread on YARN. > --- > > Key: FLINK-13184 > URL: https://issues.apache.org/jira/browse/FLINK-13184 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.8.1, 1.9.0 >Reporter: Xintong Song >Assignee: Xintong Song >Priority: Major > Fix For: 1.10.0 > > > Currently, YarnResourceManager starts all task executors in main thread. This > could cause RM thread becomes unresponsive when launching a large number of > TEs (e.g. > 1000), leading to TE registration/heartbeat timeouts. > > In Blink, we have a thread pool that RM starts TEs through the YARN NMClient > in separated threads. I think we should add this feature to the Flink master > branch as well. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] xintongsong opened a new pull request #9106: [FLINK-13184][yarn] Support launching task executors with multi-thread on YARN.
xintongsong opened a new pull request #9106: [FLINK-13184][yarn] Support launching task executors with multi-thread on YARN. URL: https://github.com/apache/flink/pull/9106 ## What is the purpose of the change This pull request support starting new TaskExecutors with multi-thread in YarnResourceManager, so when starting a large amount of TaskExecutors it won't the block main thread on RM from responding to other RPC messages (TE registration, heartbeats, etc.). ## Brief change log - b2ed97d0e6dd506602888aacf6146238fbcdfd48: Introduce a config option for the max number of threads on RM used for starting new TaskExecutors. - 06c7209af04f49ca69a263f57fc713f3b9c55c87: Introduce a thread pool in YarnResourceManager and start new TaskExecutors with multi-thread. - 25fc95f30720209e19bd010cdd517ff5e3c685d8: Update relevant test cases to wait for threads starting new TaskExecutors to finish. ## Verifying this change - Updated YarnResourceManagerTest to verify that YarnResourceManager starts new TaskExecutors properly. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wisgood commented on a change in pull request #8621: [FLINK-12682][connectors] StringWriter support custom row delimiter
wisgood commented on a change in pull request #8621: [FLINK-12682][connectors] StringWriter support custom row delimiter URL: https://github.com/apache/flink/pull/8621#discussion_r303232591 ## File path: flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/StringWriter.java ## @@ -82,7 +103,7 @@ public void open(FileSystem fs, Path path) throws IOException { public void write(T element) throws IOException { FSDataOutputStream outputStream = getStream(); outputStream.write(element.toString().getBytes(charset)); - outputStream.write('\n'); + outputStream.write(rowDelimiterBytes); Review comment: @pnowojski Hi , according to your suggestions ,I add a test case! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9106: [FLINK-13184][yarn] Support launching task executors with multi-thread on YARN.
flinkbot commented on issue #9106: [FLINK-13184][yarn] Support launching task executors with multi-thread on YARN. URL: https://github.com/apache/flink/pull/9106#issuecomment-511182862 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-13184) Support launching task executors with multi-thread on YARN.
[ https://issues.apache.org/jira/browse/FLINK-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-13184: --- Labels: pull-request-available (was: ) > Support launching task executors with multi-thread on YARN. > --- > > Key: FLINK-13184 > URL: https://issues.apache.org/jira/browse/FLINK-13184 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.8.1, 1.9.0 >Reporter: Xintong Song >Assignee: Xintong Song >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > > Currently, YarnResourceManager starts all task executors in main thread. This > could cause RM thread becomes unresponsive when launching a large number of > TEs (e.g. > 1000), leading to TE registration/heartbeat timeouts. > > In Blink, we have a thread pool that RM starts TEs through the YARN NMClient > in separated threads. I think we should add this feature to the Flink master > branch as well. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot commented on issue #8621: [FLINK-12682][connectors] StringWriter support custom row delimiter
flinkbot commented on issue #8621: [FLINK-12682][connectors] StringWriter support custom row delimiter URL: https://github.com/apache/flink/pull/8621#issuecomment-511183092 ## CI report: * e21d12fa2c9b6305d90502ae05a9d574ce712fd1 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119056511) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9106: [FLINK-13184][yarn] Support launching task executors with multi-thread on YARN.
flinkbot commented on issue #9106: [FLINK-13184][yarn] Support launching task executors with multi-thread on YARN. URL: https://github.com/apache/flink/pull/9106#issuecomment-511183558 ## CI report: * 25fc95f30720209e19bd010cdd517ff5e3c685d8 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119056693) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8976: [FLINK-12277][table/hive/doc] Add documentation for catalogs
flinkbot edited a comment on issue #8976: [FLINK-12277][table/hive/doc] Add documentation for catalogs URL: https://github.com/apache/flink/pull/8976#issuecomment-511060754 ## CI report: * a6494267595de4d1c63430ec20083b909e50cf9c : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054577) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8827: [FLINK-12928][docs] Remove old Flink ML docs
flinkbot edited a comment on issue #8827: [FLINK-12928][docs] Remove old Flink ML docs URL: https://github.com/apache/flink/pull/8827#issuecomment-510915926 ## CI report: * 4cc5b2c3b960dc661caebf75d4dfdf86c3b1aa18 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054582) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9094: [FLINK-13094][state-processor-api] Provide an easy way to read timers using the State Processor API
flinkbot edited a comment on issue #9094: [FLINK-13094][state-processor-api] Provide an easy way to read timers using the State Processor API URL: https://github.com/apache/flink/pull/9094#issuecomment-510560219 ## CI report: * 68c63247fc6c3b60b0cf7afb483cc04ee8f558d0 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/118811503) * 90313e480878fce12d292065691baee026795e70 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054581) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9067: [FLINK-13069][hive] HiveTableSink should implement OverwritableTableSink
flinkbot edited a comment on issue #9067: [FLINK-13069][hive] HiveTableSink should implement OverwritableTableSink URL: https://github.com/apache/flink/pull/9067#issuecomment-510405753 ## CI report: * 0034a70157b871b401cb1f8cd5a223427cf6223a : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118885081) * 1f192d01c764dbff8ab884512814cc1a4fd80dba : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054583) * e73d6f0155762b8c4c3fff5c742d2644962329ee : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054722) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9035: [FLINK-13168] [table] clarify isBatch/isStreaming/isBounded flag in flink planner and blink planner
flinkbot edited a comment on issue #9035: [FLINK-13168] [table] clarify isBatch/isStreaming/isBounded flag in flink planner and blink planner URL: https://github.com/apache/flink/pull/9035#issuecomment-510801620 ## CI report: * dac226605ba8018c4c18c27624260c9e3d1f9eaf : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118897652) * a7be6ce6aae6d2229f81923846ac6dd08d3a271e : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119054584) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226534 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -309,24 +279,17 @@ similar to this: 4> (KasparBot,-245) {% endhighlight %} -The number in front of each line tells you on which parallel instance of the print sink the output -was produced. +每行数据前面的数字代表着打印接收器在哪个并行实例上产生的输出数据。 -This should get you started with writing your own Flink programs. To learn more -you can check out our guides -about [basic concepts]({{ site.baseurl }}/dev/api_concepts.html) and the -[DataStream API]({{ site.baseurl }}/dev/datastream_api.html). Stick -around for the bonus exercise if you want to learn about setting up a Flink cluster on -your own machine and writing results to [Kafka](http://kafka.apache.org). +这可以让你开始创建你自己的 Flink 项目。你可以查看[基本概念]({{ site.baseurl }}/zh/dev/api_concepts.html)和[DataStream API] +({{ site.baseurl }}/zh/dev/datastream_api.html)指南。如果你想学习了解更多关于 Flink 集群安装以及写入数据到 [Kafka](http://kafka.apache.org), Review comment: `[DataStream API]` and `({{ site. baseurl }}.` have to be on the same line. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225649 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -26,19 +26,14 @@ under the License. * This will be replaced by the TOC {:toc} -In this guide we will start from scratch and go from setting up a Flink project to running -a streaming analysis program on a Flink cluster. +在本节指南中,我们将从零开始创建一个在 flink 集群上面进行流分析的 Flink 项目。 -Wikipedia provides an IRC channel where all edits to the wiki are logged. We are going to -read this channel in Flink and count the number of bytes that each user edits within -a given window of time. This is easy enough to implement in a few minutes using Flink, but it will -give you a good foundation from which to start building more complex analysis programs on your own. +维基百科提供了一个能够记录所有对 wiki 编辑的 IRC 通道。我们将使用 Flink 读取该通道的数据,同时 Review comment: ```suggestion 维基百科提供了一个记录所有 wiki 编辑历史的 IRC 通道。我们将使用 Flink 读取该通道的数据,同时 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226207 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -131,32 +120,24 @@ public class WikipediaAnalysis { } {% endhighlight %} -The program is very basic now, but we will fill it in as we go. Note that I'll not give -import statements here since IDEs can add them automatically. At the end of this section I'll show -the complete code with import statements if you simply want to skip ahead and enter that in your -editor. +这个程序现在很基础,但我们会边做边进行补充。注意我不会给出导入语句,因为 IDE 会自动添加它们。在本节的最后,我将展示带有导入语句的完整代码 +如果您只是想跳过并在您的编辑器中编辑他们。 -The first step in a Flink program is to create a `StreamExecutionEnvironment` -(or `ExecutionEnvironment` if you are writing a batch job). This can be used to set execution -parameters and create sources for reading from external systems. So let's go ahead and add -this to the main method: +在一个 Flink 程序中,首先你需要创建一个 `StreamExecutionEnvironment` (或者处理批作业环境的 `ExecutionEnvironment`)。这可以用来设置程序运行参数,同时也能够创建从外部系统读取的源。我们把这个添加到 main 方法中: {% highlight java %} StreamExecutionEnvironment see = StreamExecutionEnvironment.getExecutionEnvironment(); {% endhighlight %} -Next we will create a source that reads from the Wikipedia IRC log: +接下来我们将创建一个读取维基百科 IRC 数据源: Review comment: ```suggestion 接下来我们将创建一个读取维基百科 IRC 数据的源: ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303235400 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -309,24 +279,17 @@ similar to this: 4> (KasparBot,-245) {% endhighlight %} -The number in front of each line tells you on which parallel instance of the print sink the output -was produced. +每行数据前面的数字代表着打印接收器在哪个并行实例上产生的输出数据。 Review comment: `每行数据前面的数字代表着打印接收器运行的并实例`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226045 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -112,13 +103,11 @@ use it in our program. Edit the `dependencies` section of the `pom.xml` so that {% endhighlight %} -Notice the `flink-connector-wikiedits_2.11` dependency that was added. (This example and -the Wikipedia connector were inspired by the *Hello Samza* example of Apache Samza.) +注意加入 `flink-connector-wikiedits_2.11` 依赖。(这个例子和维基百科连接器的灵感来自于 Apache Samza *Hello Samza* 示例。) -## Writing a Flink Program +## 编写 Flink 程序 -It's coding time. Fire up your favorite IDE and import the Maven project or open a text editor and -create the file `src/main/java/wikiedits/WikipediaAnalysis.java`: +现在是编程时间。启动你最喜欢的 IDE 并导入 Maven 项目或打开文本编辑器创建文件 `src/main/java/wikiedits/WikipediaAnalysis.java`: Review comment: ```suggestion 现在是编程时间。启动你最喜欢的 IDE 并导入 Maven 项目或打开文本编辑器,然后创建文件 `src/main/java/wikiedits/WikipediaAnalysis.java`: ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226156 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -131,32 +120,24 @@ public class WikipediaAnalysis { } {% endhighlight %} -The program is very basic now, but we will fill it in as we go. Note that I'll not give -import statements here since IDEs can add them automatically. At the end of this section I'll show -the complete code with import statements if you simply want to skip ahead and enter that in your -editor. +这个程序现在很基础,但我们会边做边进行补充。注意我不会给出导入语句,因为 IDE 会自动添加它们。在本节的最后,我将展示带有导入语句的完整代码 +如果您只是想跳过并在您的编辑器中编辑他们。 Review comment: `,如果需要你可以将他们复制到你的编辑器中`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226551 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -309,24 +279,17 @@ similar to this: 4> (KasparBot,-245) {% endhighlight %} -The number in front of each line tells you on which parallel instance of the print sink the output -was produced. +每行数据前面的数字代表着打印接收器在哪个并行实例上产生的输出数据。 -This should get you started with writing your own Flink programs. To learn more -you can check out our guides -about [basic concepts]({{ site.baseurl }}/dev/api_concepts.html) and the -[DataStream API]({{ site.baseurl }}/dev/datastream_api.html). Stick -around for the bonus exercise if you want to learn about setting up a Flink cluster on -your own machine and writing results to [Kafka](http://kafka.apache.org). +这可以让你开始创建你自己的 Flink 项目。你可以查看[基本概念]({{ site.baseurl }}/zh/dev/api_concepts.html)和[DataStream API] +({{ site.baseurl }}/zh/dev/datastream_api.html)指南。如果你想学习了解更多关于 Flink 集群安装以及写入数据到 [Kafka](http://kafka.apache.org), +你可以自己多加以练习尝试。 -## Bonus Exercise: Running on a Cluster and Writing to Kafka +## 额外练习: 集群运行并输出 Kafka -Please follow our [local setup tutorial](local_setup.html) for setting up a Flink distribution -on your machine and refer to the [Kafka quickstart](https://kafka.apache.org/0110/documentation.html#quickstart) -for setting up a Kafka installation before we proceed. +请按照我们的[本地安装教程](local_setup.html)在你的机器上构建一个Flink分布式环境,同时参考[Kafka快速指南](https://kafka.apache.org/0110/documentation.html#quickstart)安装一个我们需要使用的Kafka环境。 Review comment: ```suggestion 请按照我们的[本地安装教程](local_setup.html)在你的机器上构建一个Flink分布式环境,同时参考 [Kafka快速指南](https://kafka.apache.org/0110/documentation.html#quickstart)安装一个我们需要使用的Kafka环境。 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226098 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -404,27 +361,22 @@ The output of that command should look similar to this, if everything went accor 03/08/2016 15:09:27 Source: Custom Source(1/1) switched to RUNNING {% endhighlight %} -You can see how the individual operators start running. There are only two, because -the operations after the window get folded into one operation for performance reasons. In Flink -we call this *chaining*. +你可以看到每个算子是如何运行的。这里只有两个,出于性能原因的考虑,窗口后的操作会被链接成一个操作。在Flink中我们称它为 *链接*。 -You can observe the output of the program by inspecting the Kafka topic using the Kafka -console consumer: +你可以通过 Kafka 控制台来检查项目输出到 Kafka 主题的情况 {% highlight bash %} bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic wiki-result {% endhighlight %} -You can also check out the Flink dashboard which should be running at [http://localhost:8081](http://localhost:8081). -You get an overview of your cluster resources and running jobs: +你还可以查看运行在[http://localhost:8081](http://localhost:8081)上的 Flink 作业仪表盘。你可以概览集群资源以及正在运行的作业: - + -If you click on your running job you will get a view where you can inspect individual operations -and, for example, see the number of processed elements: +如果你点击正在运行的作业,你将会得到一个可以检查各个操作的视图,比如说,你可以查看处理过的元素数量: - + -This concludes our little tour of Flink. If you have any questions, please don't hesitate to ask on our [Mailing Lists](http://flink.apache.org/community.html#mailing-lists). +这就结束了 Flink 项目构建之旅. 如果你有任何问题, 你可以在我们的 [邮件组](http://flink.apache.org/community.html#mailing-lists)提出. Review comment: ```suggestion 这就结束了 Flink 项目构建之旅. 如果你有任何问题, 可以在我们的[邮件组](http://flink.apache.org/community.html#mailing-lists)提出. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226335 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -131,32 +120,24 @@ public class WikipediaAnalysis { } {% endhighlight %} -The program is very basic now, but we will fill it in as we go. Note that I'll not give -import statements here since IDEs can add them automatically. At the end of this section I'll show -the complete code with import statements if you simply want to skip ahead and enter that in your -editor. +这个程序现在很基础,但我们会边做边进行补充。注意我不会给出导入语句,因为 IDE 会自动添加它们。在本节的最后,我将展示带有导入语句的完整代码 +如果您只是想跳过并在您的编辑器中编辑他们。 -The first step in a Flink program is to create a `StreamExecutionEnvironment` -(or `ExecutionEnvironment` if you are writing a batch job). This can be used to set execution -parameters and create sources for reading from external systems. So let's go ahead and add -this to the main method: +在一个 Flink 程序中,首先你需要创建一个 `StreamExecutionEnvironment` (或者处理批作业环境的 `ExecutionEnvironment`)。这可以用来设置程序运行参数,同时也能够创建从外部系统读取的源。我们把这个添加到 main 方法中: {% highlight java %} StreamExecutionEnvironment see = StreamExecutionEnvironment.getExecutionEnvironment(); {% endhighlight %} -Next we will create a source that reads from the Wikipedia IRC log: +接下来我们将创建一个读取维基百科 IRC 数据源: {% highlight java %} DataStream edits = see.addSource(new WikipediaEditsSource()); {% endhighlight %} -This creates a `DataStream` of `WikipediaEditEvent` elements that we can further process. For -the purposes of this example we are interested in determining the number of added or removed -bytes that each user causes in a certain time window, let's say five seconds. For this we first -have to specify that we want to key the stream on the user name, that is to say that operations -on this stream should take the user name into account. In our case the summation of edited bytes in the windows -should be per unique user. For keying a Stream we have to provide a `KeySelector`, like this: +上面代码创建了一个 `WikipediaEditEvent` 事件的`DataStream`,我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数,比如5秒一个时间窗口。首先 Review comment: ```suggestion 上面代码创建了一个 `WikipediaEditEvent` 事件的 `DataStream`,我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数,比如 5 秒一个时间窗口。首先 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225846 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -76,16 +70,13 @@ wiki-edits/ └── log4j.properties {% endhighlight %} -There is our `pom.xml` file that already has the Flink dependencies added in the root directory and -several example Flink programs in `src/main/java`. We can delete the example programs, since -we are going to start from scratch: +项目根目录下的 `pom.xml` 文件已经将 Flink 依赖添加进来,同时在 `src/main/java` 目录下也有几个 Flink 程序实例。由于我们从头开始创建,我们可以删除程序实例: Review comment: "Flink 依赖已经添加到根目录下的 `pom.xml` 文件中"? `Flink 程序实例` -> `Flink 实例程序`? `由于我们将从头开始创建,因此可以删除这些实例程序`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225672 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -26,19 +26,14 @@ under the License. * This will be replaced by the TOC {:toc} -In this guide we will start from scratch and go from setting up a Flink project to running -a streaming analysis program on a Flink cluster. +在本节指南中,我们将从零开始创建一个在 flink 集群上面进行流分析的 Flink 项目。 -Wikipedia provides an IRC channel where all edits to the wiki are logged. We are going to -read this channel in Flink and count the number of bytes that each user edits within -a given window of time. This is easy enough to implement in a few minutes using Flink, but it will -give you a good foundation from which to start building more complex analysis programs on your own. +维基百科提供了一个能够记录所有对 wiki 编辑的 IRC 通道。我们将使用 Flink 读取该通道的数据,同时 +在给定的时间窗口,计算出每个用户在其中编辑的字节数。这使用 Flink 很容易就能实现,但它会为你提供一个良好的基础去开始构建你自己更为复杂的分析程序。 Review comment: `计算出每个用户在给定时间窗口内的编辑字节数`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225918 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -76,16 +70,13 @@ wiki-edits/ └── log4j.properties {% endhighlight %} -There is our `pom.xml` file that already has the Flink dependencies added in the root directory and -several example Flink programs in `src/main/java`. We can delete the example programs, since -we are going to start from scratch: +项目根目录下的 `pom.xml` 文件已经将 Flink 依赖添加进来,同时在 `src/main/java` 目录下也有几个 Flink 程序实例。由于我们从头开始创建,我们可以删除程序实例: {% highlight bash %} $ rm wiki-edits/src/main/java/wikiedits/*.java {% endhighlight %} -As a last step we need to add the Flink Wikipedia connector as a dependency so that we can -use it in our program. Edit the `dependencies` section of the `pom.xml` so that it looks like this: +作为最后一步,我们需要添加 Flink 维基百科连接器作为依赖项,这样就可以在我们的项目中进行使用。编辑 `pom.xml` 的 `dependencies` 部分,使它看起来像这样: Review comment: ```suggestion 作为最后一步,我们需要添加 Flink 维基百科连接器的依赖,从而可以在项目中进行使用。修改 `pom.xml` 的 `dependencies` 部分,使它看起来像这样: ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225808 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -59,8 +54,7 @@ $ mvn archetype:generate \ {% endunless %} -You can edit the `groupId`, `artifactId` and `package` if you like. With the above parameters, -Maven will create a project structure that looks like this: +你可以根据自己需求编辑 `groupId`、`artifactId` 以及 `package`。对于上面的参数,Maven 将会创建一个这样的项目结构: Review comment: "你可以按需修改 `groupId`、`artifactId` 以及 `package`"? `对于上面的参数,Maven 将会创建一个这样的项目结构` seems a little odd to me, do you think we can make it better? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225778 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -404,27 +361,22 @@ The output of that command should look similar to this, if everything went accor 03/08/2016 15:09:27 Source: Custom Source(1/1) switched to RUNNING {% endhighlight %} -You can see how the individual operators start running. There are only two, because -the operations after the window get folded into one operation for performance reasons. In Flink -we call this *chaining*. +你可以看到每个算子是如何运行的。这里只有两个,出于性能原因的考虑,窗口后的操作会被链接成一个操作。在Flink中我们称它为 *链接*。 -You can observe the output of the program by inspecting the Kafka topic using the Kafka -console consumer: +你可以通过 Kafka 控制台来检查项目输出到 Kafka 主题的情况 {% highlight bash %} bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic wiki-result {% endhighlight %} -You can also check out the Flink dashboard which should be running at [http://localhost:8081](http://localhost:8081). -You get an overview of your cluster resources and running jobs: +你还可以查看运行在[http://localhost:8081](http://localhost:8081)上的 Flink 作业仪表盘。你可以概览集群资源以及正在运行的作业: - + -If you click on your running job you will get a view where you can inspect individual operations -and, for example, see the number of processed elements: +如果你点击正在运行的作业,你将会得到一个可以检查各个操作的视图,比如说,你可以查看处理过的元素数量: - + Review comment: maybe we do not change the url of image? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226343 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -131,32 +120,24 @@ public class WikipediaAnalysis { } {% endhighlight %} -The program is very basic now, but we will fill it in as we go. Note that I'll not give -import statements here since IDEs can add them automatically. At the end of this section I'll show -the complete code with import statements if you simply want to skip ahead and enter that in your -editor. +这个程序现在很基础,但我们会边做边进行补充。注意我不会给出导入语句,因为 IDE 会自动添加它们。在本节的最后,我将展示带有导入语句的完整代码 +如果您只是想跳过并在您的编辑器中编辑他们。 -The first step in a Flink program is to create a `StreamExecutionEnvironment` -(or `ExecutionEnvironment` if you are writing a batch job). This can be used to set execution -parameters and create sources for reading from external systems. So let's go ahead and add -this to the main method: +在一个 Flink 程序中,首先你需要创建一个 `StreamExecutionEnvironment` (或者处理批作业环境的 `ExecutionEnvironment`)。这可以用来设置程序运行参数,同时也能够创建从外部系统读取的源。我们把这个添加到 main 方法中: {% highlight java %} StreamExecutionEnvironment see = StreamExecutionEnvironment.getExecutionEnvironment(); {% endhighlight %} -Next we will create a source that reads from the Wikipedia IRC log: +接下来我们将创建一个读取维基百科 IRC 数据源: {% highlight java %} DataStream edits = see.addSource(new WikipediaEditsSource()); {% endhighlight %} -This creates a `DataStream` of `WikipediaEditEvent` elements that we can further process. For -the purposes of this example we are interested in determining the number of added or removed -bytes that each user causes in a certain time window, let's say five seconds. For this we first -have to specify that we want to key the stream on the user name, that is to say that operations -on this stream should take the user name into account. In our case the summation of edited bytes in the windows -should be per unique user. For keying a Stream we have to provide a `KeySelector`, like this: +上面代码创建了一个 `WikipediaEditEvent` 事件的`DataStream`,我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数,比如5秒一个时间窗口。首先 +我们必须指定用户名来划分我们的数据流,也就是说这个流上的操作应该考虑用户名。 Review comment: `根据用户名来划分`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226389 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -131,32 +120,24 @@ public class WikipediaAnalysis { } {% endhighlight %} -The program is very basic now, but we will fill it in as we go. Note that I'll not give -import statements here since IDEs can add them automatically. At the end of this section I'll show -the complete code with import statements if you simply want to skip ahead and enter that in your -editor. +这个程序现在很基础,但我们会边做边进行补充。注意我不会给出导入语句,因为 IDE 会自动添加它们。在本节的最后,我将展示带有导入语句的完整代码 +如果您只是想跳过并在您的编辑器中编辑他们。 -The first step in a Flink program is to create a `StreamExecutionEnvironment` -(or `ExecutionEnvironment` if you are writing a batch job). This can be used to set execution -parameters and create sources for reading from external systems. So let's go ahead and add -this to the main method: +在一个 Flink 程序中,首先你需要创建一个 `StreamExecutionEnvironment` (或者处理批作业环境的 `ExecutionEnvironment`)。这可以用来设置程序运行参数,同时也能够创建从外部系统读取的源。我们把这个添加到 main 方法中: {% highlight java %} StreamExecutionEnvironment see = StreamExecutionEnvironment.getExecutionEnvironment(); {% endhighlight %} -Next we will create a source that reads from the Wikipedia IRC log: +接下来我们将创建一个读取维基百科 IRC 数据源: {% highlight java %} DataStream edits = see.addSource(new WikipediaEditsSource()); {% endhighlight %} -This creates a `DataStream` of `WikipediaEditEvent` elements that we can further process. For -the purposes of this example we are interested in determining the number of added or removed -bytes that each user causes in a certain time window, let's say five seconds. For this we first -have to specify that we want to key the stream on the user name, that is to say that operations -on this stream should take the user name into account. In our case the summation of edited bytes in the windows -should be per unique user. For keying a Stream we have to provide a `KeySelector`, like this: +上面代码创建了一个 `WikipediaEditEvent` 事件的`DataStream`,我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数,比如5秒一个时间窗口。首先 +我们必须指定用户名来划分我们的数据流,也就是说这个流上的操作应该考虑用户名。 +在我们这个统计窗口编辑的字节数的例子中,每个用户应该唯一的。对于划分一个数据流,我们必须提供一个 `KeySelector`,像这样: Review comment: I think here does not mean "每个用户应该是唯一的", It means "每个不同的用户每个窗口都应该计算一个结果" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225323 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -26,19 +26,14 @@ under the License. * This will be replaced by the TOC {:toc} -In this guide we will start from scratch and go from setting up a Flink project to running -a streaming analysis program on a Flink cluster. +在本节指南中,我们将从零开始创建一个在 flink 集群上面进行流分析的 Flink 项目。 Review comment: ```suggestion 在本节指南中,我们将在 Flink 集群上从零开始创建一个流分析项目。 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225686 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -26,19 +26,14 @@ under the License. * This will be replaced by the TOC {:toc} -In this guide we will start from scratch and go from setting up a Flink project to running -a streaming analysis program on a Flink cluster. +在本节指南中,我们将从零开始创建一个在 flink 集群上面进行流分析的 Flink 项目。 -Wikipedia provides an IRC channel where all edits to the wiki are logged. We are going to -read this channel in Flink and count the number of bytes that each user edits within -a given window of time. This is easy enough to implement in a few minutes using Flink, but it will -give you a good foundation from which to start building more complex analysis programs on your own. +维基百科提供了一个能够记录所有对 wiki 编辑的 IRC 通道。我们将使用 Flink 读取该通道的数据,同时 +在给定的时间窗口,计算出每个用户在其中编辑的字节数。这使用 Flink 很容易就能实现,但它会为你提供一个良好的基础去开始构建你自己更为复杂的分析程序。 -## Setting up a Maven Project +## 创建一个 Maven 项目 -We are going to use a Flink Maven Archetype for creating our project structure. Please -see [Java API Quickstart]({{ site.baseurl }}/dev/projectsetup/java_api_quickstart.html) for more details -about this. For our purposes, the command to run is this: +我们准备使用 Flink Maven Archetype 创建项目结构。更多细节请查看[Java API 快速指南]({{ site.baseurl }}/zh/dev/projectsetup/java_api_quickstart.html)。项目运行命令如下: Review comment: ```suggestion 我们准备使用 Flink Maven Archetype 创建项目结构。更多细节请查看 [Java API 快速指南]({{ site.baseurl }}/zh/dev/projectsetup/java_api_quickstart.html)。项目运行命令如下: ``` do we need to translate `Maven Archetype` here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226199 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -131,32 +120,24 @@ public class WikipediaAnalysis { } {% endhighlight %} -The program is very basic now, but we will fill it in as we go. Note that I'll not give -import statements here since IDEs can add them automatically. At the end of this section I'll show -the complete code with import statements if you simply want to skip ahead and enter that in your -editor. +这个程序现在很基础,但我们会边做边进行补充。注意我不会给出导入语句,因为 IDE 会自动添加它们。在本节的最后,我将展示带有导入语句的完整代码 +如果您只是想跳过并在您的编辑器中编辑他们。 -The first step in a Flink program is to create a `StreamExecutionEnvironment` -(or `ExecutionEnvironment` if you are writing a batch job). This can be used to set execution -parameters and create sources for reading from external systems. So let's go ahead and add -this to the main method: +在一个 Flink 程序中,首先你需要创建一个 `StreamExecutionEnvironment` (或者处理批作业环境的 `ExecutionEnvironment`)。这可以用来设置程序运行参数,同时也能够创建从外部系统读取的源。我们把这个添加到 main 方法中: Review comment: `这可以用来设置程序运行参数、创建从外部系统读取的源`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303225765 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -59,8 +54,7 @@ $ mvn archetype:generate \ Review comment: I think we need to translate the `Note` also This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226491 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -203,26 +180,20 @@ DataStream> result = keyedEdits }); {% endhighlight %} -The first call, `.timeWindow()`, specifies that we want to have tumbling (non-overlapping) windows -of five seconds. The second call specifies a *Aggregate transformation* on each window slice for -each unique key. In our case we start from an initial value of `("", 0L)` and add to it the byte -difference of every edit in that time window for a user. The resulting Stream now contains -a `Tuple2` for every user which gets emitted every five seconds. +首先调用 `.timeWindow()` 方法指定五秒翻滚(非重叠)窗口。第二个调用方法对于每一个唯一关键字指定每个窗口片`聚合转换`。 +在本例中,我们从`("",0L)`初始值开始,并将每个用户编辑的字节添加到该时间窗口中。对于每个用户来说,结果流现在包含的元素为 `Tuple2`,它每5秒发出一次。 -The only thing left to do is print the stream to the console and start execution: +唯一剩下要做的就是将打印流输出到控制台并开始执行: Review comment: ```suggestion 唯一剩下的就是将结果输出到控制台并开始执行: ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226123 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -131,32 +120,24 @@ public class WikipediaAnalysis { } {% endhighlight %} -The program is very basic now, but we will fill it in as we go. Note that I'll not give -import statements here since IDEs can add them automatically. At the end of this section I'll show -the complete code with import statements if you simply want to skip ahead and enter that in your -editor. +这个程序现在很基础,但我们会边做边进行补充。注意我不会给出导入语句,因为 IDE 会自动添加它们。在本节的最后,我将展示带有导入语句的完整代码 Review comment: `边做边完善`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226693 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -168,12 +149,8 @@ KeyedStream keyedEdits = edits }); {% endhighlight %} -This gives us a Stream of `WikipediaEditEvent` that has a `String` key, the user name. -We can now specify that we want to have windows imposed on this stream and compute a -result based on elements in these windows. A window specifies a slice of a Stream -on which to perform a computation. Windows are required when computing aggregations -on an infinite stream of elements. In our example we will say -that we want to aggregate the sum of edited bytes for every five seconds: +这给了我们一个 `WikipediaEditEvent` 数据流,它有一个 `String` 键,即用户名。 Review comment: maybe we can have a better translation for this paragraph. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303226595 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -404,27 +361,22 @@ The output of that command should look similar to this, if everything went accor 03/08/2016 15:09:27 Source: Custom Source(1/1) switched to RUNNING {% endhighlight %} -You can see how the individual operators start running. There are only two, because -the operations after the window get folded into one operation for performance reasons. In Flink -we call this *chaining*. +你可以看到每个算子是如何运行的。这里只有两个,出于性能原因的考虑,窗口后的操作会被链接成一个操作。在Flink中我们称它为 *链接*。 -You can observe the output of the program by inspecting the Kafka topic using the Kafka -console consumer: +你可以通过 Kafka 控制台来检查项目输出到 Kafka 主题的情况 {% highlight bash %} bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic wiki-result {% endhighlight %} -You can also check out the Flink dashboard which should be running at [http://localhost:8081](http://localhost:8081). -You get an overview of your cluster resources and running jobs: +你还可以查看运行在[http://localhost:8081](http://localhost:8081)上的 Flink 作业仪表盘。你可以概览集群资源以及正在运行的作业: Review comment: ```suggestion 你还可以查看运行在 [http://localhost:8081](http://localhost:8081) 上的 Flink 作业仪表盘。你可以概览集群资源以及正在运行的作业: ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese
klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/9097#discussion_r303235342 ## File path: docs/getting-started/tutorials/datastream_api.zh.md ## @@ -309,24 +279,17 @@ similar to this: 4> (KasparBot,-245) {% endhighlight %} -The number in front of each line tells you on which parallel instance of the print sink the output -was produced. +每行数据前面的数字代表着打印接收器在哪个并行实例上产生的输出数据。 -This should get you started with writing your own Flink programs. To learn more -you can check out our guides -about [basic concepts]({{ site.baseurl }}/dev/api_concepts.html) and the -[DataStream API]({{ site.baseurl }}/dev/datastream_api.html). Stick -around for the bonus exercise if you want to learn about setting up a Flink cluster on -your own machine and writing results to [Kafka](http://kafka.apache.org). +这可以让你开始创建你自己的 Flink 项目。你可以查看[基本概念]({{ site.baseurl }}/zh/dev/api_concepts.html)和[DataStream API] +({{ site.baseurl }}/zh/dev/datastream_api.html)指南。如果你想学习了解更多关于 Flink 集群安装以及写入数据到 [Kafka](http://kafka.apache.org), +你可以自己多加以练习尝试。 Review comment: where is the source of this translation? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8742: [FLINK-11879] Add validators for the uses of InputSelectable, BoundedOneInput and BoundedMultiInput
flinkbot edited a comment on issue #8742: [FLINK-11879] Add validators for the uses of InputSelectable, BoundedOneInput and BoundedMultiInput URL: https://github.com/apache/flink/pull/8742#issuecomment-510731561 ## CI report: * 3f0c15862fc70f35cd58883ca9635bde1a5fb7ee : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118876288) * e9adf752da210ededdcebbd1ba3753c3b689cf3e : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119054586) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager
flinkbot edited a comment on issue #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager URL: https://github.com/apache/flink/pull/8471#issuecomment-510870797 ## CI report: * 3a6874ec1f1e40441b068868a16570e0b96f083c : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054587) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9039: [FLINK-13170][table-planner] Planner should get table factory from ca…
flinkbot edited a comment on issue #9039: [FLINK-13170][table-planner] Planner should get table factory from ca… URL: https://github.com/apache/flink/pull/9039#issuecomment-510445729 ## CI report: * a14e955ab6082bbd08fcf9b28a654ab771a57fb2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118899030) * 0e58a3aa301da884b3859a7d7e653b964903f32f : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119054588) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9049: [FLINK-13176][SQL CLI] remember current catalog and database in SQL CLI SessionContext
flinkbot edited a comment on issue #9049: [FLINK-13176][SQL CLI] remember current catalog and database in SQL CLI SessionContext URL: https://github.com/apache/flink/pull/9049#issuecomment-510698190 ## CI report: * 0cdf8186440e2446515c463d1a214ca1505f7a22 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118865431) * 25e8c5e8b7c481aa97976a9b8312e624392abd16 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054595) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8920: [FLINK-13024][table] integrate FunctionCatalog with CatalogManager
flinkbot edited a comment on issue #8920: [FLINK-13024][table] integrate FunctionCatalog with CatalogManager URL: https://github.com/apache/flink/pull/8920#issuecomment-510405859 ## CI report: * 4afedee15460ac0f1f2945ca657581c538ddfc06 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118723073) * f639acfa778cc8e31581107f27e3cf0139e3a98d : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/118956583) * d1aa3f20fd038d7e0177599671bf31c830426982 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119054596) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #7487: [FLINK-11321] Clarify NPE on fetching nonexistent topic
flinkbot edited a comment on issue #7487: [FLINK-11321] Clarify NPE on fetching nonexistent topic URL: https://github.com/apache/flink/pull/7487#issuecomment-510739413 ## CI report: * c746405bfc3264fec2e9a5a6551360e37aa5688f : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/118879011) * be567080699e278af084886da36fb8369fa3fc13 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119054600) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9104: [HOXFIX][mvn] upgrade frontend-maven-plugin version to 1.7.5
flinkbot edited a comment on issue #9104: [HOXFIX][mvn] upgrade frontend-maven-plugin version to 1.7.5 URL: https://github.com/apache/flink/pull/9104#issuecomment-510895464 ## CI report: * 1d3eb46ab1b663b59439c32011c7f959b64c18d1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119054601) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9088: [FLINK-13012][hive] Handle default partition name of Hive table
flinkbot edited a comment on issue #9088: [FLINK-13012][hive] Handle default partition name of Hive table URL: https://github.com/apache/flink/pull/9088#issuecomment-510484364 ## CI report: * eac5f74690ddb0b08cb41b029f5b8ac675e63565 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118891245) * b22f836f5e8f95a9f376f54c68798eeb14cb1644 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054589) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9096: [hotfix][table-planner-blink] Also set batch properties in BatchExecutor
flinkbot edited a comment on issue #9096: [hotfix][table-planner-blink] Also set batch properties in BatchExecutor URL: https://github.com/apache/flink/pull/9096#issuecomment-510724774 ## CI report: * d75ca929b19b054d56332b13374bb3570ccdd179 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/118873953) * 2210b1aa33ad28cd919b0ea2603e2059e7b7c859 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054590) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8903: [FLINK-12747][docs] Getting Started - Table API Example Walkthrough
flinkbot edited a comment on issue #8903: [FLINK-12747][docs] Getting Started - Table API Example Walkthrough URL: https://github.com/apache/flink/pull/8903#issuecomment-510464651 ## CI report: * b2821a6ae97fd943f3a66b672e85fbd2374126c4 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/118909729) * 0699f7e5f2240a4a1bc44c15f08e6a1df47d3b01 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119054579) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9096: [hotfix][table-planner-blink] Also set batch properties in BatchExecutor
flinkbot edited a comment on issue #9096: [hotfix][table-planner-blink] Also set batch properties in BatchExecutor URL: https://github.com/apache/flink/pull/9096#issuecomment-510724774 ## CI report: * d75ca929b19b054d56332b13374bb3570ccdd179 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/118873953) * 2210b1aa33ad28cd919b0ea2603e2059e7b7c859 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054590) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8827: [FLINK-12928][docs] Remove old Flink ML docs
flinkbot edited a comment on issue #8827: [FLINK-12928][docs] Remove old Flink ML docs URL: https://github.com/apache/flink/pull/8827#issuecomment-510915926 ## CI report: * 4cc5b2c3b960dc661caebf75d4dfdf86c3b1aa18 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054582) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9094: [FLINK-13094][state-processor-api] Provide an easy way to read timers using the State Processor API
flinkbot edited a comment on issue #9094: [FLINK-13094][state-processor-api] Provide an easy way to read timers using the State Processor API URL: https://github.com/apache/flink/pull/9094#issuecomment-510560219 ## CI report: * 68c63247fc6c3b60b0cf7afb483cc04ee8f558d0 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/118811503) * 90313e480878fce12d292065691baee026795e70 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054581) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9049: [FLINK-13176][SQL CLI] remember current catalog and database in SQL CLI SessionContext
flinkbot edited a comment on issue #9049: [FLINK-13176][SQL CLI] remember current catalog and database in SQL CLI SessionContext URL: https://github.com/apache/flink/pull/9049#issuecomment-510698190 ## CI report: * 0cdf8186440e2446515c463d1a214ca1505f7a22 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118865431) * 25e8c5e8b7c481aa97976a9b8312e624392abd16 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054595) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8976: [FLINK-12277][table/hive/doc] Add documentation for catalogs
flinkbot edited a comment on issue #8976: [FLINK-12277][table/hive/doc] Add documentation for catalogs URL: https://github.com/apache/flink/pull/8976#issuecomment-511060754 ## CI report: * a6494267595de4d1c63430ec20083b909e50cf9c : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054577) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8920: [FLINK-13024][table] integrate FunctionCatalog with CatalogManager
flinkbot edited a comment on issue #8920: [FLINK-13024][table] integrate FunctionCatalog with CatalogManager URL: https://github.com/apache/flink/pull/8920#issuecomment-510405859 ## CI report: * 4afedee15460ac0f1f2945ca657581c538ddfc06 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118723073) * f639acfa778cc8e31581107f27e3cf0139e3a98d : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/118956583) * d1aa3f20fd038d7e0177599671bf31c830426982 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054596) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager
flinkbot edited a comment on issue #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager URL: https://github.com/apache/flink/pull/8471#issuecomment-510870797 ## CI report: * 3a6874ec1f1e40441b068868a16570e0b96f083c : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054587) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9067: [FLINK-13069][hive] HiveTableSink should implement OverwritableTableSink
flinkbot edited a comment on issue #9067: [FLINK-13069][hive] HiveTableSink should implement OverwritableTableSink URL: https://github.com/apache/flink/pull/9067#issuecomment-510405753 ## CI report: * 0034a70157b871b401cb1f8cd5a223427cf6223a : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118885081) * 1f192d01c764dbff8ab884512814cc1a4fd80dba : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054583) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9035: [FLINK-13168] [table] clarify isBatch/isStreaming/isBounded flag in flink planner and blink planner
flinkbot edited a comment on issue #9035: [FLINK-13168] [table] clarify isBatch/isStreaming/isBounded flag in flink planner and blink planner URL: https://github.com/apache/flink/pull/9035#issuecomment-510801620 ## CI report: * dac226605ba8018c4c18c27624260c9e3d1f9eaf : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118897652) * a7be6ce6aae6d2229f81923846ac6dd08d3a271e : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054584) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #7487: [FLINK-11321] Clarify NPE on fetching nonexistent topic
flinkbot edited a comment on issue #7487: [FLINK-11321] Clarify NPE on fetching nonexistent topic URL: https://github.com/apache/flink/pull/7487#issuecomment-510739413 ## CI report: * c746405bfc3264fec2e9a5a6551360e37aa5688f : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/118879011) * be567080699e278af084886da36fb8369fa3fc13 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054600) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8742: [FLINK-11879] Add validators for the uses of InputSelectable, BoundedOneInput and BoundedMultiInput
flinkbot edited a comment on issue #8742: [FLINK-11879] Add validators for the uses of InputSelectable, BoundedOneInput and BoundedMultiInput URL: https://github.com/apache/flink/pull/8742#issuecomment-510731561 ## CI report: * 3f0c15862fc70f35cd58883ca9635bde1a5fb7ee : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118876288) * e9adf752da210ededdcebbd1ba3753c3b689cf3e : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054586) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9104: [HOXFIX][mvn] upgrade frontend-maven-plugin version to 1.7.5
flinkbot edited a comment on issue #9104: [HOXFIX][mvn] upgrade frontend-maven-plugin version to 1.7.5 URL: https://github.com/apache/flink/pull/9104#issuecomment-510895464 ## CI report: * 1d3eb46ab1b663b59439c32011c7f959b64c18d1 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054601) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9089: [FLINK-13225][table-planner-blink] Introduce type inference for hive functions in blink
flinkbot edited a comment on issue #9089: [FLINK-13225][table-planner-blink] Introduce type inference for hive functions in blink URL: https://github.com/apache/flink/pull/9089#issuecomment-510488226 ## CI report: * fb34a0f4245ddac5872ea77aad07887a6ff12d11 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118890132) * ba44069acdbd82261839605b5d363548dae81522 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054606) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8903: [FLINK-12747][docs] Getting Started - Table API Example Walkthrough
flinkbot edited a comment on issue #8903: [FLINK-12747][docs] Getting Started - Table API Example Walkthrough URL: https://github.com/apache/flink/pull/8903#issuecomment-510464651 ## CI report: * b2821a6ae97fd943f3a66b672e85fbd2374126c4 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/118909729) * 0699f7e5f2240a4a1bc44c15f08e6a1df47d3b01 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054579) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9083: [FLINK-13116] [table-planner-blink] Supports catalog statistics in blink planner
flinkbot edited a comment on issue #9083: [FLINK-13116] [table-planner-blink] Supports catalog statistics in blink planner URL: https://github.com/apache/flink/pull/9083#issuecomment-510435070 ## CI report: * 402974d0ffb69d9244c108234c9837f2eacc8d37 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118878495) * c1768f65b4eeb9abef3d7797aa6e1c711012bd39 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054603) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9088: [FLINK-13012][hive] Handle default partition name of Hive table
flinkbot edited a comment on issue #9088: [FLINK-13012][hive] Handle default partition name of Hive table URL: https://github.com/apache/flink/pull/9088#issuecomment-510484364 ## CI report: * eac5f74690ddb0b08cb41b029f5b8ac675e63565 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118891245) * b22f836f5e8f95a9f376f54c68798eeb14cb1644 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054589) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9090: [FLINK-13124] Don't forward exceptions when finishing SourceStreamTask
flinkbot edited a comment on issue #9090: [FLINK-13124] Don't forward exceptions when finishing SourceStreamTask URL: https://github.com/apache/flink/pull/9090#issuecomment-510496617 ## CI report: * 6c229740b17ae4e0df7c8ba9678ada086ae8bf47 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118880603) * 6b32e0614651fa80d5ec14d0d54fe6be219f417c : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054716) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8990: [FLINK-13104][metrics] Updated request callback to log warning on failure
flinkbot edited a comment on issue #8990: [FLINK-13104][metrics] Updated request callback to log warning on failure URL: https://github.com/apache/flink/pull/8990#issuecomment-510684087 ## CI report: * f29b2ae71ed7d7dcd823399dc3427f76534b38d1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118907907) * 76d4183e3a6e6e348d8349b4ceb385529e8a2c5e : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054728) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9067: [FLINK-13069][hive] HiveTableSink should implement OverwritableTableSink
flinkbot edited a comment on issue #9067: [FLINK-13069][hive] HiveTableSink should implement OverwritableTableSink URL: https://github.com/apache/flink/pull/9067#issuecomment-510405753 ## CI report: * 0034a70157b871b401cb1f8cd5a223427cf6223a : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118885081) * 1f192d01c764dbff8ab884512814cc1a4fd80dba : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054583) * e73d6f0155762b8c4c3fff5c742d2644962329ee : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119054722) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9083: [FLINK-13116] [table-planner-blink] Supports catalog statistics in blink planner
flinkbot edited a comment on issue #9083: [FLINK-13116] [table-planner-blink] Supports catalog statistics in blink planner URL: https://github.com/apache/flink/pull/9083#issuecomment-510435070 ## CI report: * 402974d0ffb69d9244c108234c9837f2eacc8d37 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118878495) * c1768f65b4eeb9abef3d7797aa6e1c711012bd39 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054603) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9089: [FLINK-13225][table-planner-blink] Introduce type inference for hive functions in blink
flinkbot edited a comment on issue #9089: [FLINK-13225][table-planner-blink] Introduce type inference for hive functions in blink URL: https://github.com/apache/flink/pull/9089#issuecomment-510488226 ## CI report: * fb34a0f4245ddac5872ea77aad07887a6ff12d11 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118890132) * ba44069acdbd82261839605b5d363548dae81522 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054606) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9090: [FLINK-13124] Don't forward exceptions when finishing SourceStreamTask
flinkbot edited a comment on issue #9090: [FLINK-13124] Don't forward exceptions when finishing SourceStreamTask URL: https://github.com/apache/flink/pull/9090#issuecomment-510496617 ## CI report: * 6c229740b17ae4e0df7c8ba9678ada086ae8bf47 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118880603) * 6b32e0614651fa80d5ec14d0d54fe6be219f417c : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054716) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9067: [FLINK-13069][hive] HiveTableSink should implement OverwritableTableSink
flinkbot edited a comment on issue #9067: [FLINK-13069][hive] HiveTableSink should implement OverwritableTableSink URL: https://github.com/apache/flink/pull/9067#issuecomment-510405753 ## CI report: * 0034a70157b871b401cb1f8cd5a223427cf6223a : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118885081) * 1f192d01c764dbff8ab884512814cc1a4fd80dba : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054583) * e73d6f0155762b8c4c3fff5c742d2644962329ee : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054722) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8990: [FLINK-13104][metrics] Updated request callback to log warning on failure
flinkbot edited a comment on issue #8990: [FLINK-13104][metrics] Updated request callback to log warning on failure URL: https://github.com/apache/flink/pull/8990#issuecomment-510684087 ## CI report: * f29b2ae71ed7d7dcd823399dc3427f76534b38d1 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/118907907) * 76d4183e3a6e6e348d8349b4ceb385529e8a2c5e : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119054728) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8621: [FLINK-12682][connectors] StringWriter support custom row delimiter
flinkbot edited a comment on issue #8621: [FLINK-12682][connectors] StringWriter support custom row delimiter URL: https://github.com/apache/flink/pull/8621#issuecomment-511183092 ## CI report: * e21d12fa2c9b6305d90502ae05a9d574ce712fd1 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119056511) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9106: [FLINK-13184][yarn] Support launching task executors with multi-thread on YARN.
flinkbot edited a comment on issue #9106: [FLINK-13184][yarn] Support launching task executors with multi-thread on YARN. URL: https://github.com/apache/flink/pull/9106#issuecomment-511183558 ## CI report: * 25fc95f30720209e19bd010cdd517ff5e3c685d8 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119056693) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-13252) Common CachedLookupFunction for All connector
[ https://issues.apache.org/jira/browse/FLINK-13252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chance Li updated FLINK-13252: -- Description: shortly, it's a decorator pattern: # A CachedLookupFunction extends TableFunction # when needing the cache feature, the only thing is to construct this CachedLookupFunction with the real LookupFunction's instance. so it's can be used by any connector. # CachedLookupFunction will send the result directly if data has been cached or, to invoke the real LookupFunction to get data and send it after this data has been cached. # will have more cache strategies such as All. should add a new module called flink-connector-common. we also can provide a common Async LookupFunction using this pattern instead of too much implementation. was: shortly, it's a decorator pattern: # A CachedLookupFunction extends TableFunction # when needing the cache feature, the only thing is to construct this CachedLookupFunction with the real LookupFunction's instance. so it's can be used by any connector. # CachedLookupFunction will send the result directly if data has been cached or, to invoke the real LookupFunction to get data and send it after this data has been cached. # will have more cache strategies such as All step by step. should add a new module called flink-connector-common. we also can provide a common Async LookupFunction using this pattern instead of too much implementation. > Common CachedLookupFunction for All connector > - > > Key: FLINK-13252 > URL: https://issues.apache.org/jira/browse/FLINK-13252 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Chance Li >Assignee: Chance Li >Priority: Minor > > shortly, it's a decorator pattern: > # A CachedLookupFunction extends TableFunction > # when needing the cache feature, the only thing is to construct this > CachedLookupFunction with the real LookupFunction's instance. so it's can be > used by any connector. > # CachedLookupFunction will send the result directly if data has been cached > or, to invoke the real LookupFunction to get data and send it after this data > has been cached. > # will have more cache strategies such as All. > should add a new module called flink-connector-common. > we also can provide a common Async LookupFunction using this pattern instead > of too much implementation. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (FLINK-13252) Common CachedLookupFunction for All connector
[ https://issues.apache.org/jira/browse/FLINK-13252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chance Li updated FLINK-13252: -- Description: shortly, it's a decorator pattern: # A CachedLookupFunction extends TableFunction # when needing the cache feature, the only thing is to construct this CachedLookupFunction with the real LookupFunction's instance. so it's can be used by any connector. # CachedLookupFunction will send the result directly if data has been cached or, to invoke the real LookupFunction to get data and send it after this data has been cached. # will have more cache strategies such as All step by step. should add a new module called flink-connector-common. we also can provide a common Async LookupFunction using this pattern instead of too much implementation. was: shortly, it's a decorator pattern: # A CachedLookupFunction extends TableFunction # when needing the cache feature, the only thing is to construct this CachedLookupFunction with the real LookupFunction's instance. so it's can be used by any connector. # CachedLookupFunction will send the result directly if data has been cached or, to invoke the real LookupFunction to get data and send it after this data has been cached. should add a new module called flink-connector-common. we also can provide a common Async LookupFunction using this pattern instead of too much implementation. > Common CachedLookupFunction for All connector > - > > Key: FLINK-13252 > URL: https://issues.apache.org/jira/browse/FLINK-13252 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Chance Li >Assignee: Chance Li >Priority: Minor > > shortly, it's a decorator pattern: > # A CachedLookupFunction extends TableFunction > # when needing the cache feature, the only thing is to construct this > CachedLookupFunction with the real LookupFunction's instance. so it's can be > used by any connector. > # CachedLookupFunction will send the result directly if data has been cached > or, to invoke the real LookupFunction to get data and send it after this data > has been cached. > # will have more cache strategies such as All step by step. > should add a new module called flink-connector-common. > we also can provide a common Async LookupFunction using this pattern instead > of too much implementation. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (FLINK-13253) Deadlock may occur in JDBCUpsertOutputFormat
Jingsong Lee created FLINK-13253: Summary: Deadlock may occur in JDBCUpsertOutputFormat Key: FLINK-13253 URL: https://issues.apache.org/jira/browse/FLINK-13253 Project: Flink Issue Type: Bug Components: Connectors / JDBC Reporter: Jingsong Lee Assignee: Jingsong Lee In close, it await the flush scheduler terminal, but it hold the lock of JDBCUpsertOutputFormat instance, maybe the async thread is waiting for this lock in the flush method, so there might be a deadlock here. First, it should not await scheduler terminal, because it has flushed all data to jdbc, what we should do is let async thread quit. Second, we should add lock outside the closed check in the flusher, in this way, we can ensure async thread secure exiting. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] wuchong commented on a change in pull request #9107: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat
wuchong commented on a change in pull request #9107: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat URL: https://github.com/apache/flink/pull/9107#discussion_r303274281 ## File path: flink-connectors/flink-jdbc/src/main/java/org/apache/flink/api/java/io/jdbc/JDBCUpsertOutputFormat.java ## @@ -116,13 +116,15 @@ public void open(int taskNumber, int numTasks) throws IOException { this.scheduler = Executors.newScheduledThreadPool( 1, new ExecutorThreadFactory("jdbc-upsert-output-format")); this.scheduledFuture = this.scheduler.scheduleWithFixedDelay(() -> { - if (closed) { - return; - } - try { - flush(); - } catch (Exception e) { - flushException = e; + synchronized (JDBCUpsertOutputFormat.this) { Review comment: Why should we add `synchronized`? `flush()` is already `synchronized`, right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wuchong commented on a change in pull request #9107: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat
wuchong commented on a change in pull request #9107: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat URL: https://github.com/apache/flink/pull/9107#discussion_r303274382 ## File path: flink-connectors/flink-jdbc/src/main/java/org/apache/flink/api/java/io/jdbc/JDBCUpsertOutputFormat.java ## @@ -184,15 +186,6 @@ public synchronized void close() throws IOException { if (this.scheduledFuture != null) { scheduledFuture.cancel(false); this.scheduler.shutdown(); Review comment: Should we change this to `shutdownNow()`? I think the async flusher can quit now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9107: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat
flinkbot commented on issue #9107: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat URL: https://github.com/apache/flink/pull/9107#issuecomment-511256671 ## CI report: * 65e1326b09a9d8188380597c4050078cf3646f8c : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119089761) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#issuecomment-511260167 ## CI report: * e290f6b65758fc7d199277e4345a75335de981b2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119092230) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi opened a new pull request #9108: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat
JingsongLi opened a new pull request #9108: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat URL: https://github.com/apache/flink/pull/9108 ## What is the purpose of the change In close, it await the flush scheduler terminal, but it hold the lock of JDBCUpsertOutputFormat instance, maybe the async thread is waiting for this lock in the flush method, so there might be a deadlock here. First, it should not await scheduler terminal, because it has flushed all data to jdbc, what we should do is let async thread quit. Second, we should add lock outside the closed check in the flusher, in this way, we can ensure async thread secure exiting. ## Verifying this change ut ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#issuecomment-511260167 ## CI report: * e290f6b65758fc7d199277e4345a75335de981b2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119094227) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#issuecomment-511260167 ## CI report: * e290f6b65758fc7d199277e4345a75335de981b2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119096361) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-13252) Common CachedLookupFunction for All connector
[ https://issues.apache.org/jira/browse/FLINK-13252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chance Li updated FLINK-13252: -- Docs Text: coming soon. lets me know what's your opinions. Thanks. (was: coming soon. lets me know what's your opinion. Thanks.) > Common CachedLookupFunction for All connector > - > > Key: FLINK-13252 > URL: https://issues.apache.org/jira/browse/FLINK-13252 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Chance Li >Assignee: Chance Li >Priority: Minor > > shortly, it's a decorator pattern: > # A CachedLookupFunction extends TableFunction > # when needing the cache feature, the only thing is to construct this > CachedLookupFunction with the real LookupFunction's instance. so it's can be > used by any connector's LookupFunction. > # CachedLookupFunction will send the result directly if data has been cached > or, to invoke the real LookupFunction to get data and send it after this data > has been cached. > should add a new module called flink-connector-common. > we also can provide a common Async LookupFunction using this pattern instead > of too much implementation. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (FLINK-13252) Common CachedLookupFunction for All connector
[ https://issues.apache.org/jira/browse/FLINK-13252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chance Li updated FLINK-13252: -- Description: shortly, it's a decorator pattern: # A CachedLookupFunction extends TableFunction # when needing the cache feature, the only thing is to construct this CachedLookupFunction with the real LookupFunction's instance. so it's can be used by any connector's LookupFunction. # CachedLookupFunction will send the result directly if data has been cached or, to invoke the real LookupFunction to get data and send it after this data has been cached. should add a new module called flink-connector-common. we also can provide a common Async LookupFunction using this pattern instead of too much implementation. was: shortly, it's a decorator pattern: # A CachedLookupFunction extends TableFunction # when needing the cache feature, the only thing is to construct this CachedLookupFunction with the real LookupFunction's instance. so it's can be used by any connector's LookupFunction. # CachedLookupFunction will send the result directly if data has been cached or, to invoke the real LookupFunction to get data and send it after this data has been cached. should add a new module called flink-connector-common. we also can provide a common Async LookupFunction using this pattern. > Common CachedLookupFunction for All connector > - > > Key: FLINK-13252 > URL: https://issues.apache.org/jira/browse/FLINK-13252 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Chance Li >Assignee: Chance Li >Priority: Minor > > shortly, it's a decorator pattern: > # A CachedLookupFunction extends TableFunction > # when needing the cache feature, the only thing is to construct this > CachedLookupFunction with the real LookupFunction's instance. so it's can be > used by any connector's LookupFunction. > # CachedLookupFunction will send the result directly if data has been cached > or, to invoke the real LookupFunction to get data and send it after this data > has been cached. > should add a new module called flink-connector-common. > we also can provide a common Async LookupFunction using this pattern instead > of too much implementation. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot commented on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
flinkbot commented on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#issuecomment-511260167 ## CI report: * e290f6b65758fc7d199277e4345a75335de981b2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119091103) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13132) Allow ClusterEntrypoints use user main method to generate job graph
[ https://issues.apache.org/jira/browse/FLINK-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884819#comment-16884819 ] Zhenqiu Huang commented on FLINK-13132: --- [~fly_in_gis] Our concern of the cost is mainly on the pipeline downtime. In our current design, the downloads always from the nearest storages, such as hdfs in prime and s3/GCS for cloud. But we think it is not enough to guarantee our SLA in worst case. If considering start a job in the service, the end to end latency includes download jars, start another process, start session client, upload remote resource, start job cluster, submit joggraph to start the job, etc. It usually takes 1 - 2 minutes for low QPS. If request burst (1000 requests) comes due to some unexpected issue, some of the redeployment requests will be much slower due to the resource competition in the each stage of of job submission. The optimization we want to do is to skip some of the steps (like upload remote resource, job graph generation) in service side, and put the job-graph compilation into ClusterEntrypoints. In this way, download jar can be ignored, and the job graph can be parallelized for each job right after start a cluster, so that even in worst case, we can guarantee our downtime SLA. > Allow ClusterEntrypoints use user main method to generate job graph > --- > > Key: FLINK-13132 > URL: https://issues.apache.org/jira/browse/FLINK-13132 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.8.0, 1.8.1 >Reporter: Zhenqiu Huang >Assignee: Zhenqiu Huang >Priority: Minor > > We are building a service that can transparently deploy a job to different > cluster management systems, such as Yarn and another internal system. It is > very cost to download the jar and generate JobGraph in the client side. Thus, > I want to propose an improvement to make Yarn Entrypoints can be configurable > to use either FileJobGraphRetriever or ClassPathJobGraphRetriever. It is > actually a long asking TODO in AbstractionYarnClusterDescriptor in line 834. > https://github.com/apache/flink/blob/21468e0050dc5f97de5cfe39885e0d3fd648e399/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L834 -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13132) Allow ClusterEntrypoints use user main method to generate job graph
[ https://issues.apache.org/jira/browse/FLINK-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884822#comment-16884822 ] Zhenqiu Huang commented on FLINK-13132: --- [~maguowei] For your questions, I think you are mainly asking how to guarantee the lossless when moving job in clusters. 1) We only moving a job within clusters in the same region (networking latency < 2ms). Thus, most of the time we don't change the data source, thus a job can always read from last committed offset (if no checkpoint). For jobs with state, there is another story. Our storage team built a blob management system on top of internal HDFS, S3 and GCS. They provides data placement policy and data replication service for us. For example, we define a job with its state stored a blob folder, and the blob configured to be replicated to GCS. When we want to restart a stateful job from a cluster a in local to a cluster into cloud with latest checkpoint. The checkpoint is already copied to cloud within the same namespace. 2) It is a good suggestion. I am always looking into how to utilize exiting HA setting for storing jobgraph. I think I can storage the jobgraph to zookeeper in HA mode in the first time launch, so that when application master fail and recover the job graph can be reused. Let's discuss in details on the coming PR. > Allow ClusterEntrypoints use user main method to generate job graph > --- > > Key: FLINK-13132 > URL: https://issues.apache.org/jira/browse/FLINK-13132 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.8.0, 1.8.1 >Reporter: Zhenqiu Huang >Assignee: Zhenqiu Huang >Priority: Minor > > We are building a service that can transparently deploy a job to different > cluster management systems, such as Yarn and another internal system. It is > very cost to download the jar and generate JobGraph in the client side. Thus, > I want to propose an improvement to make Yarn Entrypoints can be configurable > to use either FileJobGraphRetriever or ClassPathJobGraphRetriever. It is > actually a long asking TODO in AbstractionYarnClusterDescriptor in line 834. > https://github.com/apache/flink/blob/21468e0050dc5f97de5cfe39885e0d3fd648e399/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L834 -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot edited a comment on issue #9107: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat
flinkbot edited a comment on issue #9107: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat URL: https://github.com/apache/flink/pull/9107#issuecomment-511256671 ## CI report: * 65e1326b09a9d8188380597c4050078cf3646f8c : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/119089761) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#issuecomment-511260167 ## CI report: * e290f6b65758fc7d199277e4345a75335de981b2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119093085) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#issuecomment-511260167 ## CI report: * e290f6b65758fc7d199277e4345a75335de981b2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119094748) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-13252) Common CachedLookupFunction for All connector
Chance Li created FLINK-13252: - Summary: Common CachedLookupFunction for All connector Key: FLINK-13252 URL: https://issues.apache.org/jira/browse/FLINK-13252 Project: Flink Issue Type: New Feature Components: Connectors / Common Reporter: Chance Li Assignee: Chance Li shortly, it's a decorator pattern: # A CachedLookupFunction extends TableFunction # when needing the cache feature, the only thing is to construct this CachedLookupFunction with the real LookupFunction's instance. so it's can be used by any connector's LookupFunction. # CachedLookupFunction will send the result directly if data has been cached or, to invoke the real LookupFunction to get data and send it after this data has been cached. should add a new module called flink-connector-common. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] wuchong merged pull request #9091: [FLINK-13229][table-planner-blink] ExpressionReducer with udf bug in blink
wuchong merged pull request #9091: [FLINK-13229][table-planner-blink] ExpressionReducer with udf bug in blink URL: https://github.com/apache/flink/pull/9091 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-13253) Deadlock may occur in JDBCUpsertOutputFormat
[ https://issues.apache.org/jira/browse/FLINK-13253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-13253: Fix Version/s: 1.10.0 1.9.0 > Deadlock may occur in JDBCUpsertOutputFormat > > > Key: FLINK-13253 > URL: https://issues.apache.org/jira/browse/FLINK-13253 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.9.0 >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0, 1.10.0 > > Time Spent: 10m > Remaining Estimate: 0h > > In close, it await the flush scheduler terminal, but it hold the lock of > JDBCUpsertOutputFormat instance, maybe the async thread is waiting for this > lock in the flush method, so there might be a deadlock here. > First, it should not await scheduler terminal, because it has flushed all > data to jdbc, what we should do is let async thread quit. > Second, we should add lock outside the closed check in the flusher, in this > way, we can ensure async thread secure exiting. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot commented on issue #9108: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat
flinkbot commented on issue #9108: [FLINK-13253][jdbc] Deadlock may occur in JDBCUpsertOutputFormat URL: https://github.com/apache/flink/pull/9108#issuecomment-511264067 ## CI report: * 00665c416016d92526255cb3a9d9fc37f6c6a8d2 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119092795) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#issuecomment-511260167 ## CI report: * e290f6b65758fc7d199277e4345a75335de981b2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119092799) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
flinkbot edited a comment on issue #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#issuecomment-511260167 ## CI report: * e290f6b65758fc7d199277e4345a75335de981b2 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/119094981) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-12926) Main thread checking in some tests fails
[ https://issues.apache.org/jira/browse/FLINK-12926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884811#comment-16884811 ] Zhu Zhu commented on FLINK-12926: - Yes. For the issue 2 listed above, this fix can be help to find out current violated cases and avoid future misuse of ComponentMainThreadExecutorServiceAdapter(TestingComponentMainThreadExecutorServiceAdapter is merged with ComponentMainThreadExecutorServiceAdapter in a late PR). > Main thread checking in some tests fails > > > Key: FLINK-12926 > URL: https://issues.apache.org/jira/browse/FLINK-12926 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Tests >Affects Versions: 1.9.0 >Reporter: Zhu Zhu >Priority: Major > Attachments: Execution#deploy.jpg, mainThreadCheckFailure.log > > > Currently all JM side job changing actions are expected to be taken in > JobMaster main thread. > In current Flink tests, many cases tend to use the test main thread as the JM > main thread. This can lead to 2 issues: > 1. TestingComponentMainThreadExecutorServiceAdapter is a direct executor, so > if it is invoked from any other thread, it will break the main thread > checking and fail the submitted action (as in the attached log > [^mainThreadCheckFailure.log]) > 2. The test main thread does not support other actions queued in its > executor, as the test will end once the current test thread action(the > current running test body) is done > > In my observation, most cases which starts > ExecutionGraph.scheduleForExecution() will encounter this issue. Cases > include ExecutionGraphRestartTest, FailoverRegionTest, > ConcurrentFailoverStrategyExecutionGraphTest, GlobalModVersionTest, > ExecutionGraphDeploymentTest, etc. > > One solution in my mind is to create a ScheduledExecutorService for those > tests, use it as the main thread and run the test body in this thread. > > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (FLINK-13236) Fix bug and improve performance in TopNBuffer
[ https://issues.apache.org/jira/browse/FLINK-13236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu resolved FLINK-13236. - Resolution: Fixed Fixed in 1.9.0: a15834a83d9caf100036df385ff041b0ac9a29be 1.10.0: 6d7c1bd94ae817de20f95328c76eda719b896c74 > Fix bug and improve performance in TopNBuffer > - > > Key: FLINK-13236 > URL: https://issues.apache.org/jira/browse/FLINK-13236 > Project: Flink > Issue Type: Improvement >Reporter: Caizhi Weng >Assignee: Caizhi Weng >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0, 1.10.0 > > Time Spent: 20m > Remaining Estimate: 0h > > In {{TopNBuffer}} we have the following method: > {code:java} > /** > * Puts a record list into the buffer under the sortKey. > * Note: if buffer already contains sortKey, putAll will overwrite the > previous value > * > * @param sortKey sort key with which the specified values are to be > associated > * @param values record lists to be associated with the specified key > */ > void putAll(BaseRow sortKey, Collection values) { > treeMap.put(sortKey, values); > currentTopNum += values.size(); > } > {code} > When {{sortKey}} already exists in {{treeMap}}, the {{currentTopNum}} should > be first subtracted by the old {{value.size()}} in {{treeMap}} then added > (can't be directly added). As currently only {{AppendOnlyTopNFunction}} uses > this method in its init procedure, this bug is not triggered. > {code:java} > /** > * Gets record which rank is given value. > * > * @param rank rank value to search > * @return the record which rank is given value > */ > BaseRow getElement(int rank) { > int curRank = 0; > Iterator>> iter = > treeMap.entrySet().iterator(); > while (iter.hasNext()) { > Map.Entry> entry = iter.next(); > Collection list = entry.getValue(); > Iterator listIter = list.iterator(); > while (listIter.hasNext()) { > BaseRow elem = listIter.next(); > curRank += 1; > if (curRank == rank) { > return elem; > } > } > } > return null; > } > {code} > We can remove the inner loop by adding {{curRank}} by {{list.size()}} each > time. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-10806) Support multiple consuming offsets when discovering a new topic
[ https://issues.apache.org/jira/browse/FLINK-10806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884836#comment-16884836 ] Jiayi Liao commented on FLINK-10806: [~becket_qin] Hi Jiangjie, Thank you for your attention on this :). This is about the "addDiscoveredPartitions" function in AbstractFetcher.java. We consume new KafkaTopicPartition from earliest offset, which may not be what we want sometimes. Assuming that we add a new topic into the topic list of FlinkKafkaConsumer08 which is the source of the application. And this topic will be found by PartitionDiscoverer and consumed from the earliest offset, which is not good if the topic has been in production for a long time(means that large history data). I'm not sure whether this is a common case or not. I'm willing to hear your opinions. > Support multiple consuming offsets when discovering a new topic > --- > > Key: FLINK-10806 > URL: https://issues.apache.org/jira/browse/FLINK-10806 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.6.2 >Reporter: Jiayi Liao >Assignee: Jiayi Liao >Priority: Major > > In KafkaConsumerBase, we discover the TopicPartitions and compare them with > the restoredState. It's reasonable when a topic's partitions scaled. However, > if we add a new topic which has too much data and restore the Flink program, > the data of the new topic will be consumed from the start, which may not be > what we want. I think this should be an option for developers. -- This message was sent by Atlassian JIRA (v7.6.14#76016)