[jira] [Commented] (FLINK-9272) DataDog API "counter" metric type is deprecated
[ https://issues.apache.org/jira/browse/FLINK-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457827#comment-16457827 ] Elias Levy commented on FLINK-9272: --- I tried to find out before opening the issue, but I found no information, other than a notice in the docs saying {{counter}} was deprecated and to use {{count}}, and to notice that the API docs no longer list {{counter}}. > DataDog API "counter" metric type is deprecated > > > Key: FLINK-9272 > URL: https://issues.apache.org/jira/browse/FLINK-9272 > Project: Flink > Issue Type: Improvement > Components: Metrics >Affects Versions: 1.4.2 >Reporter: Elias Levy >Priority: Major > > It appears to have been replaced by the "count" metric type. > https://docs.datadoghq.com/developers/metrics/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457821#comment-16457821 ] ASF GitHub Bot commented on FLINK-8690: --- Github user suez1224 commented on a diff in the pull request: https://github.com/apache/flink/pull/5940#discussion_r184866337 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/CommonAggregate.scala --- @@ -49,11 +49,16 @@ trait CommonAggregate { val aggs = namedAggregates.map(_.getKey) val aggStrings = aggs.map( a => s"${a.getAggregation}(${ - if (a.getArgList.size() > 0) { + val prefix = if (a.isDistinct) { --- End diff -- I think one line should be fine here, slightly more compact IMO. val prefix = if (a.isDistinct) "DISTINCT " else "" > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457820#comment-16457820 ] ASF GitHub Bot commented on FLINK-8690: --- Github user suez1224 commented on a diff in the pull request: https://github.com/apache/flink/pull/5940#discussion_r184866214 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/logical/FlinkLogicalAggregate.scala --- @@ -82,7 +82,7 @@ private class FlinkLogicalAggregateConverter case _ => true } -!agg.containsDistinctCall() && supported +supported --- End diff -- I dont think we need this extra local variable here after your change. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5940: [FLINK-8690][table]Support group window distinct a...
Github user suez1224 commented on a diff in the pull request: https://github.com/apache/flink/pull/5940#discussion_r184866337 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/CommonAggregate.scala --- @@ -49,11 +49,16 @@ trait CommonAggregate { val aggs = namedAggregates.map(_.getKey) val aggStrings = aggs.map( a => s"${a.getAggregation}(${ - if (a.getArgList.size() > 0) { + val prefix = if (a.isDistinct) { --- End diff -- I think one line should be fine here, slightly more compact IMO. val prefix = if (a.isDistinct) "DISTINCT " else "" ---
[GitHub] flink pull request #5940: [FLINK-8690][table]Support group window distinct a...
Github user suez1224 commented on a diff in the pull request: https://github.com/apache/flink/pull/5940#discussion_r184866214 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/logical/FlinkLogicalAggregate.scala --- @@ -82,7 +82,7 @@ private class FlinkLogicalAggregateConverter case _ => true } -!agg.containsDistinctCall() && supported +supported --- End diff -- I dont think we need this extra local variable here after your change. ---
[jira] [Updated] (FLINK-9266) Upgrade AWS Kinesis Client version to 1.9.0 to reduce Kinesis describe streams calls
[ https://issues.apache.org/jira/browse/FLINK-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Moser updated FLINK-9266: Priority: Minor (was: Trivial) > Upgrade AWS Kinesis Client version to 1.9.0 to reduce Kinesis describe > streams calls > > > Key: FLINK-9266 > URL: https://issues.apache.org/jira/browse/FLINK-9266 > Project: Flink > Issue Type: Improvement > Components: Kinesis Connector >Affects Versions: 1.4.2 >Reporter: Thomas Moser >Priority: Minor > Labels: pull-request-available, usability > > Versions of the AWS Kinesis Client before 1.9.0 can cause a large number of > Kinesis Describe Stream events which can result in other AWS services such as > AWS CloudFormation failing to deploy any stacks with Kinesis streams in them. > For accounts that use a large number of Kinesis streams and Flink clusters > this can cripple the ability to deploy anything in the account. Bumping the > version of the KCL to 1.9.0 changes to a different underlying call that has a > much larger AWS account limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9266) Upgrade AWS Kinesis Client version to 1.9.0 to reduce Kinesis describe streams calls
[ https://issues.apache.org/jira/browse/FLINK-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Moser updated FLINK-9266: Affects Version/s: 1.4.2 > Upgrade AWS Kinesis Client version to 1.9.0 to reduce Kinesis describe > streams calls > > > Key: FLINK-9266 > URL: https://issues.apache.org/jira/browse/FLINK-9266 > Project: Flink > Issue Type: Improvement > Components: Kinesis Connector >Affects Versions: 1.4.2 >Reporter: Thomas Moser >Priority: Trivial > Labels: pull-request-available, usability > > Versions of the AWS Kinesis Client before 1.9.0 can cause a large number of > Kinesis Describe Stream events which can result in other AWS services such as > AWS CloudFormation failing to deploy any stacks with Kinesis streams in them. > For accounts that use a large number of Kinesis streams and Flink clusters > this can cripple the ability to deploy anything in the account. Bumping the > version of the KCL to 1.9.0 changes to a different underlying call that has a > much larger AWS account limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9266) Upgrade AWS Kinesis Client version to 1.9.0 to reduce Kinesis describe streams calls
[ https://issues.apache.org/jira/browse/FLINK-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Moser updated FLINK-9266: Labels: pull-request-available usability (was: pull-request-available) > Upgrade AWS Kinesis Client version to 1.9.0 to reduce Kinesis describe > streams calls > > > Key: FLINK-9266 > URL: https://issues.apache.org/jira/browse/FLINK-9266 > Project: Flink > Issue Type: Improvement > Components: Kinesis Connector >Affects Versions: 1.4.2 >Reporter: Thomas Moser >Priority: Trivial > Labels: pull-request-available, usability > > Versions of the AWS Kinesis Client before 1.9.0 can cause a large number of > Kinesis Describe Stream events which can result in other AWS services such as > AWS CloudFormation failing to deploy any stacks with Kinesis streams in them. > For accounts that use a large number of Kinesis streams and Flink clusters > this can cripple the ability to deploy anything in the account. Bumping the > version of the KCL to 1.9.0 changes to a different underlying call that has a > much larger AWS account limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9266) Upgrade AWS Kinesis Client version to 1.9.0 to reduce Kinesis describe streams calls
[ https://issues.apache.org/jira/browse/FLINK-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Moser updated FLINK-9266: Labels: pull-request-available (was: ) > Upgrade AWS Kinesis Client version to 1.9.0 to reduce Kinesis describe > streams calls > > > Key: FLINK-9266 > URL: https://issues.apache.org/jira/browse/FLINK-9266 > Project: Flink > Issue Type: Improvement > Components: Kinesis Connector >Affects Versions: 1.4.2 >Reporter: Thomas Moser >Priority: Trivial > Labels: pull-request-available, usability > > Versions of the AWS Kinesis Client before 1.9.0 can cause a large number of > Kinesis Describe Stream events which can result in other AWS services such as > AWS CloudFormation failing to deploy any stacks with Kinesis streams in them. > For accounts that use a large number of Kinesis streams and Flink clusters > this can cripple the ability to deploy anything in the account. Bumping the > version of the KCL to 1.9.0 changes to a different underlying call that has a > much larger AWS account limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9272) DataDog API "counter" metric type is deprecated
[ https://issues.apache.org/jira/browse/FLINK-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457778#comment-16457778 ] Chesnay Schepler commented on FLINK-9272: - Do you know what the actual difference between {{Counter}} and {{Count}} is? > DataDog API "counter" metric type is deprecated > > > Key: FLINK-9272 > URL: https://issues.apache.org/jira/browse/FLINK-9272 > Project: Flink > Issue Type: Improvement > Components: Metrics >Affects Versions: 1.4.2 >Reporter: Elias Levy >Priority: Major > > It appears to have been replaced by the "count" metric type. > https://docs.datadoghq.com/developers/metrics/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9181) Add SQL Client documentation page
[ https://issues.apache.org/jira/browse/FLINK-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457733#comment-16457733 ] ASF GitHub Bot commented on FLINK-9181: --- Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184860963 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,539 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flink’s Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. + +### Running SQL Queries + +Once the CLI has been started, you can use the `HELP` command to list all available SQL statements. For validating your setup and cluster connection, you can enter your first SQL query and press the `Enter` key to execute it: + +{% highlight sql %} +SELECT 'Hello World' +{% endhighlight %} + +This query requires no table source and produces a single row result. The CLI will retrieve results from the cluster and visualize them. You can close the result view by pressing the `Q` key. + +The CLI supports **two modes** for maintaining and visualizing results. + +The *table mode* materializes results in memory and visualizes them in a regular, paginated table representation. It can be enabled by executing the following command in the CLI: + +{% highlight text %} +SET execution.result-mode=table +{% endhighlight %} + +The *changelog mode* does not materialize results and visualizes the result stream that is produced by a continuous query [LINK] consisting of insertions (`+`) and retractions (`-`). + +{% highlight text %} +SET execution.result-mode=changelog +{% endhighlight %} + +You can use the following query to see both result modes in action: + +{% highlight sql %} +SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name +{% endhighlight %} + +This query performs a bounded word count example. The following sections explain how to read from table sources and configure other table program properties. --- End diff -- can we replace "following sections" with the actual link, docs might be updated later and the "following" statement might no longer be true > Add SQL Client documentation page > - > > Key: FLINK-9181 > URL: https://issues.apache.org/jira/browse/FLINK-9181 > Project:
[jira] [Commented] (FLINK-9181) Add SQL Client documentation page
[ https://issues.apache.org/jira/browse/FLINK-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457731#comment-16457731 ] ASF GitHub Bot commented on FLINK-9181: --- Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184860868 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,538 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flink’s Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + --- End diff -- correct me if i were wrong, seems like @twalthr is referring to the embedded mode of the SQL-client. It might be useful to have another section "SQL Client Mode" -> "Embedded" / "Gateway" preceding this section, and put "Gateway" mode linked to "Limitation & Future" section > Add SQL Client documentation page > - > > Key: FLINK-9181 > URL: https://issues.apache.org/jira/browse/FLINK-9181 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > > The current implementation of the SQL Client implementation needs > documentation for the upcoming 1.5 release. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9181) Add SQL Client documentation page
[ https://issues.apache.org/jira/browse/FLINK-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457730#comment-16457730 ] ASF GitHub Bot commented on FLINK-9181: --- Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184860810 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,539 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flink’s Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: --- End diff -- `see the [Cluster & Deployment](...) section.` > Add SQL Client documentation page > - > > Key: FLINK-9181 > URL: https://issues.apache.org/jira/browse/FLINK-9181 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > > The current implementation of the SQL Client implementation needs > documentation for the upcoming 1.5 release. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5913: [FLINK-9181] [docs] [sql-client] Add documentation...
Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184861354 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,539 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flinkâs Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. + +### Running SQL Queries + +Once the CLI has been started, you can use the `HELP` command to list all available SQL statements. For validating your setup and cluster connection, you can enter your first SQL query and press the `Enter` key to execute it: + +{% highlight sql %} +SELECT 'Hello World' +{% endhighlight %} + +This query requires no table source and produces a single row result. The CLI will retrieve results from the cluster and visualize them. You can close the result view by pressing the `Q` key. + +The CLI supports **two modes** for maintaining and visualizing results. + +The *table mode* materializes results in memory and visualizes them in a regular, paginated table representation. It can be enabled by executing the following command in the CLI: + +{% highlight text %} +SET execution.result-mode=table +{% endhighlight %} + +The *changelog mode* does not materialize results and visualizes the result stream that is produced by a continuous query [LINK] consisting of insertions (`+`) and retractions (`-`). + +{% highlight text %} +SET execution.result-mode=changelog +{% endhighlight %} + +You can use the following query to see both result modes in action: + +{% highlight sql %} +SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name +{% endhighlight %} + +This query performs a bounded word count example. The following sections explain how to read from table sources and configure other table program properties. + +{% top %} + +Configuration +- + +The SQL Client can be started with the following optional CLI commands. They are discussed in detail in the subsequent paragraphs. + +{% highlight text %} +./bin/sql-client.sh embedded --help + +Mode "embedded" submits Flink jobs from the local machine. + + Syntax: embedded [OPTIONS] + "embedded" mode options: + -d,--defaults The environment properties with which + every new session is initialized. +
[jira] [Commented] (FLINK-9181) Add SQL Client documentation page
[ https://issues.apache.org/jira/browse/FLINK-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457734#comment-16457734 ] ASF GitHub Bot commented on FLINK-9181: --- Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184861354 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,539 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flink’s Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. + +### Running SQL Queries + +Once the CLI has been started, you can use the `HELP` command to list all available SQL statements. For validating your setup and cluster connection, you can enter your first SQL query and press the `Enter` key to execute it: + +{% highlight sql %} +SELECT 'Hello World' +{% endhighlight %} + +This query requires no table source and produces a single row result. The CLI will retrieve results from the cluster and visualize them. You can close the result view by pressing the `Q` key. + +The CLI supports **two modes** for maintaining and visualizing results. + +The *table mode* materializes results in memory and visualizes them in a regular, paginated table representation. It can be enabled by executing the following command in the CLI: + +{% highlight text %} +SET execution.result-mode=table +{% endhighlight %} + +The *changelog mode* does not materialize results and visualizes the result stream that is produced by a continuous query [LINK] consisting of insertions (`+`) and retractions (`-`). + +{% highlight text %} +SET execution.result-mode=changelog +{% endhighlight %} + +You can use the following query to see both result modes in action: + +{% highlight sql %} +SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name +{% endhighlight %} + +This query performs a bounded word count example. The following sections explain how to read from table sources and configure other table program properties. + +{% top %} + +Configuration +- + +The SQL Client can be started with the following optional CLI commands. They are discussed in detail in the subsequent paragraphs. + +{% highlight text %} +./bin/sql-client.sh embedded --help + +Mode "embedded" submits Flink jobs from the local machine. + + Syntax:
[GitHub] flink pull request #5913: [FLINK-9181] [docs] [sql-client] Add documentation...
Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184861107 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,539 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flinkâs Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. + +### Running SQL Queries + +Once the CLI has been started, you can use the `HELP` command to list all available SQL statements. For validating your setup and cluster connection, you can enter your first SQL query and press the `Enter` key to execute it: + +{% highlight sql %} +SELECT 'Hello World' +{% endhighlight %} + +This query requires no table source and produces a single row result. The CLI will retrieve results from the cluster and visualize them. You can close the result view by pressing the `Q` key. + +The CLI supports **two modes** for maintaining and visualizing results. + +The *table mode* materializes results in memory and visualizes them in a regular, paginated table representation. It can be enabled by executing the following command in the CLI: + +{% highlight text %} +SET execution.result-mode=table +{% endhighlight %} + +The *changelog mode* does not materialize results and visualizes the result stream that is produced by a continuous query [LINK] consisting of insertions (`+`) and retractions (`-`). + +{% highlight text %} +SET execution.result-mode=changelog +{% endhighlight %} + +You can use the following query to see both result modes in action: + +{% highlight sql %} +SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name --- End diff -- seems like the `result.mode` can only be set through the environment config file. executing it in CLI doesn't take effect to me. ---
[jira] [Commented] (FLINK-9181) Add SQL Client documentation page
[ https://issues.apache.org/jira/browse/FLINK-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457735#comment-16457735 ] ASF GitHub Bot commented on FLINK-9181: --- Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184861107 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,539 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flink’s Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. + +### Running SQL Queries + +Once the CLI has been started, you can use the `HELP` command to list all available SQL statements. For validating your setup and cluster connection, you can enter your first SQL query and press the `Enter` key to execute it: + +{% highlight sql %} +SELECT 'Hello World' +{% endhighlight %} + +This query requires no table source and produces a single row result. The CLI will retrieve results from the cluster and visualize them. You can close the result view by pressing the `Q` key. + +The CLI supports **two modes** for maintaining and visualizing results. + +The *table mode* materializes results in memory and visualizes them in a regular, paginated table representation. It can be enabled by executing the following command in the CLI: + +{% highlight text %} +SET execution.result-mode=table +{% endhighlight %} + +The *changelog mode* does not materialize results and visualizes the result stream that is produced by a continuous query [LINK] consisting of insertions (`+`) and retractions (`-`). + +{% highlight text %} +SET execution.result-mode=changelog +{% endhighlight %} + +You can use the following query to see both result modes in action: + +{% highlight sql %} +SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name --- End diff -- seems like the `result.mode` can only be set through the environment config file. executing it in CLI doesn't take effect to me. > Add SQL Client documentation page > - > > Key: FLINK-9181 > URL: https://issues.apache.org/jira/browse/FLINK-9181 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > > The current implementation
[GitHub] flink pull request #5913: [FLINK-9181] [docs] [sql-client] Add documentation...
Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184860963 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,539 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flinkâs Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. + +### Running SQL Queries + +Once the CLI has been started, you can use the `HELP` command to list all available SQL statements. For validating your setup and cluster connection, you can enter your first SQL query and press the `Enter` key to execute it: + +{% highlight sql %} +SELECT 'Hello World' +{% endhighlight %} + +This query requires no table source and produces a single row result. The CLI will retrieve results from the cluster and visualize them. You can close the result view by pressing the `Q` key. + +The CLI supports **two modes** for maintaining and visualizing results. + +The *table mode* materializes results in memory and visualizes them in a regular, paginated table representation. It can be enabled by executing the following command in the CLI: + +{% highlight text %} +SET execution.result-mode=table +{% endhighlight %} + +The *changelog mode* does not materialize results and visualizes the result stream that is produced by a continuous query [LINK] consisting of insertions (`+`) and retractions (`-`). + +{% highlight text %} +SET execution.result-mode=changelog +{% endhighlight %} + +You can use the following query to see both result modes in action: + +{% highlight sql %} +SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name +{% endhighlight %} + +This query performs a bounded word count example. The following sections explain how to read from table sources and configure other table program properties. --- End diff -- can we replace "following sections" with the actual link, docs might be updated later and the "following" statement might no longer be true ---
[jira] [Commented] (FLINK-9181) Add SQL Client documentation page
[ https://issues.apache.org/jira/browse/FLINK-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457732#comment-16457732 ] ASF GitHub Bot commented on FLINK-9181: --- Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184860924 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,538 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flink’s Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. --- End diff -- also `enviroment-files` is technically 4 parts away instead of the next part :-P > Add SQL Client documentation page > - > > Key: FLINK-9181 > URL: https://issues.apache.org/jira/browse/FLINK-9181 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > > The current implementation of the SQL Client implementation needs > documentation for the upcoming 1.5 release. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5913: [FLINK-9181] [docs] [sql-client] Add documentation...
Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184860924 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,538 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flinkâs Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + +{% highlight bash %} +./bin/sql-client.sh embedded +{% endhighlight %} + +This command starts the submission service and CLI embedded in one application process. By default, the SQL Client will read its configuration from the environment file located in `./conf/sql-client-defaults.yaml`. See the [next part](sqlClient.html#environment-files) for more information about the structure of environment files. --- End diff -- also `enviroment-files` is technically 4 parts away instead of the next part :-P ---
[GitHub] flink pull request #5913: [FLINK-9181] [docs] [sql-client] Add documentation...
Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184860868 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,538 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flinkâs Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: + +{% highlight bash %} +./bin/start-cluster.sh +{% endhighlight %} + +### Starting the SQL Client CLI + +The SQL Client scripts are also located in the binary directory of Flink. You can start the CLI by calling: + --- End diff -- correct me if i were wrong, seems like @twalthr is referring to the embedded mode of the SQL-client. It might be useful to have another section "SQL Client Mode" -> "Embedded" / "Gateway" preceding this section, and put "Gateway" mode linked to "Limitation & Future" section ---
[GitHub] flink pull request #5913: [FLINK-9181] [docs] [sql-client] Add documentation...
Github user walterddr commented on a diff in the pull request: https://github.com/apache/flink/pull/5913#discussion_r184860810 --- Diff: docs/dev/table/sqlClient.md --- @@ -0,0 +1,539 @@ +--- +title: "SQL Client" +nav-parent_id: tableapi +nav-pos: 100 +is_beta: true +--- + + + +Although Flinkâs Table & SQL API allows to declare queries in the SQL language. A SQL query needs to be embedded within a table program that is written either in Java or Scala. The table program needs to be packaged with a build tool before it can be submitted to a cluster. This limits the usage of Flink to mostly Java/Scala programmers. + +The *SQL Client* aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of code. The *SQL Client CLI* allows for retrieving and visualizing real-time results from the running distributed application on the command line. + + + +**Note:** The SQL Client is in an early developement phase. Even though the application is not production-ready yet, it can be a quite useful tool for prototyping and playing around with Flink SQL. In the future, the community plans to extend its functionality by providing a REST-based [SQL Client Gateway](sqlClient.html#limitations--future). + +* This will be replaced by the TOC +{:toc} + +Getting Started +--- + +This section describes how to setup and run your first Flink SQL program from the command-line. The SQL Client is bundled in the regular Flink distribution and thus runnable out of the box. + +The SQL Client requires a running Flink cluster where table programs can be submitted to. For more information about setting up a Flink cluster see the [deployment part of this documentation]({{ site.baseurl }}/ops/deployment/cluster_setup.html). If you simply want to try out the SQL Client, you can also start a local cluster with one worker using the following command: --- End diff -- `see the [Cluster & Deployment](...) section.` ---
[jira] [Commented] (FLINK-9270) Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure
[ https://issues.apache.org/jira/browse/FLINK-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457695#comment-16457695 ] ASF GitHub Bot commented on FLINK-9270: --- Github user bowenli86 closed the pull request at: https://github.com/apache/flink/pull/5937 > Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of > @RetryOnFailure > > > Key: FLINK-9270 > URL: https://issues.apache.org/jira/browse/FLINK-9270 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.6.0 > > > Upgrade RocksDB to 5.11.3 to take latest bug fixes > Besides, I found that unit tests annotated with {{@RetryOnFailure}} will be > run concurrently if there's only {{try}} clause without a {{catch}} > following. For example, sometimes, > {{RocksDBPerformanceTest.testRocksDbMergePerformance()}} will actually be > running in 3 concurrent invocations, and multiple concurrent write to RocksDB > result in errors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5937: [FLINK-9270] Upgrade RocksDB to 5.11.3, and resolv...
Github user bowenli86 closed the pull request at: https://github.com/apache/flink/pull/5937 ---
[jira] [Created] (FLINK-9272) DataDog API "counter" metric type is deprecated
Elias Levy created FLINK-9272: - Summary: DataDog API "counter" metric type is deprecated Key: FLINK-9272 URL: https://issues.apache.org/jira/browse/FLINK-9272 Project: Flink Issue Type: Improvement Components: Metrics Affects Versions: 1.4.2 Reporter: Elias Levy It appears to have been replaced by the "count" metric type. https://docs.datadoghq.com/developers/metrics/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7144) Optimize multiple LogicalAggregate into one
[ https://issues.apache.org/jira/browse/FLINK-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457685#comment-16457685 ] Rong Rong commented on FLINK-7144: -- FLINK-8689 has been merged and FLINK-8690 is on the way. Will fix this one as well :-) > Optimize multiple LogicalAggregate into one > --- > > Key: FLINK-7144 > URL: https://issues.apache.org/jira/browse/FLINK-7144 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Jark Wu >Assignee: Rong Rong >Priority: Major > > When applying multiple GROUP BY, and no aggregates or expression in the first > GROUP BY, and the second GROUP fields is a subset of first GROUP fields. > Then the first GROUP BY can be removed. > Such as the following SQL , > {code} > SELECT a FROM (SELECT a,b,c FROM MyTable GROUP BY a, b, c) GROUP BY a > {code} > should be optimized into > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > but get: > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamGroupAggregate(groupBy=[a, b, c], select=[a, b, c]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > I looked for the Calcite built-in rules, but can't find a match one. So maybe > we should implement one , and maybe we should implement it in Calcite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-7144) Optimize multiple LogicalAggregate into one
[ https://issues.apache.org/jira/browse/FLINK-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-7144: Assignee: Rong Rong > Optimize multiple LogicalAggregate into one > --- > > Key: FLINK-7144 > URL: https://issues.apache.org/jira/browse/FLINK-7144 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Jark Wu >Assignee: Rong Rong >Priority: Major > > When applying multiple GROUP BY, and no aggregates or expression in the first > GROUP BY, and the second GROUP fields is a subset of first GROUP fields. > Then the first GROUP BY can be removed. > Such as the following SQL , > {code} > SELECT a FROM (SELECT a,b,c FROM MyTable GROUP BY a, b, c) GROUP BY a > {code} > should be optimized into > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > but get: > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamGroupAggregate(groupBy=[a, b, c], select=[a, b, c]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > I looked for the Calcite built-in rules, but can't find a match one. So maybe > we should implement one , and maybe we should implement it in Calcite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457679#comment-16457679 ] Rong Rong commented on FLINK-8690: -- Thanks for the suggestion [~fhueske]. I think this is much cleaner than having 2 logical plans since they are suppose to be "logical" LOL. I created the PR based on your suggestions and it looks pretty darn good covering all the cases (and potentially it also clears the way to support distinct group aggregation w/o time window in the future). Please take a look when you have time :-) > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457676#comment-16457676 ] ASF GitHub Bot commented on FLINK-8690: --- GitHub user walterddr opened a pull request: https://github.com/apache/flink/pull/5940 [FLINK-8690][table]Support DistinctAgg on DataStream ## What is the purpose of the change * Allow FlinkLogicalAggregate to support distinct aggregations on DataStream, while keeping DataSet to decompose distinct aggs into GROUP BY follow by normal aggregates. ## Brief change log - Moved `AggregateExpandDistinctAggregatesRule.JOIN` to `DATASET_NORM_RULES` - Enabled `DataStreamGroupWindowAggregate` to support distinct agg while maintaining unsupported for `[DataStream/DataSet]GroupAggregate`. - Fixed typo in codegen for distinct aggregate when merge - Fixed a possible codegen test error for `UNION ALL`. ## Verifying this change - Unit-test are added for `DistinctAggregateTest` - Added ITCase for distinct group window agg ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): yes (codegen) - Anything that affects deployment or recovery: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? not yet, should we put in `Aggregate` section or `Group Window` section? inputs are highly appreciated. Also distinct over aggregate is bug-fixed in FLINK-8689 but not documented. You can merge this pull request into a Git repository by running: $ git pull https://github.com/walterddr/flink FLINK-8690 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5940.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5940 commit c517821d13341ae10b5d47acdbd0cc7d5bbe38b7 Author: Rong RongDate: 2018-04-28T15:59:12Z moving AggregateExpandDistinctAggregatesRule.JOIN to DATASET_NORM_RULES, and enabled distinct aggregate support for window aggregate over datastream > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5940: [FLINK-8690][table]Support DistinctAgg on DataStre...
GitHub user walterddr opened a pull request: https://github.com/apache/flink/pull/5940 [FLINK-8690][table]Support DistinctAgg on DataStream ## What is the purpose of the change * Allow FlinkLogicalAggregate to support distinct aggregations on DataStream, while keeping DataSet to decompose distinct aggs into GROUP BY follow by normal aggregates. ## Brief change log - Moved `AggregateExpandDistinctAggregatesRule.JOIN` to `DATASET_NORM_RULES` - Enabled `DataStreamGroupWindowAggregate` to support distinct agg while maintaining unsupported for `[DataStream/DataSet]GroupAggregate`. - Fixed typo in codegen for distinct aggregate when merge - Fixed a possible codegen test error for `UNION ALL`. ## Verifying this change - Unit-test are added for `DistinctAggregateTest` - Added ITCase for distinct group window agg ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): yes (codegen) - Anything that affects deployment or recovery: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? not yet, should we put in `Aggregate` section or `Group Window` section? inputs are highly appreciated. Also distinct over aggregate is bug-fixed in FLINK-8689 but not documented. You can merge this pull request into a Git repository by running: $ git pull https://github.com/walterddr/flink FLINK-8690 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5940.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5940 commit c517821d13341ae10b5d47acdbd0cc7d5bbe38b7 Author: Rong RongDate: 2018-04-28T15:59:12Z moving AggregateExpandDistinctAggregatesRule.JOIN to DATASET_NORM_RULES, and enabled distinct aggregate support for window aggregate over datastream ---
[jira] [Commented] (FLINK-9268) RockDB errors from WindowOperator
[ https://issues.apache.org/jira/browse/FLINK-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457672#comment-16457672 ] Narayanan Arunachalam commented on FLINK-9268: -- Looks like I can't control using a set vs list. Because keyBy causes the values to be treated as list. > RockDB errors from WindowOperator > - > > Key: FLINK-9268 > URL: https://issues.apache.org/jira/browse/FLINK-9268 > Project: Flink > Issue Type: Bug > Components: DataStream API, State Backends, Checkpointing >Affects Versions: 1.4.2 >Reporter: Narayanan Arunachalam >Priority: Major > > The job has no sinks, one Kafka source, does a windowing based on session and > uses processing time. The job fails with the error given below after running > for few hours. The only way to recover from this error is to cancel the job > and start a new one. > Using S3 backend for externalized checkpoints. > A representative job DAG: > val streams = sEnv > .addSource(makeKafkaSource(config)) > .map(makeEvent) > .keyBy(_.get(EVENT_GROUP_ID)) > .window(ProcessingTimeSessionWindows.withGap(Time.seconds(60))) > .trigger(PurgingTrigger.of(ProcessingTimeTrigger.create())) > .apply(makeEventsList) > .addSink(makeNoOpSink) > A representative config: > state.backend=rocksDB > checkpoint.enabled=true > external.checkpoint.enabled=true > checkpoint.mode=AT_LEAST_ONCE > checkpoint.interval=90 > checkpoint.timeout=30 > Error: > TimerException\{java.lang.NegativeArraySizeException} > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:252) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NegativeArraySizeException > at org.rocksdb.RocksDB.get(Native Method) > at org.rocksdb.RocksDB.get(RocksDB.java:810) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:86) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:49) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onProcessingTime(WindowOperator.java:496) > at > org.apache.flink.streaming.api.operators.HeapInternalTimerService.onProcessingTime(HeapInternalTimerService.java:255) > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:249) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5924: [hotfix][README.md] Update building prerequisites
Github user W4anD0eR96 commented on a diff in the pull request: https://github.com/apache/flink/pull/5924#discussion_r184857746 --- Diff: README.md --- @@ -67,10 +67,10 @@ counts.writeAsCsv(outputPath) Prerequisites for building Flink: -* Unix-like environment (We use Linux, Mac OS X, Cygwin) +* Unix-like environment (we use Linux, Mac OS X, Cygwin) * git * Maven (we recommend version 3.2.5) -* Java 8 +* Java 8 (exactly 8, not 9 or 10) --- End diff -- sure, and done :-) ---
[jira] [Commented] (FLINK-9233) Merging state may cause runtime exception when windows trigger onMerge
[ https://issues.apache.org/jira/browse/FLINK-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457648#comment-16457648 ] Stephan Ewen commented on FLINK-9233: - Let's try upgrading for the {{master}} branch (Flink 1.6) - then the new RocksDB version should get some exposure during test runs before we make a release with the upgraded version. > Merging state may cause runtime exception when windows trigger onMerge > --- > > Key: FLINK-9233 > URL: https://issues.apache.org/jira/browse/FLINK-9233 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.4.0 >Reporter: Hai Zhou >Priority: Major > > the main logic of my flink job is as follows: > {code:java} > clickStream.coGroup(exposureStream).where(...).equalTo(...) > .window(EventTimeSessionWindows.withGap()) > .trigger(new SessionMatchTrigger) > .evictor() > .apply(); > {code} > {code:java} > SessionMatchTrigger{ > ReducingStateDescriptor stateDesc = new ReducingStateDescriptor() > ... > public boolean canMerge() { > return true; > } > public void onMerge(TimeWindow window, OnMergeContext ctx) { > ctx.mergePartitionedState(this.stateDesc); > ctx.registerEventTimeTimer(window.maxTimestamp()); > } > > } > {code} > {panel:title=detailed error logs} > java.lang.RuntimeException: Error while merging state. > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator$Context.mergePartitionedState(WindowOperator.java:895) > at > com.package.trigger.SessionMatchTrigger.onMerge(SessionMatchTrigger.java:56) > at > com.package.trigger.SessionMatchTrigger.onMerge(SessionMatchTrigger.java:14) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator$Context.onMerge(WindowOperator.java:939) > at > org.apache.flink.streaming.runtime.operators.windowing.EvictingWindowOperator$1.merge(EvictingWindowOperator.java:141) > at > org.apache.flink.streaming.runtime.operators.windowing.EvictingWindowOperator$1.merge(EvictingWindowOperator.java:120) > at > org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSet.addWindow(MergingWindowSet.java:212) > at > org.apache.flink.streaming.runtime.operators.windowing.EvictingWindowOperator.processElement(EvictingWindowOperator.java:119) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:207) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:264) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.Exception: Error while merging state in RocksDB > at > org.apache.flink.contrib.streaming.state.RocksDBReducingState.mergeNamespaces(RocksDBReducingState.java:186) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator$Context.mergePartitionedState(WindowOperator.java:887) > ... 12 more > Caused by: java.lang.IllegalArgumentException: Illegal value provided for > SubCode. > at org.rocksdb.Status$SubCode.getSubCode(Status.java:109) > at org.rocksdb.Status.(Status.java:30) > at org.rocksdb.RocksDB.delete(Native Method) > at org.rocksdb.RocksDB.delete(RocksDB.java:1110) > at > org.apache.flink.contrib.streaming.state.RocksDBReducingState.mergeNamespaces(RocksDBReducingState.java:143) > ... 13 more > {panel} > > I found the reason of this error. > Due to Java's > {RocksDB.Status.SubCode} > was out of sync with > {include/rocksdb/status.h:SubCode} > . > When running out of disc space this led to an > {IllegalArgumentException} > because of an invalid status code, rather than just returning the > corresponding status code without an exception. > more details:<[https://github.com/facebook/rocksdb/pull/3050]> -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5924: [hotfix][README.md] Update building prerequisites
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/5924#discussion_r184856717 --- Diff: README.md --- @@ -67,10 +67,10 @@ counts.writeAsCsv(outputPath) Prerequisites for building Flink: -* Unix-like environment (We use Linux, Mac OS X, Cygwin) +* Unix-like environment (we use Linux, Mac OS X, Cygwin) * git * Maven (we recommend version 3.2.5) -* Java 8 +* Java 8 (exactly 8, not 9 or 10) --- End diff -- Can we phrase this more like "(Java 9 and 10 are not yet supported) ---
[jira] [Commented] (FLINK-8255) Key expressions on named row types do not work
[ https://issues.apache.org/jira/browse/FLINK-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457643#comment-16457643 ] Sergey Nuyanzin commented on FLINK-8255: Hello [~twalthr] Could I pick this issue up if you do not mind? > Key expressions on named row types do not work > -- > > Key: FLINK-8255 > URL: https://issues.apache.org/jira/browse/FLINK-8255 > Project: Flink > Issue Type: Bug > Components: DataSet API, DataStream API >Affects Versions: 1.4.0, 1.5.0 >Reporter: Timo Walther >Priority: Major > > The following program fails with a {{ClassCastException}}. It seems that key > expressions and rows are not tested well. We should add more tests for them. > {code} > final StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > TypeInformation[] types = new TypeInformation[] {Types.INT, Types.INT}; > String[] fieldNames = new String[]{"id", "value"}; > RowTypeInfo rowTypeInfo = new RowTypeInfo(types, fieldNames); > env.fromCollection(Collections.singleton(new Row(2)), rowTypeInfo) > .keyBy("id").sum("value").print(); > env.execute("Streaming WordCount"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9268) RockDB errors from WindowOperator
[ https://issues.apache.org/jira/browse/FLINK-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457641#comment-16457641 ] Narayanan Arunachalam commented on FLINK-9268: -- One possibility is that upon some kind of failure after the job is restarted, a window is seeing the same events again and before the window would close a job failure then would lead in to this state. I think I could try turn the list to a set or configure checkpointing mode to EXACTLY_ONCE. > RockDB errors from WindowOperator > - > > Key: FLINK-9268 > URL: https://issues.apache.org/jira/browse/FLINK-9268 > Project: Flink > Issue Type: Bug > Components: DataStream API, State Backends, Checkpointing >Affects Versions: 1.4.2 >Reporter: Narayanan Arunachalam >Priority: Major > > The job has no sinks, one Kafka source, does a windowing based on session and > uses processing time. The job fails with the error given below after running > for few hours. The only way to recover from this error is to cancel the job > and start a new one. > Using S3 backend for externalized checkpoints. > A representative job DAG: > val streams = sEnv > .addSource(makeKafkaSource(config)) > .map(makeEvent) > .keyBy(_.get(EVENT_GROUP_ID)) > .window(ProcessingTimeSessionWindows.withGap(Time.seconds(60))) > .trigger(PurgingTrigger.of(ProcessingTimeTrigger.create())) > .apply(makeEventsList) > .addSink(makeNoOpSink) > A representative config: > state.backend=rocksDB > checkpoint.enabled=true > external.checkpoint.enabled=true > checkpoint.mode=AT_LEAST_ONCE > checkpoint.interval=90 > checkpoint.timeout=30 > Error: > TimerException\{java.lang.NegativeArraySizeException} > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:252) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NegativeArraySizeException > at org.rocksdb.RocksDB.get(Native Method) > at org.rocksdb.RocksDB.get(RocksDB.java:810) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:86) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:49) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onProcessingTime(WindowOperator.java:496) > at > org.apache.flink.streaming.api.operators.HeapInternalTimerService.onProcessingTime(HeapInternalTimerService.java:255) > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:249) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9268) RockDB errors from WindowOperator
[ https://issues.apache.org/jira/browse/FLINK-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Narayanan Arunachalam updated FLINK-9268: - Affects Version/s: 1.4.2 > RockDB errors from WindowOperator > - > > Key: FLINK-9268 > URL: https://issues.apache.org/jira/browse/FLINK-9268 > Project: Flink > Issue Type: Bug > Components: DataStream API, State Backends, Checkpointing >Affects Versions: 1.4.2 >Reporter: Narayanan Arunachalam >Priority: Major > > The job has no sinks, one Kafka source, does a windowing based on session and > uses processing time. The job fails with the error given below after running > for few hours. The only way to recover from this error is to cancel the job > and start a new one. > Using S3 backend for externalized checkpoints. > A representative job DAG: > val streams = sEnv > .addSource(makeKafkaSource(config)) > .map(makeEvent) > .keyBy(_.get(EVENT_GROUP_ID)) > .window(ProcessingTimeSessionWindows.withGap(Time.seconds(60))) > .trigger(PurgingTrigger.of(ProcessingTimeTrigger.create())) > .apply(makeEventsList) > .addSink(makeNoOpSink) > A representative config: > state.backend=rocksDB > checkpoint.enabled=true > external.checkpoint.enabled=true > checkpoint.mode=AT_LEAST_ONCE > checkpoint.interval=90 > checkpoint.timeout=30 > Error: > TimerException\{java.lang.NegativeArraySizeException} > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:252) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NegativeArraySizeException > at org.rocksdb.RocksDB.get(Native Method) > at org.rocksdb.RocksDB.get(RocksDB.java:810) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:86) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:49) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onProcessingTime(WindowOperator.java:496) > at > org.apache.flink.streaming.api.operators.HeapInternalTimerService.onProcessingTime(HeapInternalTimerService.java:255) > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:249) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8500) Get the timestamp of the Kafka message from kafka consumer(Kafka010Fetcher)
[ https://issues.apache.org/jira/browse/FLINK-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457600#comment-16457600 ] Fred Teunissen commented on FLINK-8500: --- I've created a [pull request|https://github.com/apache/flink/pull/5939] for this issue. This is my first pull request, so I hope that I addressed all of the 'contribution code guidelines' correctly. Please let me know whether I should do something different or when I forgot something. > Get the timestamp of the Kafka message from kafka consumer(Kafka010Fetcher) > --- > > Key: FLINK-8500 > URL: https://issues.apache.org/jira/browse/FLINK-8500 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector >Affects Versions: 1.4.0 >Reporter: yanxiaobin >Priority: Major > Fix For: 1.6.0 > > Attachments: image-2018-01-30-14-58-58-167.png, > image-2018-01-31-10-48-59-633.png > > > The method deserialize of KeyedDeserializationSchema needs a parameter > 'kafka message timestamp' (from ConsumerRecord) .In some business scenarios, > this is useful! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8500) Get the timestamp of the Kafka message from kafka consumer(Kafka010Fetcher)
[ https://issues.apache.org/jira/browse/FLINK-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457587#comment-16457587 ] ASF GitHub Bot commented on FLINK-8500: --- GitHub user FredTing opened a pull request: https://github.com/apache/flink/pull/5939 [FLINK-8500] [Kafka Connector] Get the timestamp of the Kafka message from kafka consumer ## What is the purpose of the change This pull request make the Kafka timestamp and timestampType available in the message deserialisation so one can use it in the business logic processing. ## Brief change log - Introduced new interface `KeyedWithTimestampDeserializationSchema` with extra parameters in the `deserialize` method. - Added the `KeyedWithTimestampDeserializationSchemaWrapper` class to keep the code backwards compatible. - Adjusted the Kafka Connectors 0.10+ to support the new interface too. - Adjusted the Kafka Connectors 0.9- to 'hide' this new interface since these version of Kafka don';t support timestamps - Added some documentation. ## Verifying this change This change is already covered by existing tests, such as most of the the Kafka Consumer tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes - The serializers: no - The runtime per-record code paths (performance sensitive): yes - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? docs / JavaDocs You can merge this pull request into a Git repository by running: $ git pull https://github.com/FredTing/flink FLINK-8500 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5939.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5939 commit 30293ac49a1d31c2abfa2b3fb3640e9e04ef8bcf Author: Fred TeunissenDate: 2018-04-28T08:06:41Z [FLINK-8500] Get the timestamp of the Kafka message from kafka consumer(Kafka010Fetcher) > Get the timestamp of the Kafka message from kafka consumer(Kafka010Fetcher) > --- > > Key: FLINK-8500 > URL: https://issues.apache.org/jira/browse/FLINK-8500 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector >Affects Versions: 1.4.0 >Reporter: yanxiaobin >Priority: Major > Fix For: 1.6.0 > > Attachments: image-2018-01-30-14-58-58-167.png, > image-2018-01-31-10-48-59-633.png > > > The method deserialize of KeyedDeserializationSchema needs a parameter > 'kafka message timestamp' (from ConsumerRecord) .In some business scenarios, > this is useful! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5939: [FLINK-8500] [Kafka Connector] Get the timestamp o...
GitHub user FredTing opened a pull request: https://github.com/apache/flink/pull/5939 [FLINK-8500] [Kafka Connector] Get the timestamp of the Kafka message from kafka consumer ## What is the purpose of the change This pull request make the Kafka timestamp and timestampType available in the message deserialisation so one can use it in the business logic processing. ## Brief change log - Introduced new interface `KeyedWithTimestampDeserializationSchema` with extra parameters in the `deserialize` method. - Added the `KeyedWithTimestampDeserializationSchemaWrapper` class to keep the code backwards compatible. - Adjusted the Kafka Connectors 0.10+ to support the new interface too. - Adjusted the Kafka Connectors 0.9- to 'hide' this new interface since these version of Kafka don';t support timestamps - Added some documentation. ## Verifying this change This change is already covered by existing tests, such as most of the the Kafka Consumer tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes - The serializers: no - The runtime per-record code paths (performance sensitive): yes - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? docs / JavaDocs You can merge this pull request into a Git repository by running: $ git pull https://github.com/FredTing/flink FLINK-8500 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5939.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5939 commit 30293ac49a1d31c2abfa2b3fb3640e9e04ef8bcf Author: Fred TeunissenDate: 2018-04-28T08:06:41Z [FLINK-8500] Get the timestamp of the Kafka message from kafka consumer(Kafka010Fetcher) ---
[jira] [Created] (FLINK-9271) flink-1.4.2-bin-scala_2.11.tgz is not in gzip format
Martin Grigorov created FLINK-9271: -- Summary: flink-1.4.2-bin-scala_2.11.tgz is not in gzip format Key: FLINK-9271 URL: https://issues.apache.org/jira/browse/FLINK-9271 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 1.4.2 Reporter: Martin Grigorov Hi, I've just downloaded "Flink Without Hadoop" from [http://flink.apache.org/downloads.html.] The name of the downloaded file is "flink-1.4.2-bin-scala_2.11.tgz" but trying to unpack it fails with: {code} tar zxvf flink-1.4.2-bin-scala_2.11.tgz gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now {code} {code} file flink-1.4.2-bin-scala_2.11.tgz flink-1.4.2-bin-scala_2.11.tgz: POSIX tar archive (GNU) {code} I'd suggest to rename the artefact to flink-1.4.2-bin-scala_2.11.*tar* to make it more clear what is inside and how to unpack it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9269) Concurrency problem in HeapKeyedStateBackend when performing checkpoint async
[ https://issues.apache.org/jira/browse/FLINK-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457539#comment-16457539 ] ASF GitHub Bot commented on FLINK-9269: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5934 I am also a bit torn. Sometimes I am thinking we might just have a pool with as many serializer copies as se can have concurrent checkpoints + savepoints. But then again, it is borderline to premature optimization. For this particular case, I think your suggestion sounds good. > Concurrency problem in HeapKeyedStateBackend when performing checkpoint async > - > > Key: FLINK-9269 > URL: https://issues.apache.org/jira/browse/FLINK-9269 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > {code:java} > @Nonnull > @Override > protected SnapshotResult performOperation() throws > Exception { > // do something >long[] keyGroupRangeOffsets = new > long[keyGroupRange.getNumberOfKeyGroups()]; >for (int keyGroupPos = 0; keyGroupPos < > keyGroupRange.getNumberOfKeyGroups(); ++keyGroupPos) { > int keyGroupId = keyGroupRange.getKeyGroupId(keyGroupPos); > keyGroupRangeOffsets[keyGroupPos] = localStream.getPos(); > outView.writeInt(keyGroupId); > for (Map.Entry> kvState : > stateTables.entrySet()) { > // do something > } > } > // do something > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5934: [FLINK-9269][state] fix concurrency problem when performi...
Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5934 I am also a bit torn. Sometimes I am thinking we might just have a pool with as many serializer copies as se can have concurrent checkpoints + savepoints. But then again, it is borderline to premature optimization. For this particular case, I think your suggestion sounds good. ---
[jira] [Commented] (FLINK-9269) Concurrency problem in HeapKeyedStateBackend when performing checkpoint async
[ https://issues.apache.org/jira/browse/FLINK-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457528#comment-16457528 ] ASF GitHub Bot commented on FLINK-9269: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5934 About the serializer duplication problem, I think you are right, duplicating a serialize is not always super cheap, so I think maybe the best tradeoff is to not duplicate the serializer to save the performance cost, and add some dedicated comments to describe why we don't duplicate it to making the code more "defensive", what do you think? > Concurrency problem in HeapKeyedStateBackend when performing checkpoint async > - > > Key: FLINK-9269 > URL: https://issues.apache.org/jira/browse/FLINK-9269 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > {code:java} > @Nonnull > @Override > protected SnapshotResult performOperation() throws > Exception { > // do something >long[] keyGroupRangeOffsets = new > long[keyGroupRange.getNumberOfKeyGroups()]; >for (int keyGroupPos = 0; keyGroupPos < > keyGroupRange.getNumberOfKeyGroups(); ++keyGroupPos) { > int keyGroupId = keyGroupRange.getKeyGroupId(keyGroupPos); > keyGroupRangeOffsets[keyGroupPos] = localStream.getPos(); > outView.writeInt(keyGroupId); > for (Map.Entry> kvState : > stateTables.entrySet()) { > // do something > } > } > // do something > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5934: [FLINK-9269][state] fix concurrency problem when performi...
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5934 About the serializer duplication problem, I think you are right, duplicating a serialize is not always super cheap, so I think maybe the best tradeoff is to not duplicate the serializer to save the performance cost, and add some dedicated comments to describe why we don't duplicate it to making the code more "defensive", what do you think? ---
[jira] [Commented] (FLINK-9269) Concurrency problem in HeapKeyedStateBackend when performing checkpoint async
[ https://issues.apache.org/jira/browse/FLINK-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457526#comment-16457526 ] ASF GitHub Bot commented on FLINK-9269: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5934 Hi @StefanRRichter sorry for the unclearly description here. What this PR trying to fix is the mainly relate to the below code which run async: ```java for (Map.Entry> kvState : stateTables.entrySet()) { // do something } // this will just close the outer compression stream ``` the `stateTables` may cause concurrency problem. > Concurrency problem in HeapKeyedStateBackend when performing checkpoint async > - > > Key: FLINK-9269 > URL: https://issues.apache.org/jira/browse/FLINK-9269 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > {code:java} > @Nonnull > @Override > protected SnapshotResult performOperation() throws > Exception { > // do something >long[] keyGroupRangeOffsets = new > long[keyGroupRange.getNumberOfKeyGroups()]; >for (int keyGroupPos = 0; keyGroupPos < > keyGroupRange.getNumberOfKeyGroups(); ++keyGroupPos) { > int keyGroupId = keyGroupRange.getKeyGroupId(keyGroupPos); > keyGroupRangeOffsets[keyGroupPos] = localStream.getPos(); > outView.writeInt(keyGroupId); > for (Map.Entry > kvState : > stateTables.entrySet()) { > // do something > } > } > // do something > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5934: [FLINK-9269][state] fix concurrency problem when performi...
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5934 Hi @StefanRRichter sorry for the unclearly description here. What this PR trying to fix is the mainly relate to the below code which run async: ```java for (Map.Entry> kvState : stateTables.entrySet()) { // do something } // this will just close the outer compression stream ``` the `stateTables` may cause concurrency problem. ---
[jira] [Commented] (FLINK-9269) Concurrency problem in HeapKeyedStateBackend when performing checkpoint async
[ https://issues.apache.org/jira/browse/FLINK-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457525#comment-16457525 ] ASF GitHub Bot commented on FLINK-9269: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5934 Hi, can you give some more detail about the actual problem you are trying to fix here? To me it looks like duplicating the serializer only for the meta data should not be required, because the serializer is just written and the getter is only used in a restore, which is never async. You can make an argument that this is just making the code more defensive, which is a good thing. But I just want to raise awareness that duplicating a serializer is not always super cheap, and this counts for the time spend in the synchronous part. So there is a tradeoff and that is why I would like to discuss if this is really a benefit? > Concurrency problem in HeapKeyedStateBackend when performing checkpoint async > - > > Key: FLINK-9269 > URL: https://issues.apache.org/jira/browse/FLINK-9269 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > {code:java} > @Nonnull > @Override > protected SnapshotResult performOperation() throws > Exception { > // do something >long[] keyGroupRangeOffsets = new > long[keyGroupRange.getNumberOfKeyGroups()]; >for (int keyGroupPos = 0; keyGroupPos < > keyGroupRange.getNumberOfKeyGroups(); ++keyGroupPos) { > int keyGroupId = keyGroupRange.getKeyGroupId(keyGroupPos); > keyGroupRangeOffsets[keyGroupPos] = localStream.getPos(); > outView.writeInt(keyGroupId); > for (Map.Entry> kvState : > stateTables.entrySet()) { > // do something > } > } > // do something > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5934: [FLINK-9269][state] fix concurrency problem when performi...
Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5934 Hi, can you give some more detail about the actual problem you are trying to fix here? To me it looks like duplicating the serializer only for the meta data should not be required, because the serializer is just written and the getter is only used in a restore, which is never async. You can make an argument that this is just making the code more defensive, which is a good thing. But I just want to raise awareness that duplicating a serializer is not always super cheap, and this counts for the time spend in the synchronous part. So there is a tradeoff and that is why I would like to discuss if this is really a benefit? ---
[jira] [Commented] (FLINK-9270) Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure
[ https://issues.apache.org/jira/browse/FLINK-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457520#comment-16457520 ] ASF GitHub Bot commented on FLINK-9270: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5937 @bowenli86 I already wanted to do this for some time but unfortunately we currently cannot upgrade RocksDB to any higher version than what is used in master. It seems like there is again a performance regression for the merge operator for all newer versions. If you check your Travis runs, you will see that `RocksDBPerformanceTest.testRocksDbMergePerformance` fails all the time. I have commented about this problem in the RocksDB issue tracker, but no reaction so far. If you agree, please close this PR. > Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of > @RetryOnFailure > > > Key: FLINK-9270 > URL: https://issues.apache.org/jira/browse/FLINK-9270 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.6.0 > > > Upgrade RocksDB to 5.11.3 to take latest bug fixes > Besides, I found that unit tests annotated with {{@RetryOnFailure}} will be > run concurrently if there's only {{try}} clause without a {{catch}} > following. For example, sometimes, > {{RocksDBPerformanceTest.testRocksDbMergePerformance()}} will actually be > running in 3 concurrent invocations, and multiple concurrent write to RocksDB > result in errors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink issue #5937: [FLINK-9270] Upgrade RocksDB to 5.11.3, and resolve concu...
Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5937 @bowenli86 I already wanted to do this for some time but unfortunately we currently cannot upgrade RocksDB to any higher version than what is used in master. It seems like there is again a performance regression for the merge operator for all newer versions. If you check your Travis runs, you will see that `RocksDBPerformanceTest.testRocksDbMergePerformance` fails all the time. I have commented about this problem in the RocksDB issue tracker, but no reaction so far. If you agree, please close this PR. ---
[jira] [Commented] (FLINK-9196) YARN: Flink binaries are not deleted from HDFS after cluster shutdown
[ https://issues.apache.org/jira/browse/FLINK-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457513#comment-16457513 ] ASF GitHub Bot commented on FLINK-9196: --- Github user GJL commented on a diff in the pull request: https://github.com/apache/flink/pull/5938#discussion_r184848405 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterClient.java --- @@ -265,6 +264,20 @@ public void waitForClusterToBeReady() { } } + @Override + public void shutDownCluster() { + LOG.info("Sending shutdown request to the Application Master"); + try { + final Future response = Patterns.ask(applicationClient.get(), + new YarnMessages.LocalStopYarnSession(ApplicationStatus.SUCCEEDED, --- End diff -- Always `SUCCEEDED` because previous logic in 1.4.2 used ``` YarnApplicationState appState = lastReport.getYarnApplicationState(); ApplicationStatus status = (appState == YarnApplicationState.FAILED || appState == YarnApplicationState.KILLED) ? ApplicationStatus.FAILED : ApplicationStatus.SUCCEEDED; if (status != ApplicationStatus.SUCCEEDED) { LOG.warn("YARN reported application state {}", appState); LOG.warn("Diagnostics: {}", lastReport.getDiagnostics()); } return status; ``` which does not make sense imo. > YARN: Flink binaries are not deleted from HDFS after cluster shutdown > - > > Key: FLINK-9196 > URL: https://issues.apache.org/jira/browse/FLINK-9196 > Project: Flink > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.0 >Reporter: Gary Yao >Assignee: Gary Yao >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > Attachments: 0001-xxx.patch > > > When deploying on YARN in flip6 mode, the Flink binaries are not deleted from > HDFS after the cluster shuts down. > *Steps to reproduce* > # Submit job in YARN job mode, non-detached: > {noformat} HADOOP_CLASSPATH=`hadoop classpath` bin/flink run -m yarn-cluster > -yjm 2048 -ytm 2048 ./examples/streaming/WordCount.jar {noformat} > # Check contents of {{/user/hadoop/.flink/}} on HDFS after > job is finished: > {noformat} > [hadoop@ip-172-31-43-78 flink-1.5.0]$ hdfs dfs -ls > /user/hadoop/.flink/application_1523966184826_0016 > Found 6 items > -rw-r--r-- 1 hadoop hadoop583 2018-04-17 14:54 > /user/hadoop/.flink/application_1523966184826_0016/90cf5b3a-039e-4d52-8266-4e9563d74827-taskmanager-conf.yaml > -rw-r--r-- 1 hadoop hadoop332 2018-04-17 14:54 > /user/hadoop/.flink/application_1523966184826_0016/application_1523966184826_0016-flink-conf.yaml3818971235442577934.tmp > -rw-r--r-- 1 hadoop hadoop 89779342 2018-04-02 17:08 > /user/hadoop/.flink/application_1523966184826_0016/flink-dist_2.11-1.5.0.jar > drwxrwxrwx - hadoop hadoop 0 2018-04-17 14:54 > /user/hadoop/.flink/application_1523966184826_0016/lib > -rw-r--r-- 1 hadoop hadoop 1939 2018-04-02 15:37 > /user/hadoop/.flink/application_1523966184826_0016/log4j.properties > -rw-r--r-- 1 hadoop hadoop 2331 2018-04-02 15:37 > /user/hadoop/.flink/application_1523966184826_0016/logback.xml > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5938: [FLINK-9196][flip6, yarn] Cleanup application file...
Github user GJL commented on a diff in the pull request: https://github.com/apache/flink/pull/5938#discussion_r184848405 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterClient.java --- @@ -265,6 +264,20 @@ public void waitForClusterToBeReady() { } } + @Override + public void shutDownCluster() { + LOG.info("Sending shutdown request to the Application Master"); + try { + final Future response = Patterns.ask(applicationClient.get(), + new YarnMessages.LocalStopYarnSession(ApplicationStatus.SUCCEEDED, --- End diff -- Always `SUCCEEDED` because previous logic in 1.4.2 used ``` YarnApplicationState appState = lastReport.getYarnApplicationState(); ApplicationStatus status = (appState == YarnApplicationState.FAILED || appState == YarnApplicationState.KILLED) ? ApplicationStatus.FAILED : ApplicationStatus.SUCCEEDED; if (status != ApplicationStatus.SUCCEEDED) { LOG.warn("YARN reported application state {}", appState); LOG.warn("Diagnostics: {}", lastReport.getDiagnostics()); } return status; ``` which does not make sense imo. ---
[jira] [Commented] (FLINK-9196) YARN: Flink binaries are not deleted from HDFS after cluster shutdown
[ https://issues.apache.org/jira/browse/FLINK-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457510#comment-16457510 ] ASF GitHub Bot commented on FLINK-9196: --- Github user GJL commented on a diff in the pull request: https://github.com/apache/flink/pull/5938#discussion_r184848340 --- Diff: flink-yarn/src/test/java/org/apache/flink/yarn/UtilsTest.java --- @@ -18,227 +18,35 @@ package org.apache.flink.yarn; -import org.apache.flink.configuration.Configuration; -import org.apache.flink.runtime.akka.AkkaUtils; -import org.apache.flink.runtime.clusterframework.ContaineredTaskManagerParameters; -import org.apache.flink.runtime.clusterframework.messages.NotifyResourceStarted; -import org.apache.flink.runtime.clusterframework.messages.RegisterResourceManager; -import org.apache.flink.runtime.clusterframework.messages.RegisterResourceManagerSuccessful; -import org.apache.flink.runtime.instance.AkkaActorGateway; -import org.apache.flink.runtime.leaderretrieval.SettableLeaderRetrievalService; -import org.apache.flink.runtime.messages.Acknowledge; -import org.apache.flink.runtime.testingUtils.TestingUtils; -import org.apache.flink.util.TestLogger; -import org.apache.flink.yarn.messages.NotifyWhenResourcesRegistered; -import org.apache.flink.yarn.messages.RequestNumberOfRegisteredResources; - -import akka.actor.ActorRef; -import akka.actor.ActorSystem; -import akka.actor.PoisonPill; -import akka.actor.Props; -import akka.testkit.JavaTestKit; -import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; -import org.apache.hadoop.yarn.api.records.ApplicationId; -import org.apache.hadoop.yarn.api.records.Container; -import org.apache.hadoop.yarn.api.records.ContainerId; -import org.apache.hadoop.yarn.api.records.ContainerLaunchContext; -import org.apache.hadoop.yarn.api.records.NodeId; -import org.apache.hadoop.yarn.client.api.AMRMClient; -import org.apache.hadoop.yarn.client.api.NMClient; -import org.apache.hadoop.yarn.client.api.async.AMRMClientAsync; -import org.apache.hadoop.yarn.conf.YarnConfiguration; -import org.junit.AfterClass; -import org.junit.BeforeClass; +import org.junit.Rule; import org.junit.Test; -import org.mockito.Matchers; -import org.mockito.invocation.InvocationOnMock; -import org.mockito.stubbing.Answer; +import org.junit.rules.TemporaryFolder; -import java.util.ArrayList; +import java.nio.file.Files; +import java.nio.file.Path; import java.util.Collections; -import java.util.HashMap; -import java.util.List; -import java.util.UUID; -import java.util.concurrent.CompletableFuture; -import java.util.concurrent.TimeUnit; - -import scala.Option; -import scala.concurrent.Await; -import scala.concurrent.Future; -import scala.concurrent.duration.Deadline; -import scala.concurrent.duration.FiniteDuration; -import static org.junit.Assert.assertEquals; -import static org.mockito.Mockito.doAnswer; -import static org.mockito.Mockito.mock; -import static org.mockito.Mockito.when; +import static org.hamcrest.Matchers.equalTo; +import static org.junit.Assert.assertThat; /** * Tests for {@link Utils}. */ -public class UtilsTest extends TestLogger { --- End diff -- Renamed class to `YarnFlinkResourceManagerTest` > YARN: Flink binaries are not deleted from HDFS after cluster shutdown > - > > Key: FLINK-9196 > URL: https://issues.apache.org/jira/browse/FLINK-9196 > Project: Flink > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.0 >Reporter: Gary Yao >Assignee: Gary Yao >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > Attachments: 0001-xxx.patch > > > When deploying on YARN in flip6 mode, the Flink binaries are not deleted from > HDFS after the cluster shuts down. > *Steps to reproduce* > # Submit job in YARN job mode, non-detached: > {noformat} HADOOP_CLASSPATH=`hadoop classpath` bin/flink run -m yarn-cluster > -yjm 2048 -ytm 2048 ./examples/streaming/WordCount.jar {noformat} > # Check contents of {{/user/hadoop/.flink/}} on HDFS after > job is finished: > {noformat} > [hadoop@ip-172-31-43-78 flink-1.5.0]$ hdfs dfs -ls > /user/hadoop/.flink/application_1523966184826_0016 > Found 6 items > -rw-r--r-- 1 hadoop hadoop583 2018-04-17 14:54 > /user/hadoop/.flink/application_1523966184826_0016/90cf5b3a-039e-4d52-8266-4e9563d74827-taskmanager-conf.yaml > -rw-r--r-- 1 hadoop hadoop332 2018-04-17 14:54 >
[jira] [Commented] (FLINK-9196) YARN: Flink binaries are not deleted from HDFS after cluster shutdown
[ https://issues.apache.org/jira/browse/FLINK-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457511#comment-16457511 ] ASF GitHub Bot commented on FLINK-9196: --- Github user GJL commented on a diff in the pull request: https://github.com/apache/flink/pull/5938#discussion_r184848354 --- Diff: flink-yarn/src/test/java/org/apache/flink/yarn/UtilsTest.java --- @@ -18,227 +18,35 @@ package org.apache.flink.yarn; -import org.apache.flink.configuration.Configuration; -import org.apache.flink.runtime.akka.AkkaUtils; -import org.apache.flink.runtime.clusterframework.ContaineredTaskManagerParameters; -import org.apache.flink.runtime.clusterframework.messages.NotifyResourceStarted; -import org.apache.flink.runtime.clusterframework.messages.RegisterResourceManager; -import org.apache.flink.runtime.clusterframework.messages.RegisterResourceManagerSuccessful; -import org.apache.flink.runtime.instance.AkkaActorGateway; -import org.apache.flink.runtime.leaderretrieval.SettableLeaderRetrievalService; -import org.apache.flink.runtime.messages.Acknowledge; -import org.apache.flink.runtime.testingUtils.TestingUtils; -import org.apache.flink.util.TestLogger; -import org.apache.flink.yarn.messages.NotifyWhenResourcesRegistered; -import org.apache.flink.yarn.messages.RequestNumberOfRegisteredResources; - -import akka.actor.ActorRef; -import akka.actor.ActorSystem; -import akka.actor.PoisonPill; -import akka.actor.Props; -import akka.testkit.JavaTestKit; -import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; -import org.apache.hadoop.yarn.api.records.ApplicationId; -import org.apache.hadoop.yarn.api.records.Container; -import org.apache.hadoop.yarn.api.records.ContainerId; -import org.apache.hadoop.yarn.api.records.ContainerLaunchContext; -import org.apache.hadoop.yarn.api.records.NodeId; -import org.apache.hadoop.yarn.client.api.AMRMClient; -import org.apache.hadoop.yarn.client.api.NMClient; -import org.apache.hadoop.yarn.client.api.async.AMRMClientAsync; -import org.apache.hadoop.yarn.conf.YarnConfiguration; -import org.junit.AfterClass; -import org.junit.BeforeClass; +import org.junit.Rule; import org.junit.Test; -import org.mockito.Matchers; -import org.mockito.invocation.InvocationOnMock; -import org.mockito.stubbing.Answer; +import org.junit.rules.TemporaryFolder; -import java.util.ArrayList; +import java.nio.file.Files; +import java.nio.file.Path; import java.util.Collections; -import java.util.HashMap; -import java.util.List; -import java.util.UUID; -import java.util.concurrent.CompletableFuture; -import java.util.concurrent.TimeUnit; - -import scala.Option; -import scala.concurrent.Await; -import scala.concurrent.Future; -import scala.concurrent.duration.Deadline; -import scala.concurrent.duration.FiniteDuration; -import static org.junit.Assert.assertEquals; -import static org.mockito.Mockito.doAnswer; -import static org.mockito.Mockito.mock; -import static org.mockito.Mockito.when; +import static org.hamcrest.Matchers.equalTo; +import static org.junit.Assert.assertThat; /** * Tests for {@link Utils}. */ -public class UtilsTest extends TestLogger { +public class UtilsTest { --- End diff -- `TestLogger` missing. > YARN: Flink binaries are not deleted from HDFS after cluster shutdown > - > > Key: FLINK-9196 > URL: https://issues.apache.org/jira/browse/FLINK-9196 > Project: Flink > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.0 >Reporter: Gary Yao >Assignee: Gary Yao >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > Attachments: 0001-xxx.patch > > > When deploying on YARN in flip6 mode, the Flink binaries are not deleted from > HDFS after the cluster shuts down. > *Steps to reproduce* > # Submit job in YARN job mode, non-detached: > {noformat} HADOOP_CLASSPATH=`hadoop classpath` bin/flink run -m yarn-cluster > -yjm 2048 -ytm 2048 ./examples/streaming/WordCount.jar {noformat} > # Check contents of {{/user/hadoop/.flink/}} on HDFS after > job is finished: > {noformat} > [hadoop@ip-172-31-43-78 flink-1.5.0]$ hdfs dfs -ls > /user/hadoop/.flink/application_1523966184826_0016 > Found 6 items > -rw-r--r-- 1 hadoop hadoop583 2018-04-17 14:54 > /user/hadoop/.flink/application_1523966184826_0016/90cf5b3a-039e-4d52-8266-4e9563d74827-taskmanager-conf.yaml > -rw-r--r-- 1 hadoop hadoop332 2018-04-17 14:54 >
[GitHub] flink pull request #5938: [FLINK-9196][flip6, yarn] Cleanup application file...
Github user GJL commented on a diff in the pull request: https://github.com/apache/flink/pull/5938#discussion_r184848340 --- Diff: flink-yarn/src/test/java/org/apache/flink/yarn/UtilsTest.java --- @@ -18,227 +18,35 @@ package org.apache.flink.yarn; -import org.apache.flink.configuration.Configuration; -import org.apache.flink.runtime.akka.AkkaUtils; -import org.apache.flink.runtime.clusterframework.ContaineredTaskManagerParameters; -import org.apache.flink.runtime.clusterframework.messages.NotifyResourceStarted; -import org.apache.flink.runtime.clusterframework.messages.RegisterResourceManager; -import org.apache.flink.runtime.clusterframework.messages.RegisterResourceManagerSuccessful; -import org.apache.flink.runtime.instance.AkkaActorGateway; -import org.apache.flink.runtime.leaderretrieval.SettableLeaderRetrievalService; -import org.apache.flink.runtime.messages.Acknowledge; -import org.apache.flink.runtime.testingUtils.TestingUtils; -import org.apache.flink.util.TestLogger; -import org.apache.flink.yarn.messages.NotifyWhenResourcesRegistered; -import org.apache.flink.yarn.messages.RequestNumberOfRegisteredResources; - -import akka.actor.ActorRef; -import akka.actor.ActorSystem; -import akka.actor.PoisonPill; -import akka.actor.Props; -import akka.testkit.JavaTestKit; -import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; -import org.apache.hadoop.yarn.api.records.ApplicationId; -import org.apache.hadoop.yarn.api.records.Container; -import org.apache.hadoop.yarn.api.records.ContainerId; -import org.apache.hadoop.yarn.api.records.ContainerLaunchContext; -import org.apache.hadoop.yarn.api.records.NodeId; -import org.apache.hadoop.yarn.client.api.AMRMClient; -import org.apache.hadoop.yarn.client.api.NMClient; -import org.apache.hadoop.yarn.client.api.async.AMRMClientAsync; -import org.apache.hadoop.yarn.conf.YarnConfiguration; -import org.junit.AfterClass; -import org.junit.BeforeClass; +import org.junit.Rule; import org.junit.Test; -import org.mockito.Matchers; -import org.mockito.invocation.InvocationOnMock; -import org.mockito.stubbing.Answer; +import org.junit.rules.TemporaryFolder; -import java.util.ArrayList; +import java.nio.file.Files; +import java.nio.file.Path; import java.util.Collections; -import java.util.HashMap; -import java.util.List; -import java.util.UUID; -import java.util.concurrent.CompletableFuture; -import java.util.concurrent.TimeUnit; - -import scala.Option; -import scala.concurrent.Await; -import scala.concurrent.Future; -import scala.concurrent.duration.Deadline; -import scala.concurrent.duration.FiniteDuration; -import static org.junit.Assert.assertEquals; -import static org.mockito.Mockito.doAnswer; -import static org.mockito.Mockito.mock; -import static org.mockito.Mockito.when; +import static org.hamcrest.Matchers.equalTo; +import static org.junit.Assert.assertThat; /** * Tests for {@link Utils}. */ -public class UtilsTest extends TestLogger { --- End diff -- Renamed class to `YarnFlinkResourceManagerTest` ---
[GitHub] flink pull request #5938: [FLINK-9196][flip6, yarn] Cleanup application file...
Github user GJL commented on a diff in the pull request: https://github.com/apache/flink/pull/5938#discussion_r184848354 --- Diff: flink-yarn/src/test/java/org/apache/flink/yarn/UtilsTest.java --- @@ -18,227 +18,35 @@ package org.apache.flink.yarn; -import org.apache.flink.configuration.Configuration; -import org.apache.flink.runtime.akka.AkkaUtils; -import org.apache.flink.runtime.clusterframework.ContaineredTaskManagerParameters; -import org.apache.flink.runtime.clusterframework.messages.NotifyResourceStarted; -import org.apache.flink.runtime.clusterframework.messages.RegisterResourceManager; -import org.apache.flink.runtime.clusterframework.messages.RegisterResourceManagerSuccessful; -import org.apache.flink.runtime.instance.AkkaActorGateway; -import org.apache.flink.runtime.leaderretrieval.SettableLeaderRetrievalService; -import org.apache.flink.runtime.messages.Acknowledge; -import org.apache.flink.runtime.testingUtils.TestingUtils; -import org.apache.flink.util.TestLogger; -import org.apache.flink.yarn.messages.NotifyWhenResourcesRegistered; -import org.apache.flink.yarn.messages.RequestNumberOfRegisteredResources; - -import akka.actor.ActorRef; -import akka.actor.ActorSystem; -import akka.actor.PoisonPill; -import akka.actor.Props; -import akka.testkit.JavaTestKit; -import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; -import org.apache.hadoop.yarn.api.records.ApplicationId; -import org.apache.hadoop.yarn.api.records.Container; -import org.apache.hadoop.yarn.api.records.ContainerId; -import org.apache.hadoop.yarn.api.records.ContainerLaunchContext; -import org.apache.hadoop.yarn.api.records.NodeId; -import org.apache.hadoop.yarn.client.api.AMRMClient; -import org.apache.hadoop.yarn.client.api.NMClient; -import org.apache.hadoop.yarn.client.api.async.AMRMClientAsync; -import org.apache.hadoop.yarn.conf.YarnConfiguration; -import org.junit.AfterClass; -import org.junit.BeforeClass; +import org.junit.Rule; import org.junit.Test; -import org.mockito.Matchers; -import org.mockito.invocation.InvocationOnMock; -import org.mockito.stubbing.Answer; +import org.junit.rules.TemporaryFolder; -import java.util.ArrayList; +import java.nio.file.Files; +import java.nio.file.Path; import java.util.Collections; -import java.util.HashMap; -import java.util.List; -import java.util.UUID; -import java.util.concurrent.CompletableFuture; -import java.util.concurrent.TimeUnit; - -import scala.Option; -import scala.concurrent.Await; -import scala.concurrent.Future; -import scala.concurrent.duration.Deadline; -import scala.concurrent.duration.FiniteDuration; -import static org.junit.Assert.assertEquals; -import static org.mockito.Mockito.doAnswer; -import static org.mockito.Mockito.mock; -import static org.mockito.Mockito.when; +import static org.hamcrest.Matchers.equalTo; +import static org.junit.Assert.assertThat; /** * Tests for {@link Utils}. */ -public class UtilsTest extends TestLogger { +public class UtilsTest { --- End diff -- `TestLogger` missing. ---
[jira] [Commented] (FLINK-9196) YARN: Flink binaries are not deleted from HDFS after cluster shutdown
[ https://issues.apache.org/jira/browse/FLINK-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457508#comment-16457508 ] ASF GitHub Bot commented on FLINK-9196: --- GitHub user GJL opened a pull request: https://github.com/apache/flink/pull/5938 [FLINK-9196][flip6, yarn] Cleanup application files when deregistering YARN AM ## What is the purpose of the change *Ensure that YARN application files are removed if cluster is shutdown.* cc: @StephanEwen @tillrohrmann ## Brief change log - *Enable graceful cluster shut down via HTTP.* - *Remove Flink application files from remote file system when the YarnResourceManager deregisters the YARN ApplicationMaster. ## Verifying this change This change added tests and can be verified as follows: - *Manually verified that files are removed from HDFS when running stream (attached/detached) and batch jobs (attached).* - *Manually verified that files are removed from HDFS when running stopping a yarn session gracefully.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (**yes** / no / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) You can merge this pull request into a Git repository by running: $ git pull https://github.com/GJL/flink FLINK-9196 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5938.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5938 commit 6f0c0aed8a5b54814ed2e0fa761f06317592e4b3 Author: gyaoDate: 2018-04-19T08:29:43Z [hotfix] Replace String concatenation with Slf4j placeholders. commit 34b5b40fec62502579a3f3804839c1e9d1e95952 Author: gyao Date: 2018-04-19T09:03:20Z [hotfix] Indent method parameters. commit bcb0f24ec587c15287c6144d1c088a5327d98c6d Author: gyao Date: 2018-04-19T09:04:27Z [hotfix] Remove unnecessary int cast. commit 264b3e664fe84583ab8e372824f6d4424627e6e1 Author: gyao Date: 2018-04-19T09:05:05Z [hotfix] Fix raw types warning. commit 1b6eb96b3d287a20ea86606fd01b5e10564c3f5d Author: gyao Date: 2018-04-19T09:18:32Z [hotfix][tests] Rename UtilsTest to YarnFlinkResourceManagerTest. Test was misnamed. commit e8d43ff72a2861713db934fe42163fac6d9ecb8d Author: gyao Date: 2018-04-26T15:38:20Z [hotfix][mesos] Delete unused class FlinkMesosSessionCli. commit a4f9a5c6a44f08aa5f4a8dbbfb28a0bdb562b8c5 Author: gyao Date: 2018-04-26T15:44:56Z [hotfix][yarn] Remove unused field appReport in YarnClusterClient. commit 1260dfac974670f325b21d175e1e29064530bb53 Author: gyao Date: 2018-04-19T10:07:54Z [FLINK-9196][flip6, yarn] Cleanup application files when deregistering YARN AM Enable graceful cluster shut down via HTTP. Remove Flink application files from remote file system when the YarnResourceManager deregisters the YARN ApplicationMaster. > YARN: Flink binaries are not deleted from HDFS after cluster shutdown > - > > Key: FLINK-9196 > URL: https://issues.apache.org/jira/browse/FLINK-9196 > Project: Flink > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.0 >Reporter: Gary Yao >Assignee: Gary Yao >Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > Attachments: 0001-xxx.patch > > > When deploying on YARN in flip6 mode, the Flink binaries are not deleted from > HDFS after the cluster shuts down. > *Steps to reproduce* > # Submit job in YARN job mode, non-detached: > {noformat} HADOOP_CLASSPATH=`hadoop classpath` bin/flink run -m yarn-cluster > -yjm 2048 -ytm 2048 ./examples/streaming/WordCount.jar {noformat} > # Check contents of {{/user/hadoop/.flink/}} on HDFS after > job is finished: > {noformat} > [hadoop@ip-172-31-43-78 flink-1.5.0]$ hdfs dfs -ls >
[GitHub] flink pull request #5938: [FLINK-9196][flip6, yarn] Cleanup application file...
GitHub user GJL opened a pull request: https://github.com/apache/flink/pull/5938 [FLINK-9196][flip6, yarn] Cleanup application files when deregistering YARN AM ## What is the purpose of the change *Ensure that YARN application files are removed if cluster is shutdown.* cc: @StephanEwen @tillrohrmann ## Brief change log - *Enable graceful cluster shut down via HTTP.* - *Remove Flink application files from remote file system when the YarnResourceManager deregisters the YARN ApplicationMaster. ## Verifying this change This change added tests and can be verified as follows: - *Manually verified that files are removed from HDFS when running stream (attached/detached) and batch jobs (attached).* - *Manually verified that files are removed from HDFS when running stopping a yarn session gracefully.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (**yes** / no / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) You can merge this pull request into a Git repository by running: $ git pull https://github.com/GJL/flink FLINK-9196 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5938.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5938 commit 6f0c0aed8a5b54814ed2e0fa761f06317592e4b3 Author: gyaoDate: 2018-04-19T08:29:43Z [hotfix] Replace String concatenation with Slf4j placeholders. commit 34b5b40fec62502579a3f3804839c1e9d1e95952 Author: gyao Date: 2018-04-19T09:03:20Z [hotfix] Indent method parameters. commit bcb0f24ec587c15287c6144d1c088a5327d98c6d Author: gyao Date: 2018-04-19T09:04:27Z [hotfix] Remove unnecessary int cast. commit 264b3e664fe84583ab8e372824f6d4424627e6e1 Author: gyao Date: 2018-04-19T09:05:05Z [hotfix] Fix raw types warning. commit 1b6eb96b3d287a20ea86606fd01b5e10564c3f5d Author: gyao Date: 2018-04-19T09:18:32Z [hotfix][tests] Rename UtilsTest to YarnFlinkResourceManagerTest. Test was misnamed. commit e8d43ff72a2861713db934fe42163fac6d9ecb8d Author: gyao Date: 2018-04-26T15:38:20Z [hotfix][mesos] Delete unused class FlinkMesosSessionCli. commit a4f9a5c6a44f08aa5f4a8dbbfb28a0bdb562b8c5 Author: gyao Date: 2018-04-26T15:44:56Z [hotfix][yarn] Remove unused field appReport in YarnClusterClient. commit 1260dfac974670f325b21d175e1e29064530bb53 Author: gyao Date: 2018-04-19T10:07:54Z [FLINK-9196][flip6, yarn] Cleanup application files when deregistering YARN AM Enable graceful cluster shut down via HTTP. Remove Flink application files from remote file system when the YarnResourceManager deregisters the YARN ApplicationMaster. ---
[jira] [Commented] (FLINK-9268) RockDB errors from WindowOperator
[ https://issues.apache.org/jira/browse/FLINK-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457505#comment-16457505 ] Stefan Richter commented on FLINK-9268: --- This is a known issue with RocksDB, see [https://github.com/facebook/rocksdb/issues/2383.] Summary is: RocksDB the whole list state for one key in a window must fit into a byte array when returned over JNI. That means the maximum number of bytes per value of a key in a window is Integer.MAX_VALUE, i.e. 2GB. I would guess that your job somehow collects more than that for a single key. > RockDB errors from WindowOperator > - > > Key: FLINK-9268 > URL: https://issues.apache.org/jira/browse/FLINK-9268 > Project: Flink > Issue Type: Bug > Components: DataStream API, State Backends, Checkpointing >Reporter: Narayanan Arunachalam >Priority: Major > > The job has no sinks, one Kafka source, does a windowing based on session and > uses processing time. The job fails with the error given below after running > for few hours. The only way to recover from this error is to cancel the job > and start a new one. > Using S3 backend for externalized checkpoints. > A representative job DAG: > val streams = sEnv > .addSource(makeKafkaSource(config)) > .map(makeEvent) > .keyBy(_.get(EVENT_GROUP_ID)) > .window(ProcessingTimeSessionWindows.withGap(Time.seconds(60))) > .trigger(PurgingTrigger.of(ProcessingTimeTrigger.create())) > .apply(makeEventsList) > .addSink(makeNoOpSink) > A representative config: > state.backend=rocksDB > checkpoint.enabled=true > external.checkpoint.enabled=true > checkpoint.mode=AT_LEAST_ONCE > checkpoint.interval=90 > checkpoint.timeout=30 > Error: > TimerException\{java.lang.NegativeArraySizeException} > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:252) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NegativeArraySizeException > at org.rocksdb.RocksDB.get(Native Method) > at org.rocksdb.RocksDB.get(RocksDB.java:810) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:86) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:49) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onProcessingTime(WindowOperator.java:496) > at > org.apache.flink.streaming.api.operators.HeapInternalTimerService.onProcessingTime(HeapInternalTimerService.java:255) > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:249) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8715) RocksDB does not propagate reconfiguration of serializer to the states
[ https://issues.apache.org/jira/browse/FLINK-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457476#comment-16457476 ] Tzu-Li (Gordon) Tai commented on FLINK-8715: Merged. 1.6.0: 07aa2d469e34884d715b01166db077a4cf7cf3af 1.5.0: fe0d8135bf3520af70071e634cf4dadae634b56a > RocksDB does not propagate reconfiguration of serializer to the states > -- > > Key: FLINK-8715 > URL: https://issues.apache.org/jira/browse/FLINK-8715 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.3.2 >Reporter: Arvid Heise >Assignee: Tzu-Li (Gordon) Tai >Priority: Blocker > Fix For: 1.5.0 > > > Any changes to the serializer done in #ensureCompability are lost during the > state creation. > In particular, > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java#L68] > always uses a fresh copy of the StateDescriptor. > An easy fix is to pass the reconfigured serializer as an additional parameter > in > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java#L1681] > , which can be retrieved through the side-output of getColumnFamily > {code:java} > kvStateInformation.get(stateDesc.getName()).f1.getStateSerializer() > {code} > I encountered it in 1.3.2 but the code in the master seems unchanged (hence > the pointer into master). I encountered it in ValueState, but I suspect the > same issue can be observed for all kinds of RocksDB states. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (FLINK-8715) RocksDB does not propagate reconfiguration of serializer to the states
[ https://issues.apache.org/jira/browse/FLINK-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai resolved FLINK-8715. Resolution: Fixed > RocksDB does not propagate reconfiguration of serializer to the states > -- > > Key: FLINK-8715 > URL: https://issues.apache.org/jira/browse/FLINK-8715 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.3.2 >Reporter: Arvid Heise >Assignee: Tzu-Li (Gordon) Tai >Priority: Blocker > Fix For: 1.5.0 > > > Any changes to the serializer done in #ensureCompability are lost during the > state creation. > In particular, > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java#L68] > always uses a fresh copy of the StateDescriptor. > An easy fix is to pass the reconfigured serializer as an additional parameter > in > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java#L1681] > , which can be retrieved through the side-output of getColumnFamily > {code:java} > kvStateInformation.get(stateDesc.getName()).f1.getStateSerializer() > {code} > I encountered it in 1.3.2 but the code in the master seems unchanged (hence > the pointer into master). I encountered it in ValueState, but I suspect the > same issue can be observed for all kinds of RocksDB states. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8715) RocksDB does not propagate reconfiguration of serializer to the states
[ https://issues.apache.org/jira/browse/FLINK-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457473#comment-16457473 ] ASF GitHub Bot commented on FLINK-8715: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/5885 Thanks for the reviews @bowenli86 @StefanRRichter! I've merged this. > RocksDB does not propagate reconfiguration of serializer to the states > -- > > Key: FLINK-8715 > URL: https://issues.apache.org/jira/browse/FLINK-8715 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.3.2 >Reporter: Arvid Heise >Assignee: Tzu-Li (Gordon) Tai >Priority: Blocker > Fix For: 1.5.0 > > > Any changes to the serializer done in #ensureCompability are lost during the > state creation. > In particular, > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java#L68] > always uses a fresh copy of the StateDescriptor. > An easy fix is to pass the reconfigured serializer as an additional parameter > in > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java#L1681] > , which can be retrieved through the side-output of getColumnFamily > {code:java} > kvStateInformation.get(stateDesc.getName()).f1.getStateSerializer() > {code} > I encountered it in 1.3.2 but the code in the master seems unchanged (hence > the pointer into master). I encountered it in ValueState, but I suspect the > same issue can be observed for all kinds of RocksDB states. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8715) RocksDB does not propagate reconfiguration of serializer to the states
[ https://issues.apache.org/jira/browse/FLINK-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457474#comment-16457474 ] ASF GitHub Bot commented on FLINK-8715: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5885 > RocksDB does not propagate reconfiguration of serializer to the states > -- > > Key: FLINK-8715 > URL: https://issues.apache.org/jira/browse/FLINK-8715 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.3.2 >Reporter: Arvid Heise >Assignee: Tzu-Li (Gordon) Tai >Priority: Blocker > Fix For: 1.5.0 > > > Any changes to the serializer done in #ensureCompability are lost during the > state creation. > In particular, > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java#L68] > always uses a fresh copy of the StateDescriptor. > An easy fix is to pass the reconfigured serializer as an additional parameter > in > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java#L1681] > , which can be retrieved through the side-output of getColumnFamily > {code:java} > kvStateInformation.get(stateDesc.getName()).f1.getStateSerializer() > {code} > I encountered it in 1.3.2 but the code in the master seems unchanged (hence > the pointer into master). I encountered it in ValueState, but I suspect the > same issue can be observed for all kinds of RocksDB states. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5885: [FLINK-8715] Remove usage of StateDescriptor in st...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5885 ---
[GitHub] flink issue #5885: [FLINK-8715] Remove usage of StateDescriptor in state han...
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/5885 Thanks for the reviews @bowenli86 @StefanRRichter! I've merged this. ---
[jira] [Commented] (FLINK-8715) RocksDB does not propagate reconfiguration of serializer to the states
[ https://issues.apache.org/jira/browse/FLINK-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457468#comment-16457468 ] ASF GitHub Bot commented on FLINK-8715: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/5885#discussion_r184846476 --- Diff: flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/AbstractRocksDBState.java --- @@ -190,4 +197,12 @@ protected void writeKeyWithGroupAndNamespace( RocksDBKeySerializationUtils.writeKey(key, keySerializer, keySerializationStream, keySerializationDataOutputView, ambiguousKeyPossible); RocksDBKeySerializationUtils.writeNameSpace(namespace, namespaceSerializer, keySerializationStream, keySerializationDataOutputView, ambiguousKeyPossible); } + + protected V getDefaultValue() { --- End diff -- Since this PR is blocking a bug for the 1.5 release, I'll proceed to merge this as it is. @bowenli86 Perhaps we can open a separate JIRA for this to keep this in mind? > RocksDB does not propagate reconfiguration of serializer to the states > -- > > Key: FLINK-8715 > URL: https://issues.apache.org/jira/browse/FLINK-8715 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.3.2 >Reporter: Arvid Heise >Assignee: Tzu-Li (Gordon) Tai >Priority: Blocker > Fix For: 1.5.0 > > > Any changes to the serializer done in #ensureCompability are lost during the > state creation. > In particular, > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java#L68] > always uses a fresh copy of the StateDescriptor. > An easy fix is to pass the reconfigured serializer as an additional parameter > in > [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java#L1681] > , which can be retrieved through the side-output of getColumnFamily > {code:java} > kvStateInformation.get(stateDesc.getName()).f1.getStateSerializer() > {code} > I encountered it in 1.3.2 but the code in the master seems unchanged (hence > the pointer into master). I encountered it in ValueState, but I suspect the > same issue can be observed for all kinds of RocksDB states. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5885: [FLINK-8715] Remove usage of StateDescriptor in st...
Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/5885#discussion_r184846476 --- Diff: flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/AbstractRocksDBState.java --- @@ -190,4 +197,12 @@ protected void writeKeyWithGroupAndNamespace( RocksDBKeySerializationUtils.writeKey(key, keySerializer, keySerializationStream, keySerializationDataOutputView, ambiguousKeyPossible); RocksDBKeySerializationUtils.writeNameSpace(namespace, namespaceSerializer, keySerializationStream, keySerializationDataOutputView, ambiguousKeyPossible); } + + protected V getDefaultValue() { --- End diff -- Since this PR is blocking a bug for the 1.5 release, I'll proceed to merge this as it is. @bowenli86 Perhaps we can open a separate JIRA for this to keep this in mind? ---
[jira] [Updated] (FLINK-9270) Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure
[ https://issues.apache.org/jira/browse/FLINK-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li updated FLINK-9270: Description: Upgrade RocksDB to 5.11.3 to take latest bug fixes Besides, I found that unit tests annotated with {{@RetryOnFailure}} will be run concurrently if there's only {{try}} clause without a {{catch}} following. For example, sometimes, {{RocksDBPerformanceTest.testRocksDbMergePerformance()}} will actually be running in 3 concurrent invocations, and multiple concurrent write to RocksDB result in errors. was: Upgrade RocksDB to 5.11.3 Besides, I found that unit tests annotated with {{@RetryOnFailure}} will be run concurrently. For example, sometimes, unit test with {{@RetryOnFailure(times=3)}} will actually be running in 3 concurrent invocations from the beginning, and any of their success will lead to success of the unit test. But with RocksDB unit test, that behavior can lead to test failures because multiple concurrent write. > Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of > @RetryOnFailure > > > Key: FLINK-9270 > URL: https://issues.apache.org/jira/browse/FLINK-9270 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.6.0 > > > Upgrade RocksDB to 5.11.3 to take latest bug fixes > Besides, I found that unit tests annotated with {{@RetryOnFailure}} will be > run concurrently if there's only {{try}} clause without a {{catch}} > following. For example, sometimes, > {{RocksDBPerformanceTest.testRocksDbMergePerformance()}} will actually be > running in 3 concurrent invocations, and multiple concurrent write to RocksDB > result in errors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9270) Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure
[ https://issues.apache.org/jira/browse/FLINK-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457453#comment-16457453 ] ASF GitHub Bot commented on FLINK-9270: --- GitHub user bowenli86 opened a pull request: https://github.com/apache/flink/pull/5937 [FLINK-9270] Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure ## What is the purpose of the change - Upgrade RocksDB to 5.11.3 to take latest bug fixes - Besides, I found that unit tests annotated with `@RetryOnFailure` will be run concurrently if there's only `try` clause without a `catch` following. For example, sometimes, `RocksDBPerformanceTest.testRocksDbMergePerformance()` will actually be running in 3 concurrent invocations, and multiple concurrent write to RocksDB result in errors. ## Brief change log - Upgrade RocksDB to 5.11.3 - For all RocksDB performance tests, add a `catch` clause to follow `try` This change is already covered by existing tests ## Does this pull request potentially affect one of the following parts: none ## Documentation none You can merge this pull request into a Git repository by running: $ git pull https://github.com/bowenli86/flink FLINK-9270 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5937.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5937 commit 9fed7b9cba78dfdc6512818d9c2c07fc80892d72 Author: Bowen LiDate: 2018-04-28T08:32:09Z [FLINK-9270] Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure > Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of > @RetryOnFailure > > > Key: FLINK-9270 > URL: https://issues.apache.org/jira/browse/FLINK-9270 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.6.0 > > > Upgrade RocksDB to 5.11.3 > Besides, I found that unit tests annotated with {{@RetryOnFailure}} will be > run concurrently. For example, sometimes, unit test with > {{@RetryOnFailure(times=3)}} will actually be running in 3 concurrent > invocations from the beginning, and any of their success will lead to success > of the unit test. But with RocksDB unit test, that behavior can lead to test > failures because multiple concurrent write. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5937: [FLINK-9270] Upgrade RocksDB to 5.11.3, and resolv...
GitHub user bowenli86 opened a pull request: https://github.com/apache/flink/pull/5937 [FLINK-9270] Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure ## What is the purpose of the change - Upgrade RocksDB to 5.11.3 to take latest bug fixes - Besides, I found that unit tests annotated with `@RetryOnFailure` will be run concurrently if there's only `try` clause without a `catch` following. For example, sometimes, `RocksDBPerformanceTest.testRocksDbMergePerformance()` will actually be running in 3 concurrent invocations, and multiple concurrent write to RocksDB result in errors. ## Brief change log - Upgrade RocksDB to 5.11.3 - For all RocksDB performance tests, add a `catch` clause to follow `try` This change is already covered by existing tests ## Does this pull request potentially affect one of the following parts: none ## Documentation none You can merge this pull request into a Git repository by running: $ git pull https://github.com/bowenli86/flink FLINK-9270 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5937.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5937 commit 9fed7b9cba78dfdc6512818d9c2c07fc80892d72 Author: Bowen LiDate: 2018-04-28T08:32:09Z [FLINK-9270] Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure ---
[jira] [Updated] (FLINK-9270) Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure
[ https://issues.apache.org/jira/browse/FLINK-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li updated FLINK-9270: Description: Upgrade RocksDB to 5.11.3 Besides, I found that unit tests annotated with {{@RetryOnFailure}} will be run concurrently. For example, sometimes, unit test with {{@RetryOnFailure(times=3)}} will actually be running in 3 concurrent invocations from the beginning, and any of their success will lead to success of the unit test. But with RocksDB unit test, that behavior can lead to test failures because multiple concurrent write. > Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of > @RetryOnFailure > > > Key: FLINK-9270 > URL: https://issues.apache.org/jira/browse/FLINK-9270 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Major > Fix For: 1.6.0 > > > Upgrade RocksDB to 5.11.3 > Besides, I found that unit tests annotated with {{@RetryOnFailure}} will be > run concurrently. For example, sometimes, unit test with > {{@RetryOnFailure(times=3)}} will actually be running in 3 concurrent > invocations from the beginning, and any of their success will lead to success > of the unit test. But with RocksDB unit test, that behavior can lead to test > failures because multiple concurrent write. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9270) Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure
Bowen Li created FLINK-9270: --- Summary: Upgrade RocksDB to 5.11.3, and resolve concurrent test invocation problem of @RetryOnFailure Key: FLINK-9270 URL: https://issues.apache.org/jira/browse/FLINK-9270 Project: Flink Issue Type: Improvement Components: State Backends, Checkpointing Affects Versions: 1.5.0 Reporter: Bowen Li Assignee: Bowen Li Fix For: 1.6.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9268) RockDB errors from WindowOperator
[ https://issues.apache.org/jira/browse/FLINK-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457446#comment-16457446 ] Sihua Zhou commented on FLINK-9268: --- Hi [~narayaruna] which version are you using? > RockDB errors from WindowOperator > - > > Key: FLINK-9268 > URL: https://issues.apache.org/jira/browse/FLINK-9268 > Project: Flink > Issue Type: Bug > Components: DataStream API, State Backends, Checkpointing >Reporter: Narayanan Arunachalam >Priority: Major > > The job has no sinks, one Kafka source, does a windowing based on session and > uses processing time. The job fails with the error given below after running > for few hours. The only way to recover from this error is to cancel the job > and start a new one. > Using S3 backend for externalized checkpoints. > A representative job DAG: > val streams = sEnv > .addSource(makeKafkaSource(config)) > .map(makeEvent) > .keyBy(_.get(EVENT_GROUP_ID)) > .window(ProcessingTimeSessionWindows.withGap(Time.seconds(60))) > .trigger(PurgingTrigger.of(ProcessingTimeTrigger.create())) > .apply(makeEventsList) > .addSink(makeNoOpSink) > A representative config: > state.backend=rocksDB > checkpoint.enabled=true > external.checkpoint.enabled=true > checkpoint.mode=AT_LEAST_ONCE > checkpoint.interval=90 > checkpoint.timeout=30 > Error: > TimerException\{java.lang.NegativeArraySizeException} > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:252) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NegativeArraySizeException > at org.rocksdb.RocksDB.get(Native Method) > at org.rocksdb.RocksDB.get(RocksDB.java:810) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:86) > at > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:49) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onProcessingTime(WindowOperator.java:496) > at > org.apache.flink.streaming.api.operators.HeapInternalTimerService.onProcessingTime(HeapInternalTimerService.java:255) > at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:249) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9265) Upgrade Prometheus version
[ https://issues.apache.org/jira/browse/FLINK-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457402#comment-16457402 ] ASF GitHub Bot commented on FLINK-9265: --- GitHub user yanghua opened a pull request: https://github.com/apache/flink/pull/5936 [FLINK-9265] Upgrade Prometheus version ## What is the purpose of the change *This pull request upgrades the prometheus's version to the latest version* ## Brief change log - *Upgrade the prometheus version* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (**yes** / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / **not documented**) You can merge this pull request into a Git repository by running: $ git pull https://github.com/yanghua/flink FLINK-9265 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5936.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5936 commit e29ddaa56d8236c7df284b9f818c24df662c395b Author: yanghuaDate: 2018-04-28T06:38:16Z [FLINK-9265] Upgrade Prometheus version > Upgrade Prometheus version > -- > > Key: FLINK-9265 > URL: https://issues.apache.org/jira/browse/FLINK-9265 > Project: Flink > Issue Type: Improvement >Reporter: Ted Yu >Assignee: vinoyang >Priority: Minor > > We're using 0.0.26 > Latest release is 2.2.1 > This issue is for upgrading the Prometheus version -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] flink pull request #5936: [FLINK-9265] Upgrade Prometheus version
GitHub user yanghua opened a pull request: https://github.com/apache/flink/pull/5936 [FLINK-9265] Upgrade Prometheus version ## What is the purpose of the change *This pull request upgrades the prometheus's version to the latest version* ## Brief change log - *Upgrade the prometheus version* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (**yes** / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / **not documented**) You can merge this pull request into a Git repository by running: $ git pull https://github.com/yanghua/flink FLINK-9265 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5936.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5936 commit e29ddaa56d8236c7df284b9f818c24df662c395b Author: yanghuaDate: 2018-04-28T06:38:16Z [FLINK-9265] Upgrade Prometheus version ---
[jira] [Commented] (FLINK-9265) Upgrade Prometheus version
[ https://issues.apache.org/jira/browse/FLINK-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457398#comment-16457398 ] vinoyang commented on FLINK-9265: - It seems the latest release version is 0.3.0. > Upgrade Prometheus version > -- > > Key: FLINK-9265 > URL: https://issues.apache.org/jira/browse/FLINK-9265 > Project: Flink > Issue Type: Improvement >Reporter: Ted Yu >Assignee: vinoyang >Priority: Minor > > We're using 0.0.26 > Latest release is 2.2.1 > This issue is for upgrading the Prometheus version -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8795) Scala shell broken for Flip6
[ https://issues.apache.org/jira/browse/FLINK-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457396#comment-16457396 ] Alex Chen commented on FLINK-8795: -- Is it worth to document the workaround(for now) in [https://ci.apache.org/projects/flink/flink-docs-master/dev/scala_shell.html] ? If so, I'd like to do that. > Scala shell broken for Flip6 > > > Key: FLINK-8795 > URL: https://issues.apache.org/jira/browse/FLINK-8795 > Project: Flink > Issue Type: Bug >Reporter: kant kodali >Priority: Blocker > Fix For: 1.6.0 > > > I am trying to run the simple code below after building everything from > Flink's github master branch for various reasons. I get an exception below > and I wonder what runs on port 9065? and How to fix this exception? > I followed the instructions from the Flink master branch so I did the > following. > {code:java} > git clone https://github.com/apache/flink.git > cd flink mvn clean package -DskipTests > cd build-target > ./bin/start-scala-shell.sh local{code} > {{And Here is the code I ran}} > {code:java} > val dataStream = senv.fromElements(1, 2, 3, 4) > dataStream.countWindowAll(2).sum(0).print() > senv.execute("My streaming program"){code} > {{And I finally get this exception}} > {code:java} > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to > submit JobGraph. at > org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$18(RestClusterClient.java:306) > at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) > at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) > at > org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$222(RestClient.java:196) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) > at java.lang.Thread.run(Thread.java:745) Caused by: > java.util.concurrent.CompletionException: java.net.ConnectException: > Connection refused: localhost/127.0.0.1:9065 at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) > at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) > at > java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943) > at > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926) > ... 16 more Caused by: java.net.ConnectException: Connection refused: > localhost/127.0.0.1:9065 at sun.nio.ch.SocketChannelImpl.checkConnect(Native > Method) at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at > org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281){code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)