date:20200820

[jira] [Commented] (FLINK-19004) Fail to call Hive percentile function together with distinct aggregate call

2020-08-20 Thread Rui Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-19004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181194#comment-17181194
 ] 

Rui Li commented on FLINK-19004:


It seems the {{MIN}} is introduced during 
{{FlinkAggregateExpandDistinctAggregatesRule}} rewrites the original aggregate 
with grouping sets. It generates a {{MIN}} for the non-distinct aggregate 
calls. And since the return type of {{percentile}} is an array, thus the error.

> Fail to call Hive percentile function together with distinct aggregate call
> ---
>
> Key: FLINK-19004
> URL: https://issues.apache.org/jira/browse/FLINK-19004
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive, Table SQL / Planner
>Reporter: Rui Li
>Priority: Major
>
> The following test case would fail:
> {code}
>   @Test
>   public void test() throws Exception {
>   TableEnvironment tableEnv = getTableEnvWithHiveCatalog();
>   tableEnv.unloadModule("core");
>   tableEnv.loadModule("hive", new HiveModule());
>   tableEnv.loadModule("core", CoreModule.INSTANCE);
>   tableEnv.executeSql("create table src(x int,y int)");
>   tableEnv.executeSql("select count(distinct 
> y),`percentile`(y,`array`(0.5,0.99)) from src group by x").collect();
>   }
> {code}
> The error is:
> {noformat}
> org.apache.flink.table.api.TableException: Cannot generate a valid execution 
> plan for the given query: 
> FlinkLogicalLegacySink(name=[collect], fields=[EXPR$0, EXPR$1])
> +- FlinkLogicalCalc(select=[EXPR$0, EXPR$1])
>+- FlinkLogicalAggregate(group=[{0}], EXPR$0=[COUNT($1) FILTER $3], 
> EXPR$1=[MIN($2) FILTER $4])
>   +- FlinkLogicalCalc(select=[x, y, EXPR$1, =(CASE(=($e, 0:BIGINT), 
> 0:BIGINT, 1:BIGINT), 0) AS $g_0, =(CASE(=($e, 0:BIGINT), 0:BIGINT, 1:BIGINT), 
> 1) AS $g_1])
>  +- FlinkLogicalAggregate(group=[{0, 1, 3}], EXPR$1=[percentile($4, 
> $2)])
> +- FlinkLogicalExpand(projects=[x, y, $f2, $e, y_0])
>+- FlinkLogicalCalc(select=[x, y, array(0.5:DECIMAL(2, 1), 
> 0.99:DECIMAL(3, 2)) AS $f2])
>   +- FlinkLogicalLegacyTableSourceScan(table=[[test-catalog, 
> default, src, source: [HiveTableSource(x, y) TablePath: default.src, 
> PartitionPruned: false, PartitionNums: null]]], fields=[x, y])
> Min aggregate function does not support type: ''ARRAY''.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18850) Add late records dropped metric for row time over windows

2020-08-20 Thread ZhuShang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181190#comment-17181190
 ] 

ZhuShang commented on FLINK-18850:
--

sorry,i get it.

> Add late records dropped metric for row time over windows
> -
>
> Key: FLINK-18850
> URL: https://issues.apache.org/jira/browse/FLINK-18850
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Benchao Li
>Priority: Major
>  Labels: Starter
>
> Currently all the row time over windows in blink planner runtime discards 
> late records silently, it would be good to have a metric about the late 
> records dropping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] danny0405 commented on a change in pull request #13050: [FLINK-18750][table] SqlValidatorException thrown when select from a …

2020-08-20 Thread GitBox



danny0405 commented on a change in pull request #13050:
URL: https://github.com/apache/flink/pull/13050#discussion_r473975682



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/utils/Expander.java
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.utils;
+
+import org.apache.flink.table.planner.calcite.FlinkPlannerImpl;
+
+import org.apache.flink.shaded.guava18.com.google.common.collect.ImmutableMap;
+
+import org.apache.calcite.sql.SqlIdentifier;
+import org.apache.calcite.sql.SqlNode;
+import org.apache.calcite.sql.parser.SqlParser;
+import org.apache.calcite.sql.parser.SqlParserPos;
+import org.apache.calcite.sql.util.SqlBasicVisitor;
+import org.apache.calcite.sql.util.SqlShuttle;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Objects;
+import java.util.function.Function;
+
+/**
+ * Utility that expand SQL identifiers from a SQL query.
+ *
+ * Simple use:
+ *
+ * 
+ * final String sql =
+ * "select ename from emp where deptno  10";
+ * final Expander.Expanded expanded =
+ * Expander.create(planner).expanded(sql);
+ * print(expanded); // "select `emp`.`ename` from `catalog`.`db`.`emp` where 
`emp`.`deptno`  10"
+ * 
+ *
+ * Calling {@link Expanded#toString()} generates a string that is similar to
+ * SQL where a user has manually converted all identifiers as expanded, and
+ * which could then be persisted as expanded query of a Catalog view.
+ *
+ * For more advanced formatting, use {@link Expanded#substitute(Function)}.
+ *
+ * Adjust {@link SqlParser.Config} to use a different parser or parsing 
options.
+ */
+public class Expander {

Review comment:
   Yes, the `BridgingSqlFunction` did the function id expanding out of the 
scope of the `SqlValidator`, so in the code base of this patch, it does not 
work because `BridgingSqlFunction` id position was always `SqlParserPos.ZERO` 
so it can not match the original identifier.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xccui commented on a change in pull request #13089: [FLINK-18813][docs-zh] Translate the 'Local Installation' page of 'Try Flink' into Chinese

2020-08-20 Thread GitBox



xccui commented on a change in pull request #13089:
URL: https://github.com/apache/flink/pull/13089#discussion_r473944845



##
File path: docs/try-flink/local_installation.zh.md
##
@@ -26,36 +26,35 @@ under the License.
 {% if site.version contains "SNAPSHOT" %}
 
   
-  NOTE: The Apache Flink community only publishes official builds for
-  released versions of Apache Flink.
+  注意：Apache Flink 社区只发布 Apache Flink 的 release 版本。
   
-  Since you are currently looking at the latest SNAPSHOT
-  version of the documentation, all version references below will not work.
-  Please switch the documentation to the latest released version via the 
release picker which you
-  find on the left side below the menu.
+  由于你当前正在查看的是文档最新的 SNAPSHOT 版本，因此相关内容会被隐藏。请通过左侧菜单底部的版本选择将文档切换到最新的 release 版本。
 
 {% else %}
-Follow these few steps to download the latest stable versions and get started.
+请按照以下几个步骤下载最新的稳定版本并开始使用。
 
-## Step 1: Download
+
 
-To be able to run Flink, the only requirement is to have a working __Java 8 or 
11__ installation.
-You can check the correct installation of Java by issuing the following 
command:
+## 步骤 1：下载
+
+为了能够运行 Flink，唯一的要求就是安装有效的 __Java 8 或者 Java 11__。你可以通过运行以下命令来检查 Java 的正确安装。

Review comment:
   
   为了运行Flink，只需提前安装好 __Java 8 或者 Java 11__ 。你可以通过以下命令来检查 Java 是否已经安装正确。

##
File path: docs/try-flink/local_installation.zh.md
##
@@ -26,36 +26,35 @@ under the License.
 {% if site.version contains "SNAPSHOT" %}
 
   
-  NOTE: The Apache Flink community only publishes official builds for
-  released versions of Apache Flink.
+  注意：Apache Flink 社区只发布 Apache Flink 的 release 版本。
   
-  Since you are currently looking at the latest SNAPSHOT
-  version of the documentation, all version references below will not work.
-  Please switch the documentation to the latest released version via the 
release picker which you
-  find on the left side below the menu.
+  由于你当前正在查看的是文档最新的 SNAPSHOT 版本，因此相关内容会被隐藏。请通过左侧菜单底部的版本选择将文档切换到最新的 release 版本。
 
 {% else %}
-Follow these few steps to download the latest stable versions and get started.
+请按照以下几个步骤下载最新的稳定版本并开始使用。

Review comment:
   Remove “并”

##
File path: docs/try-flink/local_installation.zh.md
##
@@ -64,10 +63,11 @@ Starting standalonesession daemon on host.
 Starting taskexecutor daemon on host.
 {% endhighlight %}
 
-## Step 3: Submit a Job
+
 
-Releases of Flink come with a number of example Jobs.
-You can quickly deploy one of these applications to the running cluster. 
+## 步骤 3：提交作业（Job）
+
+Flink 的 Releases 附带了许多的示例作业。你可以将这些应用程序之一快速部署到正在运行的集群。

Review comment:
   Flink 的 Releases 附带了许多的示例作业。你可以任意选择一个，快速部署到已运行的集群上。

##
File path: docs/try-flink/local_installation.zh.md
##
@@ -80,11 +80,13 @@ $ tail log/flink-*-taskexecutor-*.out
   (be,2)
 {% endhighlight %}
 
-Additionally, you can check Flink's [Web UI](http://localhost:8080) to monitor 
the status of the Cluster and running Job.
+另外，你可以查看 Flink 的 [Web UI](http://localhost:8080) 来监视集群的状态和正在运行的作业。

Review comment:
   查看 -> 通过





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13208: [FLINK-18676] [FileSystem] Bump s3 aws version

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13208:
URL: https://github.com/apache/flink/pull/13208#issuecomment-677651717


   
   ## CI report:
   
   * 3c1d30e68c960e1009b93d51582053ccf67ceb37 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5746)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12880: [FLINK-18555][table sql/api] Make TableConfig options can be configur…

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #12880:
URL: https://github.com/apache/flink/pull/12880#issuecomment-657437173


   
   ## CI report:
   
   * 72e7dcb66f3a2c6eda4899d51ddd6420d2d5822b Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5744)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13207: [FLINK-18952][python][docs] Add 10 minutes to DataStream API documentation

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13207:
URL: https://github.com/apache/flink/pull/13207#issuecomment-677643498


   
   ## CI report:
   
   * 371eaa2083054d66ef256a9a0ef7a90e116fee0d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5745)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #13208: [FLINK-18676] [FileSystem] Bump s3 aws version

2020-08-20 Thread GitBox



flinkbot commented on pull request #13208:
URL: https://github.com/apache/flink/pull/13208#issuecomment-677651717


   
   ## CI report:
   
   * 3c1d30e68c960e1009b93d51582053ccf67ceb37 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-18850) Add late records dropped metric for row time over windows

2020-08-20 Thread Benchao Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181158#comment-17181158
 ] 

Benchao Li commented on FLINK-18850:


[~ZhuShang] Seems that you misunderstand this issue. What I want to do in this 
issue is to add a metric for over window, not the window operator.

> Add late records dropped metric for row time over windows
> -
>
> Key: FLINK-18850
> URL: https://issues.apache.org/jira/browse/FLINK-18850
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Benchao Li
>Priority: Major
>  Labels: Starter
>
> Currently all the row time over windows in blink planner runtime discards 
> late records silently, it would be good to have a metric about the late 
> records dropping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-19006) project transformation does not support conversion to Tuple25 type

2020-08-20 Thread ming li (Jira)

ming li created FLINK-19006:
---

 Summary: project transformation does not support conversion to 
Tuple25 type
 Key: FLINK-19006
 URL: https://issues.apache.org/jira/browse/FLINK-19006
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Affects Versions: 1.11.1
Reporter: ming li


In the {{DataStream#project}} method, it will judge whether the length of 
{{fieldIndexes}} is between 1 and {{Tuple.MAX_ARITY-1}}, and then call 
{{projectTupleXX}} according to the length of {{fieldIndexes}}. This limits the 
maximum length of {{Tuple}} to 24.
{code:java}
protected StreamProjection(DataStream dataStream, int[] fieldIndexes) {
   if (!dataStream.getType().isTupleType()) {
  throw new RuntimeException("Only Tuple DataStreams can be projected");
   }
   if (fieldIndexes.length == 0) {
  throw new IllegalArgumentException("project() needs to select at least 
one (1) field.");
   } else if (fieldIndexes.length > Tuple.MAX_ARITY - 1) {
  throw new IllegalArgumentException(
"project() may select only up to (" + (Tuple.MAX_ARITY - 1) + ") 
fields.");
   }

   int maxFieldIndex = (dataStream.getType()).getArity();
   for (int i = 0; i < fieldIndexes.length; i++) {
  Preconditions.checkElementIndex(fieldIndexes[i], maxFieldIndex);
   }

   this.dataStream = dataStream;
   this.fieldIndexes = fieldIndexes;
}{code}
This problem also appears in {{ProjectOperator}}.
{code:java}
public Projection(DataSet ds, int[] fieldIndexes) {

   if (!(ds.getType() instanceof TupleTypeInfo)) {
  throw new UnsupportedOperationException("project() can only be applied to 
DataSets of Tuples.");
   }

   if (fieldIndexes.length == 0) {
  throw new IllegalArgumentException("project() needs to select at least 
one (1) field.");
   } else if (fieldIndexes.length > Tuple.MAX_ARITY - 1) {
  throw new IllegalArgumentException(
 "project() may select only up to (" + (Tuple.MAX_ARITY - 1) + ") 
fields.");
   }

   int maxFieldIndex = ds.getType().getArity();
   for (int fieldIndexe : fieldIndexes) {
  Preconditions.checkElementIndex(fieldIndexe, maxFieldIndex);
   }

   this.ds = ds;
   this.fieldIndexes = fieldIndexes;
}{code}
I think the length we limit should be 1 to {{Tuple.MAX_ARITY}} instead of 1 to 
{{Tuple.MAX_ARITY-1}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on pull request #13207: [FLINK-18952][python][docs] Add 10 minutes to DataStream API documentation

2020-08-20 Thread GitBox



flinkbot commented on pull request #13207:
URL: https://github.com/apache/flink/pull/13207#issuecomment-677643498


   
   ## CI report:
   
   * 371eaa2083054d66ef256a9a0ef7a90e116fee0d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13205: [FLINK-17330[runtime] Merge cyclic dependent pipelined regions into one region

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13205:
URL: https://github.com/apache/flink/pull/13205#issuecomment-677463515


   
   ## CI report:
   
   * e11ffec55b9151857069d64c806b47cc98d9679d Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5739)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #13208: [FLINK-18676] [FileSystem] Bump s3 aws version

2020-08-20 Thread GitBox



flinkbot commented on pull request #13208:
URL: https://github.com/apache/flink/pull/13208#issuecomment-677643552


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 3c1d30e68c960e1009b93d51582053ccf67ceb37 (Thu Aug 20 
12:47:30 UTC 2020)
   
   **Warnings:**
* **1 pom.xml files were touched**: Check for build and licensing issues.
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-18676).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-18676) Update version of aws to support use of default constructor of "WebIdentityTokenCredentialsProvider"

2020-08-20 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-18676:
---
Labels: pull-request-available  (was: )

> Update version of aws to support use of default constructor of 
> "WebIdentityTokenCredentialsProvider"
> 
>
> Key: FLINK-18676
> URL: https://issues.apache.org/jira/browse/FLINK-18676
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.11.0
>Reporter: Ravi Bhushan Ratnakar
>Priority: Minor
>  Labels: pull-request-available
>
> *Background:*
> I am using Flink 1.11.0 on kubernetes platform. To give access of aws 
> services to taskmanager/jobmanager, we are using "IAM Roles for Service 
> Accounts" . I have configured below property in flink-conf.yaml to use 
> credential provider.
> fs.s3a.aws.credentials.provider: 
> com.amazonaws.auth.WebIdentityTokenCredentialsProvider
>  
> *Issue:*
> When taskmanager/jobmanager is starting up, during this it complains that 
> "WebIdentityTokenCredentialsProvider" doesn't have "public constructor" and 
> container doesn't come up.
>  
> *Solution:*
> Currently the above credential's class is being used from 
> "*flink-s3-fs-hadoop"* which gets "aws-java-sdk-core" dependency from 
> "*flink-s3-fs-base*". In *"flink-s3-fs-base",*  version of aws is 1.11.754 . 
> The support of default constructor for "WebIdentityTokenCredentialsProvider" 
> is provided from aws version 1.11.788 and onward.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18938) Throw better exception message for quering sink-only connector

2020-08-20 Thread liufangliang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181156#comment-17181156
 ] 

liufangliang commented on FLINK-18938:
--

[~jark] I understand what you mean.

when we are quering a sink-only connector ,

if we don't  have a dependency jar in the classpath ,return the following 
excception message:
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'elasticsearch-7' that implements 
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
{code}
if we have a dependency jar in the classpath ,return the following excception 
message:  
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: The connector named 
'elasticsearch-7' is only supported as sink,can n't be used as a source.
{code}
Bisides,Tips（support as sink or source） can follow the identifier,For example:
{code:java}
Available factory identifiers are: 

datagen (source,sink)
elasticsearch-7 (sink-only)
test-connector (source-only)
{code}
What do you think of ?

 

> Throw better exception message for quering sink-only connector
> --
>
> Key: FLINK-18938
> URL: https://issues.apache.org/jira/browse/FLINK-18938
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jark Wu
>Priority: Major
>
> Currently, if we are quering a sink-only connector, for example: {{SELECT * 
> FROM elasticsearch_sink}}, a following exception will be thrown:
> {code}
> Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
> factory for identifier 'elasticsearch-7' that implements 
> 'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
> Available factory identifiers are:
> datagen
> {code}
> The above exception is very misleading, it sounds like that the elasticsearch 
> jar is not loaded, however the elasticsearch jar is in the lib directory of 
> Flink cluster. 
> We can improve the exception that explicitly telling users the found 
> connector only support as sink, can't be used as a source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] batas opened a new pull request #13208: [FLINK-18676] [FileSystem] Bump s3 aws version

2020-08-20 Thread GitBox



batas opened a new pull request #13208:
URL: https://github.com/apache/flink/pull/13208


   ## What is the purpose of the change
   
   This pull request bumps AWS SDK version to minimal version supported 
WebIdentityTokenCredentialsProvider in s3-fs-package, which is described in an 
issue.
   
   ## Brief change log
   
   - flink-s3-fs-base - bumped aws dependency
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (don't know)
 - The runtime per-record code paths (performance sensitive): (don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (don't know)
 - The S3 file system connector: (yes)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-web] Myasuka commented on a change in pull request #366: Add Apache Flink release 1.10.2

2020-08-20 Thread GitBox



Myasuka commented on a change in pull request #366:
URL: https://github.com/apache/flink-web/pull/366#discussion_r473940917



##
File path: _posts/2020-08-11-release-1.10.2.md
##
@@ -0,0 +1,199 @@
+---
+layout: post
+title:  "Apache Flink 1.10.2 Released"
+date:   2020-08-11 18:00:00
+categories: news
+authors:
+- Zhu Zhu:
+  name: "Zhu Zhu"
+  twitter: "zhuzhv"
+---
+
+The Apache Flink community released the second bugfix version of the Apache 
Flink 1.10 series.
+
+This release includes 73 fixes and minor improvements for Flink 1.10.1. The 
list below includes a detailed list of all fixes and improvements.
+
+We highly recommend all users to upgrade to Flink 1.10.2.
+
+Updated Maven dependencies:
+
+```xml
+
+  org.apache.flink
+  flink-java
+  1.10.2
+
+
+  org.apache.flink
+  flink-streaming-java_2.11
+  1.10.2
+
+
+  org.apache.flink
+  flink-clients_2.11
+  1.10.2
+
+```
+
+You can find the binaries on the updated [Downloads page]({{ site.baseurl 
}}/downloads.html).
+
+List of resolved issues:
+
+Sub-task
+
+
+[FLINK-15836] - 
Throw fatal error in KubernetesResourceManager when the pods watcher is 
closed with exception
+
+[FLINK-16160] - 
Schema#proctime and Schema#rowtime dont work in 
TableEnvironment#connect code path
+
+
+
+Bug
+
+
+[FLINK-13689] - 
Rest High Level Client for Elasticsearch6.x connector leaks threads if no 
connection could be established
+
+[FLINK-14369] - 

KafkaProducerAtLeastOnceITCaseKafkaProducerTestBase.testOneToOneAtLeastOnceCustomOperator
 fails on Travis
+
+[FLINK-14836] - 
Unable to set yarn container number for scala shell in yarn mode
+
+[FLINK-14894] - 
HybridOffHeapUnsafeMemorySegmentTest#testByteBufferWrap failed on Travis
+
+[FLINK-15758] - 
Investigate potential out-of-memory problems due to managed unsafe memory 
allocation
+
+[FLINK-15849] - 
Update SQL-CLIENT document from type to data-type
+
+[FLINK-16309] - 
ElasticSearch 7 connector is missing in SQL connector list
+
+[FLINK-16346] - 
BlobsCleanupITCase.testBlobServerCleanupCancelledJob fails on Travis
+
+[FLINK-16432] - 
Building Hive connector gives problems
+
+[FLINK-16451] - 
Fix IndexOutOfBoundsException for DISTINCT AGG with constants
+
+[FLINK-16510] - 
Task manager safeguard shutdown may not be reliable
+
+[FLINK-17092] - 
Pyflink test BlinkStreamDependencyTests is instable
+
+[FLINK-17322] - 
Enable latency tracker would corrupt the broadcast state
+
+[FLINK-17420] - 
Cannot alias Tuple and Row fields when converting DataStream to Table
+
+[FLINK-17466] - 
toRetractStream doesnt work correctly with Pojo conversion class
+
+[FLINK-17555] - 
docstring of pyflink.table.descriptors.FileSystem:1:duplicate object 
description of pyflink.table.descriptors.FileSystem
+
+[FLINK-17558] - 
Partitions are released in TaskExecutor Main Thread
+
+[FLINK-17562] - 
POST /jars/:jarid/plan is not working
+
+[FLINK-17578] - 
Union of 2 SideOutputs behaviour incorrect
+
+[FLINK-17639] - 
Document which FileSystems are supported by the StreamingFileSink
+
+[FLINK-17643] - 
LaunchCoordinatorTest fails
+
+[FLINK-17700] - 
The callback client of JavaGatewayServer should run in a daemon thread
+
+[FLINK-17744] - 
StreamContextEnvironment#execute cannot be call JobListener#onJobExecuted
+
+[FLINK-17763] - 
No log files when starting scala-shell
+
+[FLINK-17788] - 
scala shell in yarn mode is broken
+
+[FLINK-17800] - 
RocksDB optimizeForPointLookup results in missing time windows
+
+[FLINK-17801] - 
TaskExecutorTest.testHeartbeatTimeoutWithResourceManager timeout
+
+[FLINK-17809] - 
BashJavaUtil script logic does not work for paths with spaces
+
+[FLINK-17822] - 
Nightly Flink CLI end-to-end test failed with 
JavaGcCleanerWrapper$PendingCleanersRunner cannot access class 
jdk.internal.misc.SharedSecrets in Java 11 
+
+[FLINK-17870] - 
dependent jars are missing to be shipped to cluster in scala shell
+
+[FLINK-17891] - 
 FlinkYarnSessionCli sets wrong execution.target type
+
+[FLINK-17959] - 
Exception: CANCELLED: call already cancelled is thrown when run 
python udf
+
+[FLINK-18008] - 
HistoryServer does not log environment information on startup
+
+[FLINK-18012] - 
Deactivate slot timeout if TaskSlotTable.tryMarkSlotActive is called
+
+[FLINK-18035] - 
Executors#newCachedThreadPool could not work as expected
+
+[FLINK-18045] - 
Fix Kerberos credentials checking to unblock Flink on secured MapR
+
+[FLINK-18048] - 
--host option could not take effect for standalone application 
cluster
+
+[FLINK-18097] - 
History server doesnt clean all job json files
+
+[FLINK-18168] - 
Error results when use UDAF with Object Array return type
+
+[FLINK-18223] -

[jira] [Updated] (FLINK-18800) Avro serialization schema doesn't support Kafka key/value serialization

2020-08-20 Thread Mohammad Hossein Gerami (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Hossein Gerami updated FLINK-18800:

Priority: Critical  (was: Major)

> Avro serialization schema doesn't support  Kafka key/value serialization
> 
>
> Key: FLINK-18800
> URL: https://issues.apache.org/jira/browse/FLINK-18800
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Formats (JSON, Avro, Parquet, ORC, 
> SequenceFile)
>Affects Versions: 1.11.0, 1.11.1
>Reporter: Mohammad Hossein Gerami
>Priority: Critical
>
> {color:#ff8b00}AvroSerializationSchema{color} and 
> {color:#ff8b00}ConfluentRegistryAvroSerializationSchema{color} doesn't 
> support Kafka key/value serialization. I implemented a custom Avro 
> serialization schema for solving this problem. 
> please consensus to implement new class to support kafka key/value 
> serialization.
> for example in the Flink must implement a class like this:
> {code:java}
> public class KafkaAvroRegistrySchemaSerializationSchema extends 
> RegistryAvroSerializationSchema implements 
> KafkaSerializationSchema{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18850) Add late records dropped metric for row time over windows

2020-08-20 Thread ZhuShang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181153#comment-17181153
 ] 

ZhuShang commented on FLINK-18850:
--

hi [~libenchao], can i take this,if no one else working on this issus.

my solution is : when the element is late to be droped,inc the  
numLateRecordsDropped accumulator.

like this
{code:java}
org.apache.flink.table.runtime.operators.window.WindowOperator

if (isElementDropped) {
   numLateRecordsDropped.inc();// ← add here
   // markEvent will increase numLateRecordsDropped
   lateRecordsDroppedRate.markEvent();
}
{code}
what do you think?

> Add late records dropped metric for row time over windows
> -
>
> Key: FLINK-18850
> URL: https://issues.apache.org/jira/browse/FLINK-18850
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Benchao Li
>Priority: Major
>  Labels: Starter
>
> Currently all the row time over windows in blink planner runtime discards 
> late records silently, it would be good to have a metric about the late 
> records dropping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18800) Avro serialization schema doesn't support Kafka key/value serialization

2020-08-20 Thread Mohammad Hossein Gerami (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Hossein Gerami updated FLINK-18800:

Issue Type: Task  (was: Improvement)

> Avro serialization schema doesn't support  Kafka key/value serialization
> 
>
> Key: FLINK-18800
> URL: https://issues.apache.org/jira/browse/FLINK-18800
> Project: Flink
>  Issue Type: Task
>  Components: Connectors / Kafka, Formats (JSON, Avro, Parquet, ORC, 
> SequenceFile)
>Affects Versions: 1.11.0, 1.11.1
>Reporter: Mohammad Hossein Gerami
>Priority: Critical
>
> {color:#ff8b00}AvroSerializationSchema{color} and 
> {color:#ff8b00}ConfluentRegistryAvroSerializationSchema{color} doesn't 
> support Kafka key/value serialization. I implemented a custom Avro 
> serialization schema for solving this problem. 
> please consensus to implement new class to support kafka key/value 
> serialization.
> for example in the Flink must implement a class like this:
> {code:java}
> public class KafkaAvroRegistrySchemaSerializationSchema extends 
> RegistryAvroSerializationSchema implements 
> KafkaSerializationSchema{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18800) Avro serialization schema doesn't support Kafka key/value serialization

2020-08-20 Thread Mohammad Hossein Gerami (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Hossein Gerami updated FLINK-18800:

Description: 
{color:#ff8b00}AvroSerializationSchema{color} and 
{color:#ff8b00}ConfluentRegistryAvroSerializationSchema{color} doesn't support 
Kafka key/value serialization. I implemented a custom Avro serialization schema 
for solving this problem. 

please consensus to implement new class to support kafka key/value 
serialization.

for example in the Flink must implement a class like this:
{code:java}
public class KafkaAvroRegistrySchemaSerializationSchema extends 
RegistryAvroSerializationSchema implements 
KafkaSerializationSchema{code}

  was:
{color:#ff8b00}AvroSerializationSchema{color} and 
{color:#ff8b00}ConfluentRegistryAvroSerializationSchema{color} doesn't support 
Kafka key/value serialization. I implemented a custom Avro serialization schema 
for solving this problem. 

for example in the Flink must implement a class like this.
{code:java}
public class KafkaAvroRegistrySchemaSerializationSchema extends 
RegistryAvroSerializationSchema implements 
KafkaSerializationSchema{code}


> Avro serialization schema doesn't support  Kafka key/value serialization
> 
>
> Key: FLINK-18800
> URL: https://issues.apache.org/jira/browse/FLINK-18800
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Formats (JSON, Avro, Parquet, ORC, 
> SequenceFile)
>Affects Versions: 1.11.0, 1.11.1
>Reporter: Mohammad Hossein Gerami
>Priority: Major
>
> {color:#ff8b00}AvroSerializationSchema{color} and 
> {color:#ff8b00}ConfluentRegistryAvroSerializationSchema{color} doesn't 
> support Kafka key/value serialization. I implemented a custom Avro 
> serialization schema for solving this problem. 
> please consensus to implement new class to support kafka key/value 
> serialization.
> for example in the Flink must implement a class like this:
> {code:java}
> public class KafkaAvroRegistrySchemaSerializationSchema extends 
> RegistryAvroSerializationSchema implements 
> KafkaSerializationSchema{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18960) flink sideoutput union

2020-08-20 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-18960:

Fix Version/s: (was: 1.10.2)

> flink sideoutput union
> --
>
> Key: FLINK-18960
> URL: https://issues.apache.org/jira/browse/FLINK-18960
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.10.1
>Reporter: xiaohang.li
>Priority: Minor
>
> Flink sideoutput union seems not works right. If we union the sideoutput from 
> the same operator, the output is the result of last side output times by the 
> number of unions, which is not expected. For example,
> {code:java}
> val side = new OutputTag[String]("side")
>  val side2 = new OutputTag[String]("side2")
>  val side3 = new OutputTag[String]("side3")
>  val ds = env.socketTextStream("master",9001)
>  val res = ds.process(new ProcessFunction[String,String] {
>  override def processElement(value: String, ctx: ProcessFunction[String, 
> String]#Context, out: Collector[String]): Unit = {
>  if(value.contains("hello"))
> { ctx.output(side,value) }
> else if(value.contains("world"))
> { ctx.output(side2,value) }
> else if(value.contains("flink"))
> { ctx.output(side3,value) }
> out.collect(value)
>  }
>  })
> val res1 = res.getSideOutput(side)
>  val res2 = res.getSideOutput(side2)
>  val res3 = res.getSideOutput(side3)
> println( ">"+res1.getClass)
>  println( ">"+res2.getClass)
> res1.print("res1")
>  res2.print("res2")
>  res3.print("res3")
> res2.union(res1).union(res3).print("all")
> {code}
>  
>  If we input 
> {code:java}
> hello
> world
> flink
> {code}
> The output will be 
>  
> {code:java}
> res1> hello
>  res2> world
>  res3> flink
>  all> flink
>  all> flink
>  all> flink
> {code}
>  
> But the expected output would be 
> {code:java}
> res1> hello
> res2> world
> res3> flink
> all> hello 
> all> world 
> all> flink
> {code}
>  
>  
> if we add a _map_ after the sideoutput and then union them, the output would 
> be right, but adding map should be not needed. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] curcur commented on a change in pull request #13175: [FLINK-18955][Checkpointing] Add checkpoint path to job startup/restore message

2020-08-20 Thread GitBox



curcur commented on a change in pull request #13175:
URL: https://github.com/apache/flink/pull/13175#discussion_r473935762



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java
##
@@ -320,6 +320,12 @@ void setDiscardCallback(@Nullable 
CompletedCheckpointStats.DiscardCallback disca
 
@Override
public String toString() {
-   return String.format("Checkpoint %d @ %d for %s", checkpointID, 
timestamp, job);
+   return String.format(
+   "%s %d @ %d for %s located at %s",
+   props.getCheckpointType(),

Review comment:
   Please let me know what do you think :-). I would be happy to change it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] curcur commented on a change in pull request #13175: [FLINK-18955][Checkpointing] Add checkpoint path to job startup/restore message

2020-08-20 Thread GitBox



curcur commented on a change in pull request #13175:
URL: https://github.com/apache/flink/pull/13175#discussion_r473929844



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java
##
@@ -320,6 +320,12 @@ void setDiscardCallback(@Nullable 
CompletedCheckpointStats.DiscardCallback disca
 
@Override
public String toString() {
-   return String.format("Checkpoint %d @ %d for %s", checkpointID, 
timestamp, job);
+   return String.format(
+   "%s %d @ %d for %s located at %s",
+   props.getCheckpointType(),

Review comment:
   I do not have a strong opinion on this as well. But since you ask, let 
me have a try LOL :-)
   
   The only concern is that we "might" have more types in the future
   1. checkpoint or savepoint; 2. Synchronous or not; 3. Global or Individual; 
4. unaligned or not (I mean it could be);
   
   - It is a bit difficult to categorize the enum of CheckpointType to just 
"Checkpoint" and "Savepoint"
   - It is also difficult to tell what level of technicality a user needs. Or 
in other words, if we think SYNC_SAVEPOINT is not that understandable, we can 
improve its expressiveness.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-18850) Add late records dropped metric for row time over windows

2020-08-20 Thread ZhuShang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuShang updated FLINK-18850:
-
Attachment: (was: image-2020-08-20-20-25-03-981.png)

> Add late records dropped metric for row time over windows
> -
>
> Key: FLINK-18850
> URL: https://issues.apache.org/jira/browse/FLINK-18850
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Benchao Li
>Priority: Major
>  Labels: Starter
>
> Currently all the row time over windows in blink planner runtime discards 
> late records silently, it would be good to have a metric about the late 
> records dropping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] curcur commented on a change in pull request #13175: [FLINK-18955][Checkpointing] Add checkpoint path to job startup/restore message

2020-08-20 Thread GitBox



curcur commented on a change in pull request #13175:
URL: https://github.com/apache/flink/pull/13175#discussion_r473929844



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java
##
@@ -320,6 +320,12 @@ void setDiscardCallback(@Nullable 
CompletedCheckpointStats.DiscardCallback disca
 
@Override
public String toString() {
-   return String.format("Checkpoint %d @ %d for %s", checkpointID, 
timestamp, job);
+   return String.format(
+   "%s %d @ %d for %s located at %s",
+   props.getCheckpointType(),

Review comment:
   I do not have a strong opinion on this as well. But since you ask, let 
me have a try LOL :-)
   
   The only concern is that we "might" have more types in the future
   1. checkpoint or savepoint; 2. Synchronous or not; 3. Global or Individual; 
4. unaligned or not (I mean it could be);
   
   - It is a bit difficult to categorize the enum of CheckpointType to just 
"Checkpoint" and "Savepoint"
   - It is also difficult to tell what level of technicality a user needs. For 
a user that makes an SYNC_SAVEPOINT, he/she probably has enough knowledge to 
understand what an SYNC_SAVEPOINT is. Or in other words, if we think 
SYNC_SAVEPOINT is not that understandable, we can improve its expressiveness.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #13207: [FLINK-18952][python][docs] Add 10 minutes to DataStream API documentation

2020-08-20 Thread GitBox



flinkbot commented on pull request #13207:
URL: https://github.com/apache/flink/pull/13207#issuecomment-677633881


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 371eaa2083054d66ef256a9a0ef7a90e116fee0d (Thu Aug 20 
12:26:13 UTC 2020)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Issue Comment Deleted] (FLINK-18850) Add late records dropped metric for row time over windows

2020-08-20 Thread ZhuShang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuShang updated FLINK-18850:
-
Comment: was deleted

(was: hi,[~libenchao] 

i found the accumulator in WindowOperator
{code:java}
private static final String LATE_ELEMENTS_DROPPED_METRIC_NAME = 
"numLateRecordsDropped";
protected transient Counter numLateRecordsDropped;
{code}
and also ,in flink ui has the metrics as below

!image-2020-08-20-20-25-03-981.png!

 )

> Add late records dropped metric for row time over windows
> -
>
> Key: FLINK-18850
> URL: https://issues.apache.org/jira/browse/FLINK-18850
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Benchao Li
>Priority: Major
>  Labels: Starter
>
> Currently all the row time over windows in blink planner runtime discards 
> late records silently, it would be good to have a metric about the late 
> records dropping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18850) Add late records dropped metric for row time over windows

2020-08-20 Thread ZhuShang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181149#comment-17181149
 ] 

ZhuShang commented on FLINK-18850:
--

hi,[~libenchao] 

i found the accumulator in WindowOperator
{code:java}
private static final String LATE_ELEMENTS_DROPPED_METRIC_NAME = 
"numLateRecordsDropped";
protected transient Counter numLateRecordsDropped;
{code}
and also ,in flink ui has the metrics as below

!image-2020-08-20-20-25-03-981.png!

 

> Add late records dropped metric for row time over windows
> -
>
> Key: FLINK-18850
> URL: https://issues.apache.org/jira/browse/FLINK-18850
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Benchao Li
>Priority: Major
>  Labels: Starter
> Attachments: image-2020-08-20-20-25-03-981.png
>
>
> Currently all the row time over windows in blink planner runtime discards 
> late records silently, it would be good to have a metric about the late 
> records dropping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18850) Add late records dropped metric for row time over windows

2020-08-20 Thread ZhuShang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuShang updated FLINK-18850:
-
Attachment: image-2020-08-20-20-25-03-981.png

> Add late records dropped metric for row time over windows
> -
>
> Key: FLINK-18850
> URL: https://issues.apache.org/jira/browse/FLINK-18850
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Reporter: Benchao Li
>Priority: Major
>  Labels: Starter
> Attachments: image-2020-08-20-20-25-03-981.png
>
>
> Currently all the row time over windows in blink planner runtime discards 
> late records silently, it would be good to have a metric about the late 
> records dropping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-19005) used metaspace grow on every execution

2020-08-20 Thread Jira

Guillermo Sánchez created FLINK-19005:
-

 Summary: used metaspace grow on every execution
 Key: FLINK-19005
 URL: https://issues.apache.org/jira/browse/FLINK-19005
 Project: Flink
  Issue Type: Bug
  Components: API / DataSet, Client / Job Submission
Affects Versions: 1.11.1
Reporter: Guillermo Sánchez


Hi !

Im running a 1.11.1 flink cluster, where I execute batch jobs made with DataSet 
API.

I submit these jobs every day to calculate daily data.

In every execution, cluster's used metaspace increase by 7MB and its never 
released.

This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i 
need to restart the cluster to clean the metaspace

taskmanager.memory.jvm-metaspace.size is set to 512mb

Any idea of what could be causing this metaspace grow and why is it not 
released ?

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] curcur commented on a change in pull request #13175: [FLINK-18955][Checkpointing] Add checkpoint path to job startup/restore message

2020-08-20 Thread GitBox



curcur commented on a change in pull request #13175:
URL: https://github.com/apache/flink/pull/13175#discussion_r473929844



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java
##
@@ -320,6 +320,12 @@ void setDiscardCallback(@Nullable 
CompletedCheckpointStats.DiscardCallback disca
 
@Override
public String toString() {
-   return String.format("Checkpoint %d @ %d for %s", checkpointID, 
timestamp, job);
+   return String.format(
+   "%s %d @ %d for %s located at %s",
+   props.getCheckpointType(),

Review comment:
   I do not have a strong opinion on this as well. But since you ask, let 
me have a try LOL :-)
   
   The only concern is that we "might" have more types in the future
   1. checkpoint or savepoint; 2. Synchronous or not; 3. Global or Individual; 
4. unaligned or not (I mean it could be);
   
   - It is a bit difficult to categorize the enum of CheckpointType to just 
"Checkpoint" and "Savepoint"
   - It is also difficult to tell what level of technicality a user needs. For 
a user that makes an SYNC_SAVEPOINT, he/she probably has enough knowledge to 
understand what an SYNC_SAVEPOINT is. Or in other words, if we think 
SYNC_SAVEPOINT is not that understandable, we probably need to improve its 
expressiveness.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-18952) Add 10 minutes to DataStream API documentation

2020-08-20 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-18952:
---
Labels: pull-request-available  (was: )

> Add 10 minutes to DataStream API documentation
> --
>
> Key: FLINK-18952
> URL: https://issues.apache.org/jira/browse/FLINK-18952
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] hequn8128 opened a new pull request #13207: [FLINK-18952][python][docs] Add 10 minutes to DataStream API documentation

2020-08-20 Thread GitBox



hequn8128 opened a new pull request #13207:
URL: https://github.com/apache/flink/pull/13207


   ## What is the purpose of the change
   
   This pull request adds "10 minutes to DataStream API" documentation. This 
page a short introduction to Python DataStream, geared mainly for new users.
   
   
   ## Brief change log
   
 - Adds "10 minutes to DataStream API" documentation
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-18938) Throw better exception message for quering sink-only connector

2020-08-20 Thread liufangliang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181135#comment-17181135
 ] 

liufangliang edited comment on FLINK-18938 at 8/20/20, 12:22 PM:
-

Hi [~jark], Can you assign it to me ? 

The following tips are very complete and precise, I think there is no need to 
optimize.
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'elasticsearch-7' that implements 
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
{code}
Tips（support as sink or source） can follow the identifier. For example: 
{code:java}
Available factory identifiers are: 

datagen (source,sink)
elasticsearch-7 (sink-only)
test-connector (source-only)

{code}
[~jark]  [~lzljs3620320] What do you think of my solution?  


was (Author: liufangliang):
Hi [~jark], Can you assign it to me ? 

The following tips are very complete and precise, I think there is no need to 
optimize.
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'elasticsearch-7' that implements 
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
{code}
Tips（support as sink or source） can follow the identifier. For example: 
{code:java}
Available factory identifiers are: 

datagen (source,sink)
elasticsearch-7 (sink-only)
test-connector (source-only)

{code}
[~jark]  [~lzljs3620320] What do you think of my solution? 

 

 

 

 

 

 

 

 

> Throw better exception message for quering sink-only connector
> --
>
> Key: FLINK-18938
> URL: https://issues.apache.org/jira/browse/FLINK-18938
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jark Wu
>Priority: Major
>
> Currently, if we are quering a sink-only connector, for example: {{SELECT * 
> FROM elasticsearch_sink}}, a following exception will be thrown:
> {code}
> Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
> factory for identifier 'elasticsearch-7' that implements 
> 'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
> Available factory identifiers are:
> datagen
> {code}
> The above exception is very misleading, it sounds like that the elasticsearch 
> jar is not loaded, however the elasticsearch jar is in the lib directory of 
> Flink cluster. 
> We can improve the exception that explicitly telling users the found 
> connector only support as sink, can't be used as a source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18938) Throw better exception message for quering sink-only connector

2020-08-20 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181145#comment-17181145
 ] 

Jark Wu commented on FLINK-18938:
-

The exception is exactly the same if we don't have a dependency jar in the 
classpath. So many people are misled to think it is because of missing 
dependency and not notice the "DynamicTableSourceFactory". What I want to 
improve is to make the exception message explicit for this case. 

> Throw better exception message for quering sink-only connector
> --
>
> Key: FLINK-18938
> URL: https://issues.apache.org/jira/browse/FLINK-18938
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jark Wu
>Priority: Major
>
> Currently, if we are quering a sink-only connector, for example: {{SELECT * 
> FROM elasticsearch_sink}}, a following exception will be thrown:
> {code}
> Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
> factory for identifier 'elasticsearch-7' that implements 
> 'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
> Available factory identifiers are:
> datagen
> {code}
> The above exception is very misleading, it sounds like that the elasticsearch 
> jar is not loaded, however the elasticsearch jar is in the lib directory of 
> Flink cluster. 
> We can improve the exception that explicitly telling users the found 
> connector only support as sink, can't be used as a source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #12880: [FLINK-18555][table sql/api] Make TableConfig options can be configur…

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #12880:
URL: https://github.com/apache/flink/pull/12880#issuecomment-657437173


   
   ## CI report:
   
   * 94e8eb39032665f77e897cb572cd4320b66635f9 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5703)
 
   * 72e7dcb66f3a2c6eda4899d51ddd6420d2d5822b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5744)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12880: [FLINK-18555][table sql/api] Make TableConfig options can be configur…

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #12880:
URL: https://github.com/apache/flink/pull/12880#issuecomment-657437173


   
   ## CI report:
   
   * 94e8eb39032665f77e897cb572cd4320b66635f9 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5703)
 
   * 72e7dcb66f3a2c6eda4899d51ddd6420d2d5822b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-18938) Throw better exception message for quering sink-only connector

2020-08-20 Thread liufangliang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181135#comment-17181135
 ] 

liufangliang commented on FLINK-18938:
--

Hi [~jark], Can you assign it to me ? 

The following tips are very complete and precise, I think there is no need to 
optimize.

 
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'elasticsearch-7' that implements 
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
{code}
Tips（support as sink or source） can follow the identifier. For example:

 

 

 
{code:java}
Available factory identifiers are: 

datagen (source,sink)
elasticsearch-7 (sink-only)
test-connector (source-only)

{code}
[~jark]  [~lzljs3620320] What do you think of my solution? 

 

 

 

 

 

 

 

 

> Throw better exception message for quering sink-only connector
> --
>
> Key: FLINK-18938
> URL: https://issues.apache.org/jira/browse/FLINK-18938
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jark Wu
>Priority: Major
>
> Currently, if we are quering a sink-only connector, for example: {{SELECT * 
> FROM elasticsearch_sink}}, a following exception will be thrown:
> {code}
> Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
> factory for identifier 'elasticsearch-7' that implements 
> 'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
> Available factory identifiers are:
> datagen
> {code}
> The above exception is very misleading, it sounds like that the elasticsearch 
> jar is not loaded, however the elasticsearch jar is in the lib directory of 
> Flink cluster. 
> We can improve the exception that explicitly telling users the found 
> connector only support as sink, can't be used as a source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-18938) Throw better exception message for quering sink-only connector

2020-08-20 Thread liufangliang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181135#comment-17181135
 ] 

liufangliang edited comment on FLINK-18938 at 8/20/20, 11:49 AM:
-

Hi [~jark], Can you assign it to me ? 

The following tips are very complete and precise, I think there is no need to 
optimize.
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'elasticsearch-7' that implements 
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
{code}
Tips（support as sink or source） can follow the identifier. For example: 
{code:java}
Available factory identifiers are: 

datagen (source,sink)
elasticsearch-7 (sink-only)
test-connector (source-only)

{code}
[~jark]  [~lzljs3620320] What do you think of my solution? 

 

 

 

 

 

 

 

 


was (Author: liufangliang):
Hi [~jark], Can you assign it to me ? 

The following tips are very complete and precise, I think there is no need to 
optimize.

 
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'elasticsearch-7' that implements 
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
{code}
Tips（support as sink or source） can follow the identifier. For example: 
{code:java}
Available factory identifiers are: 

datagen (source,sink)
elasticsearch-7 (sink-only)
test-connector (source-only)

{code}
[~jark]  [~lzljs3620320] What do you think of my solution? 

 

 

 

 

 

 

 

 

> Throw better exception message for quering sink-only connector
> --
>
> Key: FLINK-18938
> URL: https://issues.apache.org/jira/browse/FLINK-18938
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jark Wu
>Priority: Major
>
> Currently, if we are quering a sink-only connector, for example: {{SELECT * 
> FROM elasticsearch_sink}}, a following exception will be thrown:
> {code}
> Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
> factory for identifier 'elasticsearch-7' that implements 
> 'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
> Available factory identifiers are:
> datagen
> {code}
> The above exception is very misleading, it sounds like that the elasticsearch 
> jar is not loaded, however the elasticsearch jar is in the lib directory of 
> Flink cluster. 
> We can improve the exception that explicitly telling users the found 
> connector only support as sink, can't be used as a source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-18938) Throw better exception message for quering sink-only connector

2020-08-20 Thread liufangliang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181135#comment-17181135
 ] 

liufangliang edited comment on FLINK-18938 at 8/20/20, 11:49 AM:
-

Hi [~jark], Can you assign it to me ? 

The following tips are very complete and precise, I think there is no need to 
optimize.

 
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'elasticsearch-7' that implements 
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
{code}
Tips（support as sink or source） can follow the identifier. For example: 
{code:java}
Available factory identifiers are: 

datagen (source,sink)
elasticsearch-7 (sink-only)
test-connector (source-only)

{code}
[~jark]  [~lzljs3620320] What do you think of my solution? 

 

 

 

 

 

 

 

 


was (Author: liufangliang):
Hi [~jark], Can you assign it to me ? 

The following tips are very complete and precise, I think there is no need to 
optimize.

 
{code:java}
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'elasticsearch-7' that implements 
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
{code}
Tips（support as sink or source） can follow the identifier. For example:

 

 

 
{code:java}
Available factory identifiers are: 

datagen (source,sink)
elasticsearch-7 (sink-only)
test-connector (source-only)

{code}
[~jark]  [~lzljs3620320] What do you think of my solution? 

 

 

 

 

 

 

 

 

> Throw better exception message for quering sink-only connector
> --
>
> Key: FLINK-18938
> URL: https://issues.apache.org/jira/browse/FLINK-18938
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jark Wu
>Priority: Major
>
> Currently, if we are quering a sink-only connector, for example: {{SELECT * 
> FROM elasticsearch_sink}}, a following exception will be thrown:
> {code}
> Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
> factory for identifier 'elasticsearch-7' that implements 
> 'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath.
> Available factory identifiers are:
> datagen
> {code}
> The above exception is very misleading, it sounds like that the elasticsearch 
> jar is not loaded, however the elasticsearch jar is in the lib directory of 
> Flink cluster. 
> We can improve the exception that explicitly telling users the found 
> connector only support as sink, can't be used as a source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] wangzzu removed a comment on pull request #13024: [FLINK-18742][flink-clients] Some configuration args do not take effect at client

2020-08-20 Thread GitBox



wangzzu removed a comment on pull request #13024:
URL: https://github.com/apache/flink/pull/13024#issuecomment-677524943


   @flinkbot run azure



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-web] zhuzhurk commented on a change in pull request #366: Add Apache Flink release 1.10.2

2020-08-20 Thread GitBox



zhuzhurk commented on a change in pull request #366:
URL: https://github.com/apache/flink-web/pull/366#discussion_r473908601



##
File path: _posts/2020-08-11-release-1.10.2.md
##
@@ -0,0 +1,199 @@
+---
+layout: post
+title:  "Apache Flink 1.10.2 Released"
+date:   2020-08-11 18:00:00
+categories: news
+authors:
+- Zhu Zhu:
+  name: "Zhu Zhu"
+  twitter: "zhuzhv"
+---
+
+The Apache Flink community released the second bugfix version of the Apache 
Flink 1.10 series.
+
+This release includes 73 fixes and minor improvements for Flink 1.10.1. The 
list below includes a detailed list of all fixes and improvements.
+
+We highly recommend all users to upgrade to Flink 1.10.2.
+
+Updated Maven dependencies:
+
+```xml
+
+  org.apache.flink
+  flink-java
+  1.10.2
+
+
+  org.apache.flink
+  flink-streaming-java_2.11
+  1.10.2
+
+
+  org.apache.flink
+  flink-clients_2.11
+  1.10.2
+
+```
+
+You can find the binaries on the updated [Downloads page]({{ site.baseurl 
}}/downloads.html).
+
+List of resolved issues:
+
+Sub-task
+
+
+[FLINK-15836] - 
Throw fatal error in KubernetesResourceManager when the pods watcher is 
closed with exception
+
+[FLINK-16160] - 
Schema#proctime and Schema#rowtime dont work in 
TableEnvironment#connect code path
+
+
+
+Bug
+
+
+[FLINK-13689] - 
Rest High Level Client for Elasticsearch6.x connector leaks threads if no 
connection could be established
+
+[FLINK-14369] - 

KafkaProducerAtLeastOnceITCaseKafkaProducerTestBase.testOneToOneAtLeastOnceCustomOperator
 fails on Travis
+
+[FLINK-14836] - 
Unable to set yarn container number for scala shell in yarn mode
+
+[FLINK-14894] - 
HybridOffHeapUnsafeMemorySegmentTest#testByteBufferWrap failed on Travis
+
+[FLINK-15758] - 
Investigate potential out-of-memory problems due to managed unsafe memory 
allocation
+
+[FLINK-15849] - 
Update SQL-CLIENT document from type to data-type
+
+[FLINK-16309] - 
ElasticSearch 7 connector is missing in SQL connector list
+
+[FLINK-16346] - 
BlobsCleanupITCase.testBlobServerCleanupCancelledJob fails on Travis
+
+[FLINK-16432] - 
Building Hive connector gives problems
+
+[FLINK-16451] - 
Fix IndexOutOfBoundsException for DISTINCT AGG with constants
+
+[FLINK-16510] - 
Task manager safeguard shutdown may not be reliable
+
+[FLINK-17092] - 
Pyflink test BlinkStreamDependencyTests is instable
+
+[FLINK-17322] - 
Enable latency tracker would corrupt the broadcast state
+
+[FLINK-17420] - 
Cannot alias Tuple and Row fields when converting DataStream to Table
+
+[FLINK-17466] - 
toRetractStream doesnt work correctly with Pojo conversion class
+
+[FLINK-17555] - 
docstring of pyflink.table.descriptors.FileSystem:1:duplicate object 
description of pyflink.table.descriptors.FileSystem
+
+[FLINK-17558] - 
Partitions are released in TaskExecutor Main Thread
+
+[FLINK-17562] - 
POST /jars/:jarid/plan is not working
+
+[FLINK-17578] - 
Union of 2 SideOutputs behaviour incorrect
+
+[FLINK-17639] - 
Document which FileSystems are supported by the StreamingFileSink
+
+[FLINK-17643] - 
LaunchCoordinatorTest fails
+
+[FLINK-17700] - 
The callback client of JavaGatewayServer should run in a daemon thread
+
+[FLINK-17744] - 
StreamContextEnvironment#execute cannot be call JobListener#onJobExecuted
+
+[FLINK-17763] - 
No log files when starting scala-shell
+
+[FLINK-17788] - 
scala shell in yarn mode is broken
+
+[FLINK-17800] - 
RocksDB optimizeForPointLookup results in missing time windows
+
+[FLINK-17801] - 
TaskExecutorTest.testHeartbeatTimeoutWithResourceManager timeout
+
+[FLINK-17809] - 
BashJavaUtil script logic does not work for paths with spaces
+
+[FLINK-17822] - 
Nightly Flink CLI end-to-end test failed with 
JavaGcCleanerWrapper$PendingCleanersRunner cannot access class 
jdk.internal.misc.SharedSecrets in Java 11 
+
+[FLINK-17870] - 
dependent jars are missing to be shipped to cluster in scala shell
+
+[FLINK-17891] - 
 FlinkYarnSessionCli sets wrong execution.target type
+
+[FLINK-17959] - 
Exception: CANCELLED: call already cancelled is thrown when run 
python udf
+
+[FLINK-18008] - 
HistoryServer does not log environment information on startup
+
+[FLINK-18012] - 
Deactivate slot timeout if TaskSlotTable.tryMarkSlotActive is called
+
+[FLINK-18035] - 
Executors#newCachedThreadPool could not work as expected
+
+[FLINK-18045] - 
Fix Kerberos credentials checking to unblock Flink on secured MapR
+
+[FLINK-18048] - 
--host option could not take effect for standalone application 
cluster
+
+[FLINK-18097] - 
History server doesnt clean all job json files
+
+[FLINK-18168] - 
Error results when use UDAF with Object Array return type
+
+[FLINK-18223] -

[GitHub] [flink-web] asfgit closed pull request #370: Add a blog post about the current state of Flink on Docker

2020-08-20 Thread GitBox



asfgit closed pull request #370:
URL: https://github.com/apache/flink-web/pull/370


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lirui-apache commented on a change in pull request #13157: [FLINK-18900][hive] HiveCatalog should error out when listing partitions with an invalid spec

2020-08-20 Thread GitBox



lirui-apache commented on a change in pull request #13157:
URL: https://github.com/apache/flink/pull/13157#discussion_r473905054



##
File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
##
@@ -777,8 +777,11 @@ public void dropPartition(ObjectPath tablePath, 
CatalogPartitionSpec partitionSp
// partition spec can be partial
List partialVals = 
HiveReflectionUtils.getPvals(hiveShim, hiveTable.getPartitionKeys(),
partitionSpec.getPartitionSpec());
+   checkValidPartitionSpec(partitionSpec, 
getFieldNames(hiveTable.getPartitionKeys()), tablePath);
return 
client.listPartitionNames(tablePath.getDatabaseName(), 
tablePath.getObjectName(), partialVals,
(short) 
-1).stream().map(HiveCatalog::createPartitionSpec).collect(Collectors.toList());
+   } catch (PartitionSpecInvalidException e) {

Review comment:
   Why do we need to catch the exception and throw another one?

##
File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
##
@@ -777,8 +777,11 @@ public void dropPartition(ObjectPath tablePath, 
CatalogPartitionSpec partitionSp
// partition spec can be partial
List partialVals = 
HiveReflectionUtils.getPvals(hiveShim, hiveTable.getPartitionKeys(),
partitionSpec.getPartitionSpec());
+   checkValidPartitionSpec(partitionSpec, 
getFieldNames(hiveTable.getPartitionKeys()), tablePath);

Review comment:
   This should be done before we call `HiveReflectionUtils.getPvals`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13024: [FLINK-18742][flink-clients] Some configuration args do not take effect at client

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13024:
URL: https://github.com/apache/flink/pull/13024#issuecomment-666148245


   
   ## CI report:
   
   * eb611d5b39f997fa3e986b5d163cb65a44b4b0ba UNKNOWN
   * a5db01bd0f63289039dfcdae41eaf099ed28a812 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5742)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5730)
 
   * 047cc5ec9e12641f2787ac6fa4a3525fd6fc7d70 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5743)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-19004) Fail to call Hive percentile function together with distinct aggregate call

2020-08-20 Thread Rui Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated FLINK-19004:
---
Description: 
The following test case would fail:
{code}
@Test
public void test() throws Exception {
TableEnvironment tableEnv = getTableEnvWithHiveCatalog();
tableEnv.unloadModule("core");
tableEnv.loadModule("hive", new HiveModule());
tableEnv.loadModule("core", CoreModule.INSTANCE);

tableEnv.executeSql("create table src(x int,y int)");
tableEnv.executeSql("select count(distinct 
y),`percentile`(y,`array`(0.5,0.99)) from src group by x").collect();
}
{code}

The error is:
{noformat}
org.apache.flink.table.api.TableException: Cannot generate a valid execution 
plan for the given query: 

FlinkLogicalLegacySink(name=[collect], fields=[EXPR$0, EXPR$1])
+- FlinkLogicalCalc(select=[EXPR$0, EXPR$1])
   +- FlinkLogicalAggregate(group=[{0}], EXPR$0=[COUNT($1) FILTER $3], 
EXPR$1=[MIN($2) FILTER $4])
  +- FlinkLogicalCalc(select=[x, y, EXPR$1, =(CASE(=($e, 0:BIGINT), 
0:BIGINT, 1:BIGINT), 0) AS $g_0, =(CASE(=($e, 0:BIGINT), 0:BIGINT, 1:BIGINT), 
1) AS $g_1])
 +- FlinkLogicalAggregate(group=[{0, 1, 3}], EXPR$1=[percentile($4, 
$2)])
+- FlinkLogicalExpand(projects=[x, y, $f2, $e, y_0])
   +- FlinkLogicalCalc(select=[x, y, array(0.5:DECIMAL(2, 1), 
0.99:DECIMAL(3, 2)) AS $f2])
  +- FlinkLogicalLegacyTableSourceScan(table=[[test-catalog, 
default, src, source: [HiveTableSource(x, y) TablePath: default.src, 
PartitionPruned: false, PartitionNums: null]]], fields=[x, y])

Min aggregate function does not support type: ''ARRAY''.
{noformat}

> Fail to call Hive percentile function together with distinct aggregate call
> ---
>
> Key: FLINK-19004
> URL: https://issues.apache.org/jira/browse/FLINK-19004
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive, Table SQL / Planner
>Reporter: Rui Li
>Priority: Major
>
> The following test case would fail:
> {code}
>   @Test
>   public void test() throws Exception {
>   TableEnvironment tableEnv = getTableEnvWithHiveCatalog();
>   tableEnv.unloadModule("core");
>   tableEnv.loadModule("hive", new HiveModule());
>   tableEnv.loadModule("core", CoreModule.INSTANCE);
>   tableEnv.executeSql("create table src(x int,y int)");
>   tableEnv.executeSql("select count(distinct 
> y),`percentile`(y,`array`(0.5,0.99)) from src group by x").collect();
>   }
> {code}
> The error is:
> {noformat}
> org.apache.flink.table.api.TableException: Cannot generate a valid execution 
> plan for the given query: 
> FlinkLogicalLegacySink(name=[collect], fields=[EXPR$0, EXPR$1])
> +- FlinkLogicalCalc(select=[EXPR$0, EXPR$1])
>+- FlinkLogicalAggregate(group=[{0}], EXPR$0=[COUNT($1) FILTER $3], 
> EXPR$1=[MIN($2) FILTER $4])
>   +- FlinkLogicalCalc(select=[x, y, EXPR$1, =(CASE(=($e, 0:BIGINT), 
> 0:BIGINT, 1:BIGINT), 0) AS $g_0, =(CASE(=($e, 0:BIGINT), 0:BIGINT, 1:BIGINT), 
> 1) AS $g_1])
>  +- FlinkLogicalAggregate(group=[{0, 1, 3}], EXPR$1=[percentile($4, 
> $2)])
> +- FlinkLogicalExpand(projects=[x, y, $f2, $e, y_0])
>+- FlinkLogicalCalc(select=[x, y, array(0.5:DECIMAL(2, 1), 
> 0.99:DECIMAL(3, 2)) AS $f2])
>   +- FlinkLogicalLegacyTableSourceScan(table=[[test-catalog, 
> default, src, source: [HiveTableSource(x, y) TablePath: default.src, 
> PartitionPruned: false, PartitionNums: null]]], fields=[x, y])
> Min aggregate function does not support type: ''ARRAY''.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-19004) Fail to call Hive percentile function together with distinct aggregate call

2020-08-20 Thread Rui Li (Jira)

Rui Li created FLINK-19004:
--

 Summary: Fail to call Hive percentile function together with 
distinct aggregate call
 Key: FLINK-19004
 URL: https://issues.apache.org/jira/browse/FLINK-19004
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive, Table SQL / Planner
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink-web] Myasuka commented on a change in pull request #366: Add Apache Flink release 1.10.2

2020-08-20 Thread GitBox



Myasuka commented on a change in pull request #366:
URL: https://github.com/apache/flink-web/pull/366#discussion_r473897004



##
File path: _posts/2020-08-11-release-1.10.2.md
##
@@ -0,0 +1,199 @@
+---
+layout: post
+title:  "Apache Flink 1.10.2 Released"
+date:   2020-08-11 18:00:00
+categories: news
+authors:
+- Zhu Zhu:
+  name: "Zhu Zhu"
+  twitter: "zhuzhv"
+---
+
+The Apache Flink community released the second bugfix version of the Apache 
Flink 1.10 series.
+
+This release includes 73 fixes and minor improvements for Flink 1.10.1. The 
list below includes a detailed list of all fixes and improvements.
+
+We highly recommend all users to upgrade to Flink 1.10.2.
+
+Updated Maven dependencies:
+
+```xml
+
+  org.apache.flink
+  flink-java
+  1.10.2
+
+
+  org.apache.flink
+  flink-streaming-java_2.11
+  1.10.2
+
+
+  org.apache.flink
+  flink-clients_2.11
+  1.10.2
+
+```
+
+You can find the binaries on the updated [Downloads page]({{ site.baseurl 
}}/downloads.html).
+
+List of resolved issues:
+
+Sub-task
+
+
+[FLINK-15836] - 
Throw fatal error in KubernetesResourceManager when the pods watcher is 
closed with exception
+
+[FLINK-16160] - 
Schema#proctime and Schema#rowtime dont work in 
TableEnvironment#connect code path
+
+
+
+Bug
+
+
+[FLINK-13689] - 
Rest High Level Client for Elasticsearch6.x connector leaks threads if no 
connection could be established
+
+[FLINK-14369] - 

KafkaProducerAtLeastOnceITCaseKafkaProducerTestBase.testOneToOneAtLeastOnceCustomOperator
 fails on Travis
+
+[FLINK-14836] - 
Unable to set yarn container number for scala shell in yarn mode
+
+[FLINK-14894] - 
HybridOffHeapUnsafeMemorySegmentTest#testByteBufferWrap failed on Travis
+
+[FLINK-15758] - 
Investigate potential out-of-memory problems due to managed unsafe memory 
allocation
+
+[FLINK-15849] - 
Update SQL-CLIENT document from type to data-type
+
+[FLINK-16309] - 
ElasticSearch 7 connector is missing in SQL connector list
+
+[FLINK-16346] - 
BlobsCleanupITCase.testBlobServerCleanupCancelledJob fails on Travis
+
+[FLINK-16432] - 
Building Hive connector gives problems
+
+[FLINK-16451] - 
Fix IndexOutOfBoundsException for DISTINCT AGG with constants
+
+[FLINK-16510] - 
Task manager safeguard shutdown may not be reliable
+
+[FLINK-17092] - 
Pyflink test BlinkStreamDependencyTests is instable
+
+[FLINK-17322] - 
Enable latency tracker would corrupt the broadcast state
+
+[FLINK-17420] - 
Cannot alias Tuple and Row fields when converting DataStream to Table
+
+[FLINK-17466] - 
toRetractStream doesnt work correctly with Pojo conversion class
+
+[FLINK-17555] - 
docstring of pyflink.table.descriptors.FileSystem:1:duplicate object 
description of pyflink.table.descriptors.FileSystem
+
+[FLINK-17558] - 
Partitions are released in TaskExecutor Main Thread
+
+[FLINK-17562] - 
POST /jars/:jarid/plan is not working
+
+[FLINK-17578] - 
Union of 2 SideOutputs behaviour incorrect
+
+[FLINK-17639] - 
Document which FileSystems are supported by the StreamingFileSink
+
+[FLINK-17643] - 
LaunchCoordinatorTest fails
+
+[FLINK-17700] - 
The callback client of JavaGatewayServer should run in a daemon thread
+
+[FLINK-17744] - 
StreamContextEnvironment#execute cannot be call JobListener#onJobExecuted
+
+[FLINK-17763] - 
No log files when starting scala-shell
+
+[FLINK-17788] - 
scala shell in yarn mode is broken
+
+[FLINK-17800] - 
RocksDB optimizeForPointLookup results in missing time windows
+
+[FLINK-17801] - 
TaskExecutorTest.testHeartbeatTimeoutWithResourceManager timeout
+
+[FLINK-17809] - 
BashJavaUtil script logic does not work for paths with spaces
+
+[FLINK-17822] - 
Nightly Flink CLI end-to-end test failed with 
JavaGcCleanerWrapper$PendingCleanersRunner cannot access class 
jdk.internal.misc.SharedSecrets in Java 11 
+
+[FLINK-17870] - 
dependent jars are missing to be shipped to cluster in scala shell
+
+[FLINK-17891] - 
 FlinkYarnSessionCli sets wrong execution.target type
+
+[FLINK-17959] - 
Exception: CANCELLED: call already cancelled is thrown when run 
python udf
+
+[FLINK-18008] - 
HistoryServer does not log environment information on startup
+
+[FLINK-18012] - 
Deactivate slot timeout if TaskSlotTable.tryMarkSlotActive is called
+
+[FLINK-18035] - 
Executors#newCachedThreadPool could not work as expected
+
+[FLINK-18045] - 
Fix Kerberos credentials checking to unblock Flink on secured MapR
+
+[FLINK-18048] - 
--host option could not take effect for standalone application 
cluster
+
+[FLINK-18097] - 
History server doesnt clean all job json files
+
+[FLINK-18168] - 
Error results when use UDAF with Object Array return type
+
+[FLINK-18223] -

[GitHub] [flink] flinkbot edited a comment on pull request #13024: [FLINK-18742][flink-clients] Some configuration args do not take effect at client

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13024:
URL: https://github.com/apache/flink/pull/13024#issuecomment-666148245


   
   ## CI report:
   
   * eb611d5b39f997fa3e986b5d163cb65a44b4b0ba UNKNOWN
   * a5db01bd0f63289039dfcdae41eaf099ed28a812 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5742)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5730)
 
   * 047cc5ec9e12641f2787ac6fa4a3525fd6fc7d70 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-18959) Fail to archiveExecutionGraph because job is not finished when dispatcher close

2020-08-20 Thread ZhuShang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181104#comment-17181104
 ] 

ZhuShang commented on FLINK-18959:
--

i also found that archiveExecutionGraph can not be reached when the job is 
running

> Fail to archiveExecutionGraph because job is not finished when dispatcher 
> close
> ---
>
> Key: FLINK-18959
> URL: https://issues.apache.org/jira/browse/FLINK-18959
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0, 1.12.0, 1.11.1
>Reporter: Liu
>Priority: Critical
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
> Attachments: flink-debug-log
>
>
> When job is cancelled, we expect to see it in flink's history server. But I 
> can not see my job after it is cancelled.
> After digging into the problem, I find that the function 
> archiveExecutionGraph is not executed. Below is the brief log:
> {panel:title=log}
> 2020-08-14 15:10:06,406 INFO 
> org.apache.flink.runtime.executiongraph.ExecutionGraph 
> [flink-akka.actor.default-dispatcher- 15] - Job EtlAndWindow 
> (6f784d4cc5bae88a332d254b21660372) switched from state RUNNING to CANCELLING.
> 2020-08-14 15:10:06,415 DEBUG 
> org.apache.flink.runtime.dispatcher.MiniDispatcher 
> [flink-akka.actor.default-dispatcher-3] - Shutting down per-job cluster 
> because the job was canceled.
> 2020-08-14 15:10:06,629 INFO 
> org.apache.flink.runtime.dispatcher.MiniDispatcher 
> [flink-akka.actor.default-dispatcher-3] - Stopping dispatcher 
> akka.tcp://flink@bjfk-c9865.yz02:38663/user/dispatcher.
> 2020-08-14 15:10:06,629 INFO 
> org.apache.flink.runtime.dispatcher.MiniDispatcher 
> [flink-akka.actor.default-dispatcher-3] - Stopping all currently running jobs 
> of dispatcher akka.tcp://flink@bjfk-c9865.yz02:38663/user/dispatcher.
> 2020-08-14 15:10:06,631 INFO org.apache.flink.runtime.jobmaster.JobMaster 
> [flink-akka.actor.default-dispatcher-29] - Stopping the JobMaster for job 
> EtlAndWindow(6f784d4cc5bae88a332d254b21660372).
> 2020-08-14 15:10:06,632 DEBUG org.apache.flink.runtime.jobmaster.JobMaster 
> [flink-akka.actor.default-dispatcher-29] - Disconnect TaskExecutor 
> container_e144_1590060720089_2161_01_06 because: Stopping JobMaster for 
> job EtlAndWindow(6f784d4cc5bae88a332d254b21660372).
> 2020-08-14 15:10:06,646 INFO 
> org.apache.flink.runtime.executiongraph.ExecutionGraph 
> [flink-akka.actor.default-dispatcher-29] - Job EtlAndWindow 
> (6f784d4cc5bae88a332d254b21660372) switched from state CANCELLING to CANCELED.
> 2020-08-14 15:10:06,664 DEBUG 
> org.apache.flink.runtime.dispatcher.MiniDispatcher 
> [flink-akka.actor.default-dispatcher-4] - There is a newer JobManagerRunner 
> for the job 6f784d4cc5bae88a332d254b21660372.
> {panel}
> From the log, we can see that job is not finished when dispatcher closes. The 
> process is as following:
>  * Receive cancel command and send it to all tasks async.
>  * In MiniDispatcher, begin to shutting down per-job cluster.
>  * Stopping dispatcher and remove job.
>  * Job is cancelled and callback is executed in method startJobManagerRunner.
>  * Because job is removed before, so currentJobManagerRunner is null which 
> not equals to the original jobManagerRunner. In this case, 
> archivedExecutionGraph will not be uploaded.
> In normal cases, I find that job is cancelled first and then dispatcher is 
> stopped so that archivedExecutionGraph will succeed. But I think that the 
> order is not constrained and it is hard to know which comes first. 
> Above is what I suspected. If so, then we should fix it.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] RocMarshal commented on pull request #13172: [FLINK-18854][docs-zh] Translate the 'API Migration Guides' page of 'Application Development' into Chinese

2020-08-20 Thread GitBox



RocMarshal commented on pull request #13172:
URL: https://github.com/apache/flink/pull/13172#issuecomment-677536186


   Hi, @XBaith @klion26 .
   Could you help me to review this PR if you have free time?
   Thank you.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] RocMarshal removed a comment on pull request #13172: [FLINK-18854][docs-zh] Translate the 'API Migration Guides' page of 'Application Development' into Chinese

2020-08-20 Thread GitBox



RocMarshal removed a comment on pull request #13172:
URL: https://github.com/apache/flink/pull/13172#issuecomment-675233767


   Hi, @XBaith @klion26 .
   Could you help me to review this PR if you have free time?
   Thank you.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13024: [FLINK-18742][flink-clients] Some configuration args do not take effect at client

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13024:
URL: https://github.com/apache/flink/pull/13024#issuecomment-666148245


   
   ## CI report:
   
   * eb611d5b39f997fa3e986b5d163cb65a44b4b0ba UNKNOWN
   * a5db01bd0f63289039dfcdae41eaf099ed28a812 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5730)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5742)
 
   * 047cc5ec9e12641f2787ac6fa4a3525fd6fc7d70 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wangzzu commented on pull request #13024: [FLINK-18742][flink-clients] Some configuration args do not take effect at client

2020-08-20 Thread GitBox



wangzzu commented on pull request #13024:
URL: https://github.com/apache/flink/pull/13024#issuecomment-677524943


   @flinkbot run azure



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-web] rmetzger commented on a change in pull request #370: Add a blog post about the current state of Flink on Docker

2020-08-20 Thread GitBox



rmetzger commented on a change in pull request #370:
URL: https://github.com/apache/flink-web/pull/370#discussion_r473867137



##
File path: _posts/2020-08-20-flink-docker.md
##
@@ -0,0 +1,90 @@
+---
+layout: post
+title: "The State of Flink on Docker"
+date: 2020-08-08T00:00:00.000Z
+authors:
+- rmetzger:
+  name: "Robert Metzger"
+  twitter: rmetzger_
+categories: news
+
+excerpt: This blog post gives an update on the recent developments of Flink's 
support for Docker.
+---
+
+The Flink community recently put some effort into upgrading the Docker 
experience for our users. The goal was to reduce confusion and improve 
usability. With over 50 million downloads from Docker Hub, the Flink docker 
images are a very popular deployment option.

Review comment:
   okay, I'll use "improve"





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-12351) AsyncWaitOperator should deep copy StreamElement when object reuse is enabled

2020-08-20 Thread Piotr Nowojski (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-12351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181094#comment-17181094
 ] 

Piotr Nowojski commented on FLINK-12351:


Yes that's true. I was more worried about relaying on an assumption that 
network stack is not re using the records in any way. But maybe this is not big 
of an issue and could be guarded by some unit test for `AsyncWaitOperator`.

> AsyncWaitOperator should deep copy StreamElement when object reuse is enabled
> -
>
> Key: FLINK-12351
> URL: https://issues.apache.org/jira/browse/FLINK-12351
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, AsyncWaitOperator directly put the input StreamElement into 
> {{StreamElementQueue}}. But when object reuse is enabled, the StreamElement 
> is reused, which means the element in {{StreamElementQueue}} will be 
> modified. As a result, the output of AsyncWaitOperator might be wrong.
> An easy way to fix this might be deep copy the input StreamElement when 
> object reuse is enabled, like this: 
> https://github.com/apache/flink/blob/blink/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/async/AsyncWaitOperator.java#L209



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink-web] rmetzger commented on pull request #370: Add a blog post about the current state of Flink on Docker

2020-08-20 Thread GitBox



rmetzger commented on pull request #370:
URL: https://github.com/apache/flink-web/pull/370#issuecomment-677515170


   Thanks a lot for your reviews! I pushed a new version. Unless there's any 
more feedback, I'll merge it later today.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-benchmarks] pnowojski commented on a change in pull request #2: [FLINK-19003][checkpointing] Add micro-benchmark for unaligned checkpoints

2020-08-20 Thread GitBox



pnowojski commented on a change in pull request #2:
URL: https://github.com/apache/flink-benchmarks/pull/2#discussion_r473849350



##
File path: 
src/main/java/org/apache/flink/benchmark/UnalignedCheckpointBenchmark.java
##
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.benchmark;
+
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.configuration.ConfigConstants;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.configuration.TaskManagerOptions;
+import org.apache.flink.runtime.state.CheckpointListener;
+import org.apache.flink.streaming.api.datastream.DataStreamSource;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.OperationsPerInvocation;
+import org.openjdk.jmh.annotations.Setup;
+import org.openjdk.jmh.runner.Runner;
+import org.openjdk.jmh.runner.RunnerException;
+import org.openjdk.jmh.runner.options.Options;
+import org.openjdk.jmh.runner.options.OptionsBuilder;
+import org.openjdk.jmh.runner.options.VerboseMode;
+
+import java.io.IOException;
+
+@OperationsPerInvocation(value = 
UnalignedCheckpointBenchmark.RECORDS_PER_INVOCATION)

Review comment:
   This is not correct, you are not using `RECORDS_PER_INVOCATION` to 
control the number of records per invocation.

##
File path: 
src/main/java/org/apache/flink/benchmark/UnalignedCheckpointBenchmark.java
##
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.benchmark;
+
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.configuration.ConfigConstants;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.configuration.TaskManagerOptions;
+import org.apache.flink.runtime.state.CheckpointListener;
+import org.apache.flink.streaming.api.datastream.DataStreamSource;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.OperationsPerInvocation;
+import org.openjdk.jmh.annotations.Setup;
+import org.openjdk.jmh.runner.Runner;
+import org.openjdk.jmh.runner.RunnerException;
+import org.openjdk.jmh.runner.options.Options;
+import org.openjdk.jmh.runner.options.OptionsBuilder;
+import org.openjdk.jmh.runner.options.VerboseMode;
+
+import java.io.IOException;
+
+@OperationsPerInvocation(value = 
UnalignedCheckpointBenchmark.RECORDS_PER_INVOCATION)
+public class UnalignedCheckpointBenchmark extends BenchmarkBase {
+public static final int RECORDS_PER_INVOCATION = 10_000_000;
+private static final int NUM_VERTICES = 3;
+private static final int PARALLELISM = 4;
+private static final long CHECKPOINT_INTERVAL_MS = 100;
+
+public static void main(String[] args) throws RunnerException {
+Options options = new OptionsBuilder()
+.verbosity(VerboseMode.NORMAL)
+.include(UnalignedCheckpointBenchmark.class.getCanonicalName())
+

[GitHub] [flink] flinkbot edited a comment on pull request #13024: [FLINK-18742][flink-clients] Some configuration args do not take effect at client

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13024:
URL: https://github.com/apache/flink/pull/13024#issuecomment-666148245


   
   ## CI report:
   
   * eb611d5b39f997fa3e986b5d163cb65a44b4b0ba UNKNOWN
   * a5db01bd0f63289039dfcdae41eaf099ed28a812 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5730)
 
   * 047cc5ec9e12641f2787ac6fa4a3525fd6fc7d70 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-web] rmetzger commented on a change in pull request #370: Add a blog post about the current state of Flink on Docker

2020-08-20 Thread GitBox



rmetzger commented on a change in pull request #370:
URL: https://github.com/apache/flink-web/pull/370#discussion_r473850331



##
File path: _posts/2020-08-20-flink-docker.md
##
@@ -0,0 +1,90 @@
+---
+layout: post
+title: "The State of Flink on Docker"
+date: 2020-08-08T00:00:00.000Z
+authors:
+- rmetzger:
+  name: "Robert Metzger"
+  twitter: rmetzger_
+categories: news
+
+excerpt: This blog post gives an update on the recent developments of Flink's 
support for Docker.
+---
+
+The Flink community recently put some effort into upgrading the Docker 
experience for our users. The goal was to reduce confusion and improve 
usability. With over 50 million downloads from Docker Hub, the Flink docker 
images are a very popular deployment option.

Review comment:
   ... elevating the Docker experience to a new level ... ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-web] rmetzger commented on a change in pull request #370: Add a blog post about the current state of Flink on Docker

2020-08-20 Thread GitBox



rmetzger commented on a change in pull request #370:
URL: https://github.com/apache/flink-web/pull/370#discussion_r473849162



##
File path: _posts/2020-08-20-flink-docker.md
##
@@ -0,0 +1,90 @@
+---
+layout: post
+title: "The State of Flink on Docker"
+date: 2020-08-08T00:00:00.000Z
+authors:
+- rmetzger:
+  name: "Robert Metzger"
+  twitter: rmetzger_
+categories: news
+
+excerpt: This blog post gives an update on the recent developments of Flink's 
support for Docker.
+---
+
+The Flink community recently put some effort into upgrading the Docker 
experience for our users. The goal was to reduce confusion and improve 
usability. With over 50 million downloads from Docker Hub, the Flink docker 
images are a very popular deployment option.

Review comment:
   Seth didn't correct the word in his review





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13206: [FLINK-18948][python] Add end to end test for Python DataStream API.

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13206:
URL: https://github.com/apache/flink/pull/13206#issuecomment-677500353


   
   ## CI report:
   
   * bea57ee386d4ffdd264e3cc0116a653454103c42 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5741)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13204: [FLINK-16080][docs-zh]Translate \docs\dev\table\hive\hive_streaming.zh.md into Chinese

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13204:
URL: https://github.com/apache/flink/pull/13204#issuecomment-677254693


   
   ## CI report:
   
   * f3ff67413489210527790afa12a9f50b63ced883 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5737)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-web] rmetzger commented on a change in pull request #370: Add a blog post about the current state of Flink on Docker

2020-08-20 Thread GitBox



rmetzger commented on a change in pull request #370:
URL: https://github.com/apache/flink-web/pull/370#discussion_r473846656



##
File path: _posts/2020-08-20-flink-docker.md
##
@@ -0,0 +1,90 @@
+---
+layout: post
+title: "The State of Flink on Docker"
+date: 2020-08-08T00:00:00.000Z
+authors:
+- rmetzger:
+  name: "Robert Metzger"
+  twitter: rmetzger_
+categories: news
+
+excerpt: This blog post gives an update on the recent developments of Flink's 
support for Docker.
+---
+
+The Flink community recently put some effort into upgrading the Docker 
experience for our users. The goal was to reduce confusion and improve 
usability. With over 50 million downloads from Docker Hub, the Flink docker 
images are a very popular deployment option.

Review comment:
   Damn, me trying to sound eloquent often fires back :) 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-12351) AsyncWaitOperator should deep copy StreamElement when object reuse is enabled

2020-08-20 Thread Wenlong Lyu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-12351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181081#comment-17181081
 ] 

Wenlong Lyu commented on FLINK-12351:
-

[~pnowojski] As I known, it is easy to get whether the op is the head of chain 
by: StreamConfig#isChainStart

> AsyncWaitOperator should deep copy StreamElement when object reuse is enabled
> -
>
> Key: FLINK-12351
> URL: https://issues.apache.org/jira/browse/FLINK-12351
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, AsyncWaitOperator directly put the input StreamElement into 
> {{StreamElementQueue}}. But when object reuse is enabled, the StreamElement 
> is reused, which means the element in {{StreamElementQueue}} will be 
> modified. As a result, the output of AsyncWaitOperator might be wrong.
> An easy way to fix this might be deep copy the input StreamElement when 
> object reuse is enabled, like this: 
> https://github.com/apache/flink/blob/blink/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/async/AsyncWaitOperator.java#L209



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on pull request #13206: [FLINK-18948][python] Add end to end test for Python DataStream API.

2020-08-20 Thread GitBox



flinkbot commented on pull request #13206:
URL: https://github.com/apache/flink/pull/13206#issuecomment-677500353


   
   ## CI report:
   
   * bea57ee386d4ffdd264e3cc0116a653454103c42 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] RocMarshal commented on pull request #13089: [FLINK-18813][docs-zh] Translate the 'Local Installation' page of 'Try Flink' into Chinese

2020-08-20 Thread GitBox



RocMarshal commented on pull request #13089:
URL: https://github.com/apache/flink/pull/13089#issuecomment-677495556


   ping @xccui 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] kl0u commented on pull request #6613: [FLINK-9940] [API/DataStream][File source] out-of-order files were missed in continuous monitoring

2020-08-20 Thread GitBox



kl0u commented on pull request #6613:
URL: https://github.com/apache/flink/pull/6613#issuecomment-677494490


   Sorry @lvhuyen for closing this, I am reopening it because I just noticed 
there is discussion on the related JIRA.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lvhuyen opened a new pull request #6613: [FLINK-9940] [API/DataStream][File source] out-of-order files were missed in continuous monitoring

2020-08-20 Thread GitBox



lvhuyen opened a new pull request #6613:
URL: https://github.com/apache/flink/pull/6613


   [FLINK-9940] Fix - File-source continuous monitoring mode - out-of-order 
files were missed
   
   ## Fix the issue with ContinuousFileMonitoringFunction - out-of-order files 
were missed in continuous directory scanning mode.
   
   - _Cause_: In the existing directory monitoring mechanism, Flink was 
maintaining the maximum last-modified-timestamp of all identified files 
(_globalModificationTime_) so that in the next scan, all files with 
last-modified-timestamp equal or earlier than that _globalModificationTime_ 
will be ignored.
   
   
   - _Fix_: This fix provides an additional param when creating a 
ContinuousFileMonitoringFunction: readConsistencyOffset. Every scan now starts 
from that max last-modified-timestamp minus this offset. A new list of 
processedFiles is also maintained, which consists of all known files having 
modTimestamp in that offset period.
   - For testing this fix, a change to flink-fs-tests has also been made: The 
collection of seenFiles is changed from a TreeSet to a SortedList. This change 
is to verify the ExactOnce of file scanning, instead of AtLeastOnce.
   
   ## Verifying this change
   This change is already covered by existing tests with slight update.
   - ContinuousFileProcessingMigrationTest.testMonitoringSourceRestore.
   - ContinuousFileProcessingTest.{testFunctionRestore, testProcessContinuously}
   This change also added test: 
   
   - ContinuousFileProcessingTest.testProcessContinuouslyWithNoteTooLateFile
   
   ## Does this pull request potentially affect one of the following parts:
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): yes 
(per-file). This is expected to have minimal impact.
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? JavaDocs
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuxiaoshang commented on pull request #13162: [FLINK-18685][API / DataStream]JobClient.getAccumulators() blocks until streaming job has finished in local environment

2020-08-20 Thread GitBox



zhuxiaoshang commented on pull request #13162:
URL: https://github.com/apache/flink/pull/13162#issuecomment-677492835


   @rmetzger Sorry for the failing,what i think is make the getAccumulators() 
can be async invoked.But the actual behavior is not my expect.
   When i debug the code,the returned CompletableFuture of 
'miniCluster.getExecutionGraph(jobID)' is 'not completed',so the accumulators 
is empty.
   Could you give some suggestions to me?Thanks a lot.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #13206: [FLINK-18948][python] Add end to end test for Python DataStream API.

2020-08-20 Thread GitBox



flinkbot commented on pull request #13206:
URL: https://github.com/apache/flink/pull/13206#issuecomment-677491849


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit bea57ee386d4ffdd264e3cc0116a653454103c42 (Thu Aug 20 
09:37:21 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] kl0u closed pull request #6613: [FLINK-9940] [API/DataStream][File source] out-of-order files were missed in continuous monitoring

2020-08-20 Thread GitBox



kl0u closed pull request #6613:
URL: https://github.com/apache/flink/pull/6613


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] kl0u commented on pull request #6613: [FLINK-9940] [API/DataStream][File source] out-of-order files were missed in continuous monitoring

2020-08-20 Thread GitBox



kl0u commented on pull request #6613:
URL: https://github.com/apache/flink/pull/6613#issuecomment-677491493


   I'm closing this as "Abandoned", since there is no more activity.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-18948) Add end to end test for Python DataStream API

2020-08-20 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-18948:
---
Labels: pull-request-available  (was: )

> Add end to end test for Python DataStream API
> -
>
> Key: FLINK-18948
> URL: https://issues.apache.org/jira/browse/FLINK-18948
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Hequn Cheng
>Assignee: Shuiqiang Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] shuiqiangchen opened a new pull request #13206: [FLINK-18948][python] Add end to end test for Python DataStream API.

2020-08-20 Thread GitBox



shuiqiangchen opened a new pull request #13206:
URL: https://github.com/apache/flink/pull/13206


   
   
   ## What is the purpose of the change
   
   Add end to end test for Python DataStream API.
   
   
   ## Brief change log
   
   Added a simple Python DataStream Job, which will consume datas from specific 
Kafka topic, after a series of transformations, then, write the result data 
back to Kafka.
   
   
   ## Verifying this change
   
   This pull request is an end to end test, it will be started by running 
test_python_datastream.sh script.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): ( no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: ( no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? ( not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-18962) Improve error message if checkpoint directory is not writable

2020-08-20 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-18962.
--
Fix Version/s: 1.12.0
   Resolution: Fixed

Merged to master as f8ce30a50b^^..f8ce30a50b.

Thanks for submitting the idea [~NicoK] and addressing the issue 
[~roman_khachatryan] :)

> Improve error message if checkpoint directory is not writable
> -
>
> Key: FLINK-18962
> URL: https://issues.apache.org/jira/browse/FLINK-18962
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.1
>Reporter: Nico Kruber
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.12.0
>
>
> If the checkpoint directory from {{state.checkpoints.dir}} is not writable by 
> the user that Flink is running with, checkpoints will be declined, but the 
> real cause is not mentioned anywhere:
> * the Web UI says: "Cause: The job has failed" (the Flink job is running 
> though)
> * the JM log says:
> {code}
> 2020-08-14 12:13:18,820 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Triggering 
> checkpoint 2 (type=CHECKPOINT) @ 159738819 for job 
> 2c567b14e8d0833404931ef47dfec266.
> 2020-08-14 12:13:18,921 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Decline 
> checkpoint 2 by task 0d4fd75374ad16c8d963679e3c2171ec of job 
> 2c567b14e8d0833404931ef47dfec266 at a184deea621e3923fbfcb1d899348448 @ 
> Nico-PC.lan (dataPort=35531).
> {code}
> * the TM log says:
> {code}
> 2020-08-14 12:13:14,102 INFO  
> org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl [] 
> - Checkpoint 1 has been notified as aborted, would not trigger any checkpoint.
> {code}
> And that's it. It should have a real error message indicating that the 
> checkpoint (sub)-directory could not be created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] pnowojski merged pull request #13180: [FLINK-18962][checkpointing] Improve logging when checkpoint declined

2020-08-20 Thread GitBox



pnowojski merged pull request #13180:
URL: https://github.com/apache/flink/pull/13180


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] pnowojski commented on a change in pull request #13180: [FLINK-18962][checkpointing] Improve logging when checkpoint declined

2020-08-20 Thread GitBox



pnowojski commented on a change in pull request #13180:
URL: https://github.com/apache/flink/pull/13180#discussion_r473803693



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/AsyncCheckpointRunnable.java
##
@@ -129,12 +129,10 @@ public void run() {
checkpointMetaData.getCheckpointId());
}
} catch (Exception e) {
-   if (LOG.isDebugEnabled()) {
-   LOG.debug("{} - asynchronous part of checkpoint 
{} could not be completed.",
-   taskName,
-   checkpointMetaData.getCheckpointId(),
-   e);
-   }
+   LOG.info("{} - asynchronous part of checkpoint {} could 
not be completed.",

Review comment:
   > Honestly, I have not had this to me, I’m not against this change here.
   
   Ok, in that case let's not overthink it and let's try this out :) Thanks for 
your inputs @NicoK and @klion26 .





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13175: [FLINK-18955][Checkpointing] Add checkpoint path to job startup/restore message

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13175:
URL: https://github.com/apache/flink/pull/13175#issuecomment-674839926


   
   ## CI report:
   
   * 635f839466122e36674a38a9845c2d6b5eb5c244 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5735)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13188: [hotfix][docs] Translate building.md to building.zh.md

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13188:
URL: https://github.com/apache/flink/pull/13188#issuecomment-675323573


   
   ## CI report:
   
   * 8bbc06f7a8931c9e02130f668805a82ccc00a9e8 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5738)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-web] azagrebin commented on a change in pull request #370: Add a blog post about the current state of Flink on Docker

2020-08-20 Thread GitBox



azagrebin commented on a change in pull request #370:
URL: https://github.com/apache/flink-web/pull/370#discussion_r473789842



##
File path: _posts/2020-08-20-flink-docker.md
##
@@ -0,0 +1,90 @@
+---
+layout: post
+title: "The State of Flink on Docker"
+date: 2020-08-08T00:00:00.000Z
+authors:
+- rmetzger:
+  name: "Robert Metzger"
+  twitter: rmetzger_
+categories: news
+
+excerpt: This blog post gives an update on the recent developments of Flink's 
support for Docker.
+---
+
+The Flink community recently put some effort into upgrading the Docker 
experience for our users. The goal was to reduce confusion and improve 
usability. With over 50 million downloads from Docker Hub, the Flink docker 
images are a very popular deployment option.
+
+Let's quickly break down the recent improvements:
+
+- Reduce confusion: Flink used to have 2 Dockerfiles and a 3rd file maintained 
outside of the official repository — all with different features and varying 
stability. Now, we have one central place for all images: 
[apache/flink-docker](https://github.com/apache/flink-docker).
+
+  Here, we keep all the Dockerfiles for the different releases. Check out the 
[detailed readme](https://github.com/apache/flink-docker/blob/master/README.md) 
of that repository for further explanation on the different branches, as well 
as the [Flink Improvement Proposal 
(FLIP-111)](https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification)
 that contains the detailed planning.
+
+  The apache/flink-docker repository also seeds the o[fficial Flink image on 
Docker Hub](https://hub.docker.com/_/flink).
+
+- Improve Usability: The Dockerfiles are used for various purposes: [Native 
docker 
deployments](https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/docker.html),
 [Flink on 
Kubernetes](https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html),
 the (unofficial) [Flink helm 
example](https://github.com/docker-flink/examples) and the project's [internal 
end to end 
tests](https://github.com/apache/flink/tree/master/flink-end-to-end-tests). 
With one unified image, all these consumers of the images benefit from the same 
set of features, documentation and testing. 
+
+  The new images support passing configuration variables via a 
`FLINK_PROPERTIES` environment variable. Users can enable default plugins with 
the `ENABLE_BUILT_IN_PLUGINS`environment variable. The images also allow 
loading custom jar paths and configuration files.
+
+Looking into the future, there are already some interesting potential 
improvements lined up: 
+
+- [Java 11 Docker images](https://issues.apache.org/jira/browse/FLINK-16260) 
(already completed)
+- [Use vanilla docker-entrypoint with 
flink-kubernetes](https://issues.apache.org/jira/browse/FLINK-15793) (in 
progress)
+- [History server support](https://issues.apache.org/jira/browse/FLINK-17167)
+- [Support for OpenShift](https://issues.apache.org/jira/browse/FLINK-15587)
+
+## How do I get started?
+
+This is a short tutorial on how to start a Flink Session Cluster with docker.

Review comment:
   again I would either make `a Flink Session Cluster with docker` 
clickable into the docs or add `see also details in docs link`.

##
File path: _posts/2020-08-20-flink-docker.md
##
@@ -0,0 +1,90 @@
+---
+layout: post
+title: "The State of Flink on Docker"
+date: 2020-08-08T00:00:00.000Z
+authors:
+- rmetzger:
+  name: "Robert Metzger"
+  twitter: rmetzger_
+categories: news
+
+excerpt: This blog post gives an update on the recent developments of Flink's 
support for Docker.
+---
+
+The Flink community recently put some effort into upgrading the Docker 
experience for our users. The goal was to reduce confusion and improve 
usability. With over 50 million downloads from Docker Hub, the Flink docker 
images are a very popular deployment option.
+
+Let's quickly break down the recent improvements:
+
+- Reduce confusion: Flink used to have 2 Dockerfiles and a 3rd file maintained 
outside of the official repository — all with different features and varying 
stability. Now, we have one central place for all images: 
[apache/flink-docker](https://github.com/apache/flink-docker).
+
+  Here, we keep all the Dockerfiles for the different releases. Check out the 
[detailed readme](https://github.com/apache/flink-docker/blob/master/README.md) 
of that repository for further explanation on the different branches, as well 
as the [Flink Improvement Proposal 
(FLIP-111)](https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification)
 that contains the detailed planning.
+
+  The apache/flink-docker repository also seeds the o[fficial Flink image on 
Docker Hub](https://hub.docker.com/_/flink).

Review comment:
   ```suggestion
 The `apache/flink-docker` repository also seeds the [official Flink image 
on Docker Hub](https://hub.docker.com/_/flink).
   ```

##
File

[jira] [Commented] (FLINK-18643) Migrate Jenkins jobs to ci-builds.apache.org

2020-08-20 Thread Robert Metzger (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181055#comment-17181055
 ] 

Robert Metzger commented on FLINK-18643:


I've disabled the Jenkins profiles for Flink (except the statefun snapshot 
deployment ones)

> Migrate Jenkins jobs to ci-builds.apache.org
> 
>
> Key: FLINK-18643
> URL: https://issues.apache.org/jira/browse/FLINK-18643
> Project: Flink
>  Issue Type: Improvement
>  Components: Release System
>Reporter: Chesnay Schepler
>Assignee: Robert Metzger
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Infra is [reworking the Jenkins 
> setup|https://lists.apache.org/thread.html/re974eed417a1bc294694701d5c91b4bf92689fcf32a4c91f169be87d%40%3Cbuilds.apache.org%3E],
>  so we have to migrate our jobs that do the snapshot deployments.
> Alternatively, find other ways to do this (Azure?) to reduce number of used 
> infrastructure services.
> /cc [~rmetzger]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #13205: [FLINK-17330[runtime] Merge cyclic dependent pipelined regions into one region

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13205:
URL: https://github.com/apache/flink/pull/13205#issuecomment-677463515


   
   ## CI report:
   
   * e11ffec55b9151857069d64c806b47cc98d9679d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5739)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #13205: [FLINK-17330[runtime] Merge cyclic dependent pipelined regions into one region

2020-08-20 Thread GitBox



flinkbot commented on pull request #13205:
URL: https://github.com/apache/flink/pull/13205#issuecomment-677463515


   
   ## CI report:
   
   * e11ffec55b9151857069d64c806b47cc98d9679d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] AHeise commented on a change in pull request #13175: [FLINK-18955][Checkpointing] Add checkpoint path to job startup/restore message

2020-08-20 Thread GitBox



AHeise commented on a change in pull request #13175:
URL: https://github.com/apache/flink/pull/13175#discussion_r473766209



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java
##
@@ -320,6 +320,12 @@ void setDiscardCallback(@Nullable 
CompletedCheckpointStats.DiscardCallback disca
 
@Override
public String toString() {
-   return String.format("Checkpoint %d @ %d for %s", checkpointID, 
timestamp, job);
+   return String.format(
+   "%s %d @ %d for %s located at %s",
+   props.getCheckpointType(),

Review comment:
   Currently that would display:
   "CHECKPOINT"
   "SAVEPOINT"
   "SYNC_SAVEPOINT"
   
   I'm proposing to translate them into just "Checkpoint" and "Savepoint" to 
reduce the technicality. However, no strong feelings, so you can also convince 
me that it's better to leave as is.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #13205: [FLINK-17330[runtime] Merge cyclic dependent pipelined regions into one region

2020-08-20 Thread GitBox



flinkbot commented on pull request #13205:
URL: https://github.com/apache/flink/pull/13205#issuecomment-677459348


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit e11ffec55b9151857069d64c806b47cc98d9679d (Thu Aug 20 
08:33:48 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk opened a new pull request #13205: [FLINK-17330[runtime] Merge cyclic dependent pipelined regions into one region

2020-08-20 Thread GitBox



zhuzhurk opened a new pull request #13205:
URL: https://github.com/apache/flink/pull/13205


   ## What is the purpose of the change
   
   This PR changes regions building to merge cyclic dependent pipelined regions 
into one region.
   This is to avoid scheduling deadlocks due to cyclic dependencies.
   More details see FLINK-17330.
   
   ## Brief change log
   
 - *Added StronglyConnectedComponentsComputeUtils to find out cyclic 
dependent regions*
 - *Changed PipelinedRegionComputeUtil to merge cyclic dependent pipelined 
regions into one region*
   
   ## Verifying this change
   
 - *Added UT StronglyConnectedComponentsComputeUtilsTest*
 - *Added test case in PipelinedRegionComputeUtilTest to verify regions on 
the same cycles are merged*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (**yes** / no / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-17330) Avoid scheduling deadlocks caused by cyclic input dependencies between regions

2020-08-20 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-17330:
---
Labels: pull-request-available  (was: )

> Avoid scheduling deadlocks caused by cyclic input dependencies between regions
> --
>
> Key: FLINK-17330
> URL: https://issues.apache.org/jira/browse/FLINK-17330
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Imagine a job like this:
> A -- (pipelined FORWARD) --> B -- (blocking ALL-to-ALL) --> D
> A -- (pipelined FORWARD) --> C -- (pipelined FORWARD) --> D
> parallelism=2 for all vertices.
> We will have 2 execution pipelined regions:
> R1 = {A1, B1, C1, D1}
> R2 = {A2, B2, C2, D2}
> R1 has a cross-region input edge (B2->D1).
> R2 has a cross-region input edge (B1->D2).
> Scheduling deadlock will happen since we schedule a region only when all its 
> inputs are consumable (i.e. blocking partitions to be finished). This is 
> because R1 can be scheduled only if R2 finishes, while R2 can be scheduled 
> only if R1 finishes.
> To avoid this, one solution is to force a logical pipelined region with 
> intra-region ALL-to-ALL blocking edges to form one only execution pipelined 
> region, so that there would not be cyclic input dependency between regions.
> Besides that, we should also pay attention to avoid cyclic cross-region 
> POINTWISE blocking edges. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-18960) flink sideoutput union

2020-08-20 Thread Dawid Wysakowicz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz closed FLINK-18960.

Fix Version/s: 1.10.2
   Resolution: Duplicate

> flink sideoutput union
> --
>
> Key: FLINK-18960
> URL: https://issues.apache.org/jira/browse/FLINK-18960
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.10.1
>Reporter: xiaohang.li
>Priority: Minor
> Fix For: 1.10.2
>
>
> Flink sideoutput union seems not works right. If we union the sideoutput from 
> the same operator, the output is the result of last side output times by the 
> number of unions, which is not expected. For example,
> {code:java}
> val side = new OutputTag[String]("side")
>  val side2 = new OutputTag[String]("side2")
>  val side3 = new OutputTag[String]("side3")
>  val ds = env.socketTextStream("master",9001)
>  val res = ds.process(new ProcessFunction[String,String] {
>  override def processElement(value: String, ctx: ProcessFunction[String, 
> String]#Context, out: Collector[String]): Unit = {
>  if(value.contains("hello"))
> { ctx.output(side,value) }
> else if(value.contains("world"))
> { ctx.output(side2,value) }
> else if(value.contains("flink"))
> { ctx.output(side3,value) }
> out.collect(value)
>  }
>  })
> val res1 = res.getSideOutput(side)
>  val res2 = res.getSideOutput(side2)
>  val res3 = res.getSideOutput(side3)
> println( ">"+res1.getClass)
>  println( ">"+res2.getClass)
> res1.print("res1")
>  res2.print("res2")
>  res3.print("res3")
> res2.union(res1).union(res3).print("all")
> {code}
>  
>  If we input 
> {code:java}
> hello
> world
> flink
> {code}
> The output will be 
>  
> {code:java}
> res1> hello
>  res2> world
>  res3> flink
>  all> flink
>  all> flink
>  all> flink
> {code}
>  
> But the expected output would be 
> {code:java}
> res1> hello
> res2> world
> res3> flink
> all> hello 
> all> world 
> all> flink
> {code}
>  
>  
> if we add a _map_ after the sideoutput and then union them, the output would 
> be right, but adding map should be not needed. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] kl0u commented on pull request #13024: [FLINK-18742][flink-clients] Some configuration args do not take effect at client

2020-08-20 Thread GitBox



kl0u commented on pull request #13024:
URL: https://github.com/apache/flink/pull/13024#issuecomment-677440255


   @wangzzu There is a test failure related to your changes. Please fix it and 
then I think I can merge :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13188: [hotfix][docs] Translate building.md to building.zh.md

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13188:
URL: https://github.com/apache/flink/pull/13188#issuecomment-675323573


   
   ## CI report:
   
   * 8301e1e53770c95e2f903526ac93e80e039cf4f2 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5733)
 
   * 8bbc06f7a8931c9e02130f668805a82ccc00a9e8 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5738)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-18960) flink sideoutput union

2020-08-20 Thread Yun Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181004#comment-17181004
 ] 

Yun Gao commented on FLINK-18960:
-

hi [~xiaohang.li], I checked the issue, it does existed in old versions, but it 
should be fixed in https://issues.apache.org/jira/browse/FLINK-17578, you may 
try to upgrade to 1.10.2 or 1.11. 

> flink sideoutput union
> --
>
> Key: FLINK-18960
> URL: https://issues.apache.org/jira/browse/FLINK-18960
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.10.1
>Reporter: xiaohang.li
>Priority: Minor
>
> Flink sideoutput union seems not works right. If we union the sideoutput from 
> the same operator, the output is the result of last side output times by the 
> number of unions, which is not expected. For example,
> {code:java}
> val side = new OutputTag[String]("side")
>  val side2 = new OutputTag[String]("side2")
>  val side3 = new OutputTag[String]("side3")
>  val ds = env.socketTextStream("master",9001)
>  val res = ds.process(new ProcessFunction[String,String] {
>  override def processElement(value: String, ctx: ProcessFunction[String, 
> String]#Context, out: Collector[String]): Unit = {
>  if(value.contains("hello"))
> { ctx.output(side,value) }
> else if(value.contains("world"))
> { ctx.output(side2,value) }
> else if(value.contains("flink"))
> { ctx.output(side3,value) }
> out.collect(value)
>  }
>  })
> val res1 = res.getSideOutput(side)
>  val res2 = res.getSideOutput(side2)
>  val res3 = res.getSideOutput(side3)
> println( ">"+res1.getClass)
>  println( ">"+res2.getClass)
> res1.print("res1")
>  res2.print("res2")
>  res3.print("res3")
> res2.union(res1).union(res3).print("all")
> {code}
>  
>  If we input 
> {code:java}
> hello
> world
> flink
> {code}
> The output will be 
>  
> {code:java}
> res1> hello
>  res2> world
>  res3> flink
>  all> flink
>  all> flink
>  all> flink
> {code}
>  
> But the expected output would be 
> {code:java}
> res1> hello
> res2> world
> res3> flink
> all> hello 
> all> world 
> all> flink
> {code}
>  
>  
> if we add a _map_ after the sideoutput and then union them, the output would 
> be right, but adding map should be not needed. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18800) Avro serialization schema doesn't support Kafka key/value serialization

2020-08-20 Thread Mohammad Hossein Gerami (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181003#comment-17181003
 ] 

Mohammad Hossein Gerami commented on FLINK-18800:
-

anybody doesn't have any idea?

> Avro serialization schema doesn't support  Kafka key/value serialization
> 
>
> Key: FLINK-18800
> URL: https://issues.apache.org/jira/browse/FLINK-18800
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Formats (JSON, Avro, Parquet, ORC, 
> SequenceFile)
>Affects Versions: 1.11.0, 1.11.1
>Reporter: Mohammad Hossein Gerami
>Priority: Major
>
> {color:#ff8b00}AvroSerializationSchema{color} and 
> {color:#ff8b00}ConfluentRegistryAvroSerializationSchema{color} doesn't 
> support Kafka key/value serialization. I implemented a custom Avro 
> serialization schema for solving this problem. 
> for example in the Flink must implement a class like this.
> {code:java}
> public class KafkaAvroRegistrySchemaSerializationSchema extends 
> RegistryAvroSerializationSchema implements 
> KafkaSerializationSchema{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18800) Avro serialization schema doesn't support Kafka key/value serialization

2020-08-20 Thread Mohammad Hossein Gerami (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Hossein Gerami updated FLINK-18800:

Description: 
{color:#ff8b00}AvroSerializationSchema{color} and 
{color:#ff8b00}ConfluentRegistryAvroSerializationSchema{color} doesn't support 
Kafka key/value serialization. I implemented a custom Avro serialization schema 
for solving this problem. 

for example in the Flink must implement a class like this.
{code:java}
public class KafkaAvroRegistrySchemaSerializationSchema extends 
RegistryAvroSerializationSchema implements 
KafkaSerializationSchema{code}

  
was:{color:#ff8b00}[AvroSerializationSchema|[https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/formats/avro/AvroDeserializationSchema.html]]{color}
 and 
{color:#ff8b00}[ConfluentRegistryAvroSerializationSchema|[https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/formats/avro/registry/confluent/ConfluentRegistryAvroSerializationSchema.html]]{color}
 doesn't support Kafka key/value serialization. I implemented a custom Avro 
serialization schema for solving this problem. 


> Avro serialization schema doesn't support  Kafka key/value serialization
> 
>
> Key: FLINK-18800
> URL: https://issues.apache.org/jira/browse/FLINK-18800
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Formats (JSON, Avro, Parquet, ORC, 
> SequenceFile)
>Affects Versions: 1.11.0, 1.11.1
>Reporter: Mohammad Hossein Gerami
>Priority: Major
>
> {color:#ff8b00}AvroSerializationSchema{color} and 
> {color:#ff8b00}ConfluentRegistryAvroSerializationSchema{color} doesn't 
> support Kafka key/value serialization. I implemented a custom Avro 
> serialization schema for solving this problem. 
> for example in the Flink must implement a class like this.
> {code:java}
> public class KafkaAvroRegistrySchemaSerializationSchema extends 
> RegistryAvroSerializationSchema implements 
> KafkaSerializationSchema{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] tillrohrmann commented on a change in pull request #13190: [FLINK-18986][kubernetes] Create ClusterClient only for attached deployments

2020-08-20 Thread GitBox



tillrohrmann commented on a change in pull request #13190:
URL: https://github.com/apache/flink/pull/13190#discussion_r473711834



##
File path: 
flink-kubernetes/src/main/java/org/apache/flink/kubernetes/cli/KubernetesSessionCli.java
##
@@ -88,6 +88,7 @@ public Configuration getEffectiveConfiguration(String[] args) 
throws CliArgsExce
 
private int run(String[] args) throws FlinkException, CliArgsException {
final Configuration configuration = 
getEffectiveConfiguration(args);
+   
KubernetesClusterClientFactory.ensureClusterIdIsSet(configuration);

Review comment:
   Hmm, yeah one should either get the `clusterId` from the user or after a 
cluster is deployed via the `ClusterClient` or `ClusterClientProvider`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] tillrohrmann commented on a change in pull request #13190: [FLINK-18986][kubernetes] Create ClusterClient only for attached deployments

2020-08-20 Thread GitBox



tillrohrmann commented on a change in pull request #13190:
URL: https://github.com/apache/flink/pull/13190#discussion_r473709875



##
File path: 
flink-kubernetes/src/main/java/org/apache/flink/kubernetes/KubernetesClusterClientFactory.java
##
@@ -51,10 +51,7 @@ public boolean isCompatibleWith(Configuration configuration) 
{
@Override
public KubernetesClusterDescriptor 
createClusterDescriptor(Configuration configuration) {
checkNotNull(configuration);
-   if 
(!configuration.contains(KubernetesConfigOptions.CLUSTER_ID)) {
-   final String clusterId = generateClusterId();
-   
configuration.setString(KubernetesConfigOptions.CLUSTER_ID, clusterId);
-   }
+   ensureClusterIdIsSet(configuration);
return new KubernetesClusterDescriptor(configuration, 
KubeClientFactory.fromConfiguration(configuration));

Review comment:
   I think the idea was that the `ClusterDescriptor` will generate a 
cluster id when you call any of the `deploy*` methods. I believe what is set in 
the constructor of the `YarnClusterDescriptor` is fine and does not contradict 
what I was saying.
   
   If one really needs to configure the cluster id manually, then one could 
extend the `ClusterDescriptor` interface accordingly.
   
   I think that we are adding technical debt by working around broken contracts 
which might be fine now but we should definitely pull this straight eventually.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #13188: [hotfix][docs] Translate building.md to building.zh.md

2020-08-20 Thread GitBox



flinkbot edited a comment on pull request #13188:
URL: https://github.com/apache/flink/pull/13188#issuecomment-675323573


   
   ## CI report:
   
   * 8301e1e53770c95e2f903526ac93e80e039cf4f2 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=5733)
 
   * 8bbc06f7a8931c9e02130f668805a82ccc00a9e8 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-18643) Migrate Jenkins jobs to ci-builds.apache.org

2020-08-20 Thread Chesnay Schepler (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180998#comment-17180998
 ] 

Chesnay Schepler commented on FLINK-18643:
--

hmm...I guess it's okay to only cover 1.11+ with this. Go ahead and remove the 
jenkins setup.

> Migrate Jenkins jobs to ci-builds.apache.org
> 
>
> Key: FLINK-18643
> URL: https://issues.apache.org/jira/browse/FLINK-18643
> Project: Flink
>  Issue Type: Improvement
>  Components: Release System
>Reporter: Chesnay Schepler
>Assignee: Robert Metzger
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Infra is [reworking the Jenkins 
> setup|https://lists.apache.org/thread.html/re974eed417a1bc294694701d5c91b4bf92689fcf32a4c91f169be87d%40%3Cbuilds.apache.org%3E],
>  so we have to migrate our jobs that do the snapshot deployments.
> Alternatively, find other ways to do this (Azure?) to reduce number of used 
> infrastructure services.
> /cc [~rmetzger]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] rkhachatryan commented on a change in pull request #13180: [FLINK-18962][checkpointing] Improve logging when checkpoint declined

2020-08-20 Thread GitBox



rkhachatryan commented on a change in pull request #13180:
URL: https://github.com/apache/flink/pull/13180#discussion_r473689643



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/AsyncCheckpointRunnable.java
##
@@ -129,12 +129,10 @@ public void run() {
checkpointMetaData.getCheckpointId());
}
} catch (Exception e) {
-   if (LOG.isDebugEnabled()) {
-   LOG.debug("{} - asynchronous part of checkpoint 
{} could not be completed.",
-   taskName,
-   checkpointMetaData.getCheckpointId(),
-   e);
-   }
+   LOG.info("{} - asynchronous part of checkpoint {} could 
not be completed.",

Review comment:
   I think everything said above about failure frequency is also true for 
expirations (in fact, there is only one counter for all types of failures).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

< 1 2 3 >

101 - 200 of 210 matches

Mail list logo