date:20200522

[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Nicholas Jiang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114549#comment-17114549
 ] 

Nicholas Jiang commented on FLINK-17883:


[~dian.fu][~lzljs3620320][~jark]As you discuss, do you mean that this should 
use INSERT OVERWRITE statement or addInsert instead of configure write mode for 
FileSystem() connector in PyFlink?

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17892) Dynamic option may not be a part of the table digest

2020-05-22 Thread hailong wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114547#comment-17114547
 ] 

hailong wang edited comment on FLINK-17892 at 5/23/20, 5:34 AM:


Hi [~lzljs3620320], No, kafka source will not be reused.

For the following sql: 
{code:java}
val sql1 = "CREATE TABLE SS (" +
   " a int," +
   " b bigint," +
   " c varchar" +
  ") WITH (" +
  "'connector.type' = 'TestTableSource' "+
  ")"
util.tableEnv.sqlUpdate(sql1);
util.tableEnv.registerFunction("random_udf", new NonDeterministicUdf())

val sqlQuery =
  """
|(SELECT a, random_udf() FROM SS /*+ OPTIONS('k1' = 'v1') */ WHERE a > 10)
|UNION ALL
|(SELECT a, random_udf() FROM SS /*+ OPTIONS('k2' = 'v2') */ WHERE a > 10)
  """.stripMargin
util.verifyPlan(sqlQuery)
{code}
The result plan is :
{code:java}
Union(all=[true], union=[a, EXPR$1])
:- Calc(select=[a, random_udf() AS EXPR$1], where=[>(a, 10)])
:  +- LegacyTableSourceScan(table=[[default_catalog, default_database, SS, 
source: [TestTableSource(a, b, c)], dynamic options: {k1=v1}]], fields=[a, b, 
c])
+- Calc(select=[a, random_udf() AS EXPR$1], where=[>(a, 10)])
   +- LegacyTableSourceScan(table=[[default_catalog, default_database, SS, 
source: [TestTableSource(a, b, c)], dynamic options: {k2=v2}]], fields=[a, b, 
c])
{code}
For the dynamic options is a part of table digest. It is not reused of source.

But I think dynamic options  is used to  override table properties, and table 
properties is not a part of  table digest, so dynamic options may be also not 
be a part of table digest.

What I'm looking forward is source can be reused Even if the table hint is 
different.

For another example,

If the source is kafka, whether source is reused or not will generate different 
results.

If the kafka source is reused, tableSink and tableSink1 will hava a full set of 
data at the same time from source.

But if the kafka source is not reused, tableSink and tableSink1 will have a 
full set of data together.

I think the first case will be correct.

 


was (Author: hailong wang):
Hi [~lzljs3620320], No, kafka source will not be reused.

For the following sql:

 
{code:java}
val sql1 = "CREATE TABLE SS (" +
   " a int," +
   " b bigint," +
   " c varchar" +
  ") WITH (" +
  "'connector.type' = 'TestTableSource' "+
  ")"
util.tableEnv.sqlUpdate(sql1);
util.tableEnv.registerFunction("random_udf", new NonDeterministicUdf())

val sqlQuery =
  """
|(SELECT a, random_udf() FROM SS /*+ OPTIONS('k1' = 'v1') */ WHERE a > 10)
|UNION ALL
|(SELECT a, random_udf() FROM SS /*+ OPTIONS('k2' = 'v2') */ WHERE a > 10)
  """.stripMargin
util.verifyPlan(sqlQuery)
{code}
The result plan is :

 

 
{code:java}
Union(all=[true], union=[a, EXPR$1])
:- Calc(select=[a, random_udf() AS EXPR$1], where=[>(a, 10)])
:  +- LegacyTableSourceScan(table=[[default_catalog, default_database, SS, 
source: [TestTableSource(a, b, c)], dynamic options: {k1=v1}]], fields=[a, b, 
c])
+- Calc(select=[a, random_udf() AS EXPR$1], where=[>(a, 10)])
   +- LegacyTableSourceScan(table=[[default_catalog, default_database, SS, 
source: [TestTableSource(a, b, c)], dynamic options: {k2=v2}]], fields=[a, b, 
c])
{code}
For the dynamic options is a part of table digest. It is not reused of source.

 

But I think dynamic options  is used to  override table properties, and table 
properties is not a part of  table digest, so dynamic options may be also not 
be a part of table digest.

What I'm looking forward is source can be reused Even if the table hint is 
different.

For another example,

If the source is kafka, whether source is reused or not will generate different 
results.

If the kafka source is reused, tableSink and tableSink1 will hava a full set of 
data at the same time from source.

But if the kafka source is not reused, tableSink and tableSink1 will have a 
full set of data together.

I think the first case will be correct.

 

> Dynamic option may not be a part of the table digest
> 
>
> Key: FLINK-17892
> URL: https://issues.apache.org/jira/browse/FLINK-17892
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Critical
> Fix For: 1.11.0
>
>
> For now, Table properties not be a part of table digest, but dynamic option 
> will be included.
> This will lead to an error when plan reused.
> if I defines a kafka table:
> {code:java}
> CREATE TABLE KAFKA (
> ……
> ) with (
> topic = 'xx',
> groupid = 'xxx'
> ……
> )
> Insert into sinktable select * from KAFKA;
> Insert into sinktable1 select * from KAFKA;{code}
> KAFKA source will be reused according to the SQL above.
> But if i add different table hint to dml, like:
> {code:java}
> Insert into sinktable select * from

[jira] [Commented] (FLINK-17892) Dynamic option may not be a part of the table digest

2020-05-22 Thread hailong wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114547#comment-17114547
 ] 

hailong wang commented on FLINK-17892:
--

Hi [~lzljs3620320], No, kafka source will not be reused.

For the following sql:

 
{code:java}
val sql1 = "CREATE TABLE SS (" +
   " a int," +
   " b bigint," +
   " c varchar" +
  ") WITH (" +
  "'connector.type' = 'TestTableSource' "+
  ")"
util.tableEnv.sqlUpdate(sql1);
util.tableEnv.registerFunction("random_udf", new NonDeterministicUdf())

val sqlQuery =
  """
|(SELECT a, random_udf() FROM SS /*+ OPTIONS('k1' = 'v1') */ WHERE a > 10)
|UNION ALL
|(SELECT a, random_udf() FROM SS /*+ OPTIONS('k2' = 'v2') */ WHERE a > 10)
  """.stripMargin
util.verifyPlan(sqlQuery)
{code}
The result plan is :

 

 
{code:java}
Union(all=[true], union=[a, EXPR$1])
:- Calc(select=[a, random_udf() AS EXPR$1], where=[>(a, 10)])
:  +- LegacyTableSourceScan(table=[[default_catalog, default_database, SS, 
source: [TestTableSource(a, b, c)], dynamic options: {k1=v1}]], fields=[a, b, 
c])
+- Calc(select=[a, random_udf() AS EXPR$1], where=[>(a, 10)])
   +- LegacyTableSourceScan(table=[[default_catalog, default_database, SS, 
source: [TestTableSource(a, b, c)], dynamic options: {k2=v2}]], fields=[a, b, 
c])
{code}
For the dynamic options is a part of table digest. It is not reused of source.

 

But I think dynamic options  is used to  override table properties, and table 
properties is not a part of  table digest, so dynamic options may be also not 
be a part of table digest.

What I'm looking forward is source can be reused Even if the table hint is 
different.

For another example,

If the source is kafka, whether source is reused or not will generate different 
results.

If the kafka source is reused, tableSink and tableSink1 will hava a full set of 
data at the same time from source.

But if the kafka source is not reused, tableSink and tableSink1 will have a 
full set of data together.

I think the first case will be correct.

 

> Dynamic option may not be a part of the table digest
> 
>
> Key: FLINK-17892
> URL: https://issues.apache.org/jira/browse/FLINK-17892
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Critical
> Fix For: 1.11.0
>
>
> For now, Table properties not be a part of table digest, but dynamic option 
> will be included.
> This will lead to an error when plan reused.
> if I defines a kafka table:
> {code:java}
> CREATE TABLE KAFKA (
> ……
> ) with (
> topic = 'xx',
> groupid = 'xxx'
> ……
> )
> Insert into sinktable select * from KAFKA;
> Insert into sinktable1 select * from KAFKA;{code}
> KAFKA source will be reused according to the SQL above.
> But if i add different table hint to dml, like:
> {code:java}
> Insert into sinktable select * from KAFKA /*+ OPTIONS('k1' = 'v1')*/;
> Insert into sinktable1 select * from KAFKA /*+ OPTIONS('k2' = 'v2')*/;
> {code}
> There will be two kafka tableSources  use the same groupid to  consumer the 
> same topic.
> So I think dynamic option may not be a part of the table digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #12297: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12297:
URL: https://github.com/apache/flink/pull/12297#issuecomment-632945310


   
   ## CI report:
   
   * a1304a3c20a9ba848ddc3308d6da9ecab8a85a5b Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] klion26 commented on a change in pull request #11808: [FLINK-16155][docs-zh] Translate "Operator/Process Function" page into Chinese

2020-05-22 Thread GitBox



klion26 commented on a change in pull request #11808:
URL: https://github.com/apache/flink/pull/11808#discussion_r429510266



##
File path: docs/dev/stream/operators/process_function.zh.md
##
@@ -26,66 +26,65 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## The ProcessFunction
+## ProcessFunction简介
 
-The `ProcessFunction` is a low-level stream processing operation, giving 
access to the basic building blocks of
-all (acyclic) streaming applications:
+`ProcessFunction` 是一种低级别的流处理操作，基于它用户可以访问所有（非循环）流应用程序的基本构建块：
 
-  - events (stream elements)
-  - state (fault-tolerant, consistent, only on keyed stream)
-  - timers (event time and processing time, only on keyed stream)
+   -事件（流元素）
+   -状态（容错，一致性，仅在 keyed stream 上）
+   -计时器（事件时间和处理时间，仅在 keyed stream 上）

Review comment:
   ```suggestion
 - 计时器（事件时间和处理时间，仅在 keyed stream 上）
   ```

##
File path: docs/dev/stream/operators/process_function.zh.md
##
@@ -26,66 +26,65 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## The ProcessFunction
+## ProcessFunction简介
 
-The `ProcessFunction` is a low-level stream processing operation, giving 
access to the basic building blocks of
-all (acyclic) streaming applications:
+`ProcessFunction` 是一种低级别的流处理操作，基于它用户可以访问所有（非循环）流应用程序的基本构建块：

Review comment:
   “非循环” 改成 “无环” 会好一些吗？

##
File path: docs/dev/stream/operators/process_function.zh.md
##
@@ -26,66 +26,65 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## The ProcessFunction
+## ProcessFunction简介

Review comment:
   ```suggestion
   ## ProcessFunction 简介
   ```

##
File path: docs/dev/stream/operators/process_function.zh.md
##
@@ -26,66 +26,65 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## The ProcessFunction
+## ProcessFunction简介
 
-The `ProcessFunction` is a low-level stream processing operation, giving 
access to the basic building blocks of
-all (acyclic) streaming applications:
+`ProcessFunction` 是一种低级别的流处理操作，基于它用户可以访问所有（非循环）流应用程序的基本构建块：
 
-  - events (stream elements)
-  - state (fault-tolerant, consistent, only on keyed stream)
-  - timers (event time and processing time, only on keyed stream)
+   -事件（流元素）

Review comment:
   ```suggestion 
 - 事件（流元素）
   ```
   这个地方或许也可以改成其他的，达到渲染后能有相应的效果就行，下面也是
   这个效果能够在本地执行 `sh docs/build.sh -p` 后打开 localhost:4000 查看

##
File path: docs/dev/stream/operators/process_function.zh.md
##
@@ -26,66 +26,65 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## The ProcessFunction
+## ProcessFunction简介
 
-The `ProcessFunction` is a low-level stream processing operation, giving 
access to the basic building blocks of
-all (acyclic) streaming applications:
+`ProcessFunction` 是一种低级别的流处理操作，基于它用户可以访问所有（非循环）流应用程序的基本构建块：
 
-  - events (stream elements)
-  - state (fault-tolerant, consistent, only on keyed stream)
-  - timers (event time and processing time, only on keyed stream)
+   -事件（流元素）
+   -状态（容错，一致性，仅在 keyed stream 上）

Review comment:
   ```suggestion
 - 状态（容错，一致性，仅在 keyed stream 上）
   ```

##
File path: docs/dev/stream/operators/process_function.zh.md
##
@@ -26,66 +26,65 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## The ProcessFunction
+## ProcessFunction简介
 
-The `ProcessFunction` is a low-level stream processing operation, giving 
access to the basic building blocks of
-all (acyclic) streaming applications:
+`ProcessFunction` 是一种低级别的流处理操作，基于它用户可以访问所有（非循环）流应用程序的基本构建块：
 
-  - events (stream elements)
-  - state (fault-tolerant, consistent, only on keyed stream)
-  - timers (event time and processing time, only on keyed stream)
+   -事件（流元素）
+   -状态（容错，一致性，仅在 keyed stream 上）
+   -计时器（事件时间和处理时间，仅在 keyed stream 上）
 
-The `ProcessFunction` can be thought of as a `FlatMapFunction` with access to 
keyed state and timers. It handles events
-by being invoked for each event received in the input stream(s).
+可以将 `ProcessFunction` 视为一种可以访问 keyed state 和计时器的 `FlatMapFunction`。
+Flink 为收到的输入流中的每个事件都调用该函数来进行处理。

Review comment:
   "Flink 为收到的输入流中的每个事件" -> "Flink 为收到的输入流中每个事件" 会好一些吗？

##
File path: docs/dev/stream/operators/process_function.zh.md
##
@@ -26,66 +26,65 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## The ProcessFunction
+## ProcessFunction简介
 
-The `ProcessFunction` is a low-level stream processing operation, giving 
access to the basic building blocks of
-all (acyclic) streaming applications:
+`ProcessFunction` 是一种低级别的流处理操作，基于它用户可以访问所有（非循环）流应用程序的基本构建块：
 
-  - events (stream elements)
-  - state (fault-tolerant, consistent, only on keyed stream)
-  - timers (event time and processing time, only on keyed stream)
+   -事件（流元素）
+   -状态（容错，一致性，仅在 keyed stream 上）
+   -计时器（事件时间和处理时间，仅在 keyed stream 上）
 
-The `ProcessFunction` can be thought of as a `FlatMapFunction` with access to 
keyed state and timers. It handles events
-by being invoked for

[jira] [Commented] (FLINK-17893) SQL-CLI no exception stack

2020-05-22 Thread godfrey he (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114515#comment-17114515
 ] 

godfrey he commented on FLINK-17893:


sql client should support {{-verbose}} option

> SQL-CLI no exception stack
> --
>
> Key: FLINK-17893
> URL: https://issues.apache.org/jira/browse/FLINK-17893
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Reporter: Jingsong Lee
>Priority: Blocker
> Fix For: 1.11.0
>
>
> If write a wrong DDL, only "[ERROR] Unknown or invalid SQL statement" message.
> No exception stack in client and logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17896) HiveCatalog can work with new table factory because of is_generic

2020-05-22 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-17896:


 Summary: HiveCatalog can work with new table factory because of 
is_generic
 Key: FLINK-17896
 URL: https://issues.apache.org/jira/browse/FLINK-17896
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive, Table SQL / API
Reporter: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #12299: [FLINK-17878][filesystem] StreamingFileWriter currentWatermark attrib…

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12299:
URL: https://github.com/apache/flink/pull/12299#issuecomment-632977800


   
   ## CI report:
   
   * 6c0db10514575ddaeb66689ff7b3e2ee2975bdfe Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2067)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Myasuka commented on pull request #12282: [FLINK-17865][checkpoint] Increase default size of 'state.backend.fs.memory-threshold'

2020-05-22 Thread GitBox



Myasuka commented on pull request #12282:
URL: https://github.com/apache/flink/pull/12282#issuecomment-632978514


   My azure CI pipeline has run successfully twice: 
[my-build-1](https://myasuka.visualstudio.com/flink/_build/results?buildId=77) 
and 
[my-build-2](https://myasuka.visualstudio.com/flink/_build/results?buildId=75)
   
   Current CI in this PR failed due to FLINK-16572
   
   @flinkbot run azure



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-17895) Default value of rows-per-second in datagen should be limited

2020-05-22 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-17895:


 Summary: Default value of rows-per-second in datagen should be 
limited
 Key: FLINK-17895
 URL: https://issues.apache.org/jira/browse/FLINK-17895
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Reporter: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16572) CheckPubSubEmulatorTest is flaky on Azure

2020-05-22 Thread Yun Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114503#comment-17114503
 ] 

Yun Tang commented on FLINK-16572:
--

Another instance: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2040=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5

> CheckPubSubEmulatorTest is flaky on Azure
> -
>
> Key: FLINK-16572
> URL: https://issues.apache.org/jira/browse/FLINK-16572
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub, Tests
>Affects Versions: 1.11.0
>Reporter: Aljoscha Krettek
>Assignee: Richard Deurwaarder
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Log: 
> https://dev.azure.com/aljoschakrettek/Flink/_build/results?buildId=56=logs=1f3ed471-1849-5d3c-a34c-19792af4ad16=ce095137-3e3b-5f73-4b79-c42d3d5f8283=7842



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on pull request #12299: [FLINK-17878][filesystem] StreamingFileWriter currentWatermark attrib…

2020-05-22 Thread GitBox



flinkbot commented on pull request #12299:
URL: https://github.com/apache/flink/pull/12299#issuecomment-632977800


   
   ## CI report:
   
   * 6c0db10514575ddaeb66689ff7b3e2ee2975bdfe UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12293: [FLINK-17878][filesystem] StreamingFileWriter currentWatermark attrib…

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12293:
URL: https://github.com/apache/flink/pull/12293#issuecomment-632624268


   
   ## CI report:
   
   * 10d13796e80f7c566995155edabb38996bda Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2041)
 
   * 3d58b38f7bf87bfb660b7c24adb90ada3d077e20 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2066)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12297: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12297:
URL: https://github.com/apache/flink/pull/12297#issuecomment-632945310


   
   ## CI report:
   
   * 52b0395a9fc4d3fa99def97fe79b42edef32bcf6 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2064)
 
   * a1304a3c20a9ba848ddc3308d6da9ecab8a85a5b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-17894) RowGenerator in datagen connector should be serializable

2020-05-22 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-17894:


 Summary: RowGenerator in datagen connector should be serializable
 Key: FLINK-17894
 URL: https://issues.apache.org/jira/browse/FLINK-17894
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Reporter: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17889) flink-connector-hive jar contains wrong class in its SPI config file org.apache.flink.table.factories.TableFactory

2020-05-22 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17889:
-
Priority: Blocker  (was: Major)

> flink-connector-hive jar contains wrong class in its SPI config file 
> org.apache.flink.table.factories.TableFactory
> --
>
> Key: FLINK-17889
> URL: https://issues.apache.org/jira/browse/FLINK-17889
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.11.0
>Reporter: Jeff Zhang
>Priority: Blocker
>
> These 2 classes are in flink-connector-hive jar's SPI config file
> {code:java}
> org.apache.flink.orc.OrcFileSystemFormatFactory
> License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code}
> Due to this issue, I get the following exception in zeppelin side.
> {code:java}
> Caused by: java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtypeCaused by: 
> java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtype at 
> java.util.ServiceLoader.fail(ServiceLoader.java:239) at 
> java.util.ServiceLoader.access$300(ServiceLoader.java:185) at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376) at 
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at 
> java.util.ServiceLoader$1.next(ServiceLoader.java:480) at 
> java.util.Iterator.forEachRemaining(Iterator.java:116) at 
> org.apache.flink.table.factories.TableFactoryService.discoverFactories(TableFactoryService.java:214)
>  ... 35 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17889) flink-connector-hive jar contains wrong class in its SPI config file org.apache.flink.table.factories.TableFactory

2020-05-22 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17889:
-
Fix Version/s: 1.11.0

> flink-connector-hive jar contains wrong class in its SPI config file 
> org.apache.flink.table.factories.TableFactory
> --
>
> Key: FLINK-17889
> URL: https://issues.apache.org/jira/browse/FLINK-17889
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.11.0
>Reporter: Jeff Zhang
>Priority: Blocker
> Fix For: 1.11.0
>
>
> These 2 classes are in flink-connector-hive jar's SPI config file
> {code:java}
> org.apache.flink.orc.OrcFileSystemFormatFactory
> License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code}
> Due to this issue, I get the following exception in zeppelin side.
> {code:java}
> Caused by: java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtypeCaused by: 
> java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtype at 
> java.util.ServiceLoader.fail(ServiceLoader.java:239) at 
> java.util.ServiceLoader.access$300(ServiceLoader.java:185) at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376) at 
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at 
> java.util.ServiceLoader$1.next(ServiceLoader.java:480) at 
> java.util.Iterator.forEachRemaining(Iterator.java:116) at 
> org.apache.flink.table.factories.TableFactoryService.discoverFactories(TableFactoryService.java:214)
>  ... 35 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17893) SQL-CLI no exception stack

2020-05-22 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-17893:


 Summary: SQL-CLI no exception stack
 Key: FLINK-17893
 URL: https://issues.apache.org/jira/browse/FLINK-17893
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Reporter: Jingsong Lee
 Fix For: 1.11.0


If write a wrong DDL, only "[ERROR] Unknown or invalid SQL statement" message.

No exception stack in client and logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #12293: [FLINK-17878][filesystem] StreamingFileWriter currentWatermark attrib…

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12293:
URL: https://github.com/apache/flink/pull/12293#issuecomment-632624268


   
   ## CI report:
   
   * 10d13796e80f7c566995155edabb38996bda Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2041)
 
   * 3d58b38f7bf87bfb660b7c24adb90ada3d077e20 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #12299: [FLINK-17878][filesystem] StreamingFileWriter currentWatermark attrib…

2020-05-22 Thread GitBox



flinkbot commented on pull request #12299:
URL: https://github.com/apache/flink/pull/12299#issuecomment-632975818


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 6c0db10514575ddaeb66689ff7b3e2ee2975bdfe (Sat May 23 
03:02:29 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-17878) Transient watermark attribute should be initial at runtime in streaming file operators

2020-05-22 Thread xiaogang zhou (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114499#comment-17114499
 ] 

xiaogang zhou commented on FLINK-17878:
---

[~lzljs3620320] Hi Jingsong, thx for the advice, I have created another PR for 
release-1.11

[~jark] Hi Jark, thx for the advice, when we use the process time mode , the 
watermark will not be overwritten. And when I tried to work on this issue, I 
didnt remove the transient, I init the value in initializeState.

> Transient watermark attribute should be initial at runtime in streaming file 
> operators
> --
>
> Key: FLINK-17878
> URL: https://issues.apache.org/jira/browse/FLINK-17878
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.11.0
>Reporter: xiaogang zhou
>Assignee: xiaogang zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> StreamingFileWriter has a 
> private transient long currentWatermark = Long.MIN_VALUE;
>  
> in case developer wants to create a custom bucket assigner, it will receive a 
> currentWatermark as 0, this might be conflict with the original flink 
> approach to handle a min_long.
>  
> should we remove the transient key word?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] zhougit86 opened a new pull request #12299: [FLINK-17878][filesystem] StreamingFileWriter currentWatermark attrib…

2020-05-22 Thread GitBox



zhougit86 opened a new pull request #12299:
URL: https://github.com/apache/flink/pull/12299


   ## Verifying this change
   
   This change is a trivial rework 
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`:  no
 - The serializers:  no 
 - The runtime per-record code paths (performance sensitive):  no 
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper:  no 
 - The S3 file system connector: no 
   
   ## Documentation
   
 - Does this pull request introduce a new feature?  no
 - If yes, how is the feature documented? not applicable 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-14527) Add integeration tests for PostgreSQL and MySQL dialects in flink jdbc module

2020-05-22 Thread Lijie Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114494#comment-17114494
 ] 

Lijie Wang commented on FLINK-14527:


About UT, we can check whether the results returned by methods defined in 
JdbcDialect Interface (`upsertstatement`) meets our expectation.

About IT, I think it needs external system support (for example, MySQL 
database), so I don't know how to deal with it. Do you have any suggestions?

> Add integeration tests for PostgreSQL and MySQL dialects in flink jdbc module
> -
>
> Key: FLINK-14527
> URL: https://issues.apache.org/jira/browse/FLINK-14527
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / JDBC
>Reporter: Jark Wu
>Priority: Major
>
> Currently, we already supported PostgreSQL and MySQL and Derby dialects in 
> flink-jdbc as sink and source. However, we only have integeration tests for 
> Derby. 
> We should add integeration tests for PostgreSQL and MySQL dialects too. Maybe 
> we can use JUnit {{Parameterized}} feature to avoid duplicated testing code.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14527) Add integeration tests for PostgreSQL and MySQL dialects in flink jdbc module

2020-05-22 Thread Lijie Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114493#comment-17114493
 ] 

Lijie Wang commented on FLINK-14527:


Hi [~jark] , I'm considering the same thing. I found that MySQL dialect and PG 
dialect are completely untested (no UT and no IT). So I'm considering add test 
for them. Can you assign this to me?

 

> Add integeration tests for PostgreSQL and MySQL dialects in flink jdbc module
> -
>
> Key: FLINK-14527
> URL: https://issues.apache.org/jira/browse/FLINK-14527
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / JDBC
>Reporter: Jark Wu
>Priority: Major
>
> Currently, we already supported PostgreSQL and MySQL and Derby dialects in 
> flink-jdbc as sink and source. However, we only have integeration tests for 
> Derby. 
> We should add integeration tests for PostgreSQL and MySQL dialects too. Maybe 
> we can use JUnit {{Parameterized}} feature to avoid duplicated testing code.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] zhougit86 opened a new pull request #12298: Feature/watermark

2020-05-22 Thread GitBox



zhougit86 opened a new pull request #12298:
URL: https://github.com/apache/flink/pull/12298


   ## Verifying this change
   
   This change is a trivial rework 
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`:  no
 - The serializers:  no 
 - The runtime per-record code paths (performance sensitive):  no 
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper:  no 
 - The S3 file system connector: no 
   
   ## Documentation
   
 - Does this pull request introduce a new feature?  no
 - If yes, how is the feature documented? not applicable 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhougit86 closed pull request #12298: Feature/watermark

2020-05-22 Thread GitBox



zhougit86 closed pull request #12298:
URL: https://github.com/apache/flink/pull/12298


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] JingsongLi commented on pull request #12283: [FLINK-16975][documentation] Add docs for FileSystem connector

2020-05-22 Thread GitBox



JingsongLi commented on pull request #12283:
URL: https://github.com/apache/flink/pull/12283#issuecomment-632971290


   > @JingsongLi Thank you for working on this. I have one additional question 
that is not addressed in the docs. What happens when a record is supposed to be 
written into a partition that has already been committed. Is it dropped? 
written into its own file? What happens?
   
   Late data processing can be "drop late date" or "continue appending". For 
now, just "continue appending". So the policy of partition commit needs to be 
idempotent.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-17892) Dynamic option may not be a part of the table digest

2020-05-22 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114483#comment-17114483
 ] 

Jingsong Lee commented on FLINK-17892:
--

Change to 1.11 bug and critical.

> Dynamic option may not be a part of the table digest
> 
>
> Key: FLINK-17892
> URL: https://issues.apache.org/jira/browse/FLINK-17892
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Critical
> Fix For: 1.11.0
>
>
> For now, Table properties not be a part of table digest, but dynamic option 
> will be included.
> This will lead to an error when plan reused.
> if I defines a kafka table:
> {code:java}
> CREATE TABLE KAFKA (
> ……
> ) with (
> topic = 'xx',
> groupid = 'xxx'
> ……
> )
> Insert into sinktable select * from KAFKA;
> Insert into sinktable1 select * from KAFKA;{code}
> KAFKA source will be reused according to the SQL above.
> But if i add different table hint to dml, like:
> {code:java}
> Insert into sinktable select * from KAFKA /*+ OPTIONS('k1' = 'v1')*/;
> Insert into sinktable1 select * from KAFKA /*+ OPTIONS('k2' = 'v2')*/;
> {code}
> There will be two kafka tableSources  use the same groupid to  consumer the 
> same topic.
> So I think dynamic option may not be a part of the table digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17892) Dynamic option may not be a part of the table digest

2020-05-22 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17892:
-
Priority: Critical  (was: Major)

> Dynamic option may not be a part of the table digest
> 
>
> Key: FLINK-17892
> URL: https://issues.apache.org/jira/browse/FLINK-17892
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Critical
> Fix For: 1.11.0
>
>
> For now, Table properties not be a part of table digest, but dynamic option 
> will be included.
> This will lead to an error when plan reused.
> if I defines a kafka table:
> {code:java}
> CREATE TABLE KAFKA (
> ……
> ) with (
> topic = 'xx',
> groupid = 'xxx'
> ……
> )
> Insert into sinktable select * from KAFKA;
> Insert into sinktable1 select * from KAFKA;{code}
> KAFKA source will be reused according to the SQL above.
> But if i add different table hint to dml, like:
> {code:java}
> Insert into sinktable select * from KAFKA /*+ OPTIONS('k1' = 'v1')*/;
> Insert into sinktable1 select * from KAFKA /*+ OPTIONS('k2' = 'v2')*/;
> {code}
> There will be two kafka tableSources  use the same groupid to  consumer the 
> same topic.
> So I think dynamic option may not be a part of the table digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17892) Dynamic option may not be a part of the table digest

2020-05-22 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17892:
-
Fix Version/s: (was: 1.12.0)
   1.11.0

> Dynamic option may not be a part of the table digest
> 
>
> Key: FLINK-17892
> URL: https://issues.apache.org/jira/browse/FLINK-17892
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Major
> Fix For: 1.11.0
>
>
> For now, Table properties not be a part of table digest, but dynamic option 
> will be included.
> This will lead to an error when plan reused.
> if I defines a kafka table:
> {code:java}
> CREATE TABLE KAFKA (
> ……
> ) with (
> topic = 'xx',
> groupid = 'xxx'
> ……
> )
> Insert into sinktable select * from KAFKA;
> Insert into sinktable1 select * from KAFKA;{code}
> KAFKA source will be reused according to the SQL above.
> But if i add different table hint to dml, like:
> {code:java}
> Insert into sinktable select * from KAFKA /*+ OPTIONS('k1' = 'v1')*/;
> Insert into sinktable1 select * from KAFKA /*+ OPTIONS('k2' = 'v2')*/;
> {code}
> There will be two kafka tableSources  use the same groupid to  consumer the 
> same topic.
> So I think dynamic option may not be a part of the table digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17892) Dynamic option may not be a part of the table digest

2020-05-22 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-17892:
-
Issue Type: Bug  (was: Improvement)

> Dynamic option may not be a part of the table digest
> 
>
> Key: FLINK-17892
> URL: https://issues.apache.org/jira/browse/FLINK-17892
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Major
> Fix For: 1.12.0
>
>
> For now, Table properties not be a part of table digest, but dynamic option 
> will be included.
> This will lead to an error when plan reused.
> if I defines a kafka table:
> {code:java}
> CREATE TABLE KAFKA (
> ……
> ) with (
> topic = 'xx',
> groupid = 'xxx'
> ……
> )
> Insert into sinktable select * from KAFKA;
> Insert into sinktable1 select * from KAFKA;{code}
> KAFKA source will be reused according to the SQL above.
> But if i add different table hint to dml, like:
> {code:java}
> Insert into sinktable select * from KAFKA /*+ OPTIONS('k1' = 'v1')*/;
> Insert into sinktable1 select * from KAFKA /*+ OPTIONS('k2' = 'v2')*/;
> {code}
> There will be two kafka tableSources  use the same groupid to  consumer the 
> same topic.
> So I think dynamic option may not be a part of the table digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17892) Dynamic option may not be a part of the table digest

2020-05-22 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114482#comment-17114482
 ] 

Jingsong Lee commented on FLINK-17892:
--

Hi [~hailong wang], thanks for reporting. Do you mean the KAFKA source will be 
reused in physical graph? And random pick one dynamic option?

If it is, this should be a noteworthy bug.

CC: [~danny0405]

> Dynamic option may not be a part of the table digest
> 
>
> Key: FLINK-17892
> URL: https://issues.apache.org/jira/browse/FLINK-17892
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: hailong wang
>Priority: Major
> Fix For: 1.12.0
>
>
> For now, Table properties not be a part of table digest, but dynamic option 
> will be included.
> This will lead to an error when plan reused.
> if I defines a kafka table:
> {code:java}
> CREATE TABLE KAFKA (
> ……
> ) with (
> topic = 'xx',
> groupid = 'xxx'
> ……
> )
> Insert into sinktable select * from KAFKA;
> Insert into sinktable1 select * from KAFKA;{code}
> KAFKA source will be reused according to the SQL above.
> But if i add different table hint to dml, like:
> {code:java}
> Insert into sinktable select * from KAFKA /*+ OPTIONS('k1' = 'v1')*/;
> Insert into sinktable1 select * from KAFKA /*+ OPTIONS('k2' = 'v2')*/;
> {code}
> There will be two kafka tableSources  use the same groupid to  consumer the 
> same topic.
> So I think dynamic option may not be a part of the table digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] JingsongLi commented on pull request #12283: [FLINK-16975][documentation] Add docs for FileSystem connector

2020-05-22 Thread GitBox



JingsongLi commented on pull request #12283:
URL: https://github.com/apache/flink/pull/12283#issuecomment-632966730


   @sjwiesman Thank you very much for your review. Very useful. I will modify 
them.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Dian Fu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114479#comment-17114479
 ] 

Dian Fu edited comment on FLINK-17883 at 5/23/20, 1:53 AM:
---

[~jark] [~lzljs3620320] Thanks for your reply. It makes sense to me that the 
"insert override" semantics should be specified in the DML instead of DDL. I 
just noticed the following interface in StatementSet(since 1.11):
{code}
StatementSet addInsert(String targetPath, Table table, boolean overwrite);
{code}
It has also been supported in the Python Table API. So I think users could 
specify "insert overwrite" semantics in both the Java/Python Table API via this 
interface since 1.11.



was (Author: dian.fu):
[~jark] [~lzljs3620320] Thanks for your reply. It makes sense to me that the 
"insert override" semantics should be specified in the DML instead of DDL. I 
just noticed the following interface in StatementSet(since 1.11):
{code}
StatementSet addInsert(String targetPath, Table table, boolean overwrite);
{code}
It's also supported in the Python Table API. So I think users could specify 
"insert overwrite" semantics in both the Java/Python Table API via this 
interface since 1.11.


> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] JingsongLi commented on a change in pull request #12283: [FLINK-16975][documentation] Add docs for FileSystem connector

2020-05-22 Thread GitBox



JingsongLi commented on a change in pull request #12283:
URL: https://github.com/apache/flink/pull/12283#discussion_r429503816



##
File path: docs/dev/table/connectors/filesystem.md
##
@@ -0,0 +1,352 @@
+---
+title: "Hadoop FileSystem Connector"
+nav-title: Hadoop FileSystem
+nav-parent_id: connectors
+nav-pos: 5
+---
+
+
+* This will be replaced by the TOC
+{:toc}
+
+This connector provides access to partitioned files in filesystems
+supported by the [Flink FileSystem abstraction]({{ 
site.baseurl}}/ops/filesystems/index.html).
+
+The file system connector itself is included in Flink and does not require an 
additional dependency.
+A corresponding format needs to be specified for reading and writing rows from 
and to a file system.
+
+The file system connector allows for reading and writing from a local or 
distributed filesystem. A filesystem can be defined as:
+
+
+
+{% highlight sql %}
+CREATE TABLE MyUserTable (
+  column_name1 INT,
+  column_name2 STRING,
+  ...
+  part_name1 INT,
+  part_name2 STRING
+) PARTITIONED BY (part_name1, part_name2) WITH (
+  'connector' = 'filesystem',   -- required: specify to connector type
+  'path' = 'file:///path/to/whatever',  -- required: path to a directory
+  'format' = '...', -- required: file system connector 
requires to specify a format,
+-- Please refer to Table Formats
+-- section for more details.s
+  'partition.default-name' = '...', -- optional: default partition name in 
case the dynamic partition
+-- column value is null/empty string.
+  
+  -- optional: the option to enable shuffle data by dynamic partition fields 
in sink phase, this can greatly
+  -- reduce the number of file for filesystem sink but may lead data skew, the 
default value is disabled.
+  'sink.shuffle-by-partition.enable' = '...',
+  ...
+)
+{% endhighlight %}
+
+
+
+Attention Make sure to include [Flink 
File System specific dependencies]({{ site.baseurl 
}}/internals/filesystems.html).
+
+Attention File system sources for 
streaming are only experimental. In the future, we will support actual 
streaming use cases, i.e., partition and directory monitoring.
+
+## Partition files
+
+The partition supported by the file system connector is similar to hive, but 
different from hive,
+hive manage partitions through catalog, file system table manages partitions 
according to the
+directory of the file system. File system connector discover and infer 
partitions automatically.
+For example, a table partitioned by datetime and hour is the structure in file 
system path:
+
+```
+path
+└── datetime=2019-08-25
+└── hour=11
+├── part-0.parquet
+├── part-1.parquet
+└── hour=12
+├── part-0.parquet
+└── datetime=2019-08-26
+└── hour=6
+├── part-0.parquet
+```
+
+The file system table support partition inserting and overwrite inserting. See 
[INSERT Statement]({{ site.baseurl }}/table/sql/insert.html).
+
+**NOTE:** When you insert overwrite to a partitioned table, only the 
corresponding partition will be overwrite, not the entire table.
+
+## File Formats
+
+The file system connector supports multiple formats:
+
+ - CSV: [RFC-4180](https://tools.ietf.org/html/rfc4180). Uncompressed.
+ - JSON: Note JSON format for file system connector is not a typical JSON 
file. It is [Newline-delimited JSON](http://jsonlines.org/). Uncompressed.
+ - Avro: [Apache Avro](http://avro.apache.org). Support compression by 
configuring `avro.codec`.
+ - Parquet: [Apache Parquet](http://parquet.apache.org). Compatible with Hive.
+ - Orc: [Apache Orc](http://orc.apache.org). Compatible with Hive.
+
+## Streaming sink
+
+The file system connector supports streaming sink, it uses [Streaming File 
Sink]({{ site.baseurl }}/connectors/streamfile_sink.html)
+to write records to file. Row-encoded Formats are csv and json. Bulk-encoded 
Formats are parquet, orc and avro.
+
+### Rolling policy
+
+Data within the partition directories are split into part files. Each 
partition will contain at least one part file for
+each subtask of the sink that has received data for that partition. The 
in-progress part file will be closed and additional
+part file will be created according to the configurable rolling policy. The 
policy rolls part files based on size,
+a timeout that specifies the maximum duration for which a file can be open.
+
+
+  
+
+Key
+Default
+Type
+Description
+
+  
+  
+
+sink.rolling-policy.file-size
+1024L * 1024L * 128L
+Long
+The maximum part file size before rolling.
+
+
+sink.rolling-policy.time-interval
+30 m
+Duration
+The maximum time duration a part file can stay open before rolling 
(by default 30 min to avoid to many small files).
+
+  
+
+
+**NOTE:** For bulk formats

[GitHub] [flink] JingsongLi commented on a change in pull request #12283: [FLINK-16975][documentation] Add docs for FileSystem connector

2020-05-22 Thread GitBox



JingsongLi commented on a change in pull request #12283:
URL: https://github.com/apache/flink/pull/12283#discussion_r429503816



##
File path: docs/dev/table/connectors/filesystem.md
##
@@ -0,0 +1,352 @@
+---
+title: "Hadoop FileSystem Connector"
+nav-title: Hadoop FileSystem
+nav-parent_id: connectors
+nav-pos: 5
+---
+
+
+* This will be replaced by the TOC
+{:toc}
+
+This connector provides access to partitioned files in filesystems
+supported by the [Flink FileSystem abstraction]({{ 
site.baseurl}}/ops/filesystems/index.html).
+
+The file system connector itself is included in Flink and does not require an 
additional dependency.
+A corresponding format needs to be specified for reading and writing rows from 
and to a file system.
+
+The file system connector allows for reading and writing from a local or 
distributed filesystem. A filesystem can be defined as:
+
+
+
+{% highlight sql %}
+CREATE TABLE MyUserTable (
+  column_name1 INT,
+  column_name2 STRING,
+  ...
+  part_name1 INT,
+  part_name2 STRING
+) PARTITIONED BY (part_name1, part_name2) WITH (
+  'connector' = 'filesystem',   -- required: specify to connector type
+  'path' = 'file:///path/to/whatever',  -- required: path to a directory
+  'format' = '...', -- required: file system connector 
requires to specify a format,
+-- Please refer to Table Formats
+-- section for more details.s
+  'partition.default-name' = '...', -- optional: default partition name in 
case the dynamic partition
+-- column value is null/empty string.
+  
+  -- optional: the option to enable shuffle data by dynamic partition fields 
in sink phase, this can greatly
+  -- reduce the number of file for filesystem sink but may lead data skew, the 
default value is disabled.
+  'sink.shuffle-by-partition.enable' = '...',
+  ...
+)
+{% endhighlight %}
+
+
+
+Attention Make sure to include [Flink 
File System specific dependencies]({{ site.baseurl 
}}/internals/filesystems.html).
+
+Attention File system sources for 
streaming are only experimental. In the future, we will support actual 
streaming use cases, i.e., partition and directory monitoring.
+
+## Partition files
+
+The partition supported by the file system connector is similar to hive, but 
different from hive,
+hive manage partitions through catalog, file system table manages partitions 
according to the
+directory of the file system. File system connector discover and infer 
partitions automatically.
+For example, a table partitioned by datetime and hour is the structure in file 
system path:
+
+```
+path
+└── datetime=2019-08-25
+└── hour=11
+├── part-0.parquet
+├── part-1.parquet
+└── hour=12
+├── part-0.parquet
+└── datetime=2019-08-26
+└── hour=6
+├── part-0.parquet
+```
+
+The file system table support partition inserting and overwrite inserting. See 
[INSERT Statement]({{ site.baseurl }}/table/sql/insert.html).
+
+**NOTE:** When you insert overwrite to a partitioned table, only the 
corresponding partition will be overwrite, not the entire table.
+
+## File Formats
+
+The file system connector supports multiple formats:
+
+ - CSV: [RFC-4180](https://tools.ietf.org/html/rfc4180). Uncompressed.
+ - JSON: Note JSON format for file system connector is not a typical JSON 
file. It is [Newline-delimited JSON](http://jsonlines.org/). Uncompressed.
+ - Avro: [Apache Avro](http://avro.apache.org). Support compression by 
configuring `avro.codec`.
+ - Parquet: [Apache Parquet](http://parquet.apache.org). Compatible with Hive.
+ - Orc: [Apache Orc](http://orc.apache.org). Compatible with Hive.
+
+## Streaming sink
+
+The file system connector supports streaming sink, it uses [Streaming File 
Sink]({{ site.baseurl }}/connectors/streamfile_sink.html)
+to write records to file. Row-encoded Formats are csv and json. Bulk-encoded 
Formats are parquet, orc and avro.
+
+### Rolling policy
+
+Data within the partition directories are split into part files. Each 
partition will contain at least one part file for
+each subtask of the sink that has received data for that partition. The 
in-progress part file will be closed and additional
+part file will be created according to the configurable rolling policy. The 
policy rolls part files based on size,
+a timeout that specifies the maximum duration for which a file can be open.
+
+
+  
+
+Key
+Default
+Type
+Description
+
+  
+  
+
+sink.rolling-policy.file-size
+1024L * 1024L * 128L
+Long
+The maximum part file size before rolling.
+
+
+sink.rolling-policy.time-interval
+30 m
+Duration
+The maximum time duration a part file can stay open before rolling 
(by default 30 min to avoid to many small files).
+
+  
+
+
+**NOTE:** For bulk formats

[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Dian Fu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114479#comment-17114479
 ] 

Dian Fu commented on FLINK-17883:
-

[~jark] [~lzljs3620320] Thanks for your reply. It makes sense to me that the 
"insert override" semantics should be specified in the DML instead of DDL. I 
just noticed the following interface in StatementSet(since 1.11):
{code}
StatementSet addInsert(String targetPath, Table table, boolean overwrite);
{code}
It's also supported in the Python Table API. So I think users could specify 
"insert overwrite" semantics in both the Java/Python Table API via this 
interface since 1.11.


> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #12297: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12297:
URL: https://github.com/apache/flink/pull/12297#issuecomment-632945310


   
   ## CI report:
   
   * 52b0395a9fc4d3fa99def97fe79b42edef32bcf6 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2064)
 
   * a1304a3c20a9ba848ddc3308d6da9ecab8a85a5b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2065)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12297: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12297:
URL: https://github.com/apache/flink/pull/12297#issuecomment-632945310


   
   ## CI report:
   
   * 52b0395a9fc4d3fa99def97fe79b42edef32bcf6 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2064)
 
   * a1304a3c20a9ba848ddc3308d6da9ecab8a85a5b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu commented on pull request #8885: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu commented on pull request #8885:
URL: https://github.com/apache/flink/pull/8885#issuecomment-632947709


   Replaced with https://github.com/apache/flink/pull/12297



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu commented on pull request #12297: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu commented on pull request #12297:
URL: https://github.com/apache/flink/pull/12297#issuecomment-632947574


   @aljoscha 
   Here is the new PR. I will make the change for other window assigners after 
this one is approved.
   
   Thank you,
   Niel



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-17892) Dynamic option may not be a part of the table digest

2020-05-22 Thread hailong wang (Jira)

hailong wang created FLINK-17892:


 Summary: Dynamic option may not be a part of the table digest
 Key: FLINK-17892
 URL: https://issues.apache.org/jira/browse/FLINK-17892
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.11.0
Reporter: hailong wang
 Fix For: 1.12.0


For now, Table properties not be a part of table digest, but dynamic option 
will be included.

This will lead to an error when plan reused.

if I defines a kafka table:
{code:java}
CREATE TABLE KAFKA (
……
) with (
topic = 'xx',
groupid = 'xxx'
……
)
Insert into sinktable select * from KAFKA;
Insert into sinktable1 select * from KAFKA;{code}
KAFKA source will be reused according to the SQL above.

But if i add different table hint to dml, like:
{code:java}
Insert into sinktable select * from KAFKA /*+ OPTIONS('k1' = 'v1')*/;
Insert into sinktable1 select * from KAFKA /*+ OPTIONS('k2' = 'v2')*/;
{code}
There will be two kafka tableSources  use the same groupid to  consumer the 
same topic.

So I think dynamic option may not be a part of the table digest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] TengHu commented on a change in pull request #8885: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu commented on a change in pull request #8885:
URL: https://github.com/apache/flink/pull/8885#discussion_r429491587



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/assigners/TumblingProcessingTimeWindows.java
##
@@ -46,28 +46,44 @@
 
private final long size;
 
-   private final long offset;
+   private final long globalOffset;
 
-   private TumblingProcessingTimeWindows(long size, long offset) {
+   private Long staggerOffset = null;
+
+   private final WindowStagger windowStagger;
+
+   private TumblingProcessingTimeWindows(long size, long offset, 
WindowStagger windowStagger) {
if (Math.abs(offset) >= size) {
throw new 
IllegalArgumentException("TumblingProcessingTimeWindows parameters must satisfy 
abs(offset) < size");
}
-
this.size = size;
-   this.offset = offset;
+   this.globalOffset = offset;
+   this.windowStagger = windowStagger;
}
 
@Override
public Collection assignWindows(Object element, long 
timestamp, WindowAssignerContext context) {
final long now = context.getCurrentProcessingTime();
-   long start = TimeWindow.getWindowStartWithOffset(now, offset, 
size);
+
+   if (staggerOffset == null) {
+   staggerOffset = 
windowStagger.getStaggerOffset(context.getCurrentProcessingTime(), size);

Review comment:
   The intention was to align the pane to its first event. From our 
experience, the offsets among partitions were usually caused by the design of 
partition keys, network delay, etc, which led to a normal distribution 
staggering.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu closed pull request #8885: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu closed pull request #8885:
URL: https://github.com/apache/flink/pull/8885


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #12297: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



flinkbot commented on pull request #12297:
URL: https://github.com/apache/flink/pull/12297#issuecomment-632945310


   
   ## CI report:
   
   * 52b0395a9fc4d3fa99def97fe79b42edef32bcf6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu commented on a change in pull request #8885: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu commented on a change in pull request #8885:
URL: https://github.com/apache/flink/pull/8885#discussion_r429490081



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/assigners/TumblingProcessingTimeWindows.java
##
@@ -107,7 +123,48 @@ public static TumblingProcessingTimeWindows of(Time size) {
 * @return The time policy.
 */
public static TumblingProcessingTimeWindows of(Time size, Time offset) {
-   return new TumblingProcessingTimeWindows(size.toMilliseconds(), 
offset.toMilliseconds());
+   return new TumblingProcessingTimeWindows(size.toMilliseconds(), 
offset.toMilliseconds(), WindowStagger.ALIGNED);
+   }
+
+   /**
+* Creates a new {@code TumblingProcessingTimeWindows} {@link 
WindowAssigner} that assigns
+* elements to time windows based on the element timestamp, offset and 
a staggering offset sampled
+* from uniform distribution(0, window size) for each pane.
+*
+* @param size The size of the generated windows.
+* @param offset The offset which window start would be shifted by.
+* @param windowStagger The utility that produces staggering offset in 
runtime.
+*
+* @return The time policy.
+*/
+   public static TumblingProcessingTimeWindows of(Time size, Time offset, 
WindowStagger windowStagger) throws Exception {
+   return new TumblingProcessingTimeWindows(size.toMilliseconds(), 
offset.toMilliseconds(), windowStagger);
+   }
+
+   /**
+* Creates a new {@code TumblingProcessingTimeWindows} {@link 
WindowAssigner} that assigns
+* elements to time windows based on the element timestamp and a 
staggering offset sampled
+* from uniform distribution(0, window size) for each pane.
+*
+* @param size The size of the generated windows.
+* @return The time policy.
+*/
+   public static TumblingProcessingTimeWindows withStaggerOf(Time size) {

Review comment:
   I removed those in the new PR. I agree that `withStaggerOf` is a 
confusing name, any name suggestion for a shortcut method with random stagger? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu commented on pull request #8885: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu commented on pull request #8885:
URL: https://github.com/apache/flink/pull/8885#issuecomment-632943368


   > @TengHu have you been successfully using this change in production? We've 
run into a similar challenge with window alignment, would love to see this 
change merged and hear about success/or challenges with it in your usage.
   
   Yes, we deployed this into some of our production pipelines a year ago. I 
also briefly talked about this during my [flink forward 
presentation](https://youtu.be/9U8ksIqrgLM?t=1484).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu commented on pull request #8885: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu commented on pull request #8885:
URL: https://github.com/apache/flink/pull/8885#issuecomment-632942879


   > This is a very good feature! Sorry for taking so (extremely) long to 
finally review this. I didn't initially see this PR.
   > 
   > Could you please address my comments?
   > 
   > Also, when creating a PR the individual commits should also have summaries 
according to the Jira issue, i.e. `[FLINK-X] Add ...`.
   
   > This is a very good feature! Sorry for taking so (extremely) long to 
finally review this. I didn't initially see this PR.
   > 
   > Could you please address my comments?
   > 
   > Also, when creating a PR the individual commits should also have summaries 
according to the Jira issue, i.e. `[FLINK-X] Add ...`.
   
   Thanks for getting back to me. I've lost the original fork (it's been a year 
since I did this change), so I created a new PR 
https://github.com/apache/flink/pull/12297.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu commented on pull request #8885: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu commented on pull request #8885:
URL: https://github.com/apache/flink/pull/8885#issuecomment-632942543


   > ## What is the purpose of the change
   > Flink triggers all panes belonging to one window at the same time. In 
other words, all panes are aligned and their triggers all fire simultaneously, 
causing the spiking workload effect. This pull request adds WindowStagger to 
generate staggering offset for each window assignment at runtime, so the 
workloads are distributed across time. Hence each window assignment is based on 
window size, window offset and staggering offset (generated in runtime).
   > 
   > This change only modifies TumblingProcessingTimeWindows, will send out 
other windows change in other PRs.
   > 
   > ## Brief change log
   > * _Add WindowStagger for generating staggering offsets_
   > * _Enable TumblingProcessingTimeWindows to generate staggering offsets if 
user enabled_
   > 
   > ## Verifying this change
   > This change is already covered by existing tests, such as 
_TumblingProcessingTimeWindowsTest_.
   > 
   > This change added tests and can be verified as follows:
   > 
   > * _Added unit tests for WindowStagger_
   > * _Validated the change by running in our clusters with 3500 task managers 
in total on a stateful streaming program using sliding and tumbling windowing. 
Some dashboards are shown below_
   > 
   > 
![](https://camo.githubusercontent.com/298c26fefe598c1b4605a83f84451adfde4fae50/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835362f737461676765725f77696e646f775f64656c61792e706e67)
   > 
![](https://camo.githubusercontent.com/3c4769bd8b7352c8860eaaa4bc6bbffc64104975/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835372f737461676765725f77696e646f775f7468726f7567687075742e706e67)
   > 
![](https://camo.githubusercontent.com/5fd01124c4c21d9388eab78c498c928ff6a651db/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835392f737461676765725f77696e646f772e706e67)
   > 
   > _some system metrics_
   > 
   > 
![buffers_in_queue](https://user-images.githubusercontent.com/10646097/60139232-00f29d80-9762-11e9-84f4-3bfbde28c028.png)
   > 
![buffer_usage](https://user-images.githubusercontent.com/10646097/60139234-03ed8e00-9762-11e9-99d1-de845d02a8c6.png)
   > 
![output_record_rate](https://user-images.githubusercontent.com/10646097/60139237-064fe800-9762-11e9-959d-2db96c7f7bf6.png)
   > 
   > ## Does this pull request potentially affect one of the following parts:
   > * Dependencies (does it add or upgrade a dependency): no
   > * The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes, TumblingProcessingTimeWindows(potentially all 
WindowAssigners)
   > * The serializers: no
   > * The runtime per-record code paths (performance sensitive): don't know, 
probably no ?
   > * Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
   > * The S3 file system connector: no
   > 
   > ## Documentation
   > * Does this pull request introduce a new feature? yes
   > * If yes, how is the feature documented? JavaDocs
   
   
   
   > This is a very good feature! Sorry for taking so (extremely) long to 
finally review this. I didn't initially see this PR.
   > 
   > Could you please address my comments?
   > 
   > Also, when creating a PR the individual commits should also have summaries 
according to the Jira issue, i.e. `[FLINK-X] Add ...`.
   
   Thank you for getting back to me. I lost the original fork (it's been a year 
since I made this change) so I created a new PR 
https://github.com/apache/flink/pull/12297
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu removed a comment on pull request #8885: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu removed a comment on pull request #8885:
URL: https://github.com/apache/flink/pull/8885#issuecomment-632942543


   > ## What is the purpose of the change
   > Flink triggers all panes belonging to one window at the same time. In 
other words, all panes are aligned and their triggers all fire simultaneously, 
causing the spiking workload effect. This pull request adds WindowStagger to 
generate staggering offset for each window assignment at runtime, so the 
workloads are distributed across time. Hence each window assignment is based on 
window size, window offset and staggering offset (generated in runtime).
   > 
   > This change only modifies TumblingProcessingTimeWindows, will send out 
other windows change in other PRs.
   > 
   > ## Brief change log
   > * _Add WindowStagger for generating staggering offsets_
   > * _Enable TumblingProcessingTimeWindows to generate staggering offsets if 
user enabled_
   > 
   > ## Verifying this change
   > This change is already covered by existing tests, such as 
_TumblingProcessingTimeWindowsTest_.
   > 
   > This change added tests and can be verified as follows:
   > 
   > * _Added unit tests for WindowStagger_
   > * _Validated the change by running in our clusters with 3500 task managers 
in total on a stateful streaming program using sliding and tumbling windowing. 
Some dashboards are shown below_
   > 
   > 
![](https://camo.githubusercontent.com/298c26fefe598c1b4605a83f84451adfde4fae50/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835362f737461676765725f77696e646f775f64656c61792e706e67)
   > 
![](https://camo.githubusercontent.com/3c4769bd8b7352c8860eaaa4bc6bbffc64104975/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835372f737461676765725f77696e646f775f7468726f7567687075742e706e67)
   > 
![](https://camo.githubusercontent.com/5fd01124c4c21d9388eab78c498c928ff6a651db/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835392f737461676765725f77696e646f772e706e67)
   > 
   > _some system metrics_
   > 
   > 
![buffers_in_queue](https://user-images.githubusercontent.com/10646097/60139232-00f29d80-9762-11e9-84f4-3bfbde28c028.png)
   > 
![buffer_usage](https://user-images.githubusercontent.com/10646097/60139234-03ed8e00-9762-11e9-99d1-de845d02a8c6.png)
   > 
![output_record_rate](https://user-images.githubusercontent.com/10646097/60139237-064fe800-9762-11e9-959d-2db96c7f7bf6.png)
   > 
   > ## Does this pull request potentially affect one of the following parts:
   > * Dependencies (does it add or upgrade a dependency): no
   > * The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes, TumblingProcessingTimeWindows(potentially all 
WindowAssigners)
   > * The serializers: no
   > * The runtime per-record code paths (performance sensitive): don't know, 
probably no ?
   > * Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
   > * The S3 file system connector: no
   > 
   > ## Documentation
   > * Does this pull request introduce a new feature? yes
   > * If yes, how is the feature documented? JavaDocs
   
   
   
   > This is a very good feature! Sorry for taking so (extremely) long to 
finally review this. I didn't initially see this PR.
   > 
   > Could you please address my comments?
   > 
   > Also, when creating a PR the individual commits should also have summaries 
according to the Jira issue, i.e. `[FLINK-X] Add ...`.
   
   Thank you for getting back to me. I lost the original fork (it's been a year 
since I made this change) so I created a new PR 
https://github.com/apache/flink/pull/12297
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #12297: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



flinkbot commented on pull request #12297:
URL: https://github.com/apache/flink/pull/12297#issuecomment-632942068


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 8851d6a45084f93f4ee25085dcd135dc8cde2e1f (Fri May 22 
23:24:48 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] TengHu opened a new pull request #12297: [FLINK-12855] [streaming-java][window-assigners] Add functionality that staggers panes on partitions to distribute workload.

2020-05-22 Thread GitBox



TengHu opened a new pull request #12297:
URL: https://github.com/apache/flink/pull/12297


   > ## What is the purpose of the change
   > Flink triggers all panes belonging to one window at the same time. In 
other words, all panes are aligned and their triggers all fire simultaneously, 
causing the spiking workload effect. This pull request adds WindowStagger to 
generate staggering offset for each window assignment at runtime, so the 
workloads are distributed across time. Hence each window assignment is based on 
window size, window offset and staggering offset (generated in runtime).
   > 
   > This change only modifies TumblingProcessingTimeWindows, will send out 
other windows change in other PRs.
   > 
   > ## Brief change log
   > * _Add WindowStagger for generating staggering offsets_
   > * _Enable TumblingProcessingTimeWindows to generate staggering offsets if 
user enabled_
   > 
   > ## Verifying this change
   > This change is already covered by existing tests, such as 
_TumblingProcessingTimeWindowsTest_.
   > 
   > This change added tests and can be verified as follows:
   > 
   > * _Added unit tests for WindowStagger_
   > * _Validated the change by running in our clusters with 3500 task managers 
in total on a stateful streaming program using sliding and tumbling windowing. 
Some dashboards are shown below_
   > 
   > 
![](https://camo.githubusercontent.com/298c26fefe598c1b4605a83f84451adfde4fae50/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835362f737461676765725f77696e646f775f64656c61792e706e67)
   > 
![](https://camo.githubusercontent.com/3c4769bd8b7352c8860eaaa4bc6bbffc64104975/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835372f737461676765725f77696e646f775f7468726f7567687075742e706e67)
   > 
![](https://camo.githubusercontent.com/5fd01124c4c21d9388eab78c498c928ff6a651db/68747470733a2f2f6973737565732e6170616368652e6f72672f6a6972612f7365637572652f6174746163686d656e742f31323937313835392f737461676765725f77696e646f772e706e67)
   > 
   > _some system metrics_
   > 
   > 
![buffers_in_queue](https://user-images.githubusercontent.com/10646097/60139232-00f29d80-9762-11e9-84f4-3bfbde28c028.png)
   > 
![buffer_usage](https://user-images.githubusercontent.com/10646097/60139234-03ed8e00-9762-11e9-99d1-de845d02a8c6.png)
   > 
![output_record_rate](https://user-images.githubusercontent.com/10646097/60139237-064fe800-9762-11e9-959d-2db96c7f7bf6.png)
   > 
   > ## Does this pull request potentially affect one of the following parts:
   > * Dependencies (does it add or upgrade a dependency): no
   > * The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes, TumblingProcessingTimeWindows(potentially all 
WindowAssigners)
   > * The serializers: no
   > * The runtime per-record code paths (performance sensitive): don't know, 
probably no ?
   > * Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
   > * The S3 file system connector: no
   > 
   > ## Documentation
   > * Does this pull request introduce a new feature? yes
   > * If yes, how is the feature documented? JavaDocs
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12296: [FLINK-17814][chinese-translation]Translate native kubernetes document to Chinese

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12296:
URL: https://github.com/apache/flink/pull/12296#issuecomment-632902501


   
   ## CI report:
   
   * 22bae967d6b59eb8e4c5dcd6a37f5a27fc492ed1 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2063)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12199: [FLINK-17774] [table] supports all kinds of changes for select result

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12199:
URL: https://github.com/apache/flink/pull/12199#issuecomment-629793563


   
   ## CI report:
   
   * 5b62118449cdf8d0de8d5b98781fdff9c2d0c571 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2057)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #12296: [FLINK-17814][chinese-translation]Translate native kubernetes document to Chinese

2020-05-22 Thread GitBox



flinkbot commented on pull request #12296:
URL: https://github.com/apache/flink/pull/12296#issuecomment-632902501


   
   ## CI report:
   
   * 22bae967d6b59eb8e4c5dcd6a37f5a27fc492ed1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #12296: [FLINK-17814][chinese-translation]Translate native kubernetes document to Chinese

2020-05-22 Thread GitBox



flinkbot commented on pull request #12296:
URL: https://github.com/apache/flink/pull/12296#issuecomment-632882488


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 22bae967d6b59eb8e4c5dcd6a37f5a27fc492ed1 (Fri May 22 
19:38:35 UTC 2020)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] caozhen1937 commented on pull request #12296: [FLINK-17814][chinese-translation]Translate native kubernetes document to Chinese

2020-05-22 Thread GitBox



caozhen1937 commented on pull request #12296:
URL: https://github.com/apache/flink/pull/12296#issuecomment-632881152


   @wangyang0918 hi, please review if you have time.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-17814) Translate native kubernetes document to Chinese

2020-05-22 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-17814:
---
Labels: pull-request-available  (was: )

> Translate native kubernetes document to Chinese
> ---
>
> Key: FLINK-17814
> URL: https://issues.apache.org/jira/browse/FLINK-17814
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation, Documentation
>Reporter: Yang Wang
>Assignee: CaoZhen
>Priority: Major
>  Labels: pull-request-available
>
> [https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html]
>  
> Translate the native kubernetes document to Chinese.
> English updated in 7723774a0402e10bc914b1fa6128e3c80678dafe



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] caozhen1937 opened a new pull request #12296: [FLINK-17814][chinese-translation]Translate native kubernetes document to Chinese

2020-05-22 Thread GitBox



caozhen1937 opened a new pull request #12296:
URL: https://github.com/apache/flink/pull/12296


   
   
   ## What is the purpose of the change
   
   Translate native kubernetes document to Chinese



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12285: [FLINK-17445][State Processor] Add Scala support for OperatorTransformation

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12285:
URL: https://github.com/apache/flink/pull/12285#issuecomment-632360179


   
   ## CI report:
   
   * 10e16dd7e0679b6d5cd4517d7f3ebd006ce45b4d Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2060)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12285: [FLINK-17445][State Processor] Add Scala support for OperatorTransformation

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12285:
URL: https://github.com/apache/flink/pull/12285#issuecomment-632360179


   
   ## CI report:
   
   * 754d93703e9ecf3043b9bf57121d1636a3a4c167 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2018)
 
   * 10e16dd7e0679b6d5cd4517d7f3ebd006ce45b4d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-13553) KvStateServerHandlerTest.readInboundBlocking unstable on Travis

2020-05-22 Thread Gary Yao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114285#comment-17114285
 ] 

Gary Yao commented on FLINK-13553:
--

Added more debug/trace logs.

1.11:
7cbdd91413ee26d00d9015581ce2fa8538fd5963
3e18c109051821176575a15a6b10aaa5cc2e3e12 

master: 
564e8802a8f1a8c92d3c46686b109dfb826856fe
0cc7aae86dfdb5e51c661620d39caa79a16fd647



> KvStateServerHandlerTest.readInboundBlocking unstable on Travis
> ---
>
> Key: FLINK-13553
> URL: https://issues.apache.org/jira/browse/FLINK-13553
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Queryable State
>Affects Versions: 1.10.0, 1.11.0
>Reporter: Till Rohrmann
>Assignee: Gary Yao
>Priority: Critical
>  Labels: pull-request-available, test-stability
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{KvStateServerHandlerTest.readInboundBlocking}} and 
> {{KvStateServerHandlerTest.testQueryExecutorShutDown}} fail on Travis with a 
> {{TimeoutException}}.
> https://api.travis-ci.org/v3/job/566420641/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] GJL closed pull request #12276: [FLINK-13553][tests][qs] Improve Logging to debug Test Instability

2020-05-22 Thread GitBox



GJL closed pull request #12276:
URL: https://github.com/apache/flink/pull/12276


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-17891) FlinkYarnSessionCli sets wrong execution.target type

2020-05-22 Thread Shangwen Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114251#comment-17114251
 ] 

Shangwen Tang commented on FLINK-17891:
---

This is a simple fix, assign this issue to me and I can submit the code if 
possible

 

>  FlinkYarnSessionCli sets wrong execution.target type
> -
>
> Key: FLINK-17891
> URL: https://issues.apache.org/jira/browse/FLINK-17891
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.0
>Reporter: Shangwen Tang
>Priority: Major
> Attachments: image-2020-05-23-00-59-32-702.png, 
> image-2020-05-23-01-00-19-549.png
>
>
> I submitted a flink session job at the local YARN cluster, and I found that 
> the *execution.target* is of the wrong type, which should be of yarn-session 
> type
> !image-2020-05-23-00-59-32-702.png|width=545,height=75!
> !image-2020-05-23-01-00-19-549.png|width=544,height=94!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17891) FlinkYarnSessionCli sets wrong execution.target type

2020-05-22 Thread Shangwen Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114246#comment-17114246
 ] 

Shangwen Tang edited comment on FLINK-17891 at 5/22/20, 5:14 PM:
-

My idea is that if we start the flink session job with FlinkYarnSessionCli, we 
should set the *execution.target* to be of yarn-session type, not yarn-per-job. 
and the problem is in this line

 {{}}{{{noformat}
effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);}}{{{noformat}}}
{code:java}
// FlinkYarnSessionCli.java
@Override
public Configuration applyCommandLineOptionsToConfiguration(CommandLine 
commandLine) throws FlinkException {
   // we ignore the addressOption because it can only contain "yarn-cluster"
   final Configuration effectiveConfiguration = new 
Configuration(configuration);

   applyDescriptorOptionToConfig(commandLine, effectiveConfiguration);

   final ApplicationId applicationId = getApplicationId(commandLine);
   if (applicationId != null) {
  final String zooKeeperNamespace;
  if (commandLine.hasOption(zookeeperNamespace.getOpt())){
 zooKeeperNamespace = 
commandLine.getOptionValue(zookeeperNamespace.getOpt());
  } else {
 zooKeeperNamespace = effectiveConfiguration.getString(HA_CLUSTER_ID, 
applicationId.toString());
  }

  effectiveConfiguration.setString(HA_CLUSTER_ID, zooKeeperNamespace);
  effectiveConfiguration.setString(YarnConfigOptions.APPLICATION_ID, 
ConverterUtils.toString(applicationId));
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnSessionClusterExecutor.NAME);
   } else {
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);
   }
 ...
}{code}


was (Author: tangshangwen):
My idea is that if we start the flink session job with FlinkYarnSessionCli, we 
should set the *execution.target* to be of yarn-session type, not yarn-per-job. 
and that's where the problem comes in 
`effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);`
{code:java}
// FlinkYarnSessionCli.java
@Override
public Configuration applyCommandLineOptionsToConfiguration(CommandLine 
commandLine) throws FlinkException {
   // we ignore the addressOption because it can only contain "yarn-cluster"
   final Configuration effectiveConfiguration = new 
Configuration(configuration);

   applyDescriptorOptionToConfig(commandLine, effectiveConfiguration);

   final ApplicationId applicationId = getApplicationId(commandLine);
   if (applicationId != null) {
  final String zooKeeperNamespace;
  if (commandLine.hasOption(zookeeperNamespace.getOpt())){
 zooKeeperNamespace = 
commandLine.getOptionValue(zookeeperNamespace.getOpt());
  } else {
 zooKeeperNamespace = effectiveConfiguration.getString(HA_CLUSTER_ID, 
applicationId.toString());
  }

  effectiveConfiguration.setString(HA_CLUSTER_ID, zooKeeperNamespace);
  effectiveConfiguration.setString(YarnConfigOptions.APPLICATION_ID, 
ConverterUtils.toString(applicationId));
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnSessionClusterExecutor.NAME);
   } else {
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);
   }
 ...
}{code}

>  FlinkYarnSessionCli sets wrong execution.target type
> -
>
> Key: FLINK-17891
> URL: https://issues.apache.org/jira/browse/FLINK-17891
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.0
>Reporter: Shangwen Tang
>Priority: Major
> Attachments: image-2020-05-23-00-59-32-702.png, 
> image-2020-05-23-01-00-19-549.png
>
>
> I submitted a flink session job at the local YARN cluster, and I found that 
> the *execution.target* is of the wrong type, which should be of yarn-session 
> type
> !image-2020-05-23-00-59-32-702.png|width=545,height=75!
> !image-2020-05-23-01-00-19-549.png|width=544,height=94!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17891) FlinkYarnSessionCli sets wrong execution.target type

2020-05-22 Thread Shangwen Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114246#comment-17114246
 ] 

Shangwen Tang edited comment on FLINK-17891 at 5/22/20, 5:14 PM:
-

My idea is that if we start the flink session job with FlinkYarnSessionCli, we 
should set the *execution.target* to be of yarn-session type, not yarn-per-job. 
and the problem is in this line
{code:java}
effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);{code}
 
{code:java}
// FlinkYarnSessionCli.java
@Override
public Configuration applyCommandLineOptionsToConfiguration(CommandLine 
commandLine) throws FlinkException {
   // we ignore the addressOption because it can only contain "yarn-cluster"
   final Configuration effectiveConfiguration = new 
Configuration(configuration);

   applyDescriptorOptionToConfig(commandLine, effectiveConfiguration);

   final ApplicationId applicationId = getApplicationId(commandLine);
   if (applicationId != null) {
  final String zooKeeperNamespace;
  if (commandLine.hasOption(zookeeperNamespace.getOpt())){
 zooKeeperNamespace = 
commandLine.getOptionValue(zookeeperNamespace.getOpt());
  } else {
 zooKeeperNamespace = effectiveConfiguration.getString(HA_CLUSTER_ID, 
applicationId.toString());
  }

  effectiveConfiguration.setString(HA_CLUSTER_ID, zooKeeperNamespace);
  effectiveConfiguration.setString(YarnConfigOptions.APPLICATION_ID, 
ConverterUtils.toString(applicationId));
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnSessionClusterExecutor.NAME);
   } else {
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);
   }
 ...
}{code}


was (Author: tangshangwen):
My idea is that if we start the flink session job with FlinkYarnSessionCli, we 
should set the *execution.target* to be of yarn-session type, not yarn-per-job. 
and the problem is in this line

 {{}}{{{noformat}
effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);}}{{{noformat}}}
{code:java}
// FlinkYarnSessionCli.java
@Override
public Configuration applyCommandLineOptionsToConfiguration(CommandLine 
commandLine) throws FlinkException {
   // we ignore the addressOption because it can only contain "yarn-cluster"
   final Configuration effectiveConfiguration = new 
Configuration(configuration);

   applyDescriptorOptionToConfig(commandLine, effectiveConfiguration);

   final ApplicationId applicationId = getApplicationId(commandLine);
   if (applicationId != null) {
  final String zooKeeperNamespace;
  if (commandLine.hasOption(zookeeperNamespace.getOpt())){
 zooKeeperNamespace = 
commandLine.getOptionValue(zookeeperNamespace.getOpt());
  } else {
 zooKeeperNamespace = effectiveConfiguration.getString(HA_CLUSTER_ID, 
applicationId.toString());
  }

  effectiveConfiguration.setString(HA_CLUSTER_ID, zooKeeperNamespace);
  effectiveConfiguration.setString(YarnConfigOptions.APPLICATION_ID, 
ConverterUtils.toString(applicationId));
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnSessionClusterExecutor.NAME);
   } else {
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);
   }
 ...
}{code}

>  FlinkYarnSessionCli sets wrong execution.target type
> -
>
> Key: FLINK-17891
> URL: https://issues.apache.org/jira/browse/FLINK-17891
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.0
>Reporter: Shangwen Tang
>Priority: Major
> Attachments: image-2020-05-23-00-59-32-702.png, 
> image-2020-05-23-01-00-19-549.png
>
>
> I submitted a flink session job at the local YARN cluster, and I found that 
> the *execution.target* is of the wrong type, which should be of yarn-session 
> type
> !image-2020-05-23-00-59-32-702.png|width=545,height=75!
> !image-2020-05-23-01-00-19-549.png|width=544,height=94!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17891) FlinkYarnSessionCli sets wrong execution.target type

2020-05-22 Thread Shangwen Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114246#comment-17114246
 ] 

Shangwen Tang commented on FLINK-17891:
---

My idea is that if we start the flink session job with FlinkYarnSessionCli, we 
should set the *execution.target* to be of yarn-session type, not yarn-per-job. 
and that's where the problem comes in 
`effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);`
{code:java}
// FlinkYarnSessionCli.java
@Override
public Configuration applyCommandLineOptionsToConfiguration(CommandLine 
commandLine) throws FlinkException {
   // we ignore the addressOption because it can only contain "yarn-cluster"
   final Configuration effectiveConfiguration = new 
Configuration(configuration);

   applyDescriptorOptionToConfig(commandLine, effectiveConfiguration);

   final ApplicationId applicationId = getApplicationId(commandLine);
   if (applicationId != null) {
  final String zooKeeperNamespace;
  if (commandLine.hasOption(zookeeperNamespace.getOpt())){
 zooKeeperNamespace = 
commandLine.getOptionValue(zookeeperNamespace.getOpt());
  } else {
 zooKeeperNamespace = effectiveConfiguration.getString(HA_CLUSTER_ID, 
applicationId.toString());
  }

  effectiveConfiguration.setString(HA_CLUSTER_ID, zooKeeperNamespace);
  effectiveConfiguration.setString(YarnConfigOptions.APPLICATION_ID, 
ConverterUtils.toString(applicationId));
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnSessionClusterExecutor.NAME);
   } else {
  effectiveConfiguration.setString(DeploymentOptions.TARGET, 
YarnJobClusterExecutor.NAME);
   }
 ...
}{code}

>  FlinkYarnSessionCli sets wrong execution.target type
> -
>
> Key: FLINK-17891
> URL: https://issues.apache.org/jira/browse/FLINK-17891
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.0
>Reporter: Shangwen Tang
>Priority: Major
> Attachments: image-2020-05-23-00-59-32-702.png, 
> image-2020-05-23-01-00-19-549.png
>
>
> I submitted a flink session job at the local YARN cluster, and I found that 
> the *execution.target* is of the wrong type, which should be of yarn-session 
> type
> !image-2020-05-23-00-59-32-702.png|width=545,height=75!
> !image-2020-05-23-01-00-19-549.png|width=544,height=94!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] alpinegizmo commented on a change in pull request #12257: [FLINK-17076][docs] Revamp Kafka Connector Documentation

2020-05-22 Thread GitBox



alpinegizmo commented on a change in pull request #12257:
URL: https://github.com/apache/flink/pull/12257#discussion_r429283693



##
File path: docs/dev/connectors/kafka.md
##
@@ -23,125 +23,59 @@ specific language governing permissions and limitations
 under the License.
 -->
 
+Flink provides an [Apache Kafka](https://kafka.apache.org) connector for 
reading data from and writing data to Kafka topics with exactly-once guaruntees.
+
 * This will be replaced by the TOC
 {:toc}
 
-This connector provides access to event streams served by [Apache 
Kafka](https://kafka.apache.org/).
-
-Flink provides special Kafka Connectors for reading and writing data from/to 
Kafka topics.
-The Flink Kafka Consumer integrates with Flink's checkpointing mechanism to 
provide
-exactly-once processing semantics. To achieve that, Flink does not purely rely 
on Kafka's consumer group
-offset tracking, but tracks and checkpoints these offsets internally as well.
-
-Please pick a package (maven artifact id) and class name for your use-case and 
environment.
-For most users, the `FlinkKafkaConsumer010` (part of `flink-connector-kafka`) 
is appropriate.
-
-
-  
-
-  Maven Dependency
-  Supported since
-  Consumer and 
-  Producer Class name
-  Kafka version
-  Notes
-
-  
-  
-
-flink-connector-kafka-0.10{{ site.scala_version_suffix }}
-1.2.0
-FlinkKafkaConsumer010
-FlinkKafkaProducer010
-0.10.x
-This connector supports https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+timestamps+to+Kafka+message;>Kafka
 messages with timestamps both for producing and consuming.
-
-
-flink-connector-kafka-0.11{{ site.scala_version_suffix }}
-1.4.0
-FlinkKafkaConsumer011
-FlinkKafkaProducer011
-0.11.x
-Since 0.11.x Kafka does not support scala 2.10. This connector 
supports https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging;>Kafka
 transactional messaging to provide exactly once semantic for the 
producer.
-
-
-flink-connector-kafka{{ site.scala_version_suffix }}
-1.7.0
-FlinkKafkaConsumer
-FlinkKafkaProducer
->= 1.0.0
-
-This universal Kafka connector attempts to track the latest version of 
the Kafka client.
-The version of the client it uses may change between Flink releases. 
Starting with Flink 1.9 release, it uses the Kafka 2.2.0 client.
-Modern Kafka clients are backwards compatible with broker versions 
0.10.0 or later.
-However for Kafka 0.11.x and 0.10.x versions, we recommend using 
dedicated
-flink-connector-kafka-0.11{{ site.scala_version_suffix }} and 
flink-connector-kafka-0.10{{ site.scala_version_suffix }} respectively.
-
-
-  
-
-
-Then, import the connector in your maven project:
+## Dependency
+
+Apache Flink ships with multiple Kafka connectors; universal, 0.10, and 0.11.
+This universal Kafka connector attempts to track the latest version of the 
Kafka client.
+The version of the client it uses may change between Flink releases.
+Modern Kafka clients are backwards compatible with broker versions 0.10.0 or 
later.
+For most users the universal Kafka connector is the most appropriate.
+However, for Kafka 0.11.x and 0.10.x versions, we recommend using dedicated 
``0.11`` and ``0.10`` respectively.
+For details on Kafka compatibility, please refer to the official [Kafka 
documentation](https://kafka.apache.org/protocol.html#protocol_compatibility).
 
+
+
 {% highlight xml %}
 
-  org.apache.flink
-  flink-connector-kafka{{ site.scala_version_suffix }}
-  {{ site.version }}
+   org.apache.flink
+   flink-connector-kafka{{ site.scala_version_suffix 
}}
+   {{ site.version }}
+
+{% endhighlight %} 
+
+
+{% highlight xml %}
+
+   org.apache.flink
+   flink-connector-kafka-011{{ site.scala_version_suffix 
}}
+   {{ site.version }}
 
 {% endhighlight %}
-
-Note that the streaming connectors are currently not part of the binary 
distribution.
-See how to link with them for cluster execution [here]({{ 
site.baseurl}}/dev/projectsetup/dependencies.html).
-
-## Installing Apache Kafka
-
-* Follow the instructions from [Kafka's 
quickstart](https://kafka.apache.org/documentation.html#quickstart) to download 
the code and launch a server (launching a Zookeeper and a Kafka server is 
required every time before starting the application).
-* If the Kafka and Zookeeper servers are running on a remote machine, then the 
`advertised.host.name` setting in the `config/server.properties` file must be 
set to the machine's IP address.
-
-## Kafka 1.0.0+ Connector
-
-Starting with Flink 1.7, there is a new universal Kafka connector that does 
not track a specific Kafka major version.
-Rather, it tracks the latest version of Kafka at the time of the Flink release.
-
-If your Kafka broker version is

[GitHub] [flink] flinkbot edited a comment on pull request #12199: [FLINK-17774] [table] supports all kinds of changes for select result

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12199:
URL: https://github.com/apache/flink/pull/12199#issuecomment-629793563


   
   ## CI report:
   
   * 7f7d4c37fb57c5caaa862226305c6994fe622898 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1861)
 
   * 5b62118449cdf8d0de8d5b98781fdff9c2d0c571 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2057)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-17891) FlinkYarnSessionCli sets wrong execution.target type

2020-05-22 Thread Shangwen Tang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shangwen Tang updated FLINK-17891:
--
Description: 
I submitted a flink session job at the local YARN cluster, and I found that the 
*execution.target* is of the wrong type, which should be of yarn-session type

!image-2020-05-23-00-59-32-702.png|width=545,height=75!

!image-2020-05-23-01-00-19-549.png|width=544,height=94!

 

  was:
I submitted a flink session job at the local YARN cluster, and I found that the 
execution. Target is of the wrong type, which should be of yarn-session type

!image-2020-05-23-00-59-32-702.png|width=545,height=75!

!image-2020-05-23-01-00-19-549.png|width=544,height=94!

 


>  FlinkYarnSessionCli sets wrong execution.target type
> -
>
> Key: FLINK-17891
> URL: https://issues.apache.org/jira/browse/FLINK-17891
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.0
>Reporter: Shangwen Tang
>Priority: Major
> Attachments: image-2020-05-23-00-59-32-702.png, 
> image-2020-05-23-01-00-19-549.png
>
>
> I submitted a flink session job at the local YARN cluster, and I found that 
> the *execution.target* is of the wrong type, which should be of yarn-session 
> type
> !image-2020-05-23-00-59-32-702.png|width=545,height=75!
> !image-2020-05-23-01-00-19-549.png|width=544,height=94!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17891) FlinkYarnSessionCli sets wrong execution.target type

2020-05-22 Thread Shangwen Tang (Jira)

Shangwen Tang created FLINK-17891:
-

 Summary:  FlinkYarnSessionCli sets wrong execution.target type
 Key: FLINK-17891
 URL: https://issues.apache.org/jira/browse/FLINK-17891
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.11.0
Reporter: Shangwen Tang
 Attachments: image-2020-05-23-00-59-32-702.png, 
image-2020-05-23-01-00-19-549.png

I submitted a flink session job at the local YARN cluster, and I found that the 
execution. Target is of the wrong type, which should be of yarn-session type

!image-2020-05-23-00-59-32-702.png|width=545,height=75!

!image-2020-05-23-01-00-19-549.png|width=544,height=94!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17801) TaskExecutorTest.testHeartbeatTimeoutWithResourceManager timeout

2020-05-22 Thread Till Rohrmann (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-17801.
-
Fix Version/s: 1.10.2
   Resolution: Fixed

Fixed via

master: a47d7050c77c373c1cb27f7c826bf0af8cfaa700
1.11.0: 828ba1dd7356ac3694f3c4b8688ae8ecbd188771
1.10.2: 713203808b54050431d687dbf1d524c900c7141b

> TaskExecutorTest.testHeartbeatTimeoutWithResourceManager timeout
> 
>
> Key: FLINK-17801
> URL: https://issues.apache.org/jira/browse/FLINK-17801
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.11.0, 1.10.2
>
>
> CI: 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1705=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=4ed44b66-cdd6-5dcf-5f6a-88b07dda665d
> {code}
> 2020-05-18T10:06:52.8403444Z [ERROR] 
> testHeartbeatTimeoutWithResourceManager(org.apache.flink.runtime.taskexecutor.TaskExecutorTest)
>   Time elapsed: 0.484 s  <<< ERROR!
> 2020-05-18T10:06:52.8404158Z java.util.concurrent.TimeoutException
> 2020-05-18T10:06:52.8404749Z  at 
> java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
> 2020-05-18T10:06:52.8405467Z  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
> 2020-05-18T10:06:52.8406269Z  at 
> org.apache.flink.runtime.taskexecutor.TaskExecutorTest.testHeartbeatTimeoutWithResourceManager(TaskExecutorTest.java:449)
> 2020-05-18T10:06:52.8407050Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-05-18T10:06:52.8407685Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-05-18T10:06:52.8408463Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-05-18T10:06:52.8409118Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-05-18T10:06:52.8409804Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-05-18T10:06:52.8410528Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-05-18T10:06:52.8411388Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-05-18T10:06:52.8412167Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-05-18T10:06:52.8412884Z  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2020-05-18T10:06:52.8413795Z  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2020-05-18T10:06:52.8414435Z  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2020-05-18T10:06:52.8415052Z  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> 2020-05-18T10:06:52.8415692Z  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2020-05-18T10:06:52.8416251Z  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2020-05-18T10:06:52.8416863Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2020-05-18T10:06:52.8417498Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2020-05-18T10:06:52.8418235Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2020-05-18T10:06:52.8418883Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2020-05-18T10:06:52.8419374Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2020-05-18T10:06:52.8419775Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2020-05-18T10:06:52.8420151Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2020-05-18T10:06:52.8420539Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2020-05-18T10:06:52.8420912Z  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2020-05-18T10:06:52.8421493Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2020-05-18T10:06:52.846Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 2020-05-18T10:06:52.8422977Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 2020-05-18T10:06:52.8423807Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 2020-05-18T10:06:52.8424842Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 2020-05-18T10:06:52.8425680Z  at 
>

[GitHub] [flink] flinkbot edited a comment on pull request #12199: [FLINK-17774] [table] supports all kinds of changes for select result

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12199:
URL: https://github.com/apache/flink/pull/12199#issuecomment-629793563


   
   ## CI report:
   
   * 7f7d4c37fb57c5caaa862226305c6994fe622898 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1861)
 
   * 5b62118449cdf8d0de8d5b98781fdff9c2d0c571 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] brandonbevans commented on pull request #12285: [FLINK-17445][State Processor] Add Scala support for OperatorTransformation

2020-05-22 Thread GitBox



brandonbevans commented on pull request #12285:
URL: https://github.com/apache/flink/pull/12285#issuecomment-632792699


   @sjwiesman Thank you for the thorough review I really appreciate it. I'll 
make the advised changes / clean up and get back.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] sjwiesman commented on pull request #12285: [FLINK-17445][State Processor] Add Scala support for OperatorTransformation

2020-05-22 Thread GitBox



sjwiesman commented on pull request #12285:
URL: https://github.com/apache/flink/pull/12285#issuecomment-632790591


   Also, the build is failing due to missing license headers and checkstyle 
issues. If you run `mvn verify` from within `flink-state-processing-api-scala` 
locally you can check these issues. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] tillrohrmann closed pull request #12242: [FLINK-17801][tests] Increase timeout of TaskExecutorTest.testHeartbeatTimeoutWithResourceManager

2020-05-22 Thread GitBox



tillrohrmann closed pull request #12242:
URL: https://github.com/apache/flink/pull/12242


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] sjwiesman commented on a change in pull request #12285: [FLINK-17445][State Processor] Add Scala support for OperatorTransformation

2020-05-22 Thread GitBox



sjwiesman commented on a change in pull request #12285:
URL: https://github.com/apache/flink/pull/12285#discussion_r429340831



##
File path: flink-libraries/flink-state-processing-api-scala/pom.xml
##
@@ -0,0 +1,119 @@
+
+
+http://maven.apache.org/POM/4.0.0;
+xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd;>
+
+   4.0.0
+
+   
+   org.apache.flink
+   flink-libraries
+   1.11-SNAPSHOT
+   ..
+   
+
+   
flink-state-processor-api-scala_${scala.binary.version}
+   flink-state-processor-api-scala
+
+   jar
+
+   
+
+   
+
+   
+   org.apache.flink
+   
flink-streaming-java_${scala.binary.version}
+   ${project.version}
+   provided
+   
+   
+   org.apache.flink
+   flink-java
+   ${project.version}
+   provided
+   
+
+   
+   org.apache.flink
+   
flink-state-processor-api_${scala.binary.version}
+   ${project.version}
+   provided
+   
+
+   
+   org.apache.flink
+   
flink-streaming-scala_${scala.binary.version}
+   ${project.version}
+   provided
+   
+
+   
+
+   
+   org.apache.flink
+   
flink-statebackend-rocksdb_${scala.binary.version}
+   ${project.version}
+   test
+   
+
+   
+   org.apache.flink
+   
flink-streaming-java_${scala.binary.version}
+   ${project.version}
+   test-jar
+   test
+   
+
+   
+   org.apache.flink
+   
flink-test-utils_${scala.binary.version}
+   ${project.version}
+   test
+   
+
+   
+   org.apache.flink
+   
flink-runtime_${scala.binary.version}
+   ${project.version}
+   test-jar
+   test
+   

Review comment:
   It doesn't look like you are using any of these test dependencies, 
please remove. 

##
File path: flink-examples/flink-examples-streaming/pom.xml
##
@@ -37,6 +37,17 @@ under the License.

 

+   
+   org.apache.flink
+   
flink-state-processor-api_${scala.binary.version}
+   ${project.version}
+   
+
+   
+   org.apache.flink
+   
flink-state-processor-api-scala_${scala.binary.version}
+   ${project.version}
+   

Review comment:
   Unrelated changes, please remove

##
File path: flink-libraries/flink-state-processing-api-scala/pom.xml
##
@@ -0,0 +1,119 @@
+
+
+http://maven.apache.org/POM/4.0.0;
+xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd;>
+
+   4.0.0
+
+   
+   org.apache.flink
+   flink-libraries
+   1.11-SNAPSHOT
+   ..
+   
+
+   
flink-state-processor-api-scala_${scala.binary.version}
+   flink-state-processor-api-scala
+
+   jar
+
+   
+
+   

Review comment:
   There should be an explicit `provided` dependency on `flink-scala`. 

##
File path: 
flink-libraries/flink-state-processing-api-scala/src/main/scala/org/apache/flink/state/api/scala/OperatorTransformation.scala
##
@@ -0,0 +1,10 @@
+package org.apache.flink.state.api.scala
+
+import org.apache.flink.api.scala.DataSet
+import org.apache.flink.state.api.{OneInputOperatorTransformation, 
OperatorTransformation => JOperatorTransformation}
+
+object OperatorTransformation {
+  def bootstrapWith[T](dataSet: DataSet[T]): OneInputOperatorTransformation[T] 
= {
+JOperatorTransformation.bootstrapWith(dataSet.javaSet)

Review comment:
   Unfortunately this isn't going to work. See my main comment. 

##
File path: flink-libraries/flink-state-processing-api-scala/pom.xml
##
@@ -0,0 +1,119 @@
+
+
+http://maven.apache.org/POM/4.0.0;
+xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0

[GitHub] [flink] tillrohrmann commented on pull request #12242: [FLINK-17801][tests] Increase timeout of TaskExecutorTest.testHeartbeatTimeoutWithResourceManager

2020-05-22 Thread GitBox



tillrohrmann commented on pull request #12242:
URL: https://github.com/apache/flink/pull/12242#issuecomment-632781352


   Thanks for the review @azagrebin. Merging this PR now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] leonardBang commented on pull request #11900: [FLINK-17284][jdbc][postgres] Support serial fields

2020-05-22 Thread GitBox



leonardBang commented on pull request #11900:
URL: https://github.com/apache/flink/pull/11900#issuecomment-632779593


   > @wuchong or @leonardBang ? Do you have time to review this?
   
   Thanks for the effort @fpompermaier , I'll look in this weekends. ^_^



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-17848) Allow scala datastream api to create named sources

2020-05-22 Thread Zack LB (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zack LB updated FLINK-17848:

Description: 
Currently you can only provide a custom name to a source operator from the java 
API. There is no matching API on the scala API. It would be nice to allow scala 
to also name sources something custom.

 

I'm referring to this [api 
specifically|[https://github.com/apache/flink/blob/master/flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/StreamExecutionEnvironment.scala#L640]]

 

The change would be as simple as adding a default parameter to that method 
which mirrors the java api's default value "Custom Source". Or if preferred 
could duplicate the method with a required argument that then passes the value 
to the java function.

  was:Currently you can only provide a custom name to a source operator from 
the java API. There is no matching API on the scala API. It would be nice to 
allow scala to also name sources something custom.


> Allow scala datastream api to create named sources
> --
>
> Key: FLINK-17848
> URL: https://issues.apache.org/jira/browse/FLINK-17848
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, API / Scala
>Reporter: Zack LB
>Priority: Major
>
> Currently you can only provide a custom name to a source operator from the 
> java API. There is no matching API on the scala API. It would be nice to 
> allow scala to also name sources something custom.
>  
> I'm referring to this [api 
> specifically|[https://github.com/apache/flink/blob/master/flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/StreamExecutionEnvironment.scala#L640]]
>  
> The change would be as simple as adding a default parameter to that method 
> which mirrors the java api's default value "Custom Source". Or if preferred 
> could duplicate the method with a required argument that then passes the 
> value to the java function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17848) Allow scala datastream api to create named sources

2020-05-22 Thread Zack LB (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zack LB updated FLINK-17848:

Description: 
Currently you can only provide a custom name to a source operator from the java 
API. There is no matching API on the scala API. It would be nice to allow scala 
to also name sources something custom.

I'm referring to this [api specifically|#L640]]

The change would be as simple as adding a default parameter to that method 
which mirrors the java api's default value "Custom Source". Or if preferred 
could duplicate the method with a required argument that then passes the value 
to the java function.

  was:
Currently you can only provide a custom name to a source operator from the java 
API. There is no matching API on the scala API. It would be nice to allow scala 
to also name sources something custom.

 

I'm referring to this [api 
specifically|[https://github.com/apache/flink/blob/master/flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/StreamExecutionEnvironment.scala#L640]]

 

The change would be as simple as adding a default parameter to that method 
which mirrors the java api's default value "Custom Source". Or if preferred 
could duplicate the method with a required argument that then passes the value 
to the java function.


> Allow scala datastream api to create named sources
> --
>
> Key: FLINK-17848
> URL: https://issues.apache.org/jira/browse/FLINK-17848
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, API / Scala
>Reporter: Zack LB
>Priority: Major
>
> Currently you can only provide a custom name to a source operator from the 
> java API. There is no matching API on the scala API. It would be nice to 
> allow scala to also name sources something custom.
> I'm referring to this [api specifically|#L640]]
> The change would be as simple as adding a default parameter to that method 
> which mirrors the java api's default value "Custom Source". Or if preferred 
> could duplicate the method with a required argument that then passes the 
> value to the java function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17848) Allow scala datastream api to create named sources

2020-05-22 Thread Zack LB (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zack LB updated FLINK-17848:

Description: 
Currently you can only provide a custom name to a source operator from the java 
API. There is no matching API on the scala API. It would be nice to allow scala 
to also name sources something custom.

I'm referring to this [api specifically|#L640]

The change would be as simple as adding a default parameter to that method 
which mirrors the java api's default value "Custom Source". Or if preferred 
could duplicate the method with a required argument that then passes the value 
to the java function.

  was:
Currently you can only provide a custom name to a source operator from the java 
API. There is no matching API on the scala API. It would be nice to allow scala 
to also name sources something custom.

I'm referring to this [api specifically|#L640]]

The change would be as simple as adding a default parameter to that method 
which mirrors the java api's default value "Custom Source". Or if preferred 
could duplicate the method with a required argument that then passes the value 
to the java function.


> Allow scala datastream api to create named sources
> --
>
> Key: FLINK-17848
> URL: https://issues.apache.org/jira/browse/FLINK-17848
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, API / Scala
>Reporter: Zack LB
>Priority: Major
>
> Currently you can only provide a custom name to a source operator from the 
> java API. There is no matching API on the scala API. It would be nice to 
> allow scala to also name sources something custom.
> I'm referring to this [api specifically|#L640]
> The change would be as simple as adding a default parameter to that method 
> which mirrors the java api's default value "Custom Source". Or if preferred 
> could duplicate the method with a required argument that then passes the 
> value to the java function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17848) Allow scala datastream api to create named sources

2020-05-22 Thread Zack LB (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zack LB updated FLINK-17848:

Component/s: API / DataStream

> Allow scala datastream api to create named sources
> --
>
> Key: FLINK-17848
> URL: https://issues.apache.org/jira/browse/FLINK-17848
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, API / Scala
>Reporter: Zack LB
>Priority: Major
>
> Currently you can only provide a custom name to a source operator from the 
> java API. There is no matching API on the scala API. It would be nice to 
> allow scala to also name sources something custom.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17890) support in DataType TIMESTAMP_WITH_LOCAL_TIME_ZONE in JDBC

2020-05-22 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu closed FLINK-17890.
--
Resolution: Duplicate

> support in DataType TIMESTAMP_WITH_LOCAL_TIME_ZONE in JDBC
> --
>
> Key: FLINK-17890
> URL: https://issues.apache.org/jira/browse/FLINK-17890
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / JDBC
>Affects Versions: 1.12.0
>Reporter: Leonard Xu
>Priority: Major
> Fix For: 1.11.0
>
>
> As user described, JDBC connector do not support LocalDateTime yet, we need 
> to support it.
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Why-Flink-Connector-JDBC-does-t-support-LocalDateTime-td35342.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17890) support in DataType TIMESTAMP_WITH_LOCAL_TIME_ZONE in JDBC

2020-05-22 Thread Leonard Xu (Jira)

Leonard Xu created FLINK-17890:
--

 Summary: support in DataType TIMESTAMP_WITH_LOCAL_TIME_ZONE in JDBC
 Key: FLINK-17890
 URL: https://issues.apache.org/jira/browse/FLINK-17890
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.12.0
Reporter: Leonard Xu
 Fix For: 1.11.0


As user described, JDBC connector do not support LocalDateTime yet, we need to 
support it.

[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Why-Flink-Connector-JDBC-does-t-support-LocalDateTime-td35342.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] roeyshemtov commented on pull request #11972: [FLINK-17058] Adding ProcessingTimeoutTrigger of nested triggers.

2020-05-22 Thread GitBox



roeyshemtov commented on pull request #11972:
URL: https://github.com/apache/flink/pull/11972#issuecomment-632767977


   @aljoscha I personally don't like to use null (or at least wrap it with 
Objects.isNull), but i guess it is okay because the other Triggers use it too.
   You can merge this, thanks for help.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-17889) flink-connector-hive jar contains wrong class in its SPI config file org.apache.flink.table.factories.TableFactory

2020-05-22 Thread Jeff Zhang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114191#comment-17114191
 ] 

Jeff Zhang commented on FLINK-17889:


\cc [~lirui] [~lzljs3620320]

> flink-connector-hive jar contains wrong class in its SPI config file 
> org.apache.flink.table.factories.TableFactory
> --
>
> Key: FLINK-17889
> URL: https://issues.apache.org/jira/browse/FLINK-17889
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.11.0
>Reporter: Jeff Zhang
>Priority: Major
>
> These 2 classes are in flink-connector-hive jar's SPI config file
> {code:java}
> org.apache.flink.orc.OrcFileSystemFormatFactory
> License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code}
> Due to this issue, I get the following exception in zeppelin side.
> {code:java}
> Caused by: java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtypeCaused by: 
> java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtype at 
> java.util.ServiceLoader.fail(ServiceLoader.java:239) at 
> java.util.ServiceLoader.access$300(ServiceLoader.java:185) at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376) at 
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at 
> java.util.ServiceLoader$1.next(ServiceLoader.java:480) at 
> java.util.Iterator.forEachRemaining(Iterator.java:116) at 
> org.apache.flink.table.factories.TableFactoryService.discoverFactories(TableFactoryService.java:214)
>  ... 35 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17889) flink-connector-hive jar contains wrong class in its SPI config file org.apache.flink.table.factories.TableFactory

2020-05-22 Thread Jeff Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated FLINK-17889:
---
Description: 
These 2 classes are in flink-connector-hive jar's SPI config file
{code:java}
org.apache.flink.orc.OrcFileSystemFormatFactory
License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code}
Due to this issue, I get the following exception in zeppelin side.
{code:java}
Caused by: java.util.ServiceConfigurationError: 
org.apache.flink.table.factories.TableFactory: Provider 
org.apache.flink.orc.OrcFileSystemFormatFactory not a subtypeCaused by: 
java.util.ServiceConfigurationError: 
org.apache.flink.table.factories.TableFactory: Provider 
org.apache.flink.orc.OrcFileSystemFormatFactory not a subtype at 
java.util.ServiceLoader.fail(ServiceLoader.java:239) at 
java.util.ServiceLoader.access$300(ServiceLoader.java:185) at 
java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376) at 
java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at 
java.util.ServiceLoader$1.next(ServiceLoader.java:480) at 
java.util.Iterator.forEachRemaining(Iterator.java:116) at 
org.apache.flink.table.factories.TableFactoryService.discoverFactories(TableFactoryService.java:214)
 ... 35 more {code}

  was:
These 2 classes are in flink-connector-hive jar's SPI config file
{code:java}
org.apache.flink.orc.OrcFileSystemFormatFactory
License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code}
Due to this issue, I get the following exception in zeppelin side.
{code:java}

 {code}


> flink-connector-hive jar contains wrong class in its SPI config file 
> org.apache.flink.table.factories.TableFactory
> --
>
> Key: FLINK-17889
> URL: https://issues.apache.org/jira/browse/FLINK-17889
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.11.0
>Reporter: Jeff Zhang
>Priority: Major
>
> These 2 classes are in flink-connector-hive jar's SPI config file
> {code:java}
> org.apache.flink.orc.OrcFileSystemFormatFactory
> License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code}
> Due to this issue, I get the following exception in zeppelin side.
> {code:java}
> Caused by: java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtypeCaused by: 
> java.util.ServiceConfigurationError: 
> org.apache.flink.table.factories.TableFactory: Provider 
> org.apache.flink.orc.OrcFileSystemFormatFactory not a subtype at 
> java.util.ServiceLoader.fail(ServiceLoader.java:239) at 
> java.util.ServiceLoader.access$300(ServiceLoader.java:185) at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376) at 
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at 
> java.util.ServiceLoader$1.next(ServiceLoader.java:480) at 
> java.util.Iterator.forEachRemaining(Iterator.java:116) at 
> org.apache.flink.table.factories.TableFactoryService.discoverFactories(TableFactoryService.java:214)
>  ... 35 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16468) BlobClient rapid retrieval retries on failure opens too many sockets

2020-05-22 Thread Jason Kania (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114190#comment-17114190
 ] 

Jason Kania commented on FLINK-16468:
-

Sorry [~gjy], not to this point. The current economic/health situation has 
resulted in a need to redirect our efforts for the moment. We have not done 
more testing in the short term.

> BlobClient rapid retrieval retries on failure opens too many sockets
> 
>
> Key: FLINK-16468
> URL: https://issues.apache.org/jira/browse/FLINK-16468
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.8.3, 1.9.2, 1.10.0
> Environment: Linux ubuntu servers running, patch current latest 
> Ubuntu patch current release java 8 JRE
>Reporter: Jason Kania
>Priority: Major
> Fix For: 1.11.0
>
>
> In situations where the BlobClient retrieval fails as in the following log, 
> rapid retries will exhaust the open sockets. All the retries happen within a 
> few milliseconds.
> {noformat}
> 2020-03-06 17:19:07,116 ERROR org.apache.flink.runtime.blob.BlobClient - 
> Failed to fetch BLOB 
> cddd17ef76291dd60eee9fd36085647a/p-bcd61652baba25d6863cf17843a2ef64f4c801d5-c1781532477cf65ff1c1e7d72dccabc7
>  from aaa-1/10.0.1.1:45145 and store it under 
> /tmp/blobStore-7328ed37-8bc7-4af7-a56c-474e264157c9/incoming/temp-0004 
> Retrying...
> {noformat}
> The above is output repeatedly until the following error occurs:
> {noformat}
> java.io.IOException: Could not connect to BlobServer at address 
> aaa-1/10.0.1.1:45145
>  at org.apache.flink.runtime.blob.BlobClient.(BlobClient.java:100)
>  at 
> org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:143)
>  at 
> org.apache.flink.runtime.blob.AbstractBlobCache.getFileInternal(AbstractBlobCache.java:181)
>  at 
> org.apache.flink.runtime.blob.PermanentBlobCache.getFile(PermanentBlobCache.java:202)
>  at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:120)
>  at 
> org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:915)
>  at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:595)
>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.SocketException: Too many open files
>  at java.net.Socket.createImpl(Socket.java:478)
>  at java.net.Socket.connect(Socket.java:605)
>  at org.apache.flink.runtime.blob.BlobClient.(BlobClient.java:95)
>  ... 8 more
> {noformat}
>  The retries should have some form of backoff in this situation to avoid 
> flooding the logs and exhausting other resources on the server.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17888) JDBC Connector support LocalDateTime type

2020-05-22 Thread forideal (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114189#comment-17114189
 ] 

forideal commented on FLINK-17888:
--

[~lzljs3620320] Can you assign this task to me ? I'll try to finish it. Thanks~

 

> JDBC Connector support LocalDateTime type
> -
>
> Key: FLINK-17888
> URL: https://issues.apache.org/jira/browse/FLINK-17888
> Project: Flink
>  Issue Type: Task
>  Components: Connectors / JDBC
>Reporter: forideal
>Priority: Major
>
> Flink only supports the localdatetime type after 1.9. Currently, Flink's JDBC 
> connector does not support this type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17889) flink-connector-hive jar contains wrong class in its SPI config file org.apache.flink.table.factories.TableFactory

2020-05-22 Thread Jeff Zhang (Jira)

Jeff Zhang created FLINK-17889:
--

 Summary: flink-connector-hive jar contains wrong class in its SPI 
config file org.apache.flink.table.factories.TableFactory
 Key: FLINK-17889
 URL: https://issues.apache.org/jira/browse/FLINK-17889
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.11.0
Reporter: Jeff Zhang


These 2 classes are in flink-connector-hive jar's SPI config file
{code:java}
org.apache.flink.orc.OrcFileSystemFormatFactory
License.org.apache.flink.formats.parquet.ParquetFileSystemFormatFactory {code}
Due to this issue, I get the following exception in zeppelin side.
{code:java}

 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17888) JDBC Connector support LocalDateTime type

2020-05-22 Thread forideal (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

forideal updated FLINK-17888:
-
Affects Version/s: (was: 1.10.0)

> JDBC Connector support LocalDateTime type
> -
>
> Key: FLINK-17888
> URL: https://issues.apache.org/jira/browse/FLINK-17888
> Project: Flink
>  Issue Type: Task
>  Components: Connectors / JDBC
>Reporter: forideal
>Priority: Major
>
> Flink only supports the localdatetime type after 1.9. Currently, Flink's JDBC 
> connector does not support this type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17888) JDBC Connector support LocalDateTime type

2020-05-22 Thread forideal (Jira)

forideal created FLINK-17888:


 Summary: JDBC Connector support LocalDateTime type
 Key: FLINK-17888
 URL: https://issues.apache.org/jira/browse/FLINK-17888
 Project: Flink
  Issue Type: Task
  Components: Connectors / JDBC
Affects Versions: 1.10.0
Reporter: forideal


Flink only supports the localdatetime type after 1.9. Currently, Flink's JDBC 
connector does not support this type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] godfreyhe commented on a change in pull request #12199: [FLINK-17774] [table] supports all kinds of changes for select result

2020-05-22 Thread GitBox



godfreyhe commented on a change in pull request #12199:
URL: https://github.com/apache/flink/pull/12199#discussion_r429320579



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/sinks/SelectTableSinkBase.java
##
@@ -28,61 +28,106 @@
 import 
org.apache.flink.streaming.api.operators.collect.CollectSinkOperatorFactory;
 import org.apache.flink.streaming.api.operators.collect.CollectStreamSink;
 import org.apache.flink.table.api.TableSchema;
-import org.apache.flink.table.api.internal.SelectTableSink;
-import org.apache.flink.table.runtime.types.TypeInfoDataTypeConverter;
+import org.apache.flink.table.api.internal.SelectResultProvider;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.util.DataFormatConverters;
+import org.apache.flink.table.runtime.typeutils.RowDataTypeInfo;
+import org.apache.flink.table.sinks.StreamTableSink;
+import org.apache.flink.table.sinks.TableSink;
 import org.apache.flink.table.types.DataType;
+import org.apache.flink.table.types.logical.LogicalType;
 import org.apache.flink.types.Row;
 
 import java.util.Iterator;
 import java.util.UUID;
+import java.util.stream.Stream;
 
 /**
- * Basic implementation of {@link SelectTableSink}.
+ * Basic implementation of {@link StreamTableSink} for select job to collect 
the result to local.
  */
-public class SelectTableSinkBase implements SelectTableSink {
+public abstract class SelectTableSinkBase implements StreamTableSink {
 
private final TableSchema tableSchema;
-   private final CollectSinkOperatorFactory factory;
-   private final CollectResultIterator iterator;
+   protected final DataFormatConverters.DataFormatConverter 
converter;
+
+   private final CollectSinkOperatorFactory factory;
+   private final CollectResultIterator iterator;
 
@SuppressWarnings("unchecked")
-   public SelectTableSinkBase(TableSchema tableSchema) {
-   this.tableSchema = 
SelectTableSinkSchemaConverter.convertTimeAttributeToRegularTimestamp(
-   
SelectTableSinkSchemaConverter.changeDefaultConversionClass(tableSchema));
+   public SelectTableSinkBase(TableSchema schema, TypeSerializer 
typeSerializer) {
+   this.tableSchema = schema;
+   this.converter = 
DataFormatConverters.getConverterForDataType(this.tableSchema.toPhysicalRowDataType());

Review comment:
   I find there some types that `DataStructureConverters` can't handle, 
such as when running `CalcITCase#testExternalTypeFunc1`, I get the following 
exception:
   
   ```
   java.lang.ClassCastException: 
org.apache.flink.table.types.logical.TypeInformationRawType cannot be cast to 
org.apache.flink.table.types.logical.RawType
   
at 
org.apache.flink.table.data.conversion.RawObjectConverter.create(RawObjectConverter.java:56)
at 
org.apache.flink.table.data.conversion.DataStructureConverters.getConverterInternal(DataStructureConverters.java:157)
at 
org.apache.flink.table.data.conversion.DataStructureConverters.getConverter(DataStructureConverters.java:136)
at 
org.apache.flink.table.data.conversion.RowRowConverter.lambda$create$0(RowRowConverter.java:87)
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-13462) Add table api examples into flink-quickstart

2020-05-22 Thread nathan Xie (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114175#comment-17114175
 ] 

nathan Xie commented on FLINK-13462:


Hi,Can i work on this?Try to add TableApi Example into flink-quickstart.

Thanks.

> Add table api examples into flink-quickstart
> 
>
> Key: FLINK-13462
> URL: https://issues.apache.org/jira/browse/FLINK-13462
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.9.0
>Reporter: Jeff Zhang
>Priority: Major
>
> \cc [~ykt836]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] fpompermaier commented on pull request #11900: [FLINK-17284][jdbc][postgres] Support serial fields

2020-05-22 Thread GitBox



fpompermaier commented on pull request #11900:
URL: https://github.com/apache/flink/pull/11900#issuecomment-632754987


   @wuchong or @leonardBang ? Do you have time to review this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] GJL commented on a change in pull request #12256: [FLINK-17018][runtime] DefaultExecutionSlotAllocator allocates slots in bulks

2020-05-22 Thread GitBox



GJL commented on a change in pull request #12256:
URL: https://github.com/apache/flink/pull/12256#discussion_r429309837



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/slotpool/PhysicalSlotRequestBulk.java
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.jobmaster.slotpool;
+
+import org.apache.flink.runtime.clusterframework.types.AllocationID;
+import org.apache.flink.runtime.jobmaster.SlotRequestId;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.function.Function;
+import java.util.stream.Collectors;
+
+/**
+ * Represents a bulk of physical slot requests.
+ */
+public class PhysicalSlotRequestBulk {
+
+   final Map pendingRequests;

Review comment:
   `private`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #12294: [FLINK-17882][table-common] Check for self references in structured types

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #12294:
URL: https://github.com/apache/flink/pull/12294#issuecomment-632633082


   
   ## CI report:
   
   * 11cbf0ab775549319c4609fa5f1393277a28ec50 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2042)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #11978: [FLINK-16086][chinese-translation]Translate "Temporal Tables" page of "Streaming Concepts" into Chinese

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #11978:
URL: https://github.com/apache/flink/pull/11978#issuecomment-623121072


   
   ## CI report:
   
   * 689f08ae035886e85ab02097ec800afcaf8cf235 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=2048)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #8622: [FLINK-12438][doc-zh]Translate Task Lifecycle document into Chinese

2020-05-22 Thread GitBox



flinkbot edited a comment on pull request #8622:
URL: https://github.com/apache/flink/pull/8622#issuecomment-632686183


   
   ## CI report:
   
   * 31efc8a0e9e9347c883706b481f2e060bae8bc32 Travis: 
[FAILURE](https://travis-ci.com/github/flink-ci/flink/builds/167771628) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 4 >

1 - 100 of 325 matches

Mail list logo