date:20181101

[jira] [Commented] (FLINK-6962) Add a table SQL DDL

2018-11-01 Thread Shuyi Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672624#comment-16672624
 ] 

Shuyi Chen commented on FLINK-6962:
---

Design doc is attached.

> Add a table SQL DDL
> ---
>
> Key: FLINK-6962
> URL: https://issues.apache.org/jira/browse/FLINK-6962
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shuyi Chen
>Priority: Major
>
> This Jira adds support to allow user define the DDL for source and sink 
> tables, including the waterMark(on source table) and emit SLA (on result 
> table). The detailed design doc will be attached soon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-10711) flink-end-to-end-tests can fail silently

2018-11-01 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther reassigned FLINK-10711:


Assignee: Timo Walther  (was: Hequn Cheng)

> flink-end-to-end-tests can fail silently
> 
>
> Key: FLINK-10711
> URL: https://issues.apache.org/jira/browse/FLINK-10711
> Project: Flink
>  Issue Type: Bug
>  Components: E2E Tests
>Affects Versions: 1.7.0
>Reporter: Piotr Nowojski
>Assignee: Timo Walther
>Priority: Blocker
> Fix For: 1.7.0
>
>
> Because they are written in bash and they are not setting
> {code:bash}
> set -e
> {code}
> at the beginning, errors can be swallowed silently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10232) Add a SQL DDL

2018-11-01 Thread Shuyi Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672623#comment-16672623
 ] 

Shuyi Chen commented on FLINK-10232:


Design doc is attached.

> Add a SQL DDL
> -
>
> Key: FLINK-10232
> URL: https://issues.apache.org/jira/browse/FLINK-10232
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Shuyi Chen
>Priority: Major
>
> This is an umbrella issue for all efforts related to supporting a SQL Data 
> Definition Language (DDL) in Flink's Table & SQL API.
> Such a DDL includes creating, deleting, replacing:
> - tables
> - views
> - functions
> - types
> - libraries
> - catalogs
> If possible, the parsing/validating/logical part should be done using 
> Calcite. Related issues are CALCITE-707, CALCITE-2045, CALCITE-2046, 
> CALCITE-2214, and others.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10757) TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672611#comment-16672611
 ] 

ASF GitHub Bot commented on FLINK-10757:


TisonKun opened a new pull request #6998: [FLINK-10757] [tests] Avoid port 
conflicts in AbstractTaskManagerProc…
URL: https://github.com/apache/flink/pull/6998
 
 
   …essFailureRecoveryTest
   
   ## What is the purpose of the change
   
   Harden `AbstractTaskManagerProcessFailureRecoveryTest` by avoid port 
conflicts.
   
   The relative log is https://api.travis-ci.org/v3/job/449439623/log.txt and 
test fails on
   
   > Caused by: java.net.BindException: Address already in use
   
   thus start the cluster with rest port 0 to avoid possible port conflict.
   
   ## Verifying this change
   
   This change is a trivial rework and it itself is a test.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (**no**)
 - The public API, i.e., is any changed class annotated with `(Evolving)`: 
(**no**)
 - The serializers: (**no**)
 - The runtime per-record code paths (performance sensitive): (**no**)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**no**)
 - The S3 file system connector: (**no**)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**no**)
 - If yes, how is the feature documented? (**not applicable**)
   
   cc @zentol @tillrohrmann 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the 
> cluster entrypoint StandaloneSessionClusterEntrypoint.
> 
>
> Key: FLINK-10757
> URL: https://issues.apache.org/jira/browse/FLINK-10757
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.2, 1.7.0
>Reporter: Bowen Li
>Assignee: TisonKun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> {code:java}
> Failed tests: 
>   ...
>  
> TaskManagerProcessFailureStreamingRecoveryITCase>AbstractTaskManagerProcessFailureRecoveryTest.testTaskManagerProcessFailure:223
>  Failed to initialize the cluster entrypoint 
> StandaloneSessionClusterEntrypoint.
> Tests in error: 
> {code}
> https://travis-ci.org/apache/flink/jobs/449439623



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10757) TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.

2018-11-01 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10757:
---
Labels: pull-request-available  (was: )

> TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the 
> cluster entrypoint StandaloneSessionClusterEntrypoint.
> 
>
> Key: FLINK-10757
> URL: https://issues.apache.org/jira/browse/FLINK-10757
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.2, 1.7.0
>Reporter: Bowen Li
>Assignee: TisonKun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> {code:java}
> Failed tests: 
>   ...
>  
> TaskManagerProcessFailureStreamingRecoveryITCase>AbstractTaskManagerProcessFailureRecoveryTest.testTaskManagerProcessFailure:223
>  Failed to initialize the cluster entrypoint 
> StandaloneSessionClusterEntrypoint.
> Tests in error: 
> {code}
> https://travis-ci.org/apache/flink/jobs/449439623



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] TisonKun opened a new pull request #6998: [FLINK-10757] [tests] Avoid port conflicts in AbstractTaskManagerProc…

2018-11-01 Thread GitBox

TisonKun opened a new pull request #6998: [FLINK-10757] [tests] Avoid port 
conflicts in AbstractTaskManagerProc…
URL: https://github.com/apache/flink/pull/6998
 
 
   …essFailureRecoveryTest
   
   ## What is the purpose of the change
   
   Harden `AbstractTaskManagerProcessFailureRecoveryTest` by avoid port 
conflicts.
   
   The relative log is https://api.travis-ci.org/v3/job/449439623/log.txt and 
test fails on
   
   > Caused by: java.net.BindException: Address already in use
   
   thus start the cluster with rest port 0 to avoid possible port conflict.
   
   ## Verifying this change
   
   This change is a trivial rework and it itself is a test.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (**no**)
 - The public API, i.e., is any changed class annotated with `(Evolving)`: 
(**no**)
 - The serializers: (**no**)
 - The runtime per-record code paths (performance sensitive): (**no**)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**no**)
 - The S3 file system connector: (**no**)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**no**)
 - If yes, how is the feature documented? (**not applicable**)
   
   cc @zentol @tillrohrmann 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Assigned] (FLINK-10757) TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.

2018-11-01 Thread TisonKun (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TisonKun reassigned FLINK-10757:


Assignee: TisonKun

> TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the 
> cluster entrypoint StandaloneSessionClusterEntrypoint.
> 
>
> Key: FLINK-10757
> URL: https://issues.apache.org/jira/browse/FLINK-10757
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.2, 1.7.0
>Reporter: Bowen Li
>Assignee: TisonKun
>Priority: Major
> Fix For: 1.7.0
>
>
> {code:java}
> Failed tests: 
>   ...
>  
> TaskManagerProcessFailureStreamingRecoveryITCase>AbstractTaskManagerProcessFailureRecoveryTest.testTaskManagerProcessFailure:223
>  Failed to initialize the cluster entrypoint 
> StandaloneSessionClusterEntrypoint.
> Tests in error: 
> {code}
> https://travis-ci.org/apache/flink/jobs/449439623



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-10759) Adapt SQL-client configuration file to specify external catalogs and default catalog

2018-11-01 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-10759.

Resolution: Duplicate

Duplicate of FLINK-9172.

> Adapt SQL-client configuration file to specify external catalogs and default 
> catalog
> 
>
> Key: FLINK-10759
> URL: https://issues.apache.org/jira/browse/FLINK-10759
> Project: Flink
>  Issue Type: Sub-task
>  Components: SQL Client
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>
> It doesn't seem that the configuration (YAML) file allows specifications of 
> external catalogs currently. The request here is to add support for external 
> catalog specifications in YAML file. User should also be able to specify one 
> catalog is the default.
> The catalog-related configurations then need to be processed and passed to 
> {{TableEnvironment}} accordingly by calling relevant APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9172) Support external catalog factory that comes default with SQL-Client

2018-11-01 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-9172:

Component/s: SQL Client

> Support external catalog factory that comes default with SQL-Client
> ---
>
> Key: FLINK-9172
> URL: https://issues.apache.org/jira/browse/FLINK-9172
> Project: Flink
>  Issue Type: Sub-task
>  Components: SQL Client
>Reporter: Rong Rong
>Assignee: Eron Wright 
>Priority: Major
>
> It doesn't seem that the configuration (YAML) file allows specifications of 
> external catalogs currently. The request here is to add support for external 
> catalog specifications in YAML file. User should also be able to specify one 
> catalog is the default.
> It will be great to have SQL-Client to support some external catalogs 
> out-of-the-box for SQL users to configure and utilize easily. I am currently 
> think of having an external catalog factory that spins up both streaming and 
> batch external catalog table sources and sinks. This could greatly unify and 
> provide easy access for SQL users. 
> The catalog-related configurations then need to be processed and passed to 
> TableEnvironment accordingly by calling relevant APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9172) Support external catalog factory that comes default with SQL-Client

2018-11-01 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-9172:

Issue Type: Sub-task  (was: New Feature)
Parent: FLINK-10744

> Support external catalog factory that comes default with SQL-Client
> ---
>
> Key: FLINK-9172
> URL: https://issues.apache.org/jira/browse/FLINK-9172
> Project: Flink
>  Issue Type: Sub-task
>  Components: SQL Client
>Reporter: Rong Rong
>Assignee: Eron Wright 
>Priority: Major
>
> It doesn't seem that the configuration (YAML) file allows specifications of 
> external catalogs currently. The request here is to add support for external 
> catalog specifications in YAML file. User should also be able to specify one 
> catalog is the default.
> It will be great to have SQL-Client to support some external catalogs 
> out-of-the-box for SQL users to configure and utilize easily. I am currently 
> think of having an external catalog factory that spins up both streaming and 
> batch external catalog table sources and sinks. This could greatly unify and 
> provide easy access for SQL users. 
> The catalog-related configurations then need to be processed and passed to 
> TableEnvironment accordingly by calling relevant APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9172) Support external catalog factory that comes default with SQL-Client

2018-11-01 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-9172:

Issue Type: New Feature  (was: Sub-task)
Parent: (was: FLINK-7594)

> Support external catalog factory that comes default with SQL-Client
> ---
>
> Key: FLINK-9172
> URL: https://issues.apache.org/jira/browse/FLINK-9172
> Project: Flink
>  Issue Type: New Feature
>Reporter: Rong Rong
>Assignee: Eron Wright 
>Priority: Major
>
> It doesn't seem that the configuration (YAML) file allows specifications of 
> external catalogs currently. The request here is to add support for external 
> catalog specifications in YAML file. User should also be able to specify one 
> catalog is the default.
> It will be great to have SQL-Client to support some external catalogs 
> out-of-the-box for SQL users to configure and utilize easily. I am currently 
> think of having an external catalog factory that spins up both streaming and 
> batch external catalog table sources and sinks. This could greatly unify and 
> provide easy access for SQL users. 
> The catalog-related configurations then need to be processed and passed to 
> TableEnvironment accordingly by calling relevant APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9172) Support external catalog factory that comes default with SQL-Client

2018-11-01 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-9172:

Description: 
It doesn't seem that the configuration (YAML) file allows specifications of 
external catalogs currently. The request here is to add support for external 
catalog specifications in YAML file. User should also be able to specify one 
catalog is the default.

It will be great to have SQL-Client to support some external catalogs 
out-of-the-box for SQL users to configure and utilize easily. I am currently 
think of having an external catalog factory that spins up both streaming and 
batch external catalog table sources and sinks. This could greatly unify and 
provide easy access for SQL users. 

The catalog-related configurations then need to be processed and passed to 
TableEnvironment accordingly by calling relevant APIs.

  was:It will be great to have SQL-Client to support some external catalogs 
out-of-the-box for SQL users to configure and utilize easily. I am currently 
think of having an external catalog factory that spins up both streaming and 
batch external catalog table sources and sinks. This could greatly unify and 
provide easy access for SQL users. 


> Support external catalog factory that comes default with SQL-Client
> ---
>
> Key: FLINK-9172
> URL: https://issues.apache.org/jira/browse/FLINK-9172
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Rong Rong
>Assignee: Eron Wright 
>Priority: Major
>
> It doesn't seem that the configuration (YAML) file allows specifications of 
> external catalogs currently. The request here is to add support for external 
> catalog specifications in YAML file. User should also be able to specify one 
> catalog is the default.
> It will be great to have SQL-Client to support some external catalogs 
> out-of-the-box for SQL users to configure and utilize easily. I am currently 
> think of having an external catalog factory that spins up both streaming and 
> batch external catalog table sources and sinks. This could greatly unify and 
> provide easy access for SQL users. 
> The catalog-related configurations then need to be processed and passed to 
> TableEnvironment accordingly by calling relevant APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-8831) Create SQL Client dependencies

2018-11-01 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-8831.
-
Resolution: Fixed

All subtasks have been implemented. We will create new issues for new 
dependencies.

> Create SQL Client dependencies
> --
>
> Key: FLINK-8831
> URL: https://issues.apache.org/jira/browse/FLINK-8831
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>
> A first minimum version of FLIP-24 for the upcoming 
> Flink SQL Client has been merged to the master. We also merged 
> possibilities to discover and configure table sources without a single 
> line of code using string-based properties and Java service provider 
> discovery.
> We are now facing the issue of how to manage dependencies in this new 
> environment. It is different from how regular Flink projects are created 
> (by setting up a a new Maven project and build a jar or fat jar). 
> Ideally, a user should be able to select from a set of prepared 
> connectors, catalogs, and formats. E.g., if a Kafka connector and Avro 
> format is needed, all that should be required is to move a 
> "flink-kafka.jar" and "flink-avro.jar" into the "sql_lib" directory that 
> is shipped to a Flink cluster together with the SQL query.
> [As discussed on 
> ML|http://mail-archives.apache.org/mod_mbox/flink-dev/201802.mbox/%3C9c73518b-ec8e-3b01-f200-dea816c75efc%40apache.org%3E],
>  we will build fat jars for these modules with every Flink release that can 
> be hostet somewhere (e.g. Apache infrastructure, but not Maven central). This 
> would make it very easy to add a dependency by downloading the prepared JAR 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-9248) Create a SQL Client Avro format fat-jar

2018-11-01 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-9248.
-
Resolution: Fixed

Fixed in FLINK-9444.

> Create a SQL Client Avro format fat-jar
> ---
>
> Key: FLINK-9248
> URL: https://issues.apache.org/jira/browse/FLINK-9248
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>
> Create a format fat-jar for the SQL Client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10689) Port UDFs in Table API extension points to flink-table-common

2018-11-01 Thread Timo Walther (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672594#comment-16672594
 ] 

Timo Walther commented on FLINK-10689:
--

Thanks for splitting the issue [~phoenixjiangnan].

Yes, the Hive catalog work must not be blocked on FLINK-10755. Once classes are 
modified for a feature we can start migrating it to Java in a separate commit 
first and then perform the actual feature changes. They can be placed in 
{{flink-table/src/main/java}} until we find a better module structure.

> Port UDFs in Table API extension points to flink-table-common
> -
>
> Key: FLINK-10689
> URL: https://issues.apache.org/jira/browse/FLINK-10689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: xueyu
>Priority: Major
> Fix For: 1.8.0
>
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.
> This ticket is for porting UDFs. This jira does NOT depend on FLINK-16088 so 
> it can be started at anytime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10727) Remove unnecessary synchronization in SingleInputGate#requestPartitions()

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672581#comment-16672581
 ] 

ASF GitHub Bot commented on FLINK-10727:


zhijiangW commented on issue #6974: [FLINK-10727][network] remove unnecessary 
synchronization in SingleInputGate#requestPartitions()
URL: https://github.com/apache/flink/pull/6974#issuecomment-435274230
 
 
   Thanks till for these guiding suggestions!
   
   It is my careless to review so small changes, and I am supposed to find this 
issue if think more about it.
   I would be more careful for reviewing such critical component next time! :(


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove unnecessary synchronization in SingleInputGate#requestPartitions()
> -
>
> Key: FLINK-10727
> URL: https://issues.apache.org/jira/browse/FLINK-10727
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Affects Versions: 1.5.5, 1.6.2, 1.7.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> For every {{SingleInputGate#getNextBufferOrEvent()}}, 
> {{SingleInputGate#requestPartitions()}} is called and this always 
> synchronizes on the {{requestLock}} before checking the 
> {{requestedPartitionsFlag}}. Since {{SingleInputGate#requestPartitions()}} is 
> only called from the same thread (the task thread getting the record), it is 
> enough to check the {{requestedPartitionsFlag}} first before synchronizing 
> for the actual requests (if needed). {{UnionInputGate}} already goes the same 
> way in its {{requestPartitions()}} implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] zhijiangW commented on issue #6974: [FLINK-10727][network] remove unnecessary synchronization in SingleInputGate#requestPartitions()

2018-11-01 Thread GitBox

zhijiangW commented on issue #6974: [FLINK-10727][network] remove unnecessary 
synchronization in SingleInputGate#requestPartitions()
URL: https://github.com/apache/flink/pull/6974#issuecomment-435274230
 
 
   Thanks till for these guiding suggestions!
   
   It is my careless to review so small changes, and I am supposed to find this 
issue if think more about it.
   I would be more careful for reviewing such critical component next time! :(


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-10760) Create a command line tool to migrate meta objects specified in SQL client configuration

2018-11-01 Thread Xuefu Zhang (JIRA)

Xuefu Zhang created FLINK-10760:
---

 Summary: Create a command line tool to migrate meta objects 
specified in SQL client configuration
 Key: FLINK-10760
 URL: https://issues.apache.org/jira/browse/FLINK-10760
 Project: Flink
  Issue Type: Sub-task
  Components: SQL Client
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


With a persistent catalog for Flink meta objects (tables, views, functions, 
etc), it becomes unnecessary to specify such objects in SQL client 
configuration (YAML) file. However, it would be helpful to the users, who 
already have some meta objects specified in the YARM file, to have a command 
line tool that migrates objects specified in YAML files to the persistent 
catalog once for all. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10759) Adapt SQL-client configuration file to specify external catalogs and default catalog

2018-11-01 Thread Xuefu Zhang (JIRA)

Xuefu Zhang created FLINK-10759:
---

 Summary: Adapt SQL-client configuration file to specify external 
catalogs and default catalog
 Key: FLINK-10759
 URL: https://issues.apache.org/jira/browse/FLINK-10759
 Project: Flink
  Issue Type: Sub-task
  Components: SQL Client
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


It doesn't seem that the configuration (YAML) file allows specifications of 
external catalogs currently. The request here is to add support for external 
catalog specifications in YAML file. User should also be able to specify one 
catalog is the default.

The catalog-related configurations then need to be processed and passed to 
{{TableEnvironment}} accordingly by calling relevant APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10697) Create an in-memory catalog that stores Flink's meta objects

2018-11-01 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10697:
---
Labels: pull-request-available  (was: )

> Create an in-memory catalog that stores Flink's meta objects
> 
>
> Key: FLINK-10697
> URL: https://issues.apache.org/jira/browse/FLINK-10697
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Currently all Flink meta objects (currently tables only) are stored in memory 
> as part of Calcite catalog. Those objects are temporary (such as inline 
> tables), others are meant to live beyond user session. As we introduce 
> catalog for those objects (tables, views, and UDFs), it makes sense to 
> organize them neatly. Further, having a catalog implementation that store 
> those objects in memory is to retain the currently behavior, which can be 
> configured by user.
> Please note that this implementation is different from the current 
> {{InMemoryExternalCatalog}, which is used mainly for testing and doesn't 
> reflect what's actually needed for Flink meta objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10697) Create an in-memory catalog that stores Flink's meta objects

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672563#comment-16672563
 ] 

ASF GitHub Bot commented on FLINK-10697:


bowenli86 opened a new pull request #6997: [FLINK-10697] Create an in-memory 
catalog that stores Flink's meta objects
URL: https://github.com/apache/flink/pull/6997
 
 
   ## What is the purpose of the change
   
   Currently all Flink meta objects (currently tables only) are temporarily 
stored in memory as part of Calcite catalog. As we introduce catalog for those 
objects (tables, views, and UDFs), it makes sense to organize them neatly. 
Further, having a catalog implementation that store those objects in memory is 
to retain the currently behavior, which can be configured by user.
   
   `FlinkInMemoryCatalog` is similar to existing `InMemoryExternalCatalog` 
class, which exists only for testing purpose. `FlinkInMemoryCatalog` requires a 
minimum of implementation of `CrudExternalCatalog` interface. For instance, it 
doesn’t need subcatalogs. While it’s possible to extract some common code from 
`InMemoryExternalCatalog` and let `InMemoryExternalCatalog` extend from 
`FlinkInMemoryCatalog`, it’s very likely the two will divert at some point. 
Thus, it’s acceptable to have two independent classes from the very beginning. 
   
   ## Brief change log
   
   - added an in-memory impl of `CrudExternalCatalog` as `FlinkInMemoryCatalog` 
for production use
   - renamed `InMemoryExternalCatalog` to `TestingInMemoryCatalog` to clarify 
its usage scope
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as 
*InMemoryExternalCatalogTests*.
   
   ## Does this pull request potentially affect one of the following parts:
   
none
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (JavaDocs)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an in-memory catalog that stores Flink's meta objects
> 
>
> Key: FLINK-10697
> URL: https://issues.apache.org/jira/browse/FLINK-10697
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Currently all Flink meta objects (currently tables only) are stored in memory 
> as part of Calcite catalog. Those objects are temporary (such as inline 
> tables), others are meant to live beyond user session. As we introduce 
> catalog for those objects (tables, views, and UDFs), it makes sense to 
> organize them neatly. Further, having a catalog implementation that store 
> those objects in memory is to retain the currently behavior, which can be 
> configured by user.
> Please note that this implementation is different from the current 
> {{InMemoryExternalCatalog}, which is used mainly for testing and doesn't 
> reflect what's actually needed for Flink meta objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] bowenli86 opened a new pull request #6997: [FLINK-10697] Create an in-memory catalog that stores Flink's meta objects

2018-11-01 Thread GitBox

bowenli86 opened a new pull request #6997: [FLINK-10697] Create an in-memory 
catalog that stores Flink's meta objects
URL: https://github.com/apache/flink/pull/6997
 
 
   ## What is the purpose of the change
   
   Currently all Flink meta objects (currently tables only) are temporarily 
stored in memory as part of Calcite catalog. As we introduce catalog for those 
objects (tables, views, and UDFs), it makes sense to organize them neatly. 
Further, having a catalog implementation that store those objects in memory is 
to retain the currently behavior, which can be configured by user.
   
   `FlinkInMemoryCatalog` is similar to existing `InMemoryExternalCatalog` 
class, which exists only for testing purpose. `FlinkInMemoryCatalog` requires a 
minimum of implementation of `CrudExternalCatalog` interface. For instance, it 
doesn’t need subcatalogs. While it’s possible to extract some common code from 
`InMemoryExternalCatalog` and let `InMemoryExternalCatalog` extend from 
`FlinkInMemoryCatalog`, it’s very likely the two will divert at some point. 
Thus, it’s acceptable to have two independent classes from the very beginning. 
   
   ## Brief change log
   
   - added an in-memory impl of `CrudExternalCatalog` as `FlinkInMemoryCatalog` 
for production use
   - renamed `InMemoryExternalCatalog` to `TestingInMemoryCatalog` to clarify 
its usage scope
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as 
*InMemoryExternalCatalogTests*.
   
   ## Does this pull request potentially affect one of the following parts:
   
none
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (JavaDocs)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-10758) Refactor TableEnvironment so that all registration calls delegate to CatalogManager

2018-11-01 Thread Xuefu Zhang (JIRA)

Xuefu Zhang created FLINK-10758:
---

 Summary: Refactor TableEnvironment so that all registration calls 
delegate to CatalogManager 
 Key: FLINK-10758
 URL: https://issues.apache.org/jira/browse/FLINK-10758
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


There are many different APIs defined in {{TableEnvironment}} class that 
register tables/views/functions. Based on the design doc, those calls need to 
be delegated to {{CatalogManager}}. However, not all delegations are 
straightforward. For example. table registration could mean registering 
permanent tables, temp tables, or views. This JIRA takes care of the details. 

Please refer to the "TableEnvironment Class" section in the design doc 
(attached to the parent task) for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10729) Create a Hive connector for Hive data access in Flink

2018-11-01 Thread Zhenqiu Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672530#comment-16672530
 ] 

Zhenqiu Huang commented on FLINK-10729:
---

[~xuefuz]

Agree. If you want to use native formats in Hive, I feel an initial prototype 
is definitely needed for me to follow. For the direction of implementation, 
probably we can align with [~fhueske] also. Looking forward to the design.

> Create a Hive connector for Hive data access in Flink
> -
>
> Key: FLINK-10729
> URL: https://issues.apache.org/jira/browse/FLINK-10729
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Affects Versions: 1.6.2
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>
> As part of Flink-Hive integration effort, it's important for Flink to access 
> (read/write) Hive data, which is the responsibility of Hive connector. While 
> there is a HCatalog data connector in the code base, it's not complete (i.e. 
> missing all connector related classes such as validators, etc.). Further, 
> HCatalog interface has many limitations such as accessing a subset of Hive 
> data, supporting a subset of Hive data types, etc. In addition, it's not 
> actively maintained. In fact, it's now only a sub-project in Hive.
> Therefore, here we propose a complete connector set for Hive tables, not via 
> HCatalog, but via direct Hive interface. HCatalog connector will be 
> deprecated.
> Please note that connector on Hive metadata is already covered in other 
> JIRAs, as {{HiveExternalCatalog}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10473) State TTL incremental cleanup using Heap backend key iterator

2018-11-01 Thread lingqi.lpf (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672526#comment-16672526
 ] 

lingqi.lpf commented on FLINK-10473:


Hi, does this issue have detailed description?

> State TTL incremental cleanup using Heap backend key iterator
> -
>
> Key: FLINK-10473
> URL: https://issues.apache.org/jira/browse/FLINK-10473
> Project: Flink
>  Issue Type: New Feature
>  Components: State Backends, Checkpointing
>Affects Versions: 1.7.0
>Reporter: Andrey Zagrebin
>Assignee: Andrey Zagrebin
>Priority: Major
> Fix For: 1.7.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-10729) Create a Hive connector for Hive data access in Flink

2018-11-01 Thread Xuefu Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned FLINK-10729:
---

Assignee: Xuefu Zhang  (was: Zhenqiu Huang)

> Create a Hive connector for Hive data access in Flink
> -
>
> Key: FLINK-10729
> URL: https://issues.apache.org/jira/browse/FLINK-10729
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Affects Versions: 1.6.2
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>
> As part of Flink-Hive integration effort, it's important for Flink to access 
> (read/write) Hive data, which is the responsibility of Hive connector. While 
> there is a HCatalog data connector in the code base, it's not complete (i.e. 
> missing all connector related classes such as validators, etc.). Further, 
> HCatalog interface has many limitations such as accessing a subset of Hive 
> data, supporting a subset of Hive data types, etc. In addition, it's not 
> actively maintained. In fact, it's now only a sub-project in Hive.
> Therefore, here we propose a complete connector set for Hive tables, not via 
> HCatalog, but via direct Hive interface. HCatalog connector will be 
> deprecated.
> Please note that connector on Hive metadata is already covered in other 
> JIRAs, as {{HiveExternalCatalog}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10729) Create a Hive connector for Hive data access in Flink

2018-11-01 Thread Xuefu Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672525#comment-16672525
 ] 

Xuefu Zhang commented on FLINK-10729:
-

Hi [~ZhenqiuHuang], I did some initial research and found this is much involved 
than on the surface. It seems that the complexity comes mostly from Hive side, 
and there will probably need a prototype and design. I will spend more time on 
this. Thus, if you don't mind I'm going to assign this back to myself. 

Once the design is completed, I'm sure there will be many subtasks to be 
created. I'd appreciate if you could help at that time.

Thanks.

> Create a Hive connector for Hive data access in Flink
> -
>
> Key: FLINK-10729
> URL: https://issues.apache.org/jira/browse/FLINK-10729
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Affects Versions: 1.6.2
>Reporter: Xuefu Zhang
>Assignee: Zhenqiu Huang
>Priority: Major
>
> As part of Flink-Hive integration effort, it's important for Flink to access 
> (read/write) Hive data, which is the responsibility of Hive connector. While 
> there is a HCatalog data connector in the code base, it's not complete (i.e. 
> missing all connector related classes such as validators, etc.). Further, 
> HCatalog interface has many limitations such as accessing a subset of Hive 
> data, supporting a subset of Hive data types, etc. In addition, it's not 
> actively maintained. In fact, it's now only a sub-project in Hive.
> Therefore, here we propose a complete connector set for Hive tables, not via 
> HCatalog, but via direct Hive interface. HCatalog connector will be 
> deprecated.
> Please note that connector on Hive metadata is already covered in other 
> JIRAs, as {{HiveExternalCatalog}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10742) Let Netty use Flink's buffers directly in credit-based mode

2018-11-01 Thread Yun Gao (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672521#comment-16672521
 ] 

Yun Gao commented on FLINK-10742:
-

Hi Nico, I just assigned this Jira to myself temporarily, and I will first 
propose a design doc as soon as possible in next several days. Let me know if 
it conflicts with your previous plan. :) 

> Let Netty use Flink's buffers directly in credit-based mode
> ---
>
> Key: FLINK-10742
> URL: https://issues.apache.org/jira/browse/FLINK-10742
> Project: Flink
>  Issue Type: Sub-task
>  Components: Network
>Affects Versions: 1.7.0
>Reporter: Nico Kruber
>Priority: Major
> Fix For: 1.8.0
>
>
> For credit-based flow control, we always have buffers available for data that 
> is sent to use. We could thus use them directly and not copy the network 
> stream into Netty buffers first and then into our buffers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-10742) Let Netty use Flink's buffers directly in credit-based mode

2018-11-01 Thread Yun Gao (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao reassigned FLINK-10742:
---

Assignee: Yun Gao

> Let Netty use Flink's buffers directly in credit-based mode
> ---
>
> Key: FLINK-10742
> URL: https://issues.apache.org/jira/browse/FLINK-10742
> Project: Flink
>  Issue Type: Sub-task
>  Components: Network
>Affects Versions: 1.7.0
>Reporter: Nico Kruber
>Assignee: Yun Gao
>Priority: Major
> Fix For: 1.8.0
>
>
> For credit-based flow control, we always have buffers available for data that 
> is sent to use. We could thus use them directly and not copy the network 
> stream into Netty buffers first and then into our buffers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10205) Batch Job: InputSplit Fault tolerant for DataSourceTask

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672512#comment-16672512
 ] 

ASF GitHub Bot commented on FLINK-10205:


wenlong88 commented on a change in pull request #6684: [FLINK-10205] Batch 
Job: InputSplit Fault tolerant for DataSource…
URL: https://github.com/apache/flink/pull/6684#discussion_r230257155
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.java
 ##
 @@ -249,6 +254,19 @@ public CoLocationConstraint getLocationConstraint() {
return locationConstraint;
}
 
+   public InputSplit getNextInputSplit(int index, String host) {
+   final int taskId = this.getParallelSubtaskIndex();
+   synchronized (this.inputSplits) {
+   if (index < this.inputSplits.size()) {
+   return this.inputSplits.get(index);
 
 Review comment:
   there is no need to try to get input split from the input split history.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Batch Job: InputSplit Fault tolerant for DataSourceTask
> ---
>
> Key: FLINK-10205
> URL: https://issues.apache.org/jira/browse/FLINK-10205
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Affects Versions: 1.6.1, 1.6.2, 1.7.0
>Reporter: JIN SUN
>Assignee: JIN SUN
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Today DataSource Task pull InputSplits from JobManager to achieve better 
> performance, however, when a DataSourceTask failed and rerun, it will not get 
> the same splits as its previous version. this will introduce inconsistent 
> result or even data corruption.
> Furthermore,  if there are two executions run at the same time (in batch 
> scenario), this two executions should process same splits.
> we need to fix the issue to make the inputs of a DataSourceTask 
> deterministic. The propose is save all splits into ExecutionVertex and 
> DataSourceTask will pull split from there.
>  document:
> [https://docs.google.com/document/d/1FdZdcA63tPUEewcCimTFy9Iz2jlVlMRANZkO4RngIuk/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] wenlong88 commented on a change in pull request #6684: [FLINK-10205] Batch Job: InputSplit Fault tolerant for DataSource…

2018-11-01 Thread GitBox

wenlong88 commented on a change in pull request #6684: [FLINK-10205] Batch 
Job: InputSplit Fault tolerant for DataSource…
URL: https://github.com/apache/flink/pull/6684#discussion_r230257155
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.java
 ##
 @@ -249,6 +254,19 @@ public CoLocationConstraint getLocationConstraint() {
return locationConstraint;
}
 
+   public InputSplit getNextInputSplit(int index, String host) {
+   final int taskId = this.getParallelSubtaskIndex();
+   synchronized (this.inputSplits) {
+   if (index < this.inputSplits.size()) {
+   return this.inputSplits.get(index);
 
 Review comment:
   there is no need to try to get input split from the input split history.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10711) flink-end-to-end-tests can fail silently

2018-11-01 Thread Hequn Cheng (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672495#comment-16672495
 ] 

Hequn Cheng commented on FLINK-10711:
-

[~twalthr] Hi, I haven't started yet. Take it as you wish. Feel free to let me 
know if there is anything I can help.
[~dawidwys] Agree with you. Enable it on a per-test basis is a good idea.

> flink-end-to-end-tests can fail silently
> 
>
> Key: FLINK-10711
> URL: https://issues.apache.org/jira/browse/FLINK-10711
> Project: Flink
>  Issue Type: Bug
>  Components: E2E Tests
>Affects Versions: 1.7.0
>Reporter: Piotr Nowojski
>Assignee: Hequn Cheng
>Priority: Blocker
> Fix For: 1.7.0
>
>
> Because they are written in bash and they are not setting
> {code:bash}
> set -e
> {code}
> at the beginning, errors can be swallowed silently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10507) Set target parallelism to maximum when using the standalone job cluster mode

2018-11-01 Thread vinoyang (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672445#comment-16672445
 ] 

vinoyang commented on FLINK-10507:
--

[~mikemintz2] This is just a plan. But for now (1.7.0 no longer accepts new 
features), this feature will be released after 1.7.0.

> Set target parallelism to maximum when using the standalone job cluster mode
> 
>
> Key: FLINK-10507
> URL: https://issues.apache.org/jira/browse/FLINK-10507
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: vinoyang
>Priority: Major
> Fix For: 1.7.0
>
>
> In order to enable the reactive container mode, we should set the target 
> value to the maximum parallelism if we run in standalone job cluster mode. 
> That way, we will always use all available resources and scale up if new 
> resources are being added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] isunjin commented on issue #6684: [FLINK-10205] Batch Job: InputSplit Fault tolerant for DataSource…

2018-11-01 Thread GitBox

isunjin commented on issue #6684: [FLINK-10205] Batch Job: InputSplit Fault 
tolerant for DataSource…
URL: https://github.com/apache/flink/pull/6684#issuecomment-435235888
 
 
   Thanks @tillrohrmann and @StefanRRichter. I pushed the implementation of 
```return InputSplits to the InputSplitAssigner```. 
   
   The ```InputSplit``` will return to ```InputAssigner``` if the task fail to 
process it. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10205) Batch Job: InputSplit Fault tolerant for DataSourceTask

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672420#comment-16672420
 ] 

ASF GitHub Bot commented on FLINK-10205:


isunjin commented on issue #6684: [FLINK-10205] Batch Job: InputSplit Fault 
tolerant for DataSource…
URL: https://github.com/apache/flink/pull/6684#issuecomment-435235888
 
 
   Thanks @tillrohrmann and @StefanRRichter. I pushed the implementation of 
```return InputSplits to the InputSplitAssigner```. 
   
   The ```InputSplit``` will return to ```InputAssigner``` if the task fail to 
process it. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Batch Job: InputSplit Fault tolerant for DataSourceTask
> ---
>
> Key: FLINK-10205
> URL: https://issues.apache.org/jira/browse/FLINK-10205
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Affects Versions: 1.6.1, 1.6.2, 1.7.0
>Reporter: JIN SUN
>Assignee: JIN SUN
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Today DataSource Task pull InputSplits from JobManager to achieve better 
> performance, however, when a DataSourceTask failed and rerun, it will not get 
> the same splits as its previous version. this will introduce inconsistent 
> result or even data corruption.
> Furthermore,  if there are two executions run at the same time (in batch 
> scenario), this two executions should process same splits.
> we need to fix the issue to make the inputs of a DataSourceTask 
> deterministic. The propose is save all splits into ExecutionVertex and 
> DataSourceTask will pull split from there.
>  document:
> [https://docs.google.com/document/d/1FdZdcA63tPUEewcCimTFy9Iz2jlVlMRANZkO4RngIuk/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10756) TaskManagerProcessFailureBatchRecoveryITCase did not finish on time

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10756:
-
Affects Version/s: 1.6.2

> TaskManagerProcessFailureBatchRecoveryITCase did not finish on time
> ---
>
> Key: FLINK-10756
> URL: https://issues.apache.org/jira/browse/FLINK-10756
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.2, 1.7.0
>Reporter: Bowen Li
>Priority: Major
> Fix For: 1.7.0
>
>
> {code:java}
> Failed tests: 
>   
> TaskManagerProcessFailureBatchRecoveryITCase>AbstractTaskManagerProcessFailureRecoveryTest.testTaskManagerProcessFailure:207
>  The program did not finish in time
> {code}
> https://travis-ci.org/apache/flink/jobs/449439623



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10757) TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10757:
-
Component/s: Tests

> TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the 
> cluster entrypoint StandaloneSessionClusterEntrypoint.
> 
>
> Key: FLINK-10757
> URL: https://issues.apache.org/jira/browse/FLINK-10757
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.2, 1.7.0
>Reporter: Bowen Li
>Priority: Major
> Fix For: 1.7.0
>
>
> {code:java}
> Failed tests: 
>   ...
>  
> TaskManagerProcessFailureStreamingRecoveryITCase>AbstractTaskManagerProcessFailureRecoveryTest.testTaskManagerProcessFailure:223
>  Failed to initialize the cluster entrypoint 
> StandaloneSessionClusterEntrypoint.
> Tests in error: 
> {code}
> https://travis-ci.org/apache/flink/jobs/449439623



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10757) TaskManagerProcessFailureStreamingRecoveryITCase failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.

2018-11-01 Thread Bowen Li (JIRA)

Bowen Li created FLINK-10757:


 Summary: TaskManagerProcessFailureStreamingRecoveryITCase failed 
to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.
 Key: FLINK-10757
 URL: https://issues.apache.org/jira/browse/FLINK-10757
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.6.2, 1.7.0
Reporter: Bowen Li
 Fix For: 1.7.0


{code:java}
Failed tests: 
  ...
 
TaskManagerProcessFailureStreamingRecoveryITCase>AbstractTaskManagerProcessFailureRecoveryTest.testTaskManagerProcessFailure:223
 Failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.
Tests in error: 
{code}


https://travis-ci.org/apache/flink/jobs/449439623



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10756) TaskManagerProcessFailureBatchRecoveryITCase did not finish on time

2018-11-01 Thread Bowen Li (JIRA)

Bowen Li created FLINK-10756:


 Summary: TaskManagerProcessFailureBatchRecoveryITCase did not 
finish on time
 Key: FLINK-10756
 URL: https://issues.apache.org/jira/browse/FLINK-10756
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.7.0
Reporter: Bowen Li
 Fix For: 1.7.0


{code:java}
Failed tests: 
  
TaskManagerProcessFailureBatchRecoveryITCase>AbstractTaskManagerProcessFailureRecoveryTest.testTaskManagerProcessFailure:207
 The program did not finish in time
{code}

https://travis-ci.org/apache/flink/jobs/449439623



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10689) Port UDFs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672359#comment-16672359
 ] 

Bowen Li commented on FLINK-10689:
--

Thanks [~twalthr] ! I read the code and your comment makes sense.

As a result of the discussion to parallelize the work, I re-scoped this subtask 
to only UDFs and created FLINK-10755 to migrate external catalogs. [~xueyu] 
This task can be proceeded without any dependency now.

An alternative which not only unblocks the hive catalog work and but also 
doesn't introduce much migration overhead for new code, is that new 
code/feature or any necessary refactor related to hive and should be written in 
java and be placed in {{flink-table}}'s java package.


> Port UDFs in Table API extension points to flink-table-common
> -
>
> Key: FLINK-10689
> URL: https://issues.apache.org/jira/browse/FLINK-10689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: xueyu
>Priority: Major
> Fix For: 1.8.0
>
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.
> This ticket is for porting UDFs. This jira does NOT depend on FLINK-16088 so 
> it can be started at anytime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-5697) Add per-shard watermarks for FlinkKinesisConsumer

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672295#comment-16672295
 ] 

ASF GitHub Bot commented on FLINK-5697:
---

tweise commented on issue #6980: [FLINK-5697] [kinesis] Add periodic per-shard 
watermark support
URL: https://github.com/apache/flink/pull/6980#issuecomment-435206371
 
 
   @EronWright that's correct and I will make sure to document this. Even our 
planned follow-up work won't be able to address such resharding scenario. I 
think we will only be able to address that with the new source design that is 
currently under discussion (which should permit centralized discovery and more 
sophisticated splitting/shard distribution).  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add per-shard watermarks for FlinkKinesisConsumer
> -
>
> Key: FLINK-5697
> URL: https://issues.apache.org/jira/browse/FLINK-5697
> Project: Flink
>  Issue Type: New Feature
>  Components: Kinesis Connector, Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Thomas Weise
>Priority: Major
>  Labels: pull-request-available
>
> It would be nice to let the Kinesis consumer be on-par in functionality with 
> the Kafka consumer, since they share very similar abstractions. Per-partition 
> / shard watermarks is something we can add also to the Kinesis consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] tweise commented on issue #6980: [FLINK-5697] [kinesis] Add periodic per-shard watermark support

2018-11-01 Thread GitBox

tweise commented on issue #6980: [FLINK-5697] [kinesis] Add periodic per-shard 
watermark support
URL: https://github.com/apache/flink/pull/6980#issuecomment-435206371
 
 
   @EronWright that's correct and I will make sure to document this. Even our 
planned follow-up work won't be able to address such resharding scenario. I 
think we will only be able to address that with the new source design that is 
currently under discussion (which should permit centralized discovery and more 
sophisticated splitting/shard distribution).  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672290#comment-16672290
 ] 

ASF GitHub Bot commented on FLINK-10696:


bowenli86 commented on a change in pull request #6970: [FLINK-10696][Table API 
& SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230216165
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/InMemoryExternalCatalog.scala
 ##
 @@ -98,6 +101,77 @@ class InMemoryExternalCatalog(name: String) extends 
CrudExternalCatalog {
 }
   }
 
+  @throws[ViewAlreadyExistException]
+  override def createView(
+viewName: String,
+view: String,
 
 Review comment:
   sounds good!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add APIs to ExternalCatalog for views and UDFs
> --
>
> Key: FLINK-10696
> URL: https://issues.apache.org/jira/browse/FLINK-10696
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Currently there are APIs for tables only. However, views and UDFs are also 
> common objects in a catalog.
> This is required when we store Flink tables/views/UDFs in an external 
> persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672289#comment-16672289
 ] 

ASF GitHub Bot commented on FLINK-10696:


bowenli86 commented on a change in pull request #6970: [FLINK-10696][Table API 
& SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230216165
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/InMemoryExternalCatalog.scala
 ##
 @@ -98,6 +101,77 @@ class InMemoryExternalCatalog(name: String) extends 
CrudExternalCatalog {
 }
   }
 
+  @throws[ViewAlreadyExistException]
+  override def createView(
+viewName: String,
+view: String,
 
 Review comment:
   sounds good!
   
   - 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add APIs to ExternalCatalog for views and UDFs
> --
>
> Key: FLINK-10696
> URL: https://issues.apache.org/jira/browse/FLINK-10696
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Currently there are APIs for tables only. However, views and UDFs are also 
> common objects in a catalog.
> This is required when we store Flink tables/views/UDFs in an external 
> persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] bowenli86 commented on a change in pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs

2018-11-01 Thread GitBox

bowenli86 commented on a change in pull request #6970: [FLINK-10696][Table API 
& SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230216165
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/InMemoryExternalCatalog.scala
 ##
 @@ -98,6 +101,77 @@ class InMemoryExternalCatalog(name: String) extends 
CrudExternalCatalog {
 }
   }
 
+  @throws[ViewAlreadyExistException]
+  override def createView(
+viewName: String,
+view: String,
 
 Review comment:
   sounds good!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] bowenli86 commented on a change in pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs

2018-11-01 Thread GitBox

bowenli86 commented on a change in pull request #6970: [FLINK-10696][Table API 
& SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230216165
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/InMemoryExternalCatalog.scala
 ##
 @@ -98,6 +101,77 @@ class InMemoryExternalCatalog(name: String) extends 
CrudExternalCatalog {
 }
   }
 
+  @throws[ViewAlreadyExistException]
+  override def createView(
+viewName: String,
+view: String,
 
 Review comment:
   sounds good!
   
   - 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-5697) Add per-shard watermarks for FlinkKinesisConsumer

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672285#comment-16672285
 ] 

ASF GitHub Bot commented on FLINK-5697:
---

EronWright commented on issue #6980: [FLINK-5697] [kinesis] Add periodic 
per-shard watermark support
URL: https://github.com/apache/flink/pull/6980#issuecomment-435202809
 
 
   There is a caveat with this implementation that the docs should perhaps 
mention.  The caveat is that it may produce spurious late events when 
processing a backlog of data.
   
   Here's an example of when that may occur.  Imagine that subtask 1 is 
processing shard A and subtask 2 is processing shard B.  Shard A has reached 
6:00 in event time (as per the assigner), and so the subtask emits the 
corresponding watermark.  At this point, the subtask has made the irrevocable 
assertion that subsequent events will be past 6:00.   Meanwhile, Shard B is at 
5:30 and undergoes a split into C/D.  If either shard is subsequently assigned 
to subtask 1, the events will be considered late due to the assertion made 
earlier.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add per-shard watermarks for FlinkKinesisConsumer
> -
>
> Key: FLINK-5697
> URL: https://issues.apache.org/jira/browse/FLINK-5697
> Project: Flink
>  Issue Type: New Feature
>  Components: Kinesis Connector, Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Thomas Weise
>Priority: Major
>  Labels: pull-request-available
>
> It would be nice to let the Kinesis consumer be on-par in functionality with 
> the Kafka consumer, since they share very similar abstractions. Per-partition 
> / shard watermarks is something we can add also to the Kinesis consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] EronWright commented on issue #6980: [FLINK-5697] [kinesis] Add periodic per-shard watermark support

2018-11-01 Thread GitBox

EronWright commented on issue #6980: [FLINK-5697] [kinesis] Add periodic 
per-shard watermark support
URL: https://github.com/apache/flink/pull/6980#issuecomment-435202809
 
 
   There is a caveat with this implementation that the docs should perhaps 
mention.  The caveat is that it may produce spurious late events when 
processing a backlog of data.
   
   Here's an example of when that may occur.  Imagine that subtask 1 is 
processing shard A and subtask 2 is processing shard B.  Shard A has reached 
6:00 in event time (as per the assigner), and so the subtask emits the 
corresponding watermark.  At this point, the subtask has made the irrevocable 
assertion that subsequent events will be past 6:00.   Meanwhile, Shard B is at 
5:30 and undergoes a split into C/D.  If either shard is subsequently assigned 
to subtask 1, the events will be considered late due to the assertion made 
earlier.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-5697) Add per-shard watermarks for FlinkKinesisConsumer

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672248#comment-16672248
 ] 

ASF GitHub Bot commented on FLINK-5697:
---

tweise closed pull request #6980: [FLINK-5697] [kinesis] Add periodic per-shard 
watermark support
URL: https://github.com/apache/flink/pull/6980
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 
b/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
index 407a5a95524..f0852584ade 100644
--- 
a/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
+++ 
b/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
@@ -31,6 +31,7 @@
 import org.apache.flink.runtime.state.FunctionInitializationContext;
 import org.apache.flink.runtime.state.FunctionSnapshotContext;
 import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;
+import org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks;
 import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
 import org.apache.flink.streaming.api.functions.source.SourceFunction;
 import 
org.apache.flink.streaming.connectors.kinesis.config.ConsumerConfigConstants;
@@ -78,6 +79,17 @@
  * A custom assigner implementation can be set via {@link 
#setShardAssigner(KinesisShardAssigner)} to optimize the
  * hash function or use static overrides to limit skew.
  *
+ * In order for the consumer to emit watermarks, a timestamp assigner needs 
to be set via {@link
+ * #setPeriodicWatermarkAssigner(AssignerWithPeriodicWatermarks)} and the auto 
watermark emit
+ * interval configured via {@link
+ * org.apache.flink.api.common.ExecutionConfig#setAutoWatermarkInterval(long)}.
+ *
+ * Watermarks can only advance when all shards of a subtask continuously 
deliver records. To
+ * avoid an inactive or closed shard to block the watermark progress, the idle 
timeout should be
+ * configured via configuration property {@link
+ * ConsumerConfigConstants#SHARD_IDLE_INTERVAL_MILLIS}. By default, shards 
won't be considered
+ * idle and watermark calculation will wait for newer records to arrive from 
all shards.
+ *
  * @param  the type of data emitted
  */
 @PublicEvolving
@@ -108,6 +120,8 @@
 */
private KinesisShardAssigner shardAssigner = 
KinesisDataFetcher.DEFAULT_SHARD_ASSIGNER;
 
+   private AssignerWithPeriodicWatermarks periodicWatermarkAssigner;
+
// 

//  Runtime state
// 

@@ -220,6 +234,22 @@ public void setShardAssigner(KinesisShardAssigner 
shardAssigner) {
ClosureCleaner.clean(shardAssigner, true);
}
 
+   public AssignerWithPeriodicWatermarks getPeriodicWatermarkAssigner() 
{
+   return periodicWatermarkAssigner;
+   }
+
+   /**
+* Set the assigner that will extract the timestamp from {@link T} and 
calculate the
+* watermark.
+*
+* @param periodicWatermarkAssigner
+*/
+   public void setPeriodicWatermarkAssigner(
+   AssignerWithPeriodicWatermarks periodicWatermarkAssigner) {
+   this.periodicWatermarkAssigner = periodicWatermarkAssigner;
+   ClosureCleaner.clean(this.periodicWatermarkAssigner, true);
+   }
+
// 

//  Source life cycle
// 

@@ -414,7 +444,7 @@ public void snapshotState(FunctionSnapshotContext context) 
throws Exception {
Properties configProps,
KinesisDeserializationSchema deserializationSchema) {
 
-   return new KinesisDataFetcher<>(streams, sourceContext, 
runtimeContext, configProps, deserializationSchema, shardAssigner);
+   return new KinesisDataFetcher<>(streams, sourceContext, 
runtimeContext, configProps, deserializationSchema, shardAssigner, 
periodicWatermarkAssigner);
}
 
@VisibleForTesting
diff --git 
a/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java
 
b/flink-connectors/flink-connector-kinesis/src/main/java/org/apache

[jira] [Commented] (FLINK-5697) Add per-shard watermarks for FlinkKinesisConsumer

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672249#comment-16672249
 ] 

ASF GitHub Bot commented on FLINK-5697:
---

tweise opened a new pull request #6980: [FLINK-5697] [kinesis] Add periodic 
per-shard watermark support
URL: https://github.com/apache/flink/pull/6980
 
 
   
   
   ## What is the purpose of the change
   
   Adds support for periodic per-shard watermarks to the Kinesis consumer. This 
functionality is off by default and can be enabled by setting an optional 
watermark assigner on the consumer. When enabled, the watermarking also 
optionally supports idle shard detection based on configurable interval of 
inactivity.
   
   ## Brief change log
   
   * Add watermark assigner to consumer
   * Modify data fetcher to track watermark state per shard
   * Modify emitRecordAndUpdateState to extract timestamp and update watermark
   * Timer driven periodic watermark emit
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   Added a unit test and planning to add more test coverage with subsequent 
work for shared watermark state and emit queue as discussed on ML. This change 
is ported from Lyft internal codebase that is used in production. 
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / *no* / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**yes** / no)
 - If yes, how is the feature documented? (not applicable / docs / 
**JavaDocs** / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add per-shard watermarks for FlinkKinesisConsumer
> -
>
> Key: FLINK-5697
> URL: https://issues.apache.org/jira/browse/FLINK-5697
> Project: Flink
>  Issue Type: New Feature
>  Components: Kinesis Connector, Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Thomas Weise
>Priority: Major
>  Labels: pull-request-available
>
> It would be nice to let the Kinesis consumer be on-par in functionality with 
> the Kafka consumer, since they share very similar abstractions. Per-partition 
> / shard watermarks is something we can add also to the Kinesis consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] tweise opened a new pull request #6980: [FLINK-5697] [kinesis] Add periodic per-shard watermark support

2018-11-01 Thread GitBox

tweise opened a new pull request #6980: [FLINK-5697] [kinesis] Add periodic 
per-shard watermark support
URL: https://github.com/apache/flink/pull/6980
 
 
   
   
   ## What is the purpose of the change
   
   Adds support for periodic per-shard watermarks to the Kinesis consumer. This 
functionality is off by default and can be enabled by setting an optional 
watermark assigner on the consumer. When enabled, the watermarking also 
optionally supports idle shard detection based on configurable interval of 
inactivity.
   
   ## Brief change log
   
   * Add watermark assigner to consumer
   * Modify data fetcher to track watermark state per shard
   * Modify emitRecordAndUpdateState to extract timestamp and update watermark
   * Timer driven periodic watermark emit
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   Added a unit test and planning to add more test coverage with subsequent 
work for shared watermark state and emit queue as discussed on ML. This change 
is ported from Lyft internal codebase that is used in production. 
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / *no* / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**yes** / no)
 - If yes, how is the feature documented? (not applicable / docs / 
**JavaDocs** / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] tweise closed pull request #6980: [FLINK-5697] [kinesis] Add periodic per-shard watermark support

2018-11-01 Thread GitBox

tweise closed pull request #6980: [FLINK-5697] [kinesis] Add periodic per-shard 
watermark support
URL: https://github.com/apache/flink/pull/6980
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 
b/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
index 407a5a95524..f0852584ade 100644
--- 
a/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
+++ 
b/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
@@ -31,6 +31,7 @@
 import org.apache.flink.runtime.state.FunctionInitializationContext;
 import org.apache.flink.runtime.state.FunctionSnapshotContext;
 import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;
+import org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks;
 import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
 import org.apache.flink.streaming.api.functions.source.SourceFunction;
 import 
org.apache.flink.streaming.connectors.kinesis.config.ConsumerConfigConstants;
@@ -78,6 +79,17 @@
  * A custom assigner implementation can be set via {@link 
#setShardAssigner(KinesisShardAssigner)} to optimize the
  * hash function or use static overrides to limit skew.
  *
+ * In order for the consumer to emit watermarks, a timestamp assigner needs 
to be set via {@link
+ * #setPeriodicWatermarkAssigner(AssignerWithPeriodicWatermarks)} and the auto 
watermark emit
+ * interval configured via {@link
+ * org.apache.flink.api.common.ExecutionConfig#setAutoWatermarkInterval(long)}.
+ *
+ * Watermarks can only advance when all shards of a subtask continuously 
deliver records. To
+ * avoid an inactive or closed shard to block the watermark progress, the idle 
timeout should be
+ * configured via configuration property {@link
+ * ConsumerConfigConstants#SHARD_IDLE_INTERVAL_MILLIS}. By default, shards 
won't be considered
+ * idle and watermark calculation will wait for newer records to arrive from 
all shards.
+ *
  * @param  the type of data emitted
  */
 @PublicEvolving
@@ -108,6 +120,8 @@
 */
private KinesisShardAssigner shardAssigner = 
KinesisDataFetcher.DEFAULT_SHARD_ASSIGNER;
 
+   private AssignerWithPeriodicWatermarks periodicWatermarkAssigner;
+
// 

//  Runtime state
// 

@@ -220,6 +234,22 @@ public void setShardAssigner(KinesisShardAssigner 
shardAssigner) {
ClosureCleaner.clean(shardAssigner, true);
}
 
+   public AssignerWithPeriodicWatermarks getPeriodicWatermarkAssigner() 
{
+   return periodicWatermarkAssigner;
+   }
+
+   /**
+* Set the assigner that will extract the timestamp from {@link T} and 
calculate the
+* watermark.
+*
+* @param periodicWatermarkAssigner
+*/
+   public void setPeriodicWatermarkAssigner(
+   AssignerWithPeriodicWatermarks periodicWatermarkAssigner) {
+   this.periodicWatermarkAssigner = periodicWatermarkAssigner;
+   ClosureCleaner.clean(this.periodicWatermarkAssigner, true);
+   }
+
// 

//  Source life cycle
// 

@@ -414,7 +444,7 @@ public void snapshotState(FunctionSnapshotContext context) 
throws Exception {
Properties configProps,
KinesisDeserializationSchema deserializationSchema) {
 
-   return new KinesisDataFetcher<>(streams, sourceContext, 
runtimeContext, configProps, deserializationSchema, shardAssigner);
+   return new KinesisDataFetcher<>(streams, sourceContext, 
runtimeContext, configProps, deserializationSchema, shardAssigner, 
periodicWatermarkAssigner);
}
 
@VisibleForTesting
diff --git 
a/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java
 
b/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java
index 443b19ec382..42e2173474b 100644
--- 
a/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java
+++

[jira] [Issue Comment Deleted] (FLINK-10686) Introduce a flink-table-common module

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10686:
-
Comment: was deleted

(was: How about parallelize the subtasks of FLINK-10688 and FLINK-10689?

Given that FLINK-10687 is done and {{flink-table-common}} is created, rather 
than waiting for FLINK-10688 and being blocked, I think a better way to make 
progress is:
 * finish this subtask first by porting {{org.apache.flink.table.catalog}} and 
{{org.apache.flink.table.functions}} to {{flink-table-common}}
 * let {{flink-connectors}} temporarily depends on both {{flink-table}} and 
{{flink-table-common}}
 * As part of FLINK-10688, we can remove {{flink-connectors}} 's dependency on 
{{flink-table}} then.

The reasons being that the community is starting to work on Flink-Hive 
integration and external catalogs. Since we've already decided to move 
UDFs/catalogs APIs to Java, I don't think writing new scala code then porting 
to Java is cumbersome and time-consuming is a good option. I'd rather port 
existing code to Java first and then start to write all new code/feature. With 
the way I proposed, we can parallelize the work and FLINK-10689 won't get 
blocked by FLINK-10688.

What do you think? [~twalthr])

> Introduce a flink-table-common module
> -
>
> Key: FLINK-10686
> URL: https://issues.apache.org/jira/browse/FLINK-10686
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
> Fix For: 1.8.0
>
>
> Because more and more table factories for connectors and formats are added 
> and external catalog support is also on the horizon, {{flink-table}} becomes 
> a dependency for many Flink modules. Since {{flink-table}} is implemented in 
> Scala it requires other modules to be suffixes with Scala prefixes. However, 
> as we have learned in the past, Scala code is hard to maintain which is why 
> our long-term goal is to avoid Scala/Scala dependencies.
> Therefore we propose a new module {{flink-table-common}} that contains 
> interfaces between {{flink-table}} and other modules. This module is 
> implemented in Java and should contain minimal (or better no) external 
> dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10755) Port external catalogs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10755:
-
Fix Version/s: 1.8.0

> Port external catalogs in Table API extension points to flink-table-common
> --
>
> Key: FLINK-10755
> URL: https://issues.apache.org/jira/browse/FLINK-10755
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.8.0
>
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.
> This ticket is for porting external catalogs. This jira depends on 
> FLINK-16088.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10689) Port UDFs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10689:
-
Fix Version/s: 1.8.0

> Port UDFs in Table API extension points to flink-table-common
> -
>
> Key: FLINK-10689
> URL: https://issues.apache.org/jira/browse/FLINK-10689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: xueyu
>Priority: Major
> Fix For: 1.8.0
>
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.
> This ticket is for porting UDFs. This jira does NOT depend on FLINK-16088 so 
> it can be started at anytime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10686) Introduce a flink-table-common module

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10686:
-
Fix Version/s: (was: 1.7.0)
   1.8.0

> Introduce a flink-table-common module
> -
>
> Key: FLINK-10686
> URL: https://issues.apache.org/jira/browse/FLINK-10686
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
> Fix For: 1.8.0
>
>
> Because more and more table factories for connectors and formats are added 
> and external catalog support is also on the horizon, {{flink-table}} becomes 
> a dependency for many Flink modules. Since {{flink-table}} is implemented in 
> Scala it requires other modules to be suffixes with Scala prefixes. However, 
> as we have learned in the past, Scala code is hard to maintain which is why 
> our long-term goal is to avoid Scala/Scala dependencies.
> Therefore we propose a new module {{flink-table-common}} that contains 
> interfaces between {{flink-table}} and other modules. This module is 
> implemented in Java and should contain minimal (or better no) external 
> dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10755) Port external catalogs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10755:
-
Description: 
After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
remaining extension points of the Table API to flink-table-common. This 
includes interfaces for UDFs and the external catalog interface.

This ticket is for porting external catalogs. This jira depends on FLINK-16088.

  was:
After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
remaining extension points of the Table API to flink-table-common. This 
includes interfaces for UDFs and the external catalog interface.

This ticket is for porting UDFs. This jira does NOT depend on FLINK-16088 so it 
can be started at anytime.


> Port external catalogs in Table API extension points to flink-table-common
> --
>
> Key: FLINK-10755
> URL: https://issues.apache.org/jira/browse/FLINK-10755
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Bowen Li
>Priority: Major
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.
> This ticket is for porting external catalogs. This jira depends on 
> FLINK-16088.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-10755) Port external catalogs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li reassigned FLINK-10755:


Assignee: Bowen Li

> Port external catalogs in Table API extension points to flink-table-common
> --
>
> Key: FLINK-10755
> URL: https://issues.apache.org/jira/browse/FLINK-10755
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.
> This ticket is for porting external catalogs. This jira depends on 
> FLINK-16088.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-10755) Port external catalogs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li reassigned FLINK-10755:


Assignee: (was: xueyu)

> Port external catalogs in Table API extension points to flink-table-common
> --
>
> Key: FLINK-10755
> URL: https://issues.apache.org/jira/browse/FLINK-10755
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Bowen Li
>Priority: Major
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.
> This ticket is for porting UDFs. This jira does NOT depend on FLINK-16088 so 
> it can be started at anytime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10755) Port external catalogs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)

Bowen Li created FLINK-10755:


 Summary: Port external catalogs in Table API extension points to 
flink-table-common
 Key: FLINK-10755
 URL: https://issues.apache.org/jira/browse/FLINK-10755
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Reporter: Bowen Li
Assignee: xueyu


After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
remaining extension points of the Table API to flink-table-common. This 
includes interfaces for UDFs and the external catalog interface.

This ticket is for porting UDFs. This jira does NOT depend on FLINK-16088 so it 
can be started at anytime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10689) Port UDFs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10689:
-
Description: 
After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
remaining extension points of the Table API to flink-table-common. This 
includes interfaces for UDFs and the external catalog interface.

This ticket is for porting UDFs. This jira does NOT depend on FLINK-16088 so it 
can be started at anytime.

  was:After FLINK-10687 and FLINK-10688 have been resolved, we should also port 
the remaining extension points of the Table API to flink-table-common. This 
includes interfaces for UDFs and the external catalog interface.


> Port UDFs in Table API extension points to flink-table-common
> -
>
> Key: FLINK-10689
> URL: https://issues.apache.org/jira/browse/FLINK-10689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: xueyu
>Priority: Major
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.
> This ticket is for porting UDFs. This jira does NOT depend on FLINK-16088 so 
> it can be started at anytime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10689) Port UDFs in Table API extension points to flink-table-common

2018-11-01 Thread Bowen Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-10689:
-
Summary: Port UDFs in Table API extension points to flink-table-common  
(was: Port Table API extension points to flink-table-common)

> Port UDFs in Table API extension points to flink-table-common
> -
>
> Key: FLINK-10689
> URL: https://issues.apache.org/jira/browse/FLINK-10689
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: xueyu
>Priority: Major
>
> After FLINK-10687 and FLINK-10688 have been resolved, we should also port the 
> remaining extension points of the Table API to flink-table-common. This 
> includes interfaces for UDFs and the external catalog interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672058#comment-16672058
 ] 

ASF GitHub Bot commented on FLINK-10696:


xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & 
SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230161785
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/InMemoryExternalCatalog.scala
 ##
 @@ -98,6 +101,77 @@ class InMemoryExternalCatalog(name: String) extends 
CrudExternalCatalog {
 }
   }
 
+  @throws[ViewAlreadyExistException]
+  override def createView(
+viewName: String,
+view: String,
 
 Review comment:
   I think it might be good to define a View class following the descriptor 
approach, though the only thing significant might be just a query as a string. 
However, this can be taken as a followup JIRA.
   
   I will update the doc accordingly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add APIs to ExternalCatalog for views and UDFs
> --
>
> Key: FLINK-10696
> URL: https://issues.apache.org/jira/browse/FLINK-10696
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Currently there are APIs for tables only. However, views and UDFs are also 
> common objects in a catalog.
> This is required when we store Flink tables/views/UDFs in an external 
> persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs

2018-11-01 Thread GitBox

xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & 
SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230161785
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/InMemoryExternalCatalog.scala
 ##
 @@ -98,6 +101,77 @@ class InMemoryExternalCatalog(name: String) extends 
CrudExternalCatalog {
 }
   }
 
+  @throws[ViewAlreadyExistException]
+  override def createView(
+viewName: String,
+view: String,
 
 Review comment:
   I think it might be good to define a View class following the descriptor 
approach, though the only thing significant might be just a query as a string. 
However, this can be taken as a followup JIRA.
   
   I will update the doc accordingly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672037#comment-16672037
 ] 

ASF GitHub Bot commented on FLINK-10696:


xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & 
SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230161785
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/InMemoryExternalCatalog.scala
 ##
 @@ -98,6 +101,77 @@ class InMemoryExternalCatalog(name: String) extends 
CrudExternalCatalog {
 }
   }
 
+  @throws[ViewAlreadyExistException]
+  override def createView(
+viewName: String,
+view: String,
 
 Review comment:
   I think it might be good to define a View class following the descriptor 
approach, though the only thing significant might be just a query as a string. 
I will update the doc accordingly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add APIs to ExternalCatalog for views and UDFs
> --
>
> Key: FLINK-10696
> URL: https://issues.apache.org/jira/browse/FLINK-10696
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Currently there are APIs for tables only. However, views and UDFs are also 
> common objects in a catalog.
> This is required when we store Flink tables/views/UDFs in an external 
> persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs

2018-11-01 Thread GitBox

xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & 
SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230161785
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/InMemoryExternalCatalog.scala
 ##
 @@ -98,6 +101,77 @@ class InMemoryExternalCatalog(name: String) extends 
CrudExternalCatalog {
 }
   }
 
+  @throws[ViewAlreadyExistException]
+  override def createView(
+viewName: String,
+view: String,
 
 Review comment:
   I think it might be good to define a View class following the descriptor 
approach, though the only thing significant might be just a query as a string. 
I will update the doc accordingly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10507) Set target parallelism to maximum when using the standalone job cluster mode

2018-11-01 Thread Mike Mintz (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672027#comment-16672027
 ] 

Mike Mintz commented on FLINK-10507:


This ticket is marked for "Fix Version: 1.7.0" but I didn't see any activity 
around it in git. Will this actually be included in Flink 1.7?

> Set target parallelism to maximum when using the standalone job cluster mode
> 
>
> Key: FLINK-10507
> URL: https://issues.apache.org/jira/browse/FLINK-10507
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: vinoyang
>Priority: Major
> Fix For: 1.7.0
>
>
> In order to enable the reactive container mode, we should set the target 
> value to the maximum parallelism if we run in standalone job cluster mode. 
> That way, we will always use all available resources and scale up if new 
> resources are being added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672018#comment-16672018
 ] 

ASF GitHub Bot commented on FLINK-10696:


xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & 
SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230159488
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/CrudExternalCatalog.scala
 ##
 @@ -103,4 +105,86 @@ trait CrudExternalCatalog extends ExternalCatalog {
   @throws[CatalogNotExistException]
   def alterSubCatalog(name: String, catalog: ExternalCatalog, 
ignoreIfNotExists: Boolean): Unit
 
+  /**
+* Adds a view to this catalog.
+*
+* @param viewName  The name of the view to add.
+* @param view  The view to add.
+* @param ignoreIfExists Flag to specify behavior if a view with the given 
name already exists:
+*   if set to false, throw an exception,
+*   if set to true, nothing happens.
+* @throws ViewAlreadyExistException thrown if view already exists and 
ignoreIfExists is false
+*/
+  @throws[ViewAlreadyExistException]
+  def createView(viewName: String, view: String, ignoreIfExists: Boolean): Unit
+
+  /**
+* Deletes a view from this catalog.
+*
+* @param viewName Name of the view to delete.
+* @param ignoreIfNotExists Flag to specify behavior if the view does not 
exist:
+*  if set to false, throw an exception,
+*  if set to true, nothing happens.
+* @throws ViewNotExistExceptionthrown if the view does not exist in 
the catalog
+*/
+  @throws[ViewNotExistException]
+  def dropView(viewName: String, ignoreIfNotExists: Boolean): Unit
+
+  /**
+* Modifies an existing view of this catalog.
+*
+* @param viewName The name of the view to modify.
+* @param view The new view which replaces the existing table.
+* @param ignoreIfNotExists Flag to specify behavior if the view does not 
exist:
+*  if set to false, throw an exception,
+*  if set to true, nothing happens.
+* @throws ViewNotExistException   thrown if the view does not exist in the 
catalog
+*/
+  @throws[ViewNotExistException]
+  def alterView(viewName: String, view: String, ignoreIfNotExists: Boolean): 
Unit
+
+  /**
+* Adds a UDF to this catalog.
+*
+* @param functionName  The name of the function to add.
+* @param function  The function to add.
+* @param ignoreIfExists Flag to specify behavior if function with the 
given name already exists:
+*   if set to false, throw an exception,
+*   if set to true, nothing happens.
+* @throws FunctionAlreadyExistException thrown if function already exists 
and ignoreIfExists
+*   is false
+*/
+  @throws[FunctionAlreadyExistException]
+  def createFunction(
+functionName: String,
+function: UserDefinedFunction,
+ignoreIfExists: Boolean): Unit
+
+  /**
+* Deletes a UDF from this catalog.
+*
+* @param functionName Name of the function to delete.
+* @param ignoreIfNotExists Flag to specify behavior if the function does 
not exist:
+*  if set to false, throw an exception,
+*  if set to true, nothing happens.
+* @throws FunctionNotExistExceptionthrown if the function does not 
exist in the catalog
+*/
+  @throws[FunctionNotExistException]
+  def dropFunction(functionName: String, ignoreIfNotExists: Boolean): Unit
+
+  /**
+* Modifies an existing UDF of this catalog.
+*
+* @param functionName The name of the function to modify.
+* @param function The new function which replaces the existing 
table.
 
 Review comment:
   Same typo as above


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add APIs to ExternalCatalog for views and UDFs
> --
>
> Key: FLINK-10696
> URL: https://issues.apache.org/jira/browse/FLINK-10696
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-av

[GitHub] xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs

2018-11-01 Thread GitBox

xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & 
SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230159488
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/CrudExternalCatalog.scala
 ##
 @@ -103,4 +105,86 @@ trait CrudExternalCatalog extends ExternalCatalog {
   @throws[CatalogNotExistException]
   def alterSubCatalog(name: String, catalog: ExternalCatalog, 
ignoreIfNotExists: Boolean): Unit
 
+  /**
+* Adds a view to this catalog.
+*
+* @param viewName  The name of the view to add.
+* @param view  The view to add.
+* @param ignoreIfExists Flag to specify behavior if a view with the given 
name already exists:
+*   if set to false, throw an exception,
+*   if set to true, nothing happens.
+* @throws ViewAlreadyExistException thrown if view already exists and 
ignoreIfExists is false
+*/
+  @throws[ViewAlreadyExistException]
+  def createView(viewName: String, view: String, ignoreIfExists: Boolean): Unit
+
+  /**
+* Deletes a view from this catalog.
+*
+* @param viewName Name of the view to delete.
+* @param ignoreIfNotExists Flag to specify behavior if the view does not 
exist:
+*  if set to false, throw an exception,
+*  if set to true, nothing happens.
+* @throws ViewNotExistExceptionthrown if the view does not exist in 
the catalog
+*/
+  @throws[ViewNotExistException]
+  def dropView(viewName: String, ignoreIfNotExists: Boolean): Unit
+
+  /**
+* Modifies an existing view of this catalog.
+*
+* @param viewName The name of the view to modify.
+* @param view The new view which replaces the existing table.
+* @param ignoreIfNotExists Flag to specify behavior if the view does not 
exist:
+*  if set to false, throw an exception,
+*  if set to true, nothing happens.
+* @throws ViewNotExistException   thrown if the view does not exist in the 
catalog
+*/
+  @throws[ViewNotExistException]
+  def alterView(viewName: String, view: String, ignoreIfNotExists: Boolean): 
Unit
+
+  /**
+* Adds a UDF to this catalog.
+*
+* @param functionName  The name of the function to add.
+* @param function  The function to add.
+* @param ignoreIfExists Flag to specify behavior if function with the 
given name already exists:
+*   if set to false, throw an exception,
+*   if set to true, nothing happens.
+* @throws FunctionAlreadyExistException thrown if function already exists 
and ignoreIfExists
+*   is false
+*/
+  @throws[FunctionAlreadyExistException]
+  def createFunction(
+functionName: String,
+function: UserDefinedFunction,
+ignoreIfExists: Boolean): Unit
+
+  /**
+* Deletes a UDF from this catalog.
+*
+* @param functionName Name of the function to delete.
+* @param ignoreIfNotExists Flag to specify behavior if the function does 
not exist:
+*  if set to false, throw an exception,
+*  if set to true, nothing happens.
+* @throws FunctionNotExistExceptionthrown if the function does not 
exist in the catalog
+*/
+  @throws[FunctionNotExistException]
+  def dropFunction(functionName: String, ignoreIfNotExists: Boolean): Unit
+
+  /**
+* Modifies an existing UDF of this catalog.
+*
+* @param functionName The name of the function to modify.
+* @param function The new function which replaces the existing 
table.
 
 Review comment:
   Same typo as above


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672016#comment-16672016
 ] 

ASF GitHub Bot commented on FLINK-10696:


xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & 
SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230159003
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/CrudExternalCatalog.scala
 ##
 @@ -103,4 +105,86 @@ trait CrudExternalCatalog extends ExternalCatalog {
   @throws[CatalogNotExistException]
   def alterSubCatalog(name: String, catalog: ExternalCatalog, 
ignoreIfNotExists: Boolean): Unit
 
+  /**
+* Adds a view to this catalog.
+*
+* @param viewName  The name of the view to add.
+* @param view  The view to add.
+* @param ignoreIfExists Flag to specify behavior if a view with the given 
name already exists:
+*   if set to false, throw an exception,
+*   if set to true, nothing happens.
+* @throws ViewAlreadyExistException thrown if view already exists and 
ignoreIfExists is false
+*/
+  @throws[ViewAlreadyExistException]
+  def createView(viewName: String, view: String, ignoreIfExists: Boolean): Unit
+
+  /**
+* Deletes a view from this catalog.
+*
+* @param viewName Name of the view to delete.
+* @param ignoreIfNotExists Flag to specify behavior if the view does not 
exist:
+*  if set to false, throw an exception,
+*  if set to true, nothing happens.
+* @throws ViewNotExistExceptionthrown if the view does not exist in 
the catalog
+*/
+  @throws[ViewNotExistException]
+  def dropView(viewName: String, ignoreIfNotExists: Boolean): Unit
+
+  /**
+* Modifies an existing view of this catalog.
+*
+* @param viewName The name of the view to modify.
+* @param view The new view which replaces the existing table.
 
 Review comment:
   Typo: table -> view


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add APIs to ExternalCatalog for views and UDFs
> --
>
> Key: FLINK-10696
> URL: https://issues.apache.org/jira/browse/FLINK-10696
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Currently there are APIs for tables only. However, views and UDFs are also 
> common objects in a catalog.
> This is required when we store Flink tables/views/UDFs in an external 
> persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs

2018-11-01 Thread GitBox

xuefuz commented on a change in pull request #6970: [FLINK-10696][Table API & 
SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and 
InMemoryCrudExternalCatalog for views and UDFs
URL: https://github.com/apache/flink/pull/6970#discussion_r230159003
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/catalog/CrudExternalCatalog.scala
 ##
 @@ -103,4 +105,86 @@ trait CrudExternalCatalog extends ExternalCatalog {
   @throws[CatalogNotExistException]
   def alterSubCatalog(name: String, catalog: ExternalCatalog, 
ignoreIfNotExists: Boolean): Unit
 
+  /**
+* Adds a view to this catalog.
+*
+* @param viewName  The name of the view to add.
+* @param view  The view to add.
+* @param ignoreIfExists Flag to specify behavior if a view with the given 
name already exists:
+*   if set to false, throw an exception,
+*   if set to true, nothing happens.
+* @throws ViewAlreadyExistException thrown if view already exists and 
ignoreIfExists is false
+*/
+  @throws[ViewAlreadyExistException]
+  def createView(viewName: String, view: String, ignoreIfExists: Boolean): Unit
+
+  /**
+* Deletes a view from this catalog.
+*
+* @param viewName Name of the view to delete.
+* @param ignoreIfNotExists Flag to specify behavior if the view does not 
exist:
+*  if set to false, throw an exception,
+*  if set to true, nothing happens.
+* @throws ViewNotExistExceptionthrown if the view does not exist in 
the catalog
+*/
+  @throws[ViewNotExistException]
+  def dropView(viewName: String, ignoreIfNotExists: Boolean): Unit
+
+  /**
+* Modifies an existing view of this catalog.
+*
+* @param viewName The name of the view to modify.
+* @param view The new view which replaces the existing table.
 
 Review comment:
   Typo: table -> view


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Assigned] (FLINK-10490) OperatorSnapshotUtil should probably use SavepointV2Serializer

2018-11-01 Thread Till Rohrmann (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-10490:
-

Assignee: Stefan Richter

> OperatorSnapshotUtil should probably use SavepointV2Serializer
> --
>
> Key: FLINK-10490
> URL: https://issues.apache.org/jira/browse/FLINK-10490
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>  Labels: pull-request-available
>
> {{OperatorSnapshotUtil}} is used for testing savepoint migration. This 
> utility internally still uses {{SavepointV1Serializer}} and I would assume 
> that it should use {{SavepointV2Serializer}}. I wonder if that means that 
> some newer cases are actually not covered in the migration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10720) Add stress deployment end-to-end test

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671914#comment-16671914
 ] 

ASF GitHub Bot commented on FLINK-10720:


zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add 
deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994#discussion_r230128699
 
 

 ##
 File path: flink-end-to-end-tests/test-scripts/test_heavy_deployment.sh
 ##
 @@ -0,0 +1,43 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+source "$(dirname "$0")"/common.sh
+
+CHECKPOINT_DIR="file://$TEST_DATA_DIR/savepoint-e2e-test-chckpt-dir"
+
+TEST=flink-heavy-deployment-stress-test
+TEST_PROGRAM_NAME=HeavyDeploymentStressTestProgram
+TEST_PROGRAM_JAR=${END_TO_END_DIR}/$TEST/target/$TEST_PROGRAM_NAME.jar
+
+set_conf "taskmanager.numberOfTaskSlots" "13"
+
+start_cluster
+
+NUM_TMS=20
+
+#20 x 13 slots to support a parallelism of 256
+echo "Start $NUM_TMS more task managers"
+for i in `seq 1 $NUM_TMS`; do
 
 Review comment:
   this is likely to fail on travis without additional config tuning to reduce 
the memory consumption of each taskmanager. See the [high parallelism e2e 
test](https://github.com/apache/flink/blob/885640f781aa66359d929eb387f27a6024d75025/flink-end-to-end-tests/test-scripts/test_high_parallelism_iterations.sh)
 as an example.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add stress deployment end-to-end test
> -
>
> Key: FLINK-10720
> URL: https://issues.apache.org/jira/browse/FLINK-10720
> Project: Flink
>  Issue Type: Sub-task
>  Components: E2E Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> In order to test Flink's scalability, I suggest to add an end-to-end test 
> which tests the deployment of a job which is very demanding. The job should 
> have large {{TaskDeploymentDescriptors}} (e.g. a job using union state or 
> having a high degree of parallelism). That way we can test that the 
> serialization overhead of the TDDs does not affect the health of the cluster 
> (e.g. heartbeats are not affected because the serialization does not happen 
> in the main thread).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add deployment end-to-end stress test with many …

2018-11-01 Thread GitBox

zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add 
deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994#discussion_r230128699
 
 

 ##
 File path: flink-end-to-end-tests/test-scripts/test_heavy_deployment.sh
 ##
 @@ -0,0 +1,43 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+source "$(dirname "$0")"/common.sh
+
+CHECKPOINT_DIR="file://$TEST_DATA_DIR/savepoint-e2e-test-chckpt-dir"
+
+TEST=flink-heavy-deployment-stress-test
+TEST_PROGRAM_NAME=HeavyDeploymentStressTestProgram
+TEST_PROGRAM_JAR=${END_TO_END_DIR}/$TEST/target/$TEST_PROGRAM_NAME.jar
+
+set_conf "taskmanager.numberOfTaskSlots" "13"
+
+start_cluster
+
+NUM_TMS=20
+
+#20 x 13 slots to support a parallelism of 256
+echo "Start $NUM_TMS more task managers"
+for i in `seq 1 $NUM_TMS`; do
 
 Review comment:
   this is likely to fail on travis without additional config tuning to reduce 
the memory consumption of each taskmanager. See the [high parallelism e2e 
test](https://github.com/apache/flink/blob/885640f781aa66359d929eb387f27a6024d75025/flink-end-to-end-tests/test-scripts/test_high_parallelism_iterations.sh)
 as an example.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10720) Add stress deployment end-to-end test

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671910#comment-16671910
 ] 

ASF GitHub Bot commented on FLINK-10720:


zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add 
deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994#discussion_r230126767
 
 

 ##
 File path: flink-end-to-end-tests/flink-heavy-deployment-stress-test/pom.xml
 ##
 @@ -0,0 +1,108 @@
+
+
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+   4.0.0
+   
+   
+   org.apache.flink
+   flink-end-to-end-tests
+   1.7-SNAPSHOT
+   ..
+   
+
+   flink-heavy-deployment-stress-test
+   flink-heavy-deployment-stress-test
+   jar
+
+   
+   
+   org.apache.flink
+   flink-core
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-java
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-streaming-java_2.11
+   1.7-SNAPSHOT
+   
+   
+   org.apache.flink
+   flink-datastream-allround-test
+   ${project.version}
+   
+   
+   junit
+   junit
+   
+   
+   org.apache.flink
+   flink-test-utils_2.11
+   1.7-SNAPSHOT
+   compile
+   
+   
+   log4j
+   log4j
+   
+   
+
+   
+   
+   
+   org.apache.maven.plugins
+   maven-shade-plugin
+   
+   
+   
HeavyDeploymentStressTestProgram
+   package
+   
+   shade
+   
+   
+   
HeavyDeploymentStressTestProgram
+   
+   
+   

+   

+   

 
 Review comment:
   what is this for?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add stress deployment end-to-end test
> -
>
> Key: FLINK-10720
> URL: https://issues.apache.org/jira/browse/FLINK-10720
> Project: Flink
>  Issue Type: Sub-task
>  Components: E2E Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> In order to test Flink's scalability, I suggest to add an end-to-end test 
> which tests the deployment of a job which is very demanding. The job should 
> have large {{TaskDeploymentDescriptors}} (e.g. a job using union state or 
> having a high degree of parallelism). That way we can test that the 
> serialization overhead of the TDDs does not affect the health of the cluster 
> (e.g. heartbeats are not affected because the serialization does not happen 
> in the main thread).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10720) Add stress deployment end-to-end test

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671909#comment-16671909
 ] 

ASF GitHub Bot commented on FLINK-10720:


zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add 
deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994#discussion_r230126606
 
 

 ##
 File path: flink-end-to-end-tests/flink-heavy-deployment-stress-test/pom.xml
 ##
 @@ -0,0 +1,108 @@
+
+
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+   4.0.0
+   
+   
+   org.apache.flink
+   flink-end-to-end-tests
+   1.7-SNAPSHOT
+   ..
+   
+
+   flink-heavy-deployment-stress-test
+   flink-heavy-deployment-stress-test
+   jar
+
+   
+   
+   org.apache.flink
+   flink-core
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-java
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-streaming-java_2.11
+   1.7-SNAPSHOT
+   
+   
+   org.apache.flink
+   flink-datastream-allround-test
+   ${project.version}
+   
+   
+   junit
+   junit
+   
+   
+   org.apache.flink
+   flink-test-utils_2.11
+   1.7-SNAPSHOT
 
 Review comment:
   wrong version


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add stress deployment end-to-end test
> -
>
> Key: FLINK-10720
> URL: https://issues.apache.org/jira/browse/FLINK-10720
> Project: Flink
>  Issue Type: Sub-task
>  Components: E2E Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> In order to test Flink's scalability, I suggest to add an end-to-end test 
> which tests the deployment of a job which is very demanding. The job should 
> have large {{TaskDeploymentDescriptors}} (e.g. a job using union state or 
> having a high degree of parallelism). That way we can test that the 
> serialization overhead of the TDDs does not affect the health of the cluster 
> (e.g. heartbeats are not affected because the serialization does not happen 
> in the main thread).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10720) Add stress deployment end-to-end test

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671908#comment-16671908
 ] 

ASF GitHub Bot commented on FLINK-10720:


zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add 
deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994#discussion_r230126550
 
 

 ##
 File path: flink-end-to-end-tests/flink-heavy-deployment-stress-test/pom.xml
 ##
 @@ -0,0 +1,108 @@
+
+
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+   4.0.0
+   
+   
+   org.apache.flink
+   flink-end-to-end-tests
+   1.7-SNAPSHOT
+   ..
+   
+
+   flink-heavy-deployment-stress-test
+   flink-heavy-deployment-stress-test
+   jar
+
+   
+   
+   org.apache.flink
+   flink-core
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-java
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-streaming-java_2.11
+   1.7-SNAPSHOT
 
 Review comment:
   wrong version


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add stress deployment end-to-end test
> -
>
> Key: FLINK-10720
> URL: https://issues.apache.org/jira/browse/FLINK-10720
> Project: Flink
>  Issue Type: Sub-task
>  Components: E2E Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> In order to test Flink's scalability, I suggest to add an end-to-end test 
> which tests the deployment of a job which is very demanding. The job should 
> have large {{TaskDeploymentDescriptors}} (e.g. a job using union state or 
> having a high degree of parallelism). That way we can test that the 
> serialization overhead of the TDDs does not affect the health of the cluster 
> (e.g. heartbeats are not affected because the serialization does not happen 
> in the main thread).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add deployment end-to-end stress test with many …

2018-11-01 Thread GitBox

zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add 
deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994#discussion_r230126767
 
 

 ##
 File path: flink-end-to-end-tests/flink-heavy-deployment-stress-test/pom.xml
 ##
 @@ -0,0 +1,108 @@
+
+
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+   4.0.0
+   
+   
+   org.apache.flink
+   flink-end-to-end-tests
+   1.7-SNAPSHOT
+   ..
+   
+
+   flink-heavy-deployment-stress-test
+   flink-heavy-deployment-stress-test
+   jar
+
+   
+   
+   org.apache.flink
+   flink-core
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-java
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-streaming-java_2.11
+   1.7-SNAPSHOT
+   
+   
+   org.apache.flink
+   flink-datastream-allround-test
+   ${project.version}
+   
+   
+   junit
+   junit
+   
+   
+   org.apache.flink
+   flink-test-utils_2.11
+   1.7-SNAPSHOT
+   compile
+   
+   
+   log4j
+   log4j
+   
+   
+
+   
+   
+   
+   org.apache.maven.plugins
+   maven-shade-plugin
+   
+   
+   
HeavyDeploymentStressTestProgram
+   package
+   
+   shade
+   
+   
+   
HeavyDeploymentStressTestProgram
+   
+   
+   

+   

+   

 
 Review comment:
   what is this for?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add deployment end-to-end stress test with many …

2018-11-01 Thread GitBox

zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add 
deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994#discussion_r230126606
 
 

 ##
 File path: flink-end-to-end-tests/flink-heavy-deployment-stress-test/pom.xml
 ##
 @@ -0,0 +1,108 @@
+
+
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+   4.0.0
+   
+   
+   org.apache.flink
+   flink-end-to-end-tests
+   1.7-SNAPSHOT
+   ..
+   
+
+   flink-heavy-deployment-stress-test
+   flink-heavy-deployment-stress-test
+   jar
+
+   
+   
+   org.apache.flink
+   flink-core
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-java
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-streaming-java_2.11
+   1.7-SNAPSHOT
+   
+   
+   org.apache.flink
+   flink-datastream-allround-test
+   ${project.version}
+   
+   
+   junit
+   junit
+   
+   
+   org.apache.flink
+   flink-test-utils_2.11
+   1.7-SNAPSHOT
 
 Review comment:
   wrong version


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add deployment end-to-end stress test with many …

2018-11-01 Thread GitBox

zentol commented on a change in pull request #6994: [FLINK-10720][tests] Add 
deployment end-to-end stress test with many …
URL: https://github.com/apache/flink/pull/6994#discussion_r230126550
 
 

 ##
 File path: flink-end-to-end-tests/flink-heavy-deployment-stress-test/pom.xml
 ##
 @@ -0,0 +1,108 @@
+
+
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+   4.0.0
+   
+   
+   org.apache.flink
+   flink-end-to-end-tests
+   1.7-SNAPSHOT
+   ..
+   
+
+   flink-heavy-deployment-stress-test
+   flink-heavy-deployment-stress-test
+   jar
+
+   
+   
+   org.apache.flink
+   flink-core
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-java
+   ${project.version}
+   
+   
+   org.apache.flink
+   flink-streaming-java_2.11
+   1.7-SNAPSHOT
 
 Review comment:
   wrong version


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10631) Update jepsen tests to run with multiple slots

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671906#comment-16671906
 ] 

ASF GitHub Bot commented on FLINK-10631:


tillrohrmann closed pull request #6978: [FLINK-10631] Set number of slots per 
TM to 3 for Jepsen tests
URL: https://github.com/apache/flink/pull/6978
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/flink-jepsen/src/jepsen/flink/db.clj 
b/flink-jepsen/src/jepsen/flink/db.clj
index e0f5ff856d4..30f70843d4b 100644
--- a/flink-jepsen/src/jepsen/flink/db.clj
+++ b/flink-jepsen/src/jepsen/flink/db.clj
@@ -42,7 +42,7 @@
 (def deb-mesos-package "1.5.0-2.0.2")
 (def deb-marathon-package "1.6.322")
 
-(def taskmanager-slots 1)
+(def taskmanager-slots 3)
 
 (defn flink-configuration
   [test node]
@@ -293,7 +293,6 @@
"-Djobmanager.rpc.port=6123 "
"-Dmesos.resourcemanager.tasks.mem=2048 "
"-Dtaskmanager.heap.mb=2048 "
-   "-Dtaskmanager.numberOfTaskSlots=2 "
"-Dmesos.resourcemanager.tasks.cpus=1 "
"-Drest.bind-address=$(hostname -f) "))
 
diff --git a/flink-jepsen/src/jepsen/flink/mesos.clj 
b/flink-jepsen/src/jepsen/flink/mesos.clj
index aef73598da9..400ed42073a 100644
--- a/flink-jepsen/src/jepsen/flink/mesos.clj
+++ b/flink-jepsen/src/jepsen/flink/mesos.clj
@@ -68,7 +68,8 @@
 (str "--log_dir=" log-dir)
 (str "--master=" (zookeeper-uri test zk-namespace))
 (str "--recovery_timeout=30secs")
-(str "--work_dir=" slave-dir)]))
+(str "--work_dir=" slave-dir)
+(str "--resources='cpus:8'")]))
 
 (defn create-mesos-master-supervised-service!
   [test node]


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update jepsen tests to run with multiple slots
> --
>
> Key: FLINK-10631
> URL: https://issues.apache.org/jira/browse/FLINK-10631
> Project: Flink
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> After fixing FLINK-9455, we should update the Jepsen tests to run with 
> multiple slots per {{TaskExecutor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (FLINK-10631) Update jepsen tests to run with multiple slots

2018-11-01 Thread Till Rohrmann (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-10631.
---
Resolution: Fixed

Fixed via 
https://github.com/apache/flink/commit/6047321cd26904ea323cb8c2e0c4a69adb9dbeac

> Update jepsen tests to run with multiple slots
> --
>
> Key: FLINK-10631
> URL: https://issues.apache.org/jira/browse/FLINK-10631
> Project: Flink
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> After fixing FLINK-9455, we should update the Jepsen tests to run with 
> multiple slots per {{TaskExecutor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] tillrohrmann closed pull request #6978: [FLINK-10631] Set number of slots per TM to 3 for Jepsen tests

2018-11-01 Thread GitBox

tillrohrmann closed pull request #6978: [FLINK-10631] Set number of slots per 
TM to 3 for Jepsen tests
URL: https://github.com/apache/flink/pull/6978
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/flink-jepsen/src/jepsen/flink/db.clj 
b/flink-jepsen/src/jepsen/flink/db.clj
index e0f5ff856d4..30f70843d4b 100644
--- a/flink-jepsen/src/jepsen/flink/db.clj
+++ b/flink-jepsen/src/jepsen/flink/db.clj
@@ -42,7 +42,7 @@
 (def deb-mesos-package "1.5.0-2.0.2")
 (def deb-marathon-package "1.6.322")
 
-(def taskmanager-slots 1)
+(def taskmanager-slots 3)
 
 (defn flink-configuration
   [test node]
@@ -293,7 +293,6 @@
"-Djobmanager.rpc.port=6123 "
"-Dmesos.resourcemanager.tasks.mem=2048 "
"-Dtaskmanager.heap.mb=2048 "
-   "-Dtaskmanager.numberOfTaskSlots=2 "
"-Dmesos.resourcemanager.tasks.cpus=1 "
"-Drest.bind-address=$(hostname -f) "))
 
diff --git a/flink-jepsen/src/jepsen/flink/mesos.clj 
b/flink-jepsen/src/jepsen/flink/mesos.clj
index aef73598da9..400ed42073a 100644
--- a/flink-jepsen/src/jepsen/flink/mesos.clj
+++ b/flink-jepsen/src/jepsen/flink/mesos.clj
@@ -68,7 +68,8 @@
 (str "--log_dir=" log-dir)
 (str "--master=" (zookeeper-uri test zk-namespace))
 (str "--recovery_timeout=30secs")
-(str "--work_dir=" slave-dir)]))
+(str "--work_dir=" slave-dir)
+(str "--resources='cpus:8'")]))
 
 (defn create-mesos-master-supervised-service!
   [test node]


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10631) Update jepsen tests to run with multiple slots

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671905#comment-16671905
 ] 

ASF GitHub Bot commented on FLINK-10631:


tillrohrmann commented on issue #6978: [FLINK-10631] Set number of slots per TM 
to 3 for Jepsen tests
URL: https://github.com/apache/flink/pull/6978#issuecomment-435117292
 
 
   Thanks for your review @igalshilman. Merging this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update jepsen tests to run with multiple slots
> --
>
> Key: FLINK-10631
> URL: https://issues.apache.org/jira/browse/FLINK-10631
> Project: Flink
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> After fixing FLINK-9455, we should update the Jepsen tests to run with 
> multiple slots per {{TaskExecutor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] tillrohrmann commented on issue #6978: [FLINK-10631] Set number of slots per TM to 3 for Jepsen tests

2018-11-01 Thread GitBox

tillrohrmann commented on issue #6978: [FLINK-10631] Set number of slots per TM 
to 3 for Jepsen tests
URL: https://github.com/apache/flink/pull/6978#issuecomment-435117292
 
 
   Thanks for your review @igalshilman. Merging this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10631) Update jepsen tests to run with multiple slots

2018-11-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671899#comment-16671899
 ] 

ASF GitHub Bot commented on FLINK-10631:


tillrohrmann commented on a change in pull request #6978: [FLINK-10631] Set 
number of slots per TM to 3 for Jepsen tests
URL: https://github.com/apache/flink/pull/6978#discussion_r230124525
 
 

 ##
 File path: flink-jepsen/src/jepsen/flink/db.clj
 ##
 @@ -293,7 +293,6 @@
"-Djobmanager.rpc.port=6123 "
"-Dmesos.resourcemanager.tasks.mem=2048 "
"-Dtaskmanager.heap.mb=2048 "
-   "-Dtaskmanager.numberOfTaskSlots=2 "
 
 Review comment:
   I think Mesos will pick up the configuration settings from the Flink 
configuration. Therefore, there should be no need to explicitly override it 
here. The same may apply to the other configuration settings as well. But I 
would like to address this in a separate issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update jepsen tests to run with multiple slots
> --
>
> Key: FLINK-10631
> URL: https://issues.apache.org/jira/browse/FLINK-10631
> Project: Flink
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> After fixing FLINK-9455, we should update the Jepsen tests to run with 
> multiple slots per {{TaskExecutor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] tillrohrmann commented on a change in pull request #6978: [FLINK-10631] Set number of slots per TM to 3 for Jepsen tests

2018-11-01 Thread GitBox

tillrohrmann commented on a change in pull request #6978: [FLINK-10631] Set 
number of slots per TM to 3 for Jepsen tests
URL: https://github.com/apache/flink/pull/6978#discussion_r230124525
 
 

 ##
 File path: flink-jepsen/src/jepsen/flink/db.clj
 ##
 @@ -293,7 +293,6 @@
"-Djobmanager.rpc.port=6123 "
"-Dmesos.resourcemanager.tasks.mem=2048 "
"-Dtaskmanager.heap.mb=2048 "
-   "-Dtaskmanager.numberOfTaskSlots=2 "
 
 Review comment:
   I think Mesos will pick up the configuration settings from the Flink 
configuration. Therefore, there should be no need to explicitly override it 
here. The same may apply to the other configuration settings as well. But I 
would like to address this in a separate issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10685) Support history server on YARN

2018-11-01 Thread Till Rohrmann (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671885#comment-16671885
 ] 

Till Rohrmann commented on FLINK-10685:
---

What exactly is the scope of this issue [~yanghua]?

> Support history server on YARN
> --
>
> Key: FLINK-10685
> URL: https://issues.apache.org/jira/browse/FLINK-10685
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10703) Race condition in YarnResourceManager

2018-11-01 Thread Till Rohrmann (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671881#comment-16671881
 ] 

Till Rohrmann commented on FLINK-10703:
---

I think all accesses to {{numPendingContainerRequests}} happen from within the 
{{RpcEndpoint's}} main thread. Thus, there should be no need to make this field 
atomic. Do you see any concurrent accesses [~dangdangdang]?

> Race condition in YarnResourceManager
> -
>
> Key: FLINK-10703
> URL: https://issues.apache.org/jira/browse/FLINK-10703
> Project: Flink
>  Issue Type: Bug
>  Components: YARN
>Reporter: Shimin Yang
>Assignee: Shimin Yang
>Priority: Major
>
> Race condition on numPendingContainerRequests, this instance variable should 
> be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10629) FlinkKafkaProducerITCase.testFlinkKafkaProducer10FailTransactionCoordinatorBeforeNotify failed on Travis

2018-11-01 Thread Till Rohrmann (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671878#comment-16671878
 ] 

Till Rohrmann commented on FLINK-10629:
---

Another instance: https://api.travis-ci.org/v3/job/449386092/log.txt

> FlinkKafkaProducerITCase.testFlinkKafkaProducer10FailTransactionCoordinatorBeforeNotify
>  failed on Travis
> 
>
> Key: FLINK-10629
> URL: https://issues.apache.org/jira/browse/FLINK-10629
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector, Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Assignee: Piotr Nowojski
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.7.0
>
>
> The 
> {{FlinkKafkaProducerITCase.testFlinkKafkaProducer10FailTransactionCoordinatorBeforeNotify}}
>  failed on Travis.
> https://api.travis-ci.org/v3/job/443777257/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10712) RestartPipelinedRegionStrategy does not restore state

2018-11-01 Thread Till Rohrmann (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671867#comment-16671867
 ] 

Till Rohrmann commented on FLINK-10712:
---

Thanks a lot for contributing [~yunta]. 

[~aljoscha] I don't think that this is a release blocker since it is broken for 
quite some time. However, we should fix it soon.

> RestartPipelinedRegionStrategy does not restore state
> -
>
> Key: FLINK-10712
> URL: https://issues.apache.org/jira/browse/FLINK-10712
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.3.3, 1.4.2, 1.5.5, 1.6.2, 1.7.0
>Reporter: Stefan Richter
>Assignee: Yun Tang
>Priority: Critical
> Fix For: 1.7.0
>
>
> RestartPipelinedRegionStrategy does not perform any state restore. This is 
> big problem because all restored regions will be restarted with empty state. 
> We need to take checkpoints into account when restoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10713) RestartIndividualStrategy does not restore state

2018-11-01 Thread Till Rohrmann (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671865#comment-16671865
 ] 

Till Rohrmann commented on FLINK-10713:
---

I would not say so because this restart strategy has been broken for a long 
time. I think we should add a release note to warn about it and then plan to 
fix it.

> RestartIndividualStrategy does not restore state
> 
>
> Key: FLINK-10713
> URL: https://issues.apache.org/jira/browse/FLINK-10713
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.3.3, 1.4.2, 1.5.5, 1.6.2, 1.7.0
>Reporter: Stefan Richter
>Priority: Critical
> Fix For: 1.7.0
>
>
> RestartIndividualStrategy does not perform any state restore. This is big 
> problem because all restored regions will be restarted with empty state. We 
> need to take checkpoints into account when restoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10711) flink-end-to-end-tests can fail silently

2018-11-01 Thread Timo Walther (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671863#comment-16671863
 ] 

Timo Walther commented on FLINK-10711:
--

[~hequn8128] have you started already? I would take a look at this. I might 
enable it on a per-test basis for now.

> flink-end-to-end-tests can fail silently
> 
>
> Key: FLINK-10711
> URL: https://issues.apache.org/jira/browse/FLINK-10711
> Project: Flink
>  Issue Type: Bug
>  Components: E2E Tests
>Affects Versions: 1.7.0
>Reporter: Piotr Nowojski
>Assignee: Hequn Cheng
>Priority: Blocker
> Fix For: 1.7.0
>
>
> Because they are written in bash and they are not setting
> {code:bash}
> set -e
> {code}
> at the beginning, errors can be swallowed silently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10721) kafkaFetcher runFetchLoop throw exception will cause follow-up code not execute in FlinkKafkaConsumerBase run method

2018-11-01 Thread Till Rohrmann (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-10721:
--
Fix Version/s: 1.7.0
   1.6.3

> kafkaFetcher runFetchLoop throw exception will cause follow-up code not 
> execute in FlinkKafkaConsumerBase run method 
> -
>
> Key: FLINK-10721
> URL: https://issues.apache.org/jira/browse/FLINK-10721
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.6.2
>Reporter: zhaoshijie
>Priority: Major
> Fix For: 1.6.3, 1.7.0
>
>
> In FlinkKafkaConsumerBase run method on line 721(master branch), if 
> kafkaFetcher.runFetchLoop() throw exception(by discoveryLoopThread throw 
> exception then finally execute cancel method, cancel method will execute 
> kafkaFetcher.cancel, this implemented Kafka09Fetcher will execute 
> handover.close, then result in handover.pollNext throw ClosedException),then 
> next code will not execute,especially discoveryLoopError not be throwed,so, 
> real culprit exception will be Swallowed.
> failed log like this:
> {code:java}
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$10.apply$mcV$sp(JobManager.scala:1229)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$10.apply(JobManager.scala:1172)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$10.apply(JobManager.scala:1172)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: 
> org.apache.flink.streaming.connectors.kafka.internal.Handover$ClosedException
>   at 
> org.apache.flink.streaming.connectors.kafka.internal.Handover.close(Handover.java:180)
>   at 
> org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.cancel(Kafka09Fetcher.java:174)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.cancel(FlinkKafkaConsumerBase.java:753)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase$2.run(FlinkKafkaConsumerBase.java:695)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> Shoud we modify it as follows?
> {code:java}
> try {
>   kafkaFetcher.runFetchLoop();
>   } catch (Exception e) {
>   // if discoveryLoopErrorRef not null ,we should 
> throw real culprit exception
>   if (discoveryLoopErrorRef.get() != null){
>   throw new 
> RuntimeException(discoveryLoopErrorRef.get());
>   } else {
>   throw e;
>   }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10734) Temporal joins on heavily filtered tables might fail in planning

2018-11-01 Thread Till Rohrmann (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671862#comment-16671862
 ] 

Till Rohrmann commented on FLINK-10734:
---

Do we want to fix this for {{1.7.0}} [~pnowojski]?

> Temporal joins on heavily filtered tables might fail in planning
> 
>
> Key: FLINK-10734
> URL: https://issues.apache.org/jira/browse/FLINK-10734
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.7.0
>Reporter: Piotr Nowojski
>Priority: Major
>
> Following query:
> {code}
> val sqlQuery =
>   """
> |SELECT
> |  o.amount * r.rate AS amount
> |FROM
> |  Orders AS o,
> |  LATERAL TABLE (Rates(o.rowtime)) AS r
> |WHERE r.currency = o.currency
> |""".stripMargin
> {code}
> with {{Rates}} defined as follows:
> {code}
> tEnv.registerTable("EuroRatesHistory", 
> tEnv.scan("RatesHistory").filter('currency === "Euro"))
> tEnv.registerFunction(
>   "Rates",
>   tEnv.scan("EuroRatesHistory").createTemporalTableFunction('rowtime, 
> 'currency))
> {code}
> Will fail with:
> {noformat}
> org.apache.flink.table.api.ValidationException: Only single column join key 
> is supported. Found [] in [InnerJoin(where: 
> (__TEMPORAL_JOIN_CONDITION(rowtime, rowtime0, currency)), join: (amount, 
> rowtime, currency, rate, rowtime0))]
>  at 
> org.apache.flink.table.plan.nodes.datastream.DataStreamTemporalJoinToCoProcessTranslator$TemporalJoinConditionExtractor.validateRightPrimaryKey(DataStreamTemporalJoinToCoProcessTranslator.scala:215)
>  at 
> org.apache.flink.table.plan.nodes.datastream.DataStreamTemporalJoinToCoProcessTranslator$TemporalJoinConditionExtractor.visitCall(DataStreamTemporalJoinToCoProcessTranslator.scala:183)
>  at 
> org.apache.flink.table.plan.nodes.datastream.DataStreamTemporalJoinToCoProcessTranslator$TemporalJoinConditionExtractor.visitCall(DataStreamTemporalJoinToCoProcessTranslator.scala:152)
> {noformat}
> The problem is that filtering condition {{('currency === "Euro")}} interferes 
> with joining condition, simplifying it to nothing. Note how top 
> {{LogicalFilter(condition=[=($3, $1)])}} changes during optimising and 
> finally disappears:
> {noformat}
> LogicalProject(amount=[*($0, $4)])
>   LogicalFilter(condition=[=($3, $1)])
> LogicalTemporalTableJoin(condition=[__TEMPORAL_JOIN_CONDITION($2, $5, 
> $3)], joinType=[inner])
>   LogicalTableScan(table=[[_DataStreamTable_0]])
>   LogicalFilter(condition=[=($0, _UTF-16LE'Euro')])
> LogicalTableScan(table=[[_DataStreamTable_1]])
> {noformat}
> {noformat}
> LogicalProject(amount=[*($0, $4)])
>   LogicalFilter(condition=[=(_UTF-16LE'Euro', $1)])
> LogicalProject(amount=[$0], currency=[$1], rowtime=[$2], currency0=[$3], 
> rate=[$4], rowtime0=[CAST($5):TIMESTAMP(3) NOT NULL])
>   LogicalTemporalTableJoin(condition=[__TEMPORAL_JOIN_CONDITION($2, $5, 
> $3)], joinType=[inner])
> LogicalTableScan(table=[[_DataStreamTable_0]])
> LogicalFilter(condition=[=($0, _UTF-16LE'Euro')])
>   LogicalTableScan(table=[[_DataStreamTable_1]])
> {noformat}
> {noformat}
> FlinkLogicalCalc(expr#0..4=[{inputs}], expr#5=[*($t0, $t3)], amount=[$t5])
>   FlinkLogicalTemporalTableJoin(condition=[__TEMPORAL_JOIN_CONDITION($1, $4, 
> $2)], joinType=[inner])
> FlinkLogicalCalc(expr#0..2=[{inputs}], expr#3=[_UTF-16LE'Euro'], 
> expr#4=[=($t3, $t1)], amount=[$t0], rowtime=[$t2], $condition=[$t4])
>   FlinkLogicalNativeTableScan(table=[[_DataStreamTable_0]])
> FlinkLogicalCalc(expr#0..2=[{inputs}], expr#3=[_UTF-16LE'Euro'], 
> expr#4=[=($t0, $t3)], proj#0..2=[{exprs}], $condition=[$t4])
>   FlinkLogicalNativeTableScan(table=[[_DataStreamTable_1]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10735) flink on yarn close container exception

2018-11-01 Thread Till Rohrmann (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671861#comment-16671861
 ] 

Till Rohrmann commented on FLINK-10735:
---

Thanks for reporting this issue [~ffjl1985]. Could you please post the complete 
log for further debugging?

> flink on yarn close container exception
> ---
>
> Key: FLINK-10735
> URL: https://issues.apache.org/jira/browse/FLINK-10735
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager, TaskManager, YARN
>Affects Versions: 1.6.2
> Environment: Hadoop 2.7
> flink 1.6.2
>Reporter: Fei Feng
>Priority: Critical
>  Labels: yarn
>
> flink on yarn with detached mode, when cancle flink job，yarn resource release 
> very slow！
> if job failed and continouslly restart ， it will get more and more container 
> until the resource is  used up。
> Log：
> 18/10/31 19:43:59 INFO executiongraph.ExecutionGraph: Job 
> 32F01A0FC50EFE8F4794AD0C45678EC4: xxx switched from state RUNNING to 
> CANCELLING.
>  18/10/31 19:43:59 INFO executiongraph.ExecutionGraph: Source: 
> Kafka09TableSource(SS_SOLD_DATE_SK, SS_SOLD_TIME_SK, SS_ITEM_SK, 
> SS_CUSTOMER_SK, SS_CDEMO_SK, SS_HDEMO_SK, SS_ADDR_SK, SS_STORE_SK, 
> SS_PROMO_SK, SS_TICKET_NUMBER, SS_QUANTITY, SS_WHOLESALE_COST, SS_LIST_PRICE, 
> SS_SALES_PRICE, SS_EXT_DISCOUNT_AMT, SS_EXT_SALES_PRICE, 
> SS_EXT_WHOLESALE_COST, SS_EXT_LIST_PRICE, SS_EXT_TAX, SS_COUPON_AMT, 
> SS_NET_PAID, SS_NET_PAID_INC_TAX, SS_NET_PROFIT, ROWTIME) -> from: 
> (SS_WHOLESALE_COST, SS_SALES_PRICE, SS_COUPON_AMT, SS_NET_PAID, 
> SS_NET_PROFIT, ROWTIME) -> Timestamps/Watermarks -> where: (>(SS_COUPON_AMT, 
> 0)), select: (ROWTIME, SS_WHOLESALE_COST, SS_SALES_PRICE, SS_NET_PAID, 
> SS_NET_PROFIT) -> time attribute: (ROWTIME) (1/3) 
> (0807b5f291f897ac4545dbfdb8ec3448) switched from RUNNING to CANCELING.
>  18/10/31 19:43:59 INFO executiongraph.ExecutionGraph: Source: 
> Kafka09TableSource(SS_SOLD_DATE_SK, SS_SOLD_TIME_SK, SS_ITEM_SK, 
> SS_CUSTOMER_SK, SS_CDEMO_SK, SS_HDEMO_SK, SS_ADDR_SK, SS_STORE_SK, 
> SS_PROMO_SK, SS_TICKET_NUMBER, SS_QUANTITY, SS_WHOLESALE_COST, SS_LIST_PRICE, 
> SS_SALES_PRICE, SS_EXT_DISCOUNT_AMT, SS_EXT_SALES_PRICE, 
> SS_EXT_WHOLESALE_COST, SS_EXT_LIST_PRICE, SS_EXT_TAX, SS_COUPON_AMT, 
> SS_NET_PAID, SS_NET_PAID_INC_TAX, SS_NET_PROFIT, ROWTIME) -> from: 
> (SS_WHOLESALE_COST, SS_SALES_PRICE, SS_COUPON_AMT, SS_NET_PAID, 
> SS_NET_PROFIT, ROWTIME) -> Timestamps/Watermarks -> where: (>(SS_COUPON_AMT, 
> 0)), select: (ROWTIME, SS_WHOLESALE_COST, SS_SALES_PRICE, SS_NET_PAID, 
> SS_NET_PROFIT) -> time attribute: (ROWTIME) (2/3) 
> (a56a70eb6807dacf18fbf272ee6160e2) switched from RUNNING to CANCELING.
>  18/10/31 19:43:59 INFO executiongraph.ExecutionGraph: Source: 
> Kafka09TableSource(SS_SOLD_DATE_SK, SS_SOLD_TIME_SK, SS_ITEM_SK, 
> SS_CUSTOMER_SK, SS_CDEMO_SK, SS_HDEMO_SK, SS_ADDR_SK, SS_STORE_SK, 
> SS_PROMO_SK, SS_TICKET_NUMBER, SS_QUANTITY, SS_WHOLESALE_COST, SS_LIST_PRICE, 
> SS_SALES_PRICE, SS_EXT_DISCOUNT_AMT, SS_EXT_SALES_PRICE, 
> SS_EXT_WHOLESALE_COST, SS_EXT_LIST_PRICE, SS_EXT_TAX, SS_COUPON_AMT, 
> SS_NET_PAID, SS_NET_PAID_INC_TAX, SS_NET_PROFIT, ROWTIME) -> from: 
> (SS_WHOLESALE_COST, SS_SALES_PRICE, SS_COUPON_AMT, SS_NET_PAID, 
> SS_NET_PROFIT, ROWTIME) -> Timestamps/Watermarks -> where: (>(SS_COUPON_AMT, 
> 0)), select: (ROWTIME, SS_WHOLESALE_COST, SS_SALES_PRICE, SS_NET_PAID, 
> SS_NET_PROFIT) -> time attribute: (ROWTIME) (3/3) 
> (a2eca3dc06087dfdaf3fd0200e545cc4) switched from RUNNING to CANCELING.
>  18/10/31 19:43:59 INFO executiongraph.ExecutionGraph: window: 
> (TumblingGroupWindow('w$, 'ROWTIME, 360.millis)), select: 
> (SUM(SS_WHOLESALE_COST) AS EXPR$1, SUM(SS_SALES_PRICE) AS EXPR$2, 
> SUM(SS_NET_PAID) AS EXPR$3, SUM(SS_NET_PROFIT) AS EXPR$4, start('w$) AS 
> w$start, end('w$) AS w$end, rowtime('w$) AS w$rowtime, proctime('w$) AS 
> w$proctime) -> where: (>(EXPR$4, 1000)), select: (CAST(w$start) AS WSTART, 
> EXPR$1, EXPR$2, EXPR$3, EXPR$4) -> to: Row (1/1) 
> (9fa3592c74eda97124a033b6afea6c87) switched from RUNNING to CANCELING.
>  18/10/31 19:43:59 INFO executiongraph.ExecutionGraph: Sink: 
> JDBCAppendTableSink(RS_DAY, RS_WHOLESALE_COST, RS_SALES_PRICE, RS_NET_PAID, 
> RS_NET_PROFIT) (1/3) (6d4f413ace1206714566b610e5bf47b6) switched from RUNNING 
> to CANCELING.
>  18/10/31 19:43:59 INFO executiongraph.ExecutionGraph: Sink: 
> JDBCAppendTableSink(RS_DAY, RS_WHOLESALE_COST, RS_SALES_PRICE, RS_NET_PAID, 
> RS_NET_PROFIT) (2/3) (8871efc7c19abae7f833e02aa1d2107d) switched from RUNNING 
> to CANCELING.
>  18/10/31 19:43:59 INFO executiongraph.ExecutionGraph: Sink: 
> JDBCAppendTableSink(RS_DAY, RS_WHOLESALE_COST, RS_SALES_PRICE, RS_NET_PAID, 
> RS_NET_PROFIT) (3/3) (fcaf0b9247d60a31e3381d70871d929f) switched from RUNNING 
> to CANCELING.
>  18/10/31 19:4

[jira] [Updated] (FLINK-10751) Checkpoints should be retained when job reaches suspended state

2018-11-01 Thread Till Rohrmann (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-10751:
--
Fix Version/s: 1.8.0

> Checkpoints should be retained when job reaches suspended state
> ---
>
> Key: FLINK-10751
> URL: https://issues.apache.org/jira/browse/FLINK-10751
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2, 1.7.0
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Minor
> Fix For: 1.7.0
>
>
> {{CheckpointProperties}} define in which terminal job status a checkpoint 
> should be disposed.
> I've noticed that the properties for {{CHECKPOINT_NEVER_RETAINED}}, 
> {{CHECKPOINT_RETAINED_ON_FAILURE}} prescribe checkpoint disposal in (locally) 
> terminal job status {{SUSPENDED}}.
> Since a job reaches the {{SUSPENDED}} state when its {{JobMaster}} looses 
> leadership, this would result in the checkpoint to be cleaned up and not 
> being available for recovery by the new leader. Therefore, we should rather 
> retain checkpoints when reaching job status {{SUSPENDED}}.
> *BUT:* Because we special case this terminal state in the only highly 
> available {{CompletedCheckpointStore}} implementation (see 
> [ZooKeeperCompletedCheckpointStore|https://github.com/apache/flink/blob/e7ac3ba/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/ZooKeeperCompletedCheckpointStore.java#L315])
>  and don't use regular checkpoint disposal, this issue has not surfaced yet.
> I think we should proactively fix the properties to indicate to retain 
> checkpoints in {{SUSPENDED}} state. We might actually completely remove this 
> case since with this change, all properties will indicate to retain on 
> suspension.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10751) Checkpoints should be retained when job reaches suspended state

2018-11-01 Thread Till Rohrmann (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-10751:
--
Fix Version/s: (was: 1.8.0)
   1.7.0

> Checkpoints should be retained when job reaches suspended state
> ---
>
> Key: FLINK-10751
> URL: https://issues.apache.org/jira/browse/FLINK-10751
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.2, 1.7.0
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Minor
> Fix For: 1.7.0
>
>
> {{CheckpointProperties}} define in which terminal job status a checkpoint 
> should be disposed.
> I've noticed that the properties for {{CHECKPOINT_NEVER_RETAINED}}, 
> {{CHECKPOINT_RETAINED_ON_FAILURE}} prescribe checkpoint disposal in (locally) 
> terminal job status {{SUSPENDED}}.
> Since a job reaches the {{SUSPENDED}} state when its {{JobMaster}} looses 
> leadership, this would result in the checkpoint to be cleaned up and not 
> being available for recovery by the new leader. Therefore, we should rather 
> retain checkpoints when reaching job status {{SUSPENDED}}.
> *BUT:* Because we special case this terminal state in the only highly 
> available {{CompletedCheckpointStore}} implementation (see 
> [ZooKeeperCompletedCheckpointStore|https://github.com/apache/flink/blob/e7ac3ba/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/ZooKeeperCompletedCheckpointStore.java#L315])
>  and don't use regular checkpoint disposal, this issue has not surfaced yet.
> I think we should proactively fix the properties to indicate to retain 
> checkpoints in {{SUSPENDED}} state. We might actually completely remove this 
> case since with this change, all properties will indicate to retain on 
> suspension.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10743) Use 0 processExitCode for ApplicationStatus.CANCELED

2018-11-01 Thread Till Rohrmann (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-10743:
--
Fix Version/s: 1.7.0

> Use 0 processExitCode for ApplicationStatus.CANCELED
> 
>
> Key: FLINK-10743
> URL: https://issues.apache.org/jira/browse/FLINK-10743
> Project: Flink
>  Issue Type: Bug
>  Components: Cluster Management, Kubernetes, Mesos, YARN
>Affects Versions: 1.6.3, 1.7.0
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Minor
> Fix For: 1.7.0
>
>
> {{org.apache.flink.runtime.clusterframework.ApplicationStatus}} is used to 
> map {{org.apache.flink.runtime.jobgraph.JobStatus}} to a process exit code.
> We currently map {{ApplicationStatus.CANCELED}} to a non-zero exit code 
> ({{1444}}). Since cancellation is a user-triggered operation I would consider 
> this to be a successful exit and map it to exit code {{0}}.
> Our current behavior results in applications running via the 
> {{StandaloneJobClusterEntryPoint}} and Kubernetes pods as documented in 
> [flink-container|https://github.com/apache/flink/tree/master/flink-container/kubernetes]
>  to be immediately restarted when cancelled. This only leaves the option of 
> killing the respective job cluster master container.
> The {{ApplicationStatus}} is also used in the YARN and Mesos clients, but I'm 
> not familiar with that part of the code base and can't asses how changing the 
> exit code would affect these clients. A quick usage scan for 
> {{ApplicationStatus.CANCELED}} did not surface any problematic usages though.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10742) Let Netty use Flink's buffers directly in credit-based mode

2018-11-01 Thread Till Rohrmann (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-10742:
--
Fix Version/s: 1.8.0

> Let Netty use Flink's buffers directly in credit-based mode
> ---
>
> Key: FLINK-10742
> URL: https://issues.apache.org/jira/browse/FLINK-10742
> Project: Flink
>  Issue Type: Sub-task
>  Components: Network
>Affects Versions: 1.7.0
>Reporter: Nico Kruber
>Priority: Major
> Fix For: 1.8.0
>
>
> For credit-based flow control, we always have buffers available for data that 
> is sent to use. We could thus use them directly and not copy the network 
> stream into Netty buffers first and then into our buffers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 4 >

1 - 100 of 387 matches

Mail list logo