[jira] [Commented] (FLINK-10702) Yarn app is not killed when scala shell is terminated
[ https://issues.apache.org/jira/browse/FLINK-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665904#comment-16665904 ] ASF GitHub Bot commented on FLINK-10702: TisonKun commented on issue #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated URL: https://github.com/apache/flink/pull/6948#issuecomment-433589505 Oh sorry, now we base on `RestClusterClient` thus what I mention above is not a problem. Sorry for burdening. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Yarn app is not killed when scala shell is terminated > - > > Key: FLINK-10702 > URL: https://issues.apache.org/jira/browse/FLINK-10702 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.6.1 >Reporter: Jeff Zhang >Assignee: Jeff Zhang >Priority: Minor > Labels: pull-request-available > > When I quit scala shell in yarn mode, yarn app is not killed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] TisonKun commented on issue #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated
TisonKun commented on issue #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated URL: https://github.com/apache/flink/pull/6948#issuecomment-433589505 Oh sorry, now we base on `RestClusterClient` thus what I mention above is not a problem. Sorry for burdening. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10702) Yarn app is not killed when scala shell is terminated
[ https://issues.apache.org/jira/browse/FLINK-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665901#comment-16665901 ] ASF GitHub Bot commented on FLINK-10702: TisonKun commented on issue #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated URL: https://github.com/apache/flink/pull/6948#issuecomment-433589159 Looks reasonable. However, recently I am working with FLINK-10392 "Remove legacy mode" and I find `YarnClusterClient` is used by so called legacy mode. A FLIP-6 replacement of it could be `FlinkYarnSessionCli` or `RestClusterClient`. I might prefer to refactor Scala Shell to base on the new client. But well, if we want a hotfix *for now*, this pull request does the job. cc @tillrohrmann @GJL This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Yarn app is not killed when scala shell is terminated > - > > Key: FLINK-10702 > URL: https://issues.apache.org/jira/browse/FLINK-10702 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.6.1 >Reporter: Jeff Zhang >Assignee: Jeff Zhang >Priority: Minor > Labels: pull-request-available > > When I quit scala shell in yarn mode, yarn app is not killed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] TisonKun commented on issue #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated
TisonKun commented on issue #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated URL: https://github.com/apache/flink/pull/6948#issuecomment-433589159 Looks reasonable. However, recently I am working with FLINK-10392 "Remove legacy mode" and I find `YarnClusterClient` is used by so called legacy mode. A FLIP-6 replacement of it could be `FlinkYarnSessionCli` or `RestClusterClient`. I might prefer to refactor Scala Shell to base on the new client. But well, if we want a hotfix *for now*, this pull request does the job. cc @tillrohrmann @GJL This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-9718) Add enviroment variable in start-scala-shell.sh & flink to enable remote debug
[ https://issues.apache.org/jira/browse/FLINK-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665879#comment-16665879 ] ASF GitHub Bot commented on FLINK-9718: --- zjffdu commented on issue #6245: [FLINK-9718][scala-shell]. Add enviroment variable in start-scala-shell.sh and flink to enable remote debug URL: https://github.com/apache/flink/pull/6245#issuecomment-433587399 Sorry for late response, @yanghua do you have more comments ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add enviroment variable in start-scala-shell.sh & flink to enable remote debug > -- > > Key: FLINK-9718 > URL: https://issues.apache.org/jira/browse/FLINK-9718 > Project: Flink > Issue Type: Improvement >Reporter: Jeff Zhang >Assignee: Jeff Zhang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zjffdu commented on issue #6245: [FLINK-9718][scala-shell]. Add enviroment variable in start-scala-shell.sh and flink to enable remote debug
zjffdu commented on issue #6245: [FLINK-9718][scala-shell]. Add enviroment variable in start-scala-shell.sh and flink to enable remote debug URL: https://github.com/apache/flink/pull/6245#issuecomment-433587399 Sorry for late response, @yanghua do you have more comments ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-10702) Yarn app is not killed when scala shell is terminated
[ https://issues.apache.org/jira/browse/FLINK-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-10702: --- Labels: pull-request-available (was: ) > Yarn app is not killed when scala shell is terminated > - > > Key: FLINK-10702 > URL: https://issues.apache.org/jira/browse/FLINK-10702 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.6.1 >Reporter: Jeff Zhang >Assignee: Jeff Zhang >Priority: Minor > Labels: pull-request-available > > When I quit scala shell in yarn mode, yarn app is not killed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10702) Yarn app is not killed when scala shell is terminated
[ https://issues.apache.org/jira/browse/FLINK-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665868#comment-16665868 ] ASF GitHub Bot commented on FLINK-10702: zjffdu opened a new pull request #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated URL: https://github.com/apache/flink/pull/6948 ## What is the purpose of the change This PR is trying to fix issue that yarn app is not killed when scala shell is terminated. ## Brief change log It is a trivial change that just call `yarnCluster.shutDownCluster()` to kill the yarn app. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. I also manually tested it. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Yarn app is not killed when scala shell is terminated > - > > Key: FLINK-10702 > URL: https://issues.apache.org/jira/browse/FLINK-10702 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.6.1 >Reporter: Jeff Zhang >Assignee: Jeff Zhang >Priority: Minor > Labels: pull-request-available > > When I quit scala shell in yarn mode, yarn app is not killed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zjffdu opened a new pull request #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated
zjffdu opened a new pull request #6948: [FLINK-10702][scala-shell] Yarn app is not killed when scala shell is terminated URL: https://github.com/apache/flink/pull/6948 ## What is the purpose of the change This PR is trying to fix issue that yarn app is not killed when scala shell is terminated. ## Brief change log It is a trivial change that just call `yarnCluster.shutDownCluster()` to kill the yarn app. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. I also manually tested it. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-10702) Yarn app is not killed when scala shell is terminated
[ https://issues.apache.org/jira/browse/FLINK-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang reassigned FLINK-10702: -- Assignee: Jeff Zhang > Yarn app is not killed when scala shell is terminated > - > > Key: FLINK-10702 > URL: https://issues.apache.org/jira/browse/FLINK-10702 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.6.1 >Reporter: Jeff Zhang >Assignee: Jeff Zhang >Priority: Minor > > When I quit scala shell in yarn mode, yarn app is not killed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-10702) Yarn app is not killed when scala shell is terminated
Jeff Zhang created FLINK-10702: -- Summary: Yarn app is not killed when scala shell is terminated Key: FLINK-10702 URL: https://issues.apache.org/jira/browse/FLINK-10702 Project: Flink Issue Type: Bug Components: Scala Shell Affects Versions: 1.6.1 Reporter: Jeff Zhang When I quit scala shell in yarn mode, yarn app is not killed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9697) Provide connector for modern Kafka
[ https://issues.apache.org/jira/browse/FLINK-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665833#comment-16665833 ] ASF GitHub Bot commented on FLINK-9697: --- yanghua commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka URL: https://github.com/apache/flink/pull/6703#issuecomment-433583162 @pnowojski Yes, I have created [FLINK-10701](https://issues.apache.org/jira/browse/FLINK-10701) to track this problem. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide connector for modern Kafka > -- > > Key: FLINK-9697 > URL: https://issues.apache.org/jira/browse/FLINK-9697 > Project: Flink > Issue Type: Improvement >Reporter: Ted Yu >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > Kafka 2.0.0 would be released soon. > Here is vote thread: > [http://search-hadoop.com/m/Kafka/uyzND1vxnEd23QLxb?subj=+VOTE+2+0+0+RC1] > We should provide connector for Kafka 2.0.0 once it is released. > Upgrade to 2.0 documentation : > http://kafka.apache.org/20/documentation.html#upgrade_2_0_0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka
yanghua commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka URL: https://github.com/apache/flink/pull/6703#issuecomment-433583162 @pnowojski Yes, I have created [FLINK-10701](https://issues.apache.org/jira/browse/FLINK-10701) to track this problem. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-10701) Move modern kafka connector module into connector profile
[ https://issues.apache.org/jira/browse/FLINK-10701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated FLINK-10701: - Description: The modern connector is run in the {{misc}} profile since it wasn't properly added to the {{connector profile in stage.sh click [here|https://github.com/apache/flink/pull/6890#issuecomment-431917344] to see more details.}} (was: The modern connector is run in the {{misc}} profile since it wasn't properly added to the {{connector}}profile in {{stage.sh click [here|https://github.com/apache/flink/pull/6890#issuecomment-431917344] to see more details.}}) > Move modern kafka connector module into connector profile > -- > > Key: FLINK-10701 > URL: https://issues.apache.org/jira/browse/FLINK-10701 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector, Tests >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > The modern connector is run in the {{misc}} profile since it wasn't properly > added to the {{connector profile in stage.sh click > [here|https://github.com/apache/flink/pull/6890#issuecomment-431917344] to > see more details.}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-10701) Move modern kafka connector module into connector profile
[ https://issues.apache.org/jira/browse/FLINK-10701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated FLINK-10701: - Description: The modern connector is run in the {{misc}} profile since it wasn't properly added to the {{connector}}profile in {{stage.sh click [here|https://github.com/apache/flink/pull/6890#issuecomment-431917344] to see more details.}} > Move modern kafka connector module into connector profile > -- > > Key: FLINK-10701 > URL: https://issues.apache.org/jira/browse/FLINK-10701 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector, Tests >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > The modern connector is run in the {{misc}} profile since it wasn't properly > added to the {{connector}}profile in {{stage.sh click > [here|https://github.com/apache/flink/pull/6890#issuecomment-431917344] to > see more details.}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-10701) Move modern kafka connector module into connector profile
vinoyang created FLINK-10701: Summary: Move modern kafka connector module into connector profile Key: FLINK-10701 URL: https://issues.apache.org/jira/browse/FLINK-10701 Project: Flink Issue Type: Sub-task Components: Kafka Connector, Tests Reporter: vinoyang Assignee: vinoyang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] TisonKun opened a new pull request #6947: [hotfix] [tests] refactor YARNSessionCapacitySchedulerITCase
TisonKun opened a new pull request #6947: [hotfix] [tests] refactor YARNSessionCapacitySchedulerITCase URL: https://github.com/apache/flink/pull/6947 ## What is the purpose of the change refactor `YARNSessionCapacitySchedulerITCase` to get rid of dependency of `JobClient`. however, I think the following checkers are invalid, if confirmed, we can remove it or try to fix it(where these strings occur?) ```java /* prohibited strings: (we want to see "DataSink (...) (2/2) switched to FINISHED") */ new String[]{"DataSink \\(.*\\) \\(1/1\\) switched to FINISHED"}, ``` ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (**no**) - The public API, i.e., is any changed class annotated with `(Evolving)`: (**no**) - The serializers: (**no**) - The runtime per-record code paths (performance sensitive): (**no**) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (**no**) - The S3 file system connector: (**no**) ## Documentation - Does this pull request introduce a new feature? (**no**) - If yes, how is the feature documented? (**not applicable**) cc @zentol @aljoscha This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-10700) Remove LegacyStandaloneClusterDescriptor
[ https://issues.apache.org/jira/browse/FLINK-10700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-10700: --- Labels: pull-request-available (was: ) > Remove LegacyStandaloneClusterDescriptor > > > Key: FLINK-10700 > URL: https://issues.apache.org/jira/browse/FLINK-10700 > Project: Flink > Issue Type: Sub-task > Components: Cluster Management >Affects Versions: 1.7.0 >Reporter: TisonKun >Assignee: TisonKun >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > Remove legacy {{LegacyStandaloneClusterDescriptor}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10700) Remove LegacyStandaloneClusterDescriptor
[ https://issues.apache.org/jira/browse/FLINK-10700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665823#comment-16665823 ] ASF GitHub Bot commented on FLINK-10700: TisonKun opened a new pull request #6946: [FLINK-10700] [cluster managerment] Remove LegacyStandaloneClusterDes… URL: https://github.com/apache/flink/pull/6946 …criptor ## What is the purpose of the change Remove legacy `LegacyStandaloneClusterDescriptor`. No any other use point and we don't maintain it any more, remove directly. By the way, I meet this removal when rush into `StandaloneClusterClient`. If `StandaloneClusterClient` if for legacy usage, I can try to get rid of it. Its name is quite confusing that I am not 100% sure it is not used any more. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (**no**) - The public API, i.e., is any changed class annotated with `(Evolving)`: (**no**) - The serializers: (**no**) - The runtime per-record code paths (performance sensitive): (**no**) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (**no**) - The S3 file system connector: (**no**) ## Documentation - Does this pull request introduce a new feature? (**no**) - If yes, how is the feature documented? (**not applicable**) cc @zentol @tillrohrmann This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove LegacyStandaloneClusterDescriptor > > > Key: FLINK-10700 > URL: https://issues.apache.org/jira/browse/FLINK-10700 > Project: Flink > Issue Type: Sub-task > Components: Cluster Management >Affects Versions: 1.7.0 >Reporter: TisonKun >Assignee: TisonKun >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > Remove legacy {{LegacyStandaloneClusterDescriptor}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] TisonKun opened a new pull request #6946: [FLINK-10700] [cluster managerment] Remove LegacyStandaloneClusterDes…
TisonKun opened a new pull request #6946: [FLINK-10700] [cluster managerment] Remove LegacyStandaloneClusterDes… URL: https://github.com/apache/flink/pull/6946 …criptor ## What is the purpose of the change Remove legacy `LegacyStandaloneClusterDescriptor`. No any other use point and we don't maintain it any more, remove directly. By the way, I meet this removal when rush into `StandaloneClusterClient`. If `StandaloneClusterClient` if for legacy usage, I can try to get rid of it. Its name is quite confusing that I am not 100% sure it is not used any more. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (**no**) - The public API, i.e., is any changed class annotated with `(Evolving)`: (**no**) - The serializers: (**no**) - The runtime per-record code paths (performance sensitive): (**no**) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (**no**) - The S3 file system connector: (**no**) ## Documentation - Does this pull request introduce a new feature? (**no**) - If yes, how is the feature documented? (**not applicable**) cc @zentol @tillrohrmann This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-10700) Remove LegacyStandaloneClusterDescriptor
TisonKun created FLINK-10700: Summary: Remove LegacyStandaloneClusterDescriptor Key: FLINK-10700 URL: https://issues.apache.org/jira/browse/FLINK-10700 Project: Flink Issue Type: Sub-task Components: Cluster Management Affects Versions: 1.7.0 Reporter: TisonKun Assignee: TisonKun Fix For: 1.7.0 Remove legacy {{LegacyStandaloneClusterDescriptor}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-10699) Create a catalog implementation for persistent Flink meta objects using Hive metastore as a registry
Xuefu Zhang created FLINK-10699: --- Summary: Create a catalog implementation for persistent Flink meta objects using Hive metastore as a registry Key: FLINK-10699 URL: https://issues.apache.org/jira/browse/FLINK-10699 Project: Flink Issue Type: Sub-task Components: Table API & SQL Affects Versions: 1.6.1 Reporter: Xuefu Zhang Similar to FLINK-10697, but using Hive metastore as persistent storage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-10698) Create CatalogManager class manages all external catalogs and temporary meta objects
Xuefu Zhang created FLINK-10698: --- Summary: Create CatalogManager class manages all external catalogs and temporary meta objects Key: FLINK-10698 URL: https://issues.apache.org/jira/browse/FLINK-10698 Project: Flink Issue Type: Sub-task Components: Table API & SQL Affects Versions: 1.6.1 Reporter: Xuefu Zhang Currently {{TableEnvironment}} manages a list of registered external catalogs as well as in-memory meta objects, and interacts with Calcite schema. It would be cleaner to delegate all those responsibilities to a dedicate class, especially when Flink's meta objects are also stored in a catalog. {{CatalogManager}} is responsible to manage all meta objects, including external catalogs, temporary meta objects, and Calcite schema. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-10697) Create an in-memory catalog that stores Flink's meta objects
Xuefu Zhang created FLINK-10697: --- Summary: Create an in-memory catalog that stores Flink's meta objects Key: FLINK-10697 URL: https://issues.apache.org/jira/browse/FLINK-10697 Project: Flink Issue Type: Sub-task Components: Table API & SQL Affects Versions: 1.6.1 Reporter: Xuefu Zhang Currently all Flink meta objects (currently tables only) are stored in memory as part of Calcite catalog. Those objects are temporary (such as inline tables), others are meant to live beyond user session. As we introduce catalog for those objects (tables, views, and UDFs), it makes sense to organize them neatly. Further, having a catalog implementation that store those objects in memory is to retain the currently behavior, which can be configured by user. Please note that this implementation is different from the current {{InMemoryExternalCatalog}, which is used mainly for testing and doesn't reflect what's actually needed for Flink meta objects. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs
[ https://issues.apache.org/jira/browse/FLINK-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated FLINK-10696: Description: Currently there are APIs for tables only. However, views and UDFs are also common objects in a catalog. This is required when we store Flink tables/views/UDFs in an external persistent storage. was:Currently there are APIs for tables only. However, views and UDFs are also common objects in a catalog. > Add APIs to ExternalCatalog for views and UDFs > -- > > Key: FLINK-10696 > URL: https://issues.apache.org/jira/browse/FLINK-10696 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Affects Versions: 1.6.1 >Reporter: Xuefu Zhang >Priority: Major > > Currently there are APIs for tables only. However, views and UDFs are also > common objects in a catalog. > This is required when we store Flink tables/views/UDFs in an external > persistent storage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs
Xuefu Zhang created FLINK-10696: --- Summary: Add APIs to ExternalCatalog for views and UDFs Key: FLINK-10696 URL: https://issues.apache.org/jira/browse/FLINK-10696 Project: Flink Issue Type: Sub-task Components: Table API & SQL Affects Versions: 1.6.1 Reporter: Xuefu Zhang Currently there are APIs for tables only. However, views and UDFs are also common objects in a catalog. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10695) "Cannot load user class" errors should set the underlying ClassNotFoundException as their cause
[ https://issues.apache.org/jira/browse/FLINK-10695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665713#comment-16665713 ] ASF GitHub Bot commented on FLINK-10695: A412 opened a new pull request #6945: [FLINK-10695] Cannot load user class errors set underlying cause URL: https://github.com/apache/flink/pull/6945 ## What is the purpose of the change * This pull request improves the traceability of "Cannot load user class" exceptions. ## Brief change log * Set the underlying `ClassNotFoundException`s as the cause of the resulting `StreamTaskException`s. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / *no*) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / *no*) - The serializers: (yes / *no* / don't know) - The runtime per-record code paths (performance sensitive): (yes / *no* / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / *no* / don't know) - The S3 file system connector: (yes / *no* / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / *no*) - If yes, how is the feature documented? (*not applicable* / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > "Cannot load user class" errors should set the underlying > ClassNotFoundException as their cause > --- > > Key: FLINK-10695 > URL: https://issues.apache.org/jira/browse/FLINK-10695 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Max Feng >Priority: Major > Labels: pull-request-available > > In {{StreamConfig#getStreamOperator}}, {{ClassNotFoundException}} is > propagated as {{StreamTaskException}} but does not set the underlying > {{ClassNotFoundException}} as the cause of the {{StreamTaskException}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-10695) "Cannot load user class" errors should set the underlying ClassNotFoundException as their cause
[ https://issues.apache.org/jira/browse/FLINK-10695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-10695: --- Labels: pull-request-available (was: ) > "Cannot load user class" errors should set the underlying > ClassNotFoundException as their cause > --- > > Key: FLINK-10695 > URL: https://issues.apache.org/jira/browse/FLINK-10695 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Max Feng >Priority: Major > Labels: pull-request-available > > In {{StreamConfig#getStreamOperator}}, {{ClassNotFoundException}} is > propagated as {{StreamTaskException}} but does not set the underlying > {{ClassNotFoundException}} as the cause of the {{StreamTaskException}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] A412 opened a new pull request #6945: [FLINK-10695] Cannot load user class errors set underlying cause
A412 opened a new pull request #6945: [FLINK-10695] Cannot load user class errors set underlying cause URL: https://github.com/apache/flink/pull/6945 ## What is the purpose of the change * This pull request improves the traceability of "Cannot load user class" exceptions. ## Brief change log * Set the underlying `ClassNotFoundException`s as the cause of the resulting `StreamTaskException`s. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / *no*) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / *no*) - The serializers: (yes / *no* / don't know) - The runtime per-record code paths (performance sensitive): (yes / *no* / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / *no* / don't know) - The S3 file system connector: (yes / *no* / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / *no*) - If yes, how is the feature documented? (*not applicable* / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-10694) ZooKeeperRunningJobsRegistry Cleanup
[ https://issues.apache.org/jira/browse/FLINK-10694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated FLINK-10694: - Description: When a streaming job with Zookeeper-HA enabled gets cancelled all the job-related Zookeeper nodes are not removed. Is there a reason behind that? I noticed that Zookeeper paths are created of type "Container Node" (an Ephemeral node that can have nested nodes) and fall back to Persistent node type in case Zookeeper doesn't support this sort of nodes. But anyway, it is worth removing the job Zookeeper node when a job is cancelled, isn't it? zookeeper version 3.4.10 flink version 1.6.1 # The job is deployed as a YARN cluster with the following properties set {noformat} high-availability: zookeeper high-availability.zookeeper.quorum: high-availability.zookeeper.storageDir: hdfs:/// high-availability.zookeeper.path.root: high-availability.zookeeper.path.namespace: {noformat} # The job is cancelled via flink cancel command. What I've noticed: when the job is running the following directory structure is created in zookeeper {noformat} ///leader/resource_manager_lock ///leader/rest_server_lock ///leader/dispatcher_lock ///leader/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/5c21f00b9162becf5ce25a1cf0e67cde/041 ///checkpoint-counter/5c21f00b9162becf5ce25a1cf0e67cde ///running_job_registry/5c21f00b9162becf5ce25a1cf0e67cde {noformat} when the job is cancelled some ephemeral nodes disappear, but most of them are still there: {noformat} ///leader/5c21f00b9162becf5ce25a1cf0e67cde ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/ ///checkpoint-counter/ ///running_job_registry/ {noformat} Here is the method [1] responsible for cleaning zookeeper folders up [1] which is called when the job manager has stopped [2]. And it seems it only cleans up the folder *running_job_registry*, other folders stay untouched. I suppose that everything under the *///* folder is cleaned up when the job is cancelled. [1] [https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/zookeeper/ZooKeeperRunningJobsRegistry.java#L107] [2] [https://github.com/apache/flink/blob/f087f57749004790b6f5b823d66822c36ae09927/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L332] was: When a streaming job with Zookeeper-HA enabled gets cancelled all the job-related Zookeeper nodes are not removed. Is there a reason behind that? I noticed that Zookeeper paths are created of type "Container Node" (an Ephemeral node that can have nested nodes) and fall back to Persistent node type in case Zookeeper doesn't support this sort of nodes. But anyway, it is worth removing the job Zookeeper node when a job is cancelled, isn't it? zookeeper version 3.4.10 flink version 1.6.1 # The job is deployed as a YARN cluster with the following properties set {noformat} high-availability: zookeeper high-availability.zookeeper.quorum: high-availability.zookeeper.storageDir: hdfs:/// high-availability.zookeeper.path.root: high-availability.zookeeper.path.namespace: {noformat} # The job is cancelled via flink cancel command. What I've noticed: when the job is running the following directory structure is created in zookeeper {noformat} ///leader/resource_manager_lock ///leader/rest_server_lock ///leader/dispatcher_lock ///leader/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/5c21f00b9162becf5ce25a1cf0e67cde/041 ///checkpoint-counter/5c21f00b9162becf5ce25a1cf0e67cde ///running_job_registry/5c21f00b9162becf5ce25a1cf0e67cde {noformat} when the job is cancelled the some ephemeral nodes disappear, but most of them are still there: {noformat} ///leader/5c21f00b9162becf5ce25a1cf0e67cde ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/ ///checkpoint-counter/ ///running_job_registry/ {noformat} Here is the method [1] responsible for cleaning zookeeper folders up [1] which is called when the job manager has stopped [2]. And it seems it only cleans up the folder *running_job_registry*, other folders stay untouched. I suppose that everything under the *///* folder is cleaned up when the job is cancelled. [1] [https://github.com/apache/flin
[jira] [Created] (FLINK-10695) "Cannot load user class" errors should set the underlying ClassNotFoundException as their cause
Max Feng created FLINK-10695: Summary: "Cannot load user class" errors should set the underlying ClassNotFoundException as their cause Key: FLINK-10695 URL: https://issues.apache.org/jira/browse/FLINK-10695 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Max Feng In {{StreamConfig#getStreamOperator}}, {{ClassNotFoundException}}s are propagated as {{StreamTaskException}}s but do not set the underlying {{ClassNotFoundException}} as the cause of the {{StreamTaskException}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-10695) "Cannot load user class" errors should set the underlying ClassNotFoundException as their cause
[ https://issues.apache.org/jira/browse/FLINK-10695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Feng updated FLINK-10695: - Description: In {{StreamConfig#getStreamOperator}}, {{ClassNotFoundException}} is propagated as {{StreamTaskException}} but does not set the underlying {{ClassNotFoundException}} as the cause of the {{StreamTaskException}}. (was: In {{StreamConfig#getStreamOperator}}, {{ClassNotFoundException}}s are propagated as {{StreamTaskException}}s but do not set the underlying {{ClassNotFoundException}} as the cause of the {{StreamTaskException}}.) > "Cannot load user class" errors should set the underlying > ClassNotFoundException as their cause > --- > > Key: FLINK-10695 > URL: https://issues.apache.org/jira/browse/FLINK-10695 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Max Feng >Priority: Major > > In {{StreamConfig#getStreamOperator}}, {{ClassNotFoundException}} is > propagated as {{StreamTaskException}} but does not set the underlying > {{ClassNotFoundException}} as the cause of the {{StreamTaskException}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-4582) Allow FlinkKinesisConsumer to adapt for AWS DynamoDB Streams
[ https://issues.apache.org/jira/browse/FLINK-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665593#comment-16665593 ] Devin Thomson commented on FLINK-4582: -- [~yxu-apache] I appreciate the quick response! "The use of _DynamodbProxy.getShardList()_ is interesting." Haha very kind of you to not point out the potential performance issue of always fetching all the shard ids. I didn't want to duplicate the code of DynamoDBProxy, but yes this approach suffers from a complete traversal of the shard ids every time. I have also observed that the shard ids are always returned in the same (unsorted) ordering, so your approach sounds good to me. Due to the fact that we run our Flink clusters in AWS EMR, which does not support high-availability master nodes, we will not be using multi-stream consumers so I did not implement that support here and saw DynamoDBProxy as a natural solution. The alternative was to use one instance of DynamoDBProxy per stream. I didn't look into the memory implications but I assume it's worse than what you have built :) We are still in development over here using the solution I built, but I would be glad to cutover to your solution once it's available! One question would be - is it compatible with Flink 1.5.2? As I mentioned, we run in AWS EMR which only supports 1.5.2 in the latest release. Thank you!! Devin > Allow FlinkKinesisConsumer to adapt for AWS DynamoDB Streams > > > Key: FLINK-4582 > URL: https://issues.apache.org/jira/browse/FLINK-4582 > Project: Flink > Issue Type: New Feature > Components: Kinesis Connector, Streaming Connectors >Reporter: Tzu-Li (Gordon) Tai >Assignee: Ying Xu >Priority: Major > > AWS DynamoDB is a NoSQL database service that has a CDC-like (change data > capture) feature called DynamoDB Streams > (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html), > which is a stream feed of item-level table activities. > The DynamoDB Streams shard abstraction follows that of Kinesis Streams with > only a slight difference in resharding behaviours, so it is possible to build > on the internals of our Flink Kinesis Consumer for an exactly-once DynamoDB > Streams source. > I propose an API something like this: > {code} > DataStream dynamoItemsCdc = > FlinkKinesisConsumer.asDynamoDBStream(tableNames, schema, config) > {code} > The feature adds more connectivity to popular AWS services for Flink, and > combining what Flink has for exactly-once semantics, out-of-core state > backends, and queryable state with CDC can have very strong use cases. For > this feature there should only be an extra dependency to the AWS Java SDK for > DynamoDB, which has Apache License 2.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-4582) Allow FlinkKinesisConsumer to adapt for AWS DynamoDB Streams
[ https://issues.apache.org/jira/browse/FLINK-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665527#comment-16665527 ] Ying Xu commented on FLINK-4582: Hi [~tinder-dthomson] thanks for raising this issue up. And sorry for the delay in responding to the original request. We actually implemented a version of the flink-dynamodbstreams connector on top of the existing flink-kinesis connector. The work is currently in production and was presented in a meetup event back in Sep. I wasn't able to get a chance to contribute back because of other work priorities – my bad! I looked at your PR. The use of _DynamodbProxy.getShardList()_ is interesting. We took a slightly different approach, which plugs in a dynamodbstreams-kinesis adapter object into KinesisProxy and makes it an equivalent _DynamodbProxy_ (approach mentioned in another thread titled *Consuming data from dynamoDB streams to flink*). We rely on the assumption that during re-sharing, one can retrieve all the new child shard Ids based on passing the last seen shardId. Although Dynamodbstreams do not officially claim this, we consistently observed similar behavior in production during resharding. Other benefits of directly embedding a dynamodbstreams-kinesis adapter is to allow *ONE* source (consumer) to consume from multiple data streams (which is important for our use cases), plus other error handling in the existing KinesisProxy. I agree that if the _DynamodbProxy_ provides _efficient multi-stream_ implementation, it is an interesting direction to look into. If you can wait a few days, I can adapt my PR on top of the OSS flink and post it by early next week. We can have more discussions at then. What do you think? Thank you very much! > Allow FlinkKinesisConsumer to adapt for AWS DynamoDB Streams > > > Key: FLINK-4582 > URL: https://issues.apache.org/jira/browse/FLINK-4582 > Project: Flink > Issue Type: New Feature > Components: Kinesis Connector, Streaming Connectors >Reporter: Tzu-Li (Gordon) Tai >Assignee: Ying Xu >Priority: Major > > AWS DynamoDB is a NoSQL database service that has a CDC-like (change data > capture) feature called DynamoDB Streams > (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html), > which is a stream feed of item-level table activities. > The DynamoDB Streams shard abstraction follows that of Kinesis Streams with > only a slight difference in resharding behaviours, so it is possible to build > on the internals of our Flink Kinesis Consumer for an exactly-once DynamoDB > Streams source. > I propose an API something like this: > {code} > DataStream dynamoItemsCdc = > FlinkKinesisConsumer.asDynamoDBStream(tableNames, schema, config) > {code} > The feature adds more connectivity to popular AWS services for Flink, and > combining what Flink has for exactly-once semantics, out-of-core state > backends, and queryable state with CDC can have very strong use cases. For > this feature there should only be an extra dependency to the AWS Java SDK for > DynamoDB, which has Apache License 2.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kingbase opened a new pull request #6944: fix typo on comment
kingbase opened a new pull request #6944: fix typo on comment URL: https://github.com/apache/flink/pull/6944 just a little typo that "working directory" occured twice This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10600) Provide End-to-end test cases for modern Kafka connectors
[ https://issues.apache.org/jira/browse/FLINK-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665409#comment-16665409 ] ASF GitHub Bot commented on FLINK-10600: yanghua commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#issuecomment-433472224 No, I mean if we package a jar which contains old connector and new connector and multiple kafka client this pr's end to end test would fail with the same reason. But it seems shade plugin can not pick or exclude specific version of dependencies. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide End-to-end test cases for modern Kafka connectors > - > > Key: FLINK-10600 > URL: https://issues.apache.org/jira/browse/FLINK-10600 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Reporter: vinoyang >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors
yanghua commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#issuecomment-433472224 No, I mean if we package a jar which contains old connector and new connector and multiple kafka client this pr's end to end test would fail with the same reason. But it seems shade plugin can not pick or exclude specific version of dependencies. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-10694) ZooKeeperRunningJobsRegistry Cleanup
[ https://issues.apache.org/jira/browse/FLINK-10694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated FLINK-10694: - Description: When a streaming job with Zookeeper-HA enabled gets cancelled all the job-related Zookeeper nodes are not removed. Is there a reason behind that? I noticed that Zookeeper paths are created of type "Container Node" (an Ephemeral node that can have nested nodes) and fall back to Persistent node type in case Zookeeper doesn't support this sort of nodes. But anyway, it is worth removing the job Zookeeper node when a job is cancelled, isn't it? zookeeper version 3.4.10 flink version 1.6.1 # The job is deployed as a YARN cluster with the following properties set {noformat} high-availability: zookeeper high-availability.zookeeper.quorum: high-availability.zookeeper.storageDir: hdfs:/// high-availability.zookeeper.path.root: high-availability.zookeeper.path.namespace: {noformat} # The job is cancelled via flink cancel command. What I've noticed: when the job is running the following directory structure is created in zookeeper {noformat} ///leader/resource_manager_lock ///leader/rest_server_lock ///leader/dispatcher_lock ///leader/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/5c21f00b9162becf5ce25a1cf0e67cde/041 ///checkpoint-counter/5c21f00b9162becf5ce25a1cf0e67cde ///running_job_registry/5c21f00b9162becf5ce25a1cf0e67cde {noformat} when the job is cancelled the some ephemeral nodes disappear, but most of them are still there: {noformat} ///leader/5c21f00b9162becf5ce25a1cf0e67cde ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/ ///checkpoint-counter/ ///running_job_registry/ {noformat} Here is the method [1] responsible for cleaning zookeeper folders up [1] which is called when the job manager has stopped [2]. And it seems it only cleans up the folder *running_job_registry*, other folders stay untouched. I suppose that everything under the *///* folder is cleaned up when the job is cancelled. [1] [https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/zookeeper/ZooKeeperRunningJobsRegistry.java#L107] [2] [https://github.com/apache/flink/blob/f087f57749004790b6f5b823d66822c36ae09927/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L332] was: When a streaming job with Zookeeper-HA enabled gets cancelled all the job-related Zookeeper nodes are not removed. Is there a reason behind that? I noticed that Zookeeper paths are created of type "Container Node" (an Ephemeral node that can have nested nodes) and fall back to Persistent node type in case Zookeeper doesn't support this sort of nodes. But anyway, it is worth removing the job Zookeeper node when a job is cancelled, isn't it? zookeeper version 3.4.10 flink version 1.6.1 # The job is deployed as a YARN cluster with the following properties set {noformat} high-availability: zookeeper high-availability.zookeeper.quorum: high-availability.zookeeper.storageDir: hdfs:/// high-availability.zookeeper.path.root: high-availability.zookeeper.path.namespace: {noformat} # The job is cancelled via flink cancel command. What I've noticed: when the job is running the following directory structure is created in zookeeper {noformat} ///leader/resource_manager_lock ///leader/rest_server_lock ///leader/dispatcher_lock ///leader/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/5c21f00b9162becf5ce25a1cf0e67cde/041 ///checkpoint-counter/5c21f00b9162becf5ce25a1cf0e67cde ///running_job_registry/5c21f00b9162becf5ce25a1cf0e67cde {noformat} when the job is cancelled the some ephemeral nodes disappear, but most of them are still there: {noformat} ///leader/5c21f00b9162becf5ce25a1cf0e67cde ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/ ///checkpoint-counter/ ///running_job_registry/ {noformat} Here is the method [1] responsible for cleaning zookeeper folders up [1] which is called when the job manager has stopped [2]. And it seems it only cleans up the folder running_job_registry, other folders stay untouched. I supposed that everything under the *///* folder is cleaned up when the job is cancelled. [1] [https://github.com/apache/fl
[jira] [Created] (FLINK-10694) ZooKeeperRunningJobsRegistry Cleanup
Mikhail Pryakhin created FLINK-10694: Summary: ZooKeeperRunningJobsRegistry Cleanup Key: FLINK-10694 URL: https://issues.apache.org/jira/browse/FLINK-10694 Project: Flink Issue Type: Bug Components: DataStream API Affects Versions: 1.6.1 Reporter: Mikhail Pryakhin When a streaming job with Zookeeper-HA enabled gets cancelled all the job-related Zookeeper nodes are not removed. Is there a reason behind that? I noticed that Zookeeper paths are created of type "Container Node" (an Ephemeral node that can have nested nodes) and fall back to Persistent node type in case Zookeeper doesn't support this sort of nodes. But anyway, it is worth removing the job Zookeeper node when a job is cancelled, isn't it? zookeeper version 3.4.10 flink version 1.6.1 # The job is deployed as a YARN cluster with the following properties set {noformat} high-availability: zookeeper high-availability.zookeeper.quorum: high-availability.zookeeper.storageDir: hdfs:/// high-availability.zookeeper.path.root: high-availability.zookeeper.path.namespace: {noformat} # The job is cancelled via flink cancel command. What I've noticed: when the job is running the following directory structure is created in zookeeper {noformat} ///leader/resource_manager_lock ///leader/rest_server_lock ///leader/dispatcher_lock ///leader/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/5c21f00b9162becf5ce25a1cf0e67cde/041 ///checkpoint-counter/5c21f00b9162becf5ce25a1cf0e67cde ///running_job_registry/5c21f00b9162becf5ce25a1cf0e67cde {noformat} when the job is cancelled the some ephemeral nodes disappear, but most of them are still there: {noformat} ///leader/5c21f00b9162becf5ce25a1cf0e67cde ///leaderlatch/resource_manager_lock ///leaderlatch/rest_server_lock ///leaderlatch/dispatcher_lock ///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock ///checkpoints/ ///checkpoint-counter/ ///running_job_registry/ {noformat} Here is the method [1] responsible for cleaning zookeeper folders up [1] which is called when the job manager has stopped [2]. And it seems it only cleans up the folder running_job_registry, other folders stay untouched. I supposed that everything under the *///* folder is cleaned up when the job is cancelled. [1] [https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/zookeeper/ZooKeeperRunningJobsRegistry.java#L107] [2] [https://github.com/apache/flink/blob/f087f57749004790b6f5b823d66822c36ae09927/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L332] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9083) Add async backpressure support to Cassandra Connector
[ https://issues.apache.org/jira/browse/FLINK-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665336#comment-16665336 ] ASF GitHub Bot commented on FLINK-9083: --- jparkie commented on issue #6782: [FLINK-9083][Cassandra Connector] Add async backpressure support to Cassandra Connector URL: https://github.com/apache/flink/pull/6782#issuecomment-433455493 My apologies; I've been quite busy with my personal life and work, but I should be able to carve time this weekend to address your comments. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add async backpressure support to Cassandra Connector > - > > Key: FLINK-9083 > URL: https://issues.apache.org/jira/browse/FLINK-9083 > Project: Flink > Issue Type: Improvement > Components: Cassandra Connector >Reporter: Jacob Park >Assignee: Jacob Park >Priority: Minor > Labels: pull-request-available > > As the CassandraSinkBase derivatives utilize async writes, they do not block > the task to introduce any backpressure. > I am currently using a semaphore to provide backpressure support by blocking > at a maximum concurrent requests limit like how DataStax's Spark Cassandra > Connector functions: > [https://github.com/datastax/spark-cassandra-connector/blob/v2.0.7/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/writer/AsyncExecutor.scala#L18] > This improvement has greatly improved the fault-tolerance of our Cassandra > Sink Connector implementation on Apache Flink in production. I would like to > contribute this feature back upstream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] jparkie commented on issue #6782: [FLINK-9083][Cassandra Connector] Add async backpressure support to Cassandra Connector
jparkie commented on issue #6782: [FLINK-9083][Cassandra Connector] Add async backpressure support to Cassandra Connector URL: https://github.com/apache/flink/pull/6782#issuecomment-433455493 My apologies; I've been quite busy with my personal life and work, but I should be able to carve time this weekend to address your comments. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10107) SQL Client end-to-end test fails for releases
[ https://issues.apache.org/jira/browse/FLINK-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665322#comment-16665322 ] ASF GitHub Bot commented on FLINK-10107: twalthr opened a new pull request #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534 ## What is the purpose of the change This PR enforces even more shading for SQL JARs than before. In the past, we only shaded Kafka dependencies. However, since Flink's Kafka connectors mutually depend on each other and do not use version-specific package names (e.g. for Elasticsearch we use `elasticsearch2`, `elasticsearch3`, etc.). This is the only way of avoiding dependency conflicts between different Kafka SQL JARs. The end-to-end tests has been extended to detect classloading issues in builds. ## Brief change log - Add more relocation to Flink's Kafka SQL JARs ## Verifying this change SQL Client end-to-end tests has been adapted to detect classloading issues earlier. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): yes, but only for SQL JARs - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > SQL Client end-to-end test fails for releases > - > > Key: FLINK-10107 > URL: https://issues.apache.org/jira/browse/FLINK-10107 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > > It seems that SQL JARs for Kafka 0.10 and Kafka 0.9 have conflicts that only > occur for releases and not SNAPSHOT builds. This might be due to their file > name. Depending on the file name either 0.9 is loaded before 0.10 and vice > versa. > One of the following errors occured: > {code} > 2018-08-08 18:28:51,636 ERROR > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.ClientUtils - > Failed to close coordinator > java.lang.NoClassDefFoundError: > org/apache/flink/kafka09/shaded/org/apache/kafka/common/requests/OffsetCommitResponse > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:473) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:357) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.maybeAutoCommitOffsetsSync(ConsumerCoordinator.java:439) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.close(ConsumerCoordinator.java:319) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.ClientUtils.closeQuietly(ClientUtils.java:63) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1277) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1258) > at > org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread.run(KafkaConsumerThread.java:286) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.kafka09.shaded.org.apache.kafka.common.requests.OffsetCommitResponse > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:120) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 8 more > {code} > {code} > java.lang.NoSuchFieldError: producer > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer010.invoke(FlinkKafkaProducer010.java:369) > at > org.apache.flink.streaming.api.operators.Stream
[jira] [Commented] (FLINK-10107) SQL Client end-to-end test fails for releases
[ https://issues.apache.org/jira/browse/FLINK-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665320#comment-16665320 ] ASF GitHub Bot commented on FLINK-10107: pnowojski closed pull request #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-connectors/flink-connector-kafka-0.10/pom.xml b/flink-connectors/flink-connector-kafka-0.10/pom.xml index 135dc59b655..c7a840caafb 100644 --- a/flink-connectors/flink-connector-kafka-0.10/pom.xml +++ b/flink-connectors/flink-connector-kafka-0.10/pom.xml @@ -225,6 +225,7 @@ under the License. true sql-jar + org.apache.kafka:* @@ -232,6 +233,11 @@ under the License. org.apache.flink:flink-connector-kafka-0.9_${scala.binary.version} + + + + + *:* @@ -239,12 +245,23 @@ under the License. kafka/kafka-version.properties + + org.apache.flink:flink-connector-kafka-0.9_${scala.binary.version} + + META-INF/services/org.apache.flink.table.factories.TableFactory + + + org.apache.kafka org.apache.flink.kafka010.shaded.org.apache.kafka + + org.apache.flink.streaming.connectors.kafka + org.apache.flink.kafka010.shaded.org.apache.flink.streaming.connectors.kafka + diff --git a/flink-connectors/flink-connector-kafka-0.11/pom.xml b/flink-connectors/flink-connector-kafka-0.11/pom.xml index bf04aebb390..c6bf71e8d5a 100644
[jira] [Commented] (FLINK-10107) SQL Client end-to-end test fails for releases
[ https://issues.apache.org/jira/browse/FLINK-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665319#comment-16665319 ] ASF GitHub Bot commented on FLINK-10107: pnowojski commented on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534#issuecomment-433452892 e This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > SQL Client end-to-end test fails for releases > - > > Key: FLINK-10107 > URL: https://issues.apache.org/jira/browse/FLINK-10107 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > > It seems that SQL JARs for Kafka 0.10 and Kafka 0.9 have conflicts that only > occur for releases and not SNAPSHOT builds. This might be due to their file > name. Depending on the file name either 0.9 is loaded before 0.10 and vice > versa. > One of the following errors occured: > {code} > 2018-08-08 18:28:51,636 ERROR > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.ClientUtils - > Failed to close coordinator > java.lang.NoClassDefFoundError: > org/apache/flink/kafka09/shaded/org/apache/kafka/common/requests/OffsetCommitResponse > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:473) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:357) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.maybeAutoCommitOffsetsSync(ConsumerCoordinator.java:439) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.close(ConsumerCoordinator.java:319) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.ClientUtils.closeQuietly(ClientUtils.java:63) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1277) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1258) > at > org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread.run(KafkaConsumerThread.java:286) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.kafka09.shaded.org.apache.kafka.common.requests.OffsetCommitResponse > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:120) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 8 more > {code} > {code} > java.lang.NoSuchFieldError: producer > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer010.invoke(FlinkKafkaProducer010.java:369) > at > org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:667) > at > org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamO
[jira] [Commented] (FLINK-10107) SQL Client end-to-end test fails for releases
[ https://issues.apache.org/jira/browse/FLINK-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665321#comment-16665321 ] ASF GitHub Bot commented on FLINK-10107: pnowojski removed a comment on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534#issuecomment-433452892 e This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > SQL Client end-to-end test fails for releases > - > > Key: FLINK-10107 > URL: https://issues.apache.org/jira/browse/FLINK-10107 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > > It seems that SQL JARs for Kafka 0.10 and Kafka 0.9 have conflicts that only > occur for releases and not SNAPSHOT builds. This might be due to their file > name. Depending on the file name either 0.9 is loaded before 0.10 and vice > versa. > One of the following errors occured: > {code} > 2018-08-08 18:28:51,636 ERROR > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.ClientUtils - > Failed to close coordinator > java.lang.NoClassDefFoundError: > org/apache/flink/kafka09/shaded/org/apache/kafka/common/requests/OffsetCommitResponse > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:473) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:357) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.maybeAutoCommitOffsetsSync(ConsumerCoordinator.java:439) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.close(ConsumerCoordinator.java:319) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.ClientUtils.closeQuietly(ClientUtils.java:63) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1277) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1258) > at > org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread.run(KafkaConsumerThread.java:286) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.kafka09.shaded.org.apache.kafka.common.requests.OffsetCommitResponse > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:120) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 8 more > {code} > {code} > java.lang.NoSuchFieldError: producer > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer010.invoke(FlinkKafkaProducer010.java:369) > at > org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:667) > at > org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(Abstrac
[GitHub] twalthr opened a new pull request #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs
twalthr opened a new pull request #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534 ## What is the purpose of the change This PR enforces even more shading for SQL JARs than before. In the past, we only shaded Kafka dependencies. However, since Flink's Kafka connectors mutually depend on each other and do not use version-specific package names (e.g. for Elasticsearch we use `elasticsearch2`, `elasticsearch3`, etc.). This is the only way of avoiding dependency conflicts between different Kafka SQL JARs. The end-to-end tests has been extended to detect classloading issues in builds. ## Brief change log - Add more relocation to Flink's Kafka SQL JARs ## Verifying this change SQL Client end-to-end tests has been adapted to detect classloading issues earlier. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): yes, but only for SQL JARs - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pnowojski commented on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs
pnowojski commented on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534#issuecomment-433452892 e This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pnowojski removed a comment on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs
pnowojski removed a comment on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534#issuecomment-433452892 e This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pnowojski closed pull request #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs
pnowojski closed pull request #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-connectors/flink-connector-kafka-0.10/pom.xml b/flink-connectors/flink-connector-kafka-0.10/pom.xml index 135dc59b655..c7a840caafb 100644 --- a/flink-connectors/flink-connector-kafka-0.10/pom.xml +++ b/flink-connectors/flink-connector-kafka-0.10/pom.xml @@ -225,6 +225,7 @@ under the License. true sql-jar + org.apache.kafka:* @@ -232,6 +233,11 @@ under the License. org.apache.flink:flink-connector-kafka-0.9_${scala.binary.version} + + + + + *:* @@ -239,12 +245,23 @@ under the License. kafka/kafka-version.properties + + org.apache.flink:flink-connector-kafka-0.9_${scala.binary.version} + + META-INF/services/org.apache.flink.table.factories.TableFactory + + + org.apache.kafka org.apache.flink.kafka010.shaded.org.apache.kafka + + org.apache.flink.streaming.connectors.kafka + org.apache.flink.kafka010.shaded.org.apache.flink.streaming.connectors.kafka + diff --git a/flink-connectors/flink-connector-kafka-0.11/pom.xml b/flink-connectors/flink-connector-kafka-0.11/pom.xml index bf04aebb390..c6bf71e8d5a 100644 --- a/flink-connectors/flink-connector-kafka-0.11/pom.xml +++ b/flink-connectors/flink-connector-kafka-0.11/pom.xml @@ -234,6 +234,7 @@ under the License.
[jira] [Commented] (FLINK-10107) SQL Client end-to-end test fails for releases
[ https://issues.apache.org/jira/browse/FLINK-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665316#comment-16665316 ] ASF GitHub Bot commented on FLINK-10107: pnowojski commented on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534#issuecomment-433452599 @twalthr what is the problem here? Do I understand this correctly, that different versions of `flink-kafka-connector`s conflicting with one another if they are used/loaded at the same time? And that's potential production issue (however a rare use case), but it's causing problems in our end-to-end tests? If so why would this help: > A perfect solution would be: > > Move version-specific Kafka code into version-specific packages (e.g. o.a.f.kafka10) > Move version-agnostic Kafka code into a common package (e.g. o.a.f.kafka) ? Isn't the correct "perfect" solution to load each connector/plugin in a separate independent class loader, isolated from others? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > SQL Client end-to-end test fails for releases > - > > Key: FLINK-10107 > URL: https://issues.apache.org/jira/browse/FLINK-10107 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > > It seems that SQL JARs for Kafka 0.10 and Kafka 0.9 have conflicts that only > occur for releases and not SNAPSHOT builds. This might be due to their file > name. Depending on the file name either 0.9 is loaded before 0.10 and vice > versa. > One of the following errors occured: > {code} > 2018-08-08 18:28:51,636 ERROR > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.ClientUtils - > Failed to close coordinator > java.lang.NoClassDefFoundError: > org/apache/flink/kafka09/shaded/org/apache/kafka/common/requests/OffsetCommitResponse > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:473) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:357) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.maybeAutoCommitOffsetsSync(ConsumerCoordinator.java:439) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.close(ConsumerCoordinator.java:319) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.ClientUtils.closeQuietly(ClientUtils.java:63) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1277) > at > org.apache.flink.kafka09.shaded.org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1258) > at > org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread.run(KafkaConsumerThread.java:286) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.kafka09.shaded.org.apache.kafka.common.requests.OffsetCommitResponse > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:120) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 8 more > {code} > {code} > java.lang.NoSuchFieldError: producer > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer010.invoke(FlinkKafkaProducer010.java:369) > at > org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:667) > at >
[GitHub] pnowojski commented on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs
pnowojski commented on issue #6534: [FLINK-10107] [sql-client] Relocate Flink Kafka connectors for SQL JARs URL: https://github.com/apache/flink/pull/6534#issuecomment-433452599 @twalthr what is the problem here? Do I understand this correctly, that different versions of `flink-kafka-connector`s conflicting with one another if they are used/loaded at the same time? And that's potential production issue (however a rare use case), but it's causing problems in our end-to-end tests? If so why would this help: > A perfect solution would be: > > Move version-specific Kafka code into version-specific packages (e.g. o.a.f.kafka10) > Move version-agnostic Kafka code into a common package (e.g. o.a.f.kafka) ? Isn't the correct "perfect" solution to load each connector/plugin in a separate independent class loader, isolated from others? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10681) elasticsearch6.ElasticsearchSinkITCase fails if wrong JNA library installed
[ https://issues.apache.org/jira/browse/FLINK-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665309#comment-16665309 ] ASF GitHub Bot commented on FLINK-10681: tillrohrmann commented on a change in pull request #6928: [FLINK-10681] Harden ElasticsearchSinkITCase against wrong JNA library URL: https://github.com/apache/flink/pull/6928#discussion_r228574427 ## File path: flink-connectors/flink-connector-elasticsearch6/pom.xml ## @@ -278,6 +278,9 @@ under the License. maven-surefire-plugin 2.12.2 + + true Review comment: I think this can also be a problem in production if you have a wrong JNA library installed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > elasticsearch6.ElasticsearchSinkITCase fails if wrong JNA library installed > --- > > Key: FLINK-10681 > URL: https://issues.apache.org/jira/browse/FLINK-10681 > Project: Flink > Issue Type: Bug > Components: ElasticSearch Connector, Tests >Affects Versions: 1.6.1, 1.7.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Critical > Labels: pull-request-available > Fix For: 1.6.3, 1.7.0 > > > The {{elasticsearch6.ElasticsearchSinkITCase}} fails on systems where a wrong > JNA library is installed. > {code} > There is an incompatible JNA native library installed on this system > Expected: 5.2.0 > Found:4.0.0 > /usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib. > To resolve this issue you may do one of the following: > - remove or uninstall the offending library > - set the system property jna.nosys=true > - set jna.boot.library.path to include the path to the version of the >jnidispatch library included with the JNA jar file you are using > at com.sun.jna.Native.(Native.java:199) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:264) > at org.elasticsearch.bootstrap.Natives.(Natives.java:45) > at > org.elasticsearch.bootstrap.BootstrapInfo.isMemoryLocked(BootstrapInfo.java:50) > at > org.elasticsearch.monitor.process.ProcessProbe.processInfo(ProcessProbe.java:130) > at > org.elasticsearch.monitor.process.ProcessService.(ProcessService.java:44) > at > org.elasticsearch.monitor.MonitorService.(MonitorService.java:48) > at org.elasticsearch.node.Node.(Node.java:363) > at > org.apache.flink.streaming.connectors.elasticsearch.EmbeddedElasticsearchNodeEnvironmentImpl$PluginNode.(EmbeddedElasticsearchNodeEnvironmentImpl.java:85) > at > org.apache.flink.streaming.connectors.elasticsearch.EmbeddedElasticsearchNodeEnvironmentImpl.start(EmbeddedElasticsearchNodeEnvironmentImpl.java:53) > at > org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkTestBase.prepare(ElasticsearchSinkTestBase.java:73) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252) > at > org.apache.maven.surefire.junit4.JUnit4Provider.ex
[GitHub] tillrohrmann commented on a change in pull request #6928: [FLINK-10681] Harden ElasticsearchSinkITCase against wrong JNA library
tillrohrmann commented on a change in pull request #6928: [FLINK-10681] Harden ElasticsearchSinkITCase against wrong JNA library URL: https://github.com/apache/flink/pull/6928#discussion_r228574427 ## File path: flink-connectors/flink-connector-elasticsearch6/pom.xml ## @@ -278,6 +278,9 @@ under the License. maven-surefire-plugin 2.12.2 + + true Review comment: I think this can also be a problem in production if you have a wrong JNA library installed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10600) Provide End-to-end test cases for modern Kafka connectors
[ https://issues.apache.org/jira/browse/FLINK-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665305#comment-16665305 ] ASF GitHub Bot commented on FLINK-10600: pnowojski commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#issuecomment-433449682 @yanghua do you mean this end-to-end test is failing because of https://issues.apache.org/jira/browse/FLINK-10107 and it is blocking this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide End-to-end test cases for modern Kafka connectors > - > > Key: FLINK-10600 > URL: https://issues.apache.org/jira/browse/FLINK-10600 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Reporter: vinoyang >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] pnowojski commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors
pnowojski commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#issuecomment-433449682 @yanghua do you mean this end-to-end test is failing because of https://issues.apache.org/jira/browse/FLINK-10107 and it is blocking this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-10678) Add a switch to run_test to configure if logs should be checked for errors/excepions
[ https://issues.apache.org/jira/browse/FLINK-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz reassigned FLINK-10678: Assignee: Dawid Wysakowicz (was: Hequn Cheng) > Add a switch to run_test to configure if logs should be checked for > errors/excepions > > > Key: FLINK-10678 > URL: https://issues.apache.org/jira/browse/FLINK-10678 > Project: Flink > Issue Type: Bug > Components: E2E Tests >Affects Versions: 1.7.0 >Reporter: Dawid Wysakowicz >Assignee: Dawid Wysakowicz >Priority: Blocker > Fix For: 1.7.0 > > > After adding the switch, we should disable log checking for nightly-tests > that currently fail (or fix the test). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10678) Add a switch to run_test to configure if logs should be checked for errors/excepions
[ https://issues.apache.org/jira/browse/FLINK-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665304#comment-16665304 ] Dawid Wysakowicz commented on FLINK-10678: -- Thx, then I will take it over. > Add a switch to run_test to configure if logs should be checked for > errors/excepions > > > Key: FLINK-10678 > URL: https://issues.apache.org/jira/browse/FLINK-10678 > Project: Flink > Issue Type: Bug > Components: E2E Tests >Affects Versions: 1.7.0 >Reporter: Dawid Wysakowicz >Assignee: Hequn Cheng >Priority: Blocker > Fix For: 1.7.0 > > > After adding the switch, we should disable log checking for nightly-tests > that currently fail (or fix the test). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10678) Add a switch to run_test to configure if logs should be checked for errors/excepions
[ https://issues.apache.org/jira/browse/FLINK-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665302#comment-16665302 ] Hequn Cheng commented on FLINK-10678: - Sure. I haven't started yet. > Add a switch to run_test to configure if logs should be checked for > errors/excepions > > > Key: FLINK-10678 > URL: https://issues.apache.org/jira/browse/FLINK-10678 > Project: Flink > Issue Type: Bug > Components: E2E Tests >Affects Versions: 1.7.0 >Reporter: Dawid Wysakowicz >Assignee: Hequn Cheng >Priority: Blocker > Fix For: 1.7.0 > > > After adding the switch, we should disable log checking for nightly-tests > that currently fail (or fix the test). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9697) Provide connector for modern Kafka
[ https://issues.apache.org/jira/browse/FLINK-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665283#comment-16665283 ] ASF GitHub Bot commented on FLINK-9697: --- pnowojski commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka URL: https://github.com/apache/flink/pull/6703#issuecomment-433442098 ok thx, we will have to clean this up somehow. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide connector for modern Kafka > -- > > Key: FLINK-9697 > URL: https://issues.apache.org/jira/browse/FLINK-9697 > Project: Flink > Issue Type: Improvement >Reporter: Ted Yu >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > Kafka 2.0.0 would be released soon. > Here is vote thread: > [http://search-hadoop.com/m/Kafka/uyzND1vxnEd23QLxb?subj=+VOTE+2+0+0+RC1] > We should provide connector for Kafka 2.0.0 once it is released. > Upgrade to 2.0 documentation : > http://kafka.apache.org/20/documentation.html#upgrade_2_0_0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] pnowojski commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka
pnowojski commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka URL: https://github.com/apache/flink/pull/6703#issuecomment-433442098 ok thx, we will have to clean this up somehow. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE
pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE URL: https://github.com/apache/flink/pull/6926#discussion_r228564019 ## File path: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/datastream/DataStreamMatchRule.scala ## @@ -42,11 +42,13 @@ class DataStreamMatchRule RelOptRule.convert(logicalMatch.getInput, FlinkConventions.DATASTREAM) try { - Class.forName("org.apache.flink.cep.pattern.Pattern") + Class +.forName("org.apache.flink.cep.pattern.Pattern", + false, + Thread.currentThread().getContextClassLoader) } catch { case ex: ClassNotFoundException => throw new TableException( -"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.", -ex) +"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.") Review comment: re 2: We know what has happened but I think that `ClassNotFoundException` can have multiple different causes. One of the is that class was indeed not found, but I think it can also be thrown if class was found, but JVM was failed to load/initialize it. Hiding this kind of things might make debugging/troubleshooting impossible/more difficult. re 1: if that's the case, I would be inclined to change it at some point of time. Maybe log causes in sql client to a file or provide `verbose` mode. I'm not saying to do it now, but decision to ignore/hide the exception cause shouldn't be made at this level anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE
pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE URL: https://github.com/apache/flink/pull/6926#discussion_r228564019 ## File path: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/datastream/DataStreamMatchRule.scala ## @@ -42,11 +42,13 @@ class DataStreamMatchRule RelOptRule.convert(logicalMatch.getInput, FlinkConventions.DATASTREAM) try { - Class.forName("org.apache.flink.cep.pattern.Pattern") + Class +.forName("org.apache.flink.cep.pattern.Pattern", + false, + Thread.currentThread().getContextClassLoader) } catch { case ex: ClassNotFoundException => throw new TableException( -"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.", -ex) +"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.") Review comment: re 2: We know what has happened but I think that `ClassNotFoundException` can have multiple different causes. One of the is that class was indeed not found, but I think it can also be thrown if class was found, but JVM was failed to load/initialize it. Hiding this kind of things might make debugging/troubleshooting impossible/more difficult. re 1: if that's the case, I would be inclined to change it at some point of time. Maybe log causes in sql client to a file or provide `verbose` mode. I'm not saying to do it now, but decision to ignore/hide the exception cause shouldn't be made at this level anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10623) Extend streaming SQL end-to-end test to test MATCH_RECOGNIZE
[ https://issues.apache.org/jira/browse/FLINK-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665280#comment-16665280 ] ASF GitHub Bot commented on FLINK-10623: pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE URL: https://github.com/apache/flink/pull/6926#discussion_r228564019 ## File path: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/datastream/DataStreamMatchRule.scala ## @@ -42,11 +42,13 @@ class DataStreamMatchRule RelOptRule.convert(logicalMatch.getInput, FlinkConventions.DATASTREAM) try { - Class.forName("org.apache.flink.cep.pattern.Pattern") + Class +.forName("org.apache.flink.cep.pattern.Pattern", + false, + Thread.currentThread().getContextClassLoader) } catch { case ex: ClassNotFoundException => throw new TableException( -"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.", -ex) +"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.") Review comment: re 2: We know what has happened but I think that `ClassNotFoundException` can have multiple different causes. One of the is that class was indeed not found, but I think it can also be thrown if class was found, but JVM was failed to load/initialize it. Hiding this kind of things might make debugging/troubleshooting impossible/more difficult. re 1: if that's the case, I would be inclined to change it at some point of time. Maybe log causes in sql client to a file or provide `verbose` mode. I'm not saying to do it now, but decision to ignore/hide the exception cause shouldn't be made at this level anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Extend streaming SQL end-to-end test to test MATCH_RECOGNIZE > > > Key: FLINK-10623 > URL: https://issues.apache.org/jira/browse/FLINK-10623 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL, Tests >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > We should extend the existing {{test_streaming_sql.sh}} to test the newly > added {{MATCH_RECOGNIZE}} functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10623) Extend streaming SQL end-to-end test to test MATCH_RECOGNIZE
[ https://issues.apache.org/jira/browse/FLINK-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665281#comment-16665281 ] ASF GitHub Bot commented on FLINK-10623: pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE URL: https://github.com/apache/flink/pull/6926#discussion_r228564019 ## File path: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/datastream/DataStreamMatchRule.scala ## @@ -42,11 +42,13 @@ class DataStreamMatchRule RelOptRule.convert(logicalMatch.getInput, FlinkConventions.DATASTREAM) try { - Class.forName("org.apache.flink.cep.pattern.Pattern") + Class +.forName("org.apache.flink.cep.pattern.Pattern", + false, + Thread.currentThread().getContextClassLoader) } catch { case ex: ClassNotFoundException => throw new TableException( -"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.", -ex) +"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.") Review comment: re 2: We know what has happened but I think that `ClassNotFoundException` can have multiple different causes. One of the is that class was indeed not found, but I think it can also be thrown if class was found, but JVM was failed to load/initialize it. Hiding this kind of things might make debugging/troubleshooting impossible/more difficult. re 1: if that's the case, I would be inclined to change it at some point of time. Maybe log causes in sql client to a file or provide `verbose` mode. I'm not saying to do it now, but decision to ignore/hide the exception cause shouldn't be made at this level anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Extend streaming SQL end-to-end test to test MATCH_RECOGNIZE > > > Key: FLINK-10623 > URL: https://issues.apache.org/jira/browse/FLINK-10623 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL, Tests >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > We should extend the existing {{test_streaming_sql.sh}} to test the newly > added {{MATCH_RECOGNIZE}} functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zhijiangW commented on issue #6942: [hotfix] [docs] Fixed small typos in the documentation.
zhijiangW commented on issue #6942: [hotfix] [docs] Fixed small typos in the documentation. URL: https://github.com/apache/flink/pull/6942#issuecomment-433440581 Thanks for the contribution! @contradictioned LGTM! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zentol closed pull request #6942: [hotfix] [docs] Fixed small typos in the documentation.
zentol closed pull request #6942: [hotfix] [docs] Fixed small typos in the documentation. URL: https://github.com/apache/flink/pull/6942 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/dev/batch/hadoop_compatibility.md b/docs/dev/batch/hadoop_compatibility.md index c665b5d99a0..4e481822cf2 100644 --- a/docs/dev/batch/hadoop_compatibility.md +++ b/docs/dev/batch/hadoop_compatibility.md @@ -75,7 +75,7 @@ if you only want to use your Hadoop data types. See the To use Hadoop `InputFormats` with Flink the format must first be wrapped using either `readHadoopFile` or `createHadoopInput` of the -`HadoopInputs` utilty class. +`HadoopInputs` utility class. The former is used for input formats derived from `FileInputFormat` while the latter has to be used for general purpose input formats. diff --git a/docs/dev/connectors/cassandra.md b/docs/dev/connectors/cassandra.md index c4c3e3a4751..2a2acb3bc38 100644 --- a/docs/dev/connectors/cassandra.md +++ b/docs/dev/connectors/cassandra.md @@ -77,7 +77,7 @@ The following configuration methods can be used: * Allows exactly-once processing for non-deterministic algorithms. 6. _setFailureHandler([CassandraFailureHandler failureHandler])_ * An __optional__ setting -* Sets the custom failur handler. +* Sets the custom failure handler. 7. _build()_ * Finalizes the configuration and constructs the CassandraSink instance. diff --git a/docs/dev/table/connect.md b/docs/dev/table/connect.md index b413390615a..5f1112706ce 100644 --- a/docs/dev/table/connect.md +++ b/docs/dev/table/connect.md @@ -83,7 +83,7 @@ The **connector** describes the external system that stores the data of a table. Some systems support different **data formats**. For example, a table that is stored in Kafka or in files can encode its rows with CSV, JSON, or Avro. A database connector might need the table schema here. Whether or not a storage system requires the definition of a format, is documented for every [connector](connect.html#table-connectors). Different systems also require different [types of formats](connect.html#table-formats) (e.g., column-oriented formats vs. row-oriented formats). The documentation states which format types and connectors are compatible. -The **table schema** defines the schema of a table that is exposed to SQL queries. It describes how a source maps the data format to the table schema and a sink vice versa. The schema has access to fields defined by the connector or format. It can use one or more fields for extracting or inserting [time attributes](streaming/time_attributes.html). If input fields have no determinstic field order, the schema clearly defines column names, their order, and origin. +The **table schema** defines the schema of a table that is exposed to SQL queries. It describes how a source maps the data format to the table schema and a sink vice versa. The schema has access to fields defined by the connector or format. It can use one or more fields for extracting or inserting [time attributes](streaming/time_attributes.html). If input fields have no deterministic field order, the schema clearly defines column names, their order, and origin. The subsequent sections will cover each definition part ([connector](connect.html#table-connectors), [format](connect.html#table-formats), and [schema](connect.html#table-schema)) in more detail. The following example shows how to pass them: @@ -113,7 +113,7 @@ schema: ... The table's type (`source`, `sink`, or `both`) determines how a table is registered. In case of table type `both`, both a table source and table sink are registered under the same name. Logically, this means that we can both read and write to such a table similarly to a table in a regular DBMS. -For streaming queries, an [update mode](connect.html#update-mode) declares how to communicate between a dynamic table and the storage system for continous queries. +For streaming queries, an [update mode](connect.html#update-mode) declares how to communicate between a dynamic table and the storage system for continuous queries. The following code shows a full example of how to connect to Kafka for reading Avro records. @@ -673,7 +673,7 @@ connector: # (only MB granularity is supported) interval: 6 # optional: bulk flush interval (in milliseconds) back-off: # optional: backoff strategy ("disabled" by default) -type: ... # valid strategis are "disabled", "constant", or "exponential" +type: ... # valid strategies are "disabled", "constant", or "exponential" max-retries: 3 # optional: maximum number
[jira] [Updated] (FLINK-10632) Run general purpose test job with failures in per-job mode
[ https://issues.apache.org/jira/browse/FLINK-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-10632: --- Labels: pull-request-available (was: ) > Run general purpose test job with failures in per-job mode > -- > > Key: FLINK-10632 > URL: https://issues.apache.org/jira/browse/FLINK-10632 > Project: Flink > Issue Type: Sub-task > Components: E2E Tests >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > Similar to FLINK-8973, we should add an end-to-end which runs the general > datastream job with failures on a per-job cluster with HA enabled (either > directly the {{StandaloneJobClusterEntrypoint}} or a docker image based on > this entrypoint). > We should kill the TMs as well as the cluster entrypoint and verify that the > job recovers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10632) Run general purpose test job with failures in per-job mode
[ https://issues.apache.org/jira/browse/FLINK-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665265#comment-16665265 ] ASF GitHub Bot commented on FLINK-10632: dawidwys opened a new pull request #6943: [FLINK-10632][e2e] Running general purpose testing job with failure in per-job mode URL: https://github.com/apache/flink/pull/6943 ## What is the purpose of the change Add test that runs general purpose job with failure in per-job cluster mode. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Run general purpose test job with failures in per-job mode > -- > > Key: FLINK-10632 > URL: https://issues.apache.org/jira/browse/FLINK-10632 > Project: Flink > Issue Type: Sub-task > Components: E2E Tests >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > Similar to FLINK-8973, we should add an end-to-end which runs the general > datastream job with failures on a per-job cluster with HA enabled (either > directly the {{StandaloneJobClusterEntrypoint}} or a docker image based on > this entrypoint). > We should kill the TMs as well as the cluster entrypoint and verify that the > job recovers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] dawidwys opened a new pull request #6943: [FLINK-10632][e2e] Running general purpose testing job with failure in per-job mode
dawidwys opened a new pull request #6943: [FLINK-10632][e2e] Running general purpose testing job with failure in per-job mode URL: https://github.com/apache/flink/pull/6943 ## What is the purpose of the change Add test that runs general purpose job with failure in per-job cluster mode. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10678) Add a switch to run_test to configure if logs should be checked for errors/excepions
[ https://issues.apache.org/jira/browse/FLINK-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665262#comment-16665262 ] Dawid Wysakowicz commented on FLINK-10678: -- Hi [~hequn8128] Thanks for interest in this issue. Can I take it? I actually have a fix for this, I just forgot to assign myself. I hope this is ok. > Add a switch to run_test to configure if logs should be checked for > errors/excepions > > > Key: FLINK-10678 > URL: https://issues.apache.org/jira/browse/FLINK-10678 > Project: Flink > Issue Type: Bug > Components: E2E Tests >Affects Versions: 1.7.0 >Reporter: Dawid Wysakowicz >Assignee: Hequn Cheng >Priority: Blocker > Fix For: 1.7.0 > > > After adding the switch, we should disable log checking for nightly-tests > that currently fail (or fix the test). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10678) Add a switch to run_test to configure if logs should be checked for errors/excepions
[ https://issues.apache.org/jira/browse/FLINK-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665259#comment-16665259 ] Hequn Cheng commented on FLINK-10678: - I think it is also useful for tests that contain a lot of legitimate exceptions, such as [FLINK-10668|https://issues.apache.org/jira/browse/FLINK-10668] > Add a switch to run_test to configure if logs should be checked for > errors/excepions > > > Key: FLINK-10678 > URL: https://issues.apache.org/jira/browse/FLINK-10678 > Project: Flink > Issue Type: Bug > Components: E2E Tests >Affects Versions: 1.7.0 >Reporter: Dawid Wysakowicz >Assignee: Hequn Cheng >Priority: Blocker > Fix For: 1.7.0 > > > After adding the switch, we should disable log checking for nightly-tests > that currently fail (or fix the test). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-10678) Add a switch to run_test to configure if logs should be checked for errors/excepions
[ https://issues.apache.org/jira/browse/FLINK-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng reassigned FLINK-10678: --- Assignee: Hequn Cheng > Add a switch to run_test to configure if logs should be checked for > errors/excepions > > > Key: FLINK-10678 > URL: https://issues.apache.org/jira/browse/FLINK-10678 > Project: Flink > Issue Type: Bug > Components: E2E Tests >Affects Versions: 1.7.0 >Reporter: Dawid Wysakowicz >Assignee: Hequn Cheng >Priority: Blocker > Fix For: 1.7.0 > > > After adding the switch, we should disable log checking for nightly-tests > that currently fail (or fix the test). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-10668) Streaming File Sink E2E test fails because not all legitimate exceptions are excluded
[ https://issues.apache.org/jira/browse/FLINK-10668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665245#comment-16665245 ] Hequn Cheng edited comment on FLINK-10668 at 10/26/18 2:26 PM: --- [~gjy] Hi, need some advice from you. I see two options to solve the problem: - Exclude all legitimate errors/exceptions in {{check_logs_for_exceptions}} - Remove logs in {{test_streaming_file_sink.sh}} when test finished cause `Streaming File Sink E2E test` contains exceptions that shouldn't fail the test. Furthermore, we have already checked the output results in this test. I'm trying to solve the issue with the first option. However I find that we have to exclude a lot exceptions. I'm not sure whether there are any side effects for other tests. So, I'm prefer to the second option. It is more clear and will not affect other tests. What do you think? was (Author: hequn8128): [~gjy] Hi, I see two options to solve the problem: - Exclude all legitimate errors/exceptions in {{check_logs_for_exceptions}} - Remove logs in {{test_streaming_file_sink.sh}} when test finished cause `Streaming File Sink E2E test` contains exceptions that shouldn't fail the test. Furthermore, we have already checked the output results in this test. I'm trying to solve the issue with the first option. However I find that we have to exclude a lot exceptions. I'm not sure whether there are any side effects for other tests. So, I'm prefer to the second option. It is more clear and will not affect other tests. What do you think? > Streaming File Sink E2E test fails because not all legitimate exceptions are > excluded > - > > Key: FLINK-10668 > URL: https://issues.apache.org/jira/browse/FLINK-10668 > Project: Flink > Issue Type: Bug > Components: E2E Tests >Affects Versions: 1.6.1, 1.7.0 >Reporter: Gary Yao >Assignee: Hequn Cheng >Priority: Critical > Fix For: 1.6.3, 1.7.0 > > > Streaming File Sink E2E test fails because not all legitimate exceptions are > excluded. > The stacktrace below can appear in the logs generated by the test but > {{check_logs_for_exceptions}} does not exclude all expected exceptions. > {noformat} > java.io.IOException: Connecting the channel failed: Connecting to remote task > manager + 'xxx/10.0.x.xx:50849' has failed. This might indicate that the > remote task manager has been lost. > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:196) > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:133) > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:85) > at > org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:60) > at > org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:166) > at > org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:494) > at > org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:525) > at > org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:508) > at > org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:165) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:209) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connecting to remote task manager + 'xxx/10.0.x.xx:50849' has failed. > This might indicate that the remote task manager has been lost. > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:219) > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:133) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPro
[jira] [Commented] (FLINK-10668) Streaming File Sink E2E test fails because not all legitimate exceptions are excluded
[ https://issues.apache.org/jira/browse/FLINK-10668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665245#comment-16665245 ] Hequn Cheng commented on FLINK-10668: - [~gjy] Hi, I see two options to solve the problem: - Exclude all legitimate errors/exceptions in {{check_logs_for_exceptions}} - Remove logs in {{test_streaming_file_sink.sh}} when test finished cause `Streaming File Sink E2E test` contains exceptions that shouldn't fail the test. Furthermore, we have already checked the output results in this test. I'm trying to solve the issue with the first option. However I find that we have to exclude a lot exceptions. I'm not sure whether there are any side effects for other tests. So, I'm prefer to the second option. It is more clear and will not affect other tests. What do you think? > Streaming File Sink E2E test fails because not all legitimate exceptions are > excluded > - > > Key: FLINK-10668 > URL: https://issues.apache.org/jira/browse/FLINK-10668 > Project: Flink > Issue Type: Bug > Components: E2E Tests >Affects Versions: 1.6.1, 1.7.0 >Reporter: Gary Yao >Assignee: Hequn Cheng >Priority: Critical > Fix For: 1.6.3, 1.7.0 > > > Streaming File Sink E2E test fails because not all legitimate exceptions are > excluded. > The stacktrace below can appear in the logs generated by the test but > {{check_logs_for_exceptions}} does not exclude all expected exceptions. > {noformat} > java.io.IOException: Connecting the channel failed: Connecting to remote task > manager + 'xxx/10.0.x.xx:50849' has failed. This might indicate that the > remote task manager has been lost. > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:196) > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:133) > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:85) > at > org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:60) > at > org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:166) > at > org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:494) > at > org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:525) > at > org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:508) > at > org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:165) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:209) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connecting to remote task manager + 'xxx/10.0.x.xx:50849' has failed. > This might indicate that the remote task manager has been lost. > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:219) > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:133) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.Abstra
[jira] [Commented] (FLINK-10623) Extend streaming SQL end-to-end test to test MATCH_RECOGNIZE
[ https://issues.apache.org/jira/browse/FLINK-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665232#comment-16665232 ] ASF GitHub Bot commented on FLINK-10623: dawidwys commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE URL: https://github.com/apache/flink/pull/6926#discussion_r228542484 ## File path: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/datastream/DataStreamMatchRule.scala ## @@ -42,11 +42,13 @@ class DataStreamMatchRule RelOptRule.convert(logicalMatch.getInput, FlinkConventions.DATASTREAM) try { - Class.forName("org.apache.flink.cep.pattern.Pattern") + Class +.forName("org.apache.flink.cep.pattern.Pattern", + false, + Thread.currentThread().getContextClassLoader) } catch { case ex: ClassNotFoundException => throw new TableException( -"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.", -ex) +"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.") Review comment: The thing is we shouldn't, because: - sql-client prints only the root cause - we already know what is the original exception. There is only one possible exception there. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Extend streaming SQL end-to-end test to test MATCH_RECOGNIZE > > > Key: FLINK-10623 > URL: https://issues.apache.org/jira/browse/FLINK-10623 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL, Tests >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > We should extend the existing {{test_streaming_sql.sh}} to test the newly > added {{MATCH_RECOGNIZE}} functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] dawidwys commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE
dawidwys commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE URL: https://github.com/apache/flink/pull/6926#discussion_r228542484 ## File path: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/datastream/DataStreamMatchRule.scala ## @@ -42,11 +42,13 @@ class DataStreamMatchRule RelOptRule.convert(logicalMatch.getInput, FlinkConventions.DATASTREAM) try { - Class.forName("org.apache.flink.cep.pattern.Pattern") + Class +.forName("org.apache.flink.cep.pattern.Pattern", + false, + Thread.currentThread().getContextClassLoader) } catch { case ex: ClassNotFoundException => throw new TableException( -"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.", -ex) +"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.") Review comment: The thing is we shouldn't, because: - sql-client prints only the root cause - we already know what is the original exception. There is only one possible exception there. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10623) Extend streaming SQL end-to-end test to test MATCH_RECOGNIZE
[ https://issues.apache.org/jira/browse/FLINK-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665228#comment-16665228 ] ASF GitHub Bot commented on FLINK-10623: pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE URL: https://github.com/apache/flink/pull/6926#discussion_r228539342 ## File path: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/datastream/DataStreamMatchRule.scala ## @@ -42,11 +42,13 @@ class DataStreamMatchRule RelOptRule.convert(logicalMatch.getInput, FlinkConventions.DATASTREAM) try { - Class.forName("org.apache.flink.cep.pattern.Pattern") + Class +.forName("org.apache.flink.cep.pattern.Pattern", + false, + Thread.currentThread().getContextClassLoader) } catch { case ex: ClassNotFoundException => throw new TableException( -"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.", -ex) +"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.") Review comment: Shouldn't we at least log to original exception? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Extend streaming SQL end-to-end test to test MATCH_RECOGNIZE > > > Key: FLINK-10623 > URL: https://issues.apache.org/jira/browse/FLINK-10623 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL, Tests >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > We should extend the existing {{test_streaming_sql.sh}} to test the newly > added {{MATCH_RECOGNIZE}} functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE
pnowojski commented on a change in pull request #6926: [FLINK-10623][e2e] Extended sql-client e2e test with MATCH_RECOGNIZE URL: https://github.com/apache/flink/pull/6926#discussion_r228539342 ## File path: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/datastream/DataStreamMatchRule.scala ## @@ -42,11 +42,13 @@ class DataStreamMatchRule RelOptRule.convert(logicalMatch.getInput, FlinkConventions.DATASTREAM) try { - Class.forName("org.apache.flink.cep.pattern.Pattern") + Class +.forName("org.apache.flink.cep.pattern.Pattern", + false, + Thread.currentThread().getContextClassLoader) } catch { case ex: ClassNotFoundException => throw new TableException( -"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.", -ex) +"MATCH RECOGNIZE clause requires flink-cep dependency to be present on the classpath.") Review comment: Shouldn't we at least log to original exception? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-10693) Fix Scala EitherSerializer duplication
Stephan Ewen created FLINK-10693: Summary: Fix Scala EitherSerializer duplication Key: FLINK-10693 URL: https://issues.apache.org/jira/browse/FLINK-10693 Project: Flink Issue Type: Bug Components: Scala API Affects Versions: 1.6.1 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 1.7.0 The Scala Either Serializer has buggy duplication logic, resulting in sharing and incorrect concurrent use when the nested serializers are not thread safe. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10664) Flink: Checkpointing fails with S3 exception - Please reduce your request rate
[ https://issues.apache.org/jira/browse/FLINK-10664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665210#comment-16665210 ] Stephan Ewen commented on FLINK-10664: -- I would suggest to try and use the {{flink-s3-fs-presto}} instead of relying on Hadoop in the classpath. That mentioned file system does fewer requests to S3 by skipping on some redundant metadata operations. > Flink: Checkpointing fails with S3 exception - Please reduce your request rate > -- > > Key: FLINK-10664 > URL: https://issues.apache.org/jira/browse/FLINK-10664 > Project: Flink > Issue Type: Improvement > Components: JobManager, TaskManager >Affects Versions: 1.5.4, 1.6.1 >Reporter: Pawel Bartoszek >Priority: Major > > When the checkpoint is created for the job which has many operators it could > happen that Flink uploads too many checkpoint files, at the same time, to S3 > resulting in throttling from S3. > > {code:java} > Caused by: org.apache.hadoop.fs.s3a.AWSS3IOException: saving output on > flink/state-checkpoints/7bbd6495f90257e4bc037ecc08ba21a5/chk-19/4422b088-0836-4f12-bbbe-7e731da11231: > com.amazonaws.services.s3.model.AmazonS3Exception: Please reduce your > request rate. (Service: Amazon S3; Status Code: 503; Error Code: SlowDown; > Request ID: ; S3 Extended Request ID: XXX), S3 Extended Request ID: XXX: > Please reduce your request rate. (Service: Amazon S3; Status Code: 503; Error > Code: SlowDown; Request ID: 5310EA750DF8B949; S3 Extended Request ID: XXX) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:178) > at org.apache.hadoop.fs.s3a.S3AOutputStream.close(S3AOutputStream.java:121) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74) > at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108) > at > org.apache.flink.runtime.fs.hdfs.HadoopDataOutputStream.close(HadoopDataOutputStream.java:52) > at > org.apache.flink.core.fs.ClosingFSDataOutputStream.close(ClosingFSDataOutputStream.java:64) > at > org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.closeAndGetHandle(FsCheckpointStreamFactory.java:311){code} > > Can the upload be retried with kind of back off? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10682) EOFException occurs during deserialization of Avro class
[ https://issues.apache.org/jira/browse/FLINK-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665203#comment-16665203 ] Stephan Ewen commented on FLINK-10682: -- This is a deserialization error when reading records coming through the network. The Avro Deserializer tries to read more data than is there in the record. I would try and see if there may be any inconsistency in types / schemas or Avro configuration used on the sender or receiver side. > EOFException occurs during deserialization of Avro class > > > Key: FLINK-10682 > URL: https://issues.apache.org/jira/browse/FLINK-10682 > Project: Flink > Issue Type: Bug > Components: Type Serialization System >Affects Versions: 1.5.4 > Environment: AWS EMR 5.17 (upgraded to Flink 1.5.4) > 3 task managers, 1 job manager running in YARN in Hadoop > Running on Amazon Linux with OpenJDK 1.8 >Reporter: Ben La Monica >Priority: Critical > > I'm having trouble (which usually occurs after an hour of processing in a > StreamExecutionEnvironment) where I get this failure message. I'm at a loss > for what is causing it. I'm running this in AWS on EMR 5.17 with 3 task > managers and a job manager running in a YARN cluster and I've upgraded my > flink libraries to 1.5.4 to bypass another serialization issue and the > kerberos auth issues. > The avro classes that are being deserialized were generated with avro 1.8.2. > {code:java} > 2018-10-22 16:12:10,680 [INFO ] class=o.a.flink.runtime.taskmanager.Task > thread="Calculate Estimated NAV -> Split into single messages (3/10)" > Calculate Estimated NAV -> Split into single messages (3/10) (de7d8fa77 > 84903a475391d0168d56f2e) switched from RUNNING to FAILED. > java.io.EOFException: null > at > org.apache.flink.core.memory.DataInputDeserializer.readLong(DataInputDeserializer.java:219) > at > org.apache.flink.core.memory.DataInputDeserializer.readDouble(DataInputDeserializer.java:138) > at > org.apache.flink.formats.avro.utils.DataInputDecoder.readDouble(DataInputDecoder.java:70) > at org.apache.avro.io.ResolvingDecoder.readDouble(ResolvingDecoder.java:190) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:186) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179) > at > org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:116) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) > at > org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:266) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:177) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179) > at > org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:116) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) > at > org.apache.flink.formats.avro.typeutils.AvroSerializer.deserialize(AvroSerializer.java:172) > at > org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:208) > at > org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:49) > at > org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55) > at > org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:140) > at > org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:208) > at > org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:116) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712) > at java.lang.Thread.run(Thread.java:748){code} > Do you have any ideas on how I could further troubleshoot this issue? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10683) Error while executing BLOB connection. java.io.IOException: Unknown operation
[ https://issues.apache.org/jira/browse/FLINK-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665185#comment-16665185 ] Stephan Ewen commented on FLINK-10683: -- Seems to be an error (or byte flip) in the blob protocol- Does this take down the system fully, or is this only in the log, but Flink continues to run? > Error while executing BLOB connection. java.io.IOException: Unknown operation > - > > Key: FLINK-10683 > URL: https://issues.apache.org/jira/browse/FLINK-10683 > Project: Flink > Issue Type: Bug >Reporter: Yee >Priority: Major > > ERROR org.apache.flink.runtime.blob.BlobServerConnection- Error > while executing BLOB connection. > java.io.IOException: Unknown operation 5 > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136) > 2018-10-26 01:49:35,247 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- Error while > executing BLOB connection. > java.io.IOException: Unknown operation 18 > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136) > 2018-10-26 01:49:35,550 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- PUT operation > failed > java.io.IOException: Unknown type of BLOB addressing. > at > org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:347) > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:127) > 2018-10-26 01:49:35,854 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- Error while > executing BLOB connection. > java.io.IOException: Unknown operation 3 > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136) > 2018-10-26 01:49:36,159 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- PUT operation > failed > java.io.IOException: Unexpected number of incoming bytes: 50353152 > at > org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:368) > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:127) > 2018-10-26 01:49:36,463 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- Error while > executing BLOB connection. > java.io.IOException: Unknown operation 105 > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136) > 2018-10-26 01:49:36,765 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- Error while > executing BLOB connection. > java.io.IOException: Unknown operation 71 > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136) > 2018-10-26 01:49:37,069 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- Error while > executing BLOB connection. > java.io.IOException: Unknown operation 128 > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136) > 2018-10-26 01:49:37,373 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- PUT operation > failed > java.io.IOException: Unexpected number of incoming bytes: 4302592 > at > org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:368) > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:127) > 2018-10-26 01:49:37,676 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- Error while > executing BLOB connection. > java.io.IOException: Unknown operation 115 > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136) > 2018-10-26 01:49:37,980 ERROR > org.apache.flink.runtime.blob.BlobServerConnection- Error while > executing BLOB connection. > java.io.IOException: Unknown operation 71 > at > org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] contradictioned opened a new pull request #6942: [hotfix] [docs] Fixed small typos in the documentation.
contradictioned opened a new pull request #6942: [hotfix] [docs] Fixed small typos in the documentation. URL: https://github.com/apache/flink/pull/6942 Fixed some small typos that I spotted in the documentation. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10600) Provide End-to-end test cases for modern Kafka connectors
[ https://issues.apache.org/jira/browse/FLINK-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665123#comment-16665123 ] ASF GitHub Bot commented on FLINK-10600: yanghua edited a comment on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#issuecomment-433394209 @pnowojski Using maven-shade-plugin, it's hard to package a jar (in `flink-example-streaming`) that doesn't contain flink kafka 0.10 or old kafka client, which is why FLINK-10107 appears. Unless we adopt the maven profile. But this is not very suitable. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide End-to-end test cases for modern Kafka connectors > - > > Key: FLINK-10600 > URL: https://issues.apache.org/jira/browse/FLINK-10600 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Reporter: vinoyang >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua edited a comment on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors
yanghua edited a comment on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#issuecomment-433394209 @pnowojski Using maven-shade-plugin, it's hard to package a jar (in `flink-example-streaming`) that doesn't contain flink kafka 0.10 or old kafka client, which is why FLINK-10107 appears. Unless we adopt the maven profile. But this is not very suitable. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10600) Provide End-to-end test cases for modern Kafka connectors
[ https://issues.apache.org/jira/browse/FLINK-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665122#comment-16665122 ] ASF GitHub Bot commented on FLINK-10600: yanghua commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#issuecomment-433394209 @pnowojski Using maven-shade-plugin, it's hard to type a jar that doesn't contain flink kafka 0.10 or old kafka client, which is why FLINK-10107 appears. Unless we adopt the maven profile. But this is not very suitable. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide End-to-end test cases for modern Kafka connectors > - > > Key: FLINK-10600 > URL: https://issues.apache.org/jira/browse/FLINK-10600 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Reporter: vinoyang >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors
yanghua commented on issue #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#issuecomment-433394209 @pnowojski Using maven-shade-plugin, it's hard to type a jar that doesn't contain flink kafka 0.10 or old kafka client, which is why FLINK-10107 appears. Unless we adopt the maven profile. But this is not very suitable. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10600) Provide End-to-end test cases for modern Kafka connectors
[ https://issues.apache.org/jira/browse/FLINK-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665102#comment-16665102 ] ASF GitHub Bot commented on FLINK-10600: yanghua commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#discussion_r228505807 ## File path: flink-end-to-end-tests/test-scripts/test-data/modern-kafka-common.sh ## @@ -0,0 +1,152 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +set -o pipefail + +if [[ -z $TEST_DATA_DIR ]]; then + echo "Must run common.sh before modern-kafka-common.sh." + exit 1 +fi + +KAFKA_DIR=$TEST_DATA_DIR/kafka_2.11-2.0.0 +CONFLUENT_DIR=$TEST_DATA_DIR/confluent-5.0.0 +SCHEMA_REGISTRY_PORT=8082 +SCHEMA_REGISTRY_URL=http://localhost:${SCHEMA_REGISTRY_PORT} + +function setup_kafka_dist { + # download Kafka + mkdir -p $TEST_DATA_DIR + KAFKA_URL="https://archive.apache.org/dist/kafka/2.0.0/kafka_2.11-2.0.0.tgz"; + echo "Downloading Kafka from $KAFKA_URL" + curl "$KAFKA_URL" > $TEST_DATA_DIR/modern-kafka.tgz + + tar xzf $TEST_DATA_DIR/modern-kafka.tgz -C $TEST_DATA_DIR/ + + # fix kafka config + sed -i -e "s+^\(dataDir\s*=\s*\).*$+\1$TEST_DATA_DIR/zookeeper+" $KAFKA_DIR/config/zookeeper.properties + sed -i -e "s+^\(log\.dirs\s*=\s*\).*$+\1$TEST_DATA_DIR/kafka+" $KAFKA_DIR/config/server.properties +} + +function setup_confluent_dist { Review comment: Yes, I know. My strategy is divided into two steps: 1) First ensure that the end-to-end test passes; 2) Refactor it, still use the original `kafka-common`, and finally there should be no `modern-kafka-common` Currently I am working on the first problem (the wrong dependency version causes the class to not be found). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide End-to-end test cases for modern Kafka connectors > - > > Key: FLINK-10600 > URL: https://issues.apache.org/jira/browse/FLINK-10600 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Reporter: vinoyang >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors
yanghua commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#discussion_r228505807 ## File path: flink-end-to-end-tests/test-scripts/test-data/modern-kafka-common.sh ## @@ -0,0 +1,152 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +set -o pipefail + +if [[ -z $TEST_DATA_DIR ]]; then + echo "Must run common.sh before modern-kafka-common.sh." + exit 1 +fi + +KAFKA_DIR=$TEST_DATA_DIR/kafka_2.11-2.0.0 +CONFLUENT_DIR=$TEST_DATA_DIR/confluent-5.0.0 +SCHEMA_REGISTRY_PORT=8082 +SCHEMA_REGISTRY_URL=http://localhost:${SCHEMA_REGISTRY_PORT} + +function setup_kafka_dist { + # download Kafka + mkdir -p $TEST_DATA_DIR + KAFKA_URL="https://archive.apache.org/dist/kafka/2.0.0/kafka_2.11-2.0.0.tgz"; + echo "Downloading Kafka from $KAFKA_URL" + curl "$KAFKA_URL" > $TEST_DATA_DIR/modern-kafka.tgz + + tar xzf $TEST_DATA_DIR/modern-kafka.tgz -C $TEST_DATA_DIR/ + + # fix kafka config + sed -i -e "s+^\(dataDir\s*=\s*\).*$+\1$TEST_DATA_DIR/zookeeper+" $KAFKA_DIR/config/zookeeper.properties + sed -i -e "s+^\(log\.dirs\s*=\s*\).*$+\1$TEST_DATA_DIR/kafka+" $KAFKA_DIR/config/server.properties +} + +function setup_confluent_dist { Review comment: Yes, I know. My strategy is divided into two steps: 1) First ensure that the end-to-end test passes; 2) Refactor it, still use the original `kafka-common`, and finally there should be no `modern-kafka-common` Currently I am working on the first problem (the wrong dependency version causes the class to not be found). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10600) Provide End-to-end test cases for modern Kafka connectors
[ https://issues.apache.org/jira/browse/FLINK-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665094#comment-16665094 ] ASF GitHub Bot commented on FLINK-10600: pnowojski commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#discussion_r228504540 ## File path: flink-end-to-end-tests/test-scripts/test-data/modern-kafka-common.sh ## @@ -0,0 +1,152 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +set -o pipefail + +if [[ -z $TEST_DATA_DIR ]]; then + echo "Must run common.sh before modern-kafka-common.sh." + exit 1 +fi + +KAFKA_DIR=$TEST_DATA_DIR/kafka_2.11-2.0.0 +CONFLUENT_DIR=$TEST_DATA_DIR/confluent-5.0.0 +SCHEMA_REGISTRY_PORT=8082 +SCHEMA_REGISTRY_URL=http://localhost:${SCHEMA_REGISTRY_PORT} + +function setup_kafka_dist { + # download Kafka + mkdir -p $TEST_DATA_DIR + KAFKA_URL="https://archive.apache.org/dist/kafka/2.0.0/kafka_2.11-2.0.0.tgz"; + echo "Downloading Kafka from $KAFKA_URL" + curl "$KAFKA_URL" > $TEST_DATA_DIR/modern-kafka.tgz + + tar xzf $TEST_DATA_DIR/modern-kafka.tgz -C $TEST_DATA_DIR/ + + # fix kafka config + sed -i -e "s+^\(dataDir\s*=\s*\).*$+\1$TEST_DATA_DIR/zookeeper+" $KAFKA_DIR/config/zookeeper.properties + sed -i -e "s+^\(log\.dirs\s*=\s*\).*$+\1$TEST_DATA_DIR/kafka+" $KAFKA_DIR/config/server.properties +} + +function setup_confluent_dist { Review comment: Please deduplicate those two `kafka-common` files. Now this `setup_confluent_dist` is not only duplicated but also unused. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide End-to-end test cases for modern Kafka connectors > - > > Key: FLINK-10600 > URL: https://issues.apache.org/jira/browse/FLINK-10600 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Reporter: vinoyang >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] pnowojski commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors
pnowojski commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#discussion_r228504540 ## File path: flink-end-to-end-tests/test-scripts/test-data/modern-kafka-common.sh ## @@ -0,0 +1,152 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +set -o pipefail + +if [[ -z $TEST_DATA_DIR ]]; then + echo "Must run common.sh before modern-kafka-common.sh." + exit 1 +fi + +KAFKA_DIR=$TEST_DATA_DIR/kafka_2.11-2.0.0 +CONFLUENT_DIR=$TEST_DATA_DIR/confluent-5.0.0 +SCHEMA_REGISTRY_PORT=8082 +SCHEMA_REGISTRY_URL=http://localhost:${SCHEMA_REGISTRY_PORT} + +function setup_kafka_dist { + # download Kafka + mkdir -p $TEST_DATA_DIR + KAFKA_URL="https://archive.apache.org/dist/kafka/2.0.0/kafka_2.11-2.0.0.tgz"; + echo "Downloading Kafka from $KAFKA_URL" + curl "$KAFKA_URL" > $TEST_DATA_DIR/modern-kafka.tgz + + tar xzf $TEST_DATA_DIR/modern-kafka.tgz -C $TEST_DATA_DIR/ + + # fix kafka config + sed -i -e "s+^\(dataDir\s*=\s*\).*$+\1$TEST_DATA_DIR/zookeeper+" $KAFKA_DIR/config/zookeeper.properties + sed -i -e "s+^\(log\.dirs\s*=\s*\).*$+\1$TEST_DATA_DIR/kafka+" $KAFKA_DIR/config/server.properties +} + +function setup_confluent_dist { Review comment: Please deduplicate those two `kafka-common` files. Now this `setup_confluent_dist` is not only duplicated but also unused. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-9697) Provide connector for modern Kafka
[ https://issues.apache.org/jira/browse/FLINK-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665092#comment-16665092 ] ASF GitHub Bot commented on FLINK-9697: --- yanghua commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka URL: https://github.com/apache/flink/pull/6703#issuecomment-433387046 hi @pnowojski see [here](https://github.com/apache/flink/pull/6890#issuecomment-431917344). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide connector for modern Kafka > -- > > Key: FLINK-9697 > URL: https://issues.apache.org/jira/browse/FLINK-9697 > Project: Flink > Issue Type: Improvement >Reporter: Ted Yu >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > Kafka 2.0.0 would be released soon. > Here is vote thread: > [http://search-hadoop.com/m/Kafka/uyzND1vxnEd23QLxb?subj=+VOTE+2+0+0+RC1] > We should provide connector for Kafka 2.0.0 once it is released. > Upgrade to 2.0 documentation : > http://kafka.apache.org/20/documentation.html#upgrade_2_0_0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka
yanghua commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka URL: https://github.com/apache/flink/pull/6703#issuecomment-433387046 hi @pnowojski see [here](https://github.com/apache/flink/pull/6890#issuecomment-431917344). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-9697) Provide connector for modern Kafka
[ https://issues.apache.org/jira/browse/FLINK-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665089#comment-16665089 ] ASF GitHub Bot commented on FLINK-9697: --- pnowojski commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka URL: https://github.com/apache/flink/pull/6703#issuecomment-433386403 @yanghua is this new `flink-connector-kafka` module being compiled/tested on travis? I do not see the logs in travis builds and `flink-connector-kafka` is missing from `MODULES_CONNECTORS` defined in `tools/travis/stage.sh`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide connector for modern Kafka > -- > > Key: FLINK-9697 > URL: https://issues.apache.org/jira/browse/FLINK-9697 > Project: Flink > Issue Type: Improvement >Reporter: Ted Yu >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > Kafka 2.0.0 would be released soon. > Here is vote thread: > [http://search-hadoop.com/m/Kafka/uyzND1vxnEd23QLxb?subj=+VOTE+2+0+0+RC1] > We should provide connector for Kafka 2.0.0 once it is released. > Upgrade to 2.0 documentation : > http://kafka.apache.org/20/documentation.html#upgrade_2_0_0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] pnowojski commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka
pnowojski commented on issue #6703: [FLINK-9697] Provide connector for modern Kafka URL: https://github.com/apache/flink/pull/6703#issuecomment-433386403 @yanghua is this new `flink-connector-kafka` module being compiled/tested on travis? I do not see the logs in travis builds and `flink-connector-kafka` is missing from `MODULES_CONNECTORS` defined in `tools/travis/stage.sh`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10603) Reduce kafka test duration
[ https://issues.apache.org/jira/browse/FLINK-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665088#comment-16665088 ] vinoyang commented on FLINK-10603: -- [~pnowojski] In fact, there is no timeout added. Do you mean to continue to reduce the timeout (via kafka configuration)? The current state is that the test time of the modern kafka connector is the same as the kafka 0.11 connector, but much longer than 0.10. > Reduce kafka test duration > -- > > Key: FLINK-10603 > URL: https://issues.apache.org/jira/browse/FLINK-10603 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector, Tests >Affects Versions: 1.7.0 >Reporter: Chesnay Schepler >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > > The tests for the modern kafka connector take more than 10 minutes which is > simply unacceptable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10603) Reduce kafka test duration
[ https://issues.apache.org/jira/browse/FLINK-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665080#comment-16665080 ] Piotr Nowojski commented on FLINK-10603: [~yanghua] can not we decrease the timeouts in Kafka instead of increasing the timeouts in tests? > Reduce kafka test duration > -- > > Key: FLINK-10603 > URL: https://issues.apache.org/jira/browse/FLINK-10603 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector, Tests >Affects Versions: 1.7.0 >Reporter: Chesnay Schepler >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > > The tests for the modern kafka connector take more than 10 minutes which is > simply unacceptable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors
yanghua commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#discussion_r228492647 ## File path: flink-end-to-end-tests/test-scripts/test-data/modern-kafka-common.sh ## @@ -0,0 +1,152 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +set -o pipefail + +if [[ -z $TEST_DATA_DIR ]]; then + echo "Must run common.sh before modern-kafka-common.sh." + exit 1 +fi + +KAFKA_DIR=$TEST_DATA_DIR/kafka_2.11-2.0.0 +CONFLUENT_DIR=$TEST_DATA_DIR/confluent-5.0.0 +SCHEMA_REGISTRY_PORT=8082 +SCHEMA_REGISTRY_URL=http://localhost:${SCHEMA_REGISTRY_PORT} + +function setup_kafka_dist { + # download Kafka + mkdir -p $TEST_DATA_DIR + KAFKA_URL="https://archive.apache.org/dist/kafka/2.0.0/kafka_2.11-2.0.0.tgz"; + echo "Downloading Kafka from $KAFKA_URL" + curl "$KAFKA_URL" > $TEST_DATA_DIR/modern-kafka.tgz + + tar xzf $TEST_DATA_DIR/modern-kafka.tgz -C $TEST_DATA_DIR/ + + # fix kafka config + sed -i -e "s+^\(dataDir\s*=\s*\).*$+\1$TEST_DATA_DIR/zookeeper+" $KAFKA_DIR/config/zookeeper.properties + sed -i -e "s+^\(log\.dirs\s*=\s*\).*$+\1$TEST_DATA_DIR/kafka+" $KAFKA_DIR/config/server.properties +} + +function setup_confluent_dist { Review comment: Currently, modern-kafka-common is a simple copy of kafka-common, just to test as soon as possible. The final modification may be at kafka-common, so I don't think we need to move it? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10600) Provide End-to-end test cases for modern Kafka connectors
[ https://issues.apache.org/jira/browse/FLINK-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665058#comment-16665058 ] ASF GitHub Bot commented on FLINK-10600: yanghua commented on a change in pull request #6924: [FLINK-10600] Provide End-to-end test cases for modern Kafka connectors URL: https://github.com/apache/flink/pull/6924#discussion_r228492647 ## File path: flink-end-to-end-tests/test-scripts/test-data/modern-kafka-common.sh ## @@ -0,0 +1,152 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +set -o pipefail + +if [[ -z $TEST_DATA_DIR ]]; then + echo "Must run common.sh before modern-kafka-common.sh." + exit 1 +fi + +KAFKA_DIR=$TEST_DATA_DIR/kafka_2.11-2.0.0 +CONFLUENT_DIR=$TEST_DATA_DIR/confluent-5.0.0 +SCHEMA_REGISTRY_PORT=8082 +SCHEMA_REGISTRY_URL=http://localhost:${SCHEMA_REGISTRY_PORT} + +function setup_kafka_dist { + # download Kafka + mkdir -p $TEST_DATA_DIR + KAFKA_URL="https://archive.apache.org/dist/kafka/2.0.0/kafka_2.11-2.0.0.tgz"; + echo "Downloading Kafka from $KAFKA_URL" + curl "$KAFKA_URL" > $TEST_DATA_DIR/modern-kafka.tgz + + tar xzf $TEST_DATA_DIR/modern-kafka.tgz -C $TEST_DATA_DIR/ + + # fix kafka config + sed -i -e "s+^\(dataDir\s*=\s*\).*$+\1$TEST_DATA_DIR/zookeeper+" $KAFKA_DIR/config/zookeeper.properties + sed -i -e "s+^\(log\.dirs\s*=\s*\).*$+\1$TEST_DATA_DIR/kafka+" $KAFKA_DIR/config/server.properties +} + +function setup_confluent_dist { Review comment: Currently, modern-kafka-common is a simple copy of kafka-common, just to test as soon as possible. The final modification may be at kafka-common, so I don't think we need to move it? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Provide End-to-end test cases for modern Kafka connectors > - > > Key: FLINK-10600 > URL: https://issues.apache.org/jira/browse/FLINK-10600 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Reporter: vinoyang >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10624) Extend SQL client end-to-end to test new KafkaTableSink
[ https://issues.apache.org/jira/browse/FLINK-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665055#comment-16665055 ] ASF GitHub Bot commented on FLINK-10624: yanghua commented on issue #6927: [FLINK-10624] Extend SQL client end-to-end to test new KafkaTableSink URL: https://github.com/apache/flink/pull/6927#issuecomment-433375753 @pnowojski I am currently running locally. From the JM log, I found that multiple kafka and connector dependencies are mixed together, resulting in some exceptions that the class can't find, I'm dealing with. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Extend SQL client end-to-end to test new KafkaTableSink > --- > > Key: FLINK-10624 > URL: https://issues.apache.org/jira/browse/FLINK-10624 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector, Table API & SQL, Tests >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: vinoyang >Priority: Critical > Labels: pull-request-available > Fix For: 1.7.0 > > > With FLINK-9697, we added support for Kafka 2.0. We should also extend the > existing streaming client end-to-end test to also test the new > {{KafkaTableSink}} against Kafka 2.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)