date:20220920



lindong28 commented on code in PR #157:
URL: https://github.com/apache/flink-ml/pull/157#discussion_r975994492


##
flink-ml-core/src/main/java/org/apache/flink/ml/common/window/BoundedWindow.java:
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.common.window;
+
+/** A {@link Window} that groups all elements in a bounded stream into one 
window. */
+public class BoundedWindow implements Window {

Review Comment:
   Would it be simpler to just use an existing window with size = MAX_LONG?



##
flink-ml-core/src/main/java/org/apache/flink/ml/common/window/BoundedWindow.java:
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.common.window;
+
+/** A {@link Window} that groups all elements in a bounded stream into one 
window. */
+public class BoundedWindow implements Window {
+private static final BoundedWindow INSTANCE = new BoundedWindow();
+
+private BoundedWindow() {}
+
+public static BoundedWindow get() {

Review Comment:
   Would it be better to use `getInstance()` to be consistent with 
`EuclideanDistanceMeasure::getInstance()`?



##
docs/content/docs/operators/clustering/agglomerativeclustering.md:
##
@@ -49,15 +49,16 @@ format of the merging information is
 
 ### Parameters
 
-| Key   | Default| Type| Required | Description

 |
-|:--|:---|:|:-|:|
-| numClusters   | `2`| Integer | no   | The max number of 
clusters to create. 
  |
-| distanceThreshold | `null` | Double  | no   | Threshold to 
decide whether two clusters should be merged.   
   |
-| linkage   | `"ward"`   | String  | no   | Criterion for 
computing distance between two clusters. Supported values: `'ward', 'complete', 
'single', 'average'`. |
-| computeFullTree   | `false`| Boolean | no   | Whether computes 
the full tree after convergence.
   |
-| distanceMeasure   | `"euclidean"`  | String  | no   | Distance measure. 
Supported values: `'euclidean', 'manhattan', 'cosine'`. 
  |
-| featuresCol   | `"features"`   | String  | no   | Features column 
name.   
|
-| predictionCol | `"prediction"` | String  | no   | Prediction column 
name.   
  |
+| Key   | Default   | Type| Required | Description 
 |
+| : | : | :-- | :--- | 
:--- |
+| numClusters   | `2`   | Integer | no   | The max 
number of clusters to create.|
+| distanceThreshold | `null`| Double  | no

[jira] [Updated] (FLINK-29367) Avoid manifest corruption for incorrect checkpoint recovery



 [ 
https://issues.apache.org/jira/browse/FLINK-29367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29367:
-
Priority: Blocker  (was: Major)

> Avoid manifest corruption for incorrect checkpoint recovery
> ---
>
> Key: FLINK-29367
> URL: https://issues.apache.org/jira/browse/FLINK-29367
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Caizhi Weng
>Priority: Blocker
> Fix For: table-store-0.3.0
>
>
> When the job runs to checkpoint N, if the user recovers from an old 
> checkpoint (such as checkpoint N-5), the sink of the current FTS will cause a 
> manifest corruption because duplicate files may be committed.
> We should avoid such corruption, and the storage should be robust enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29369) Commit delete file failure due to Checkpoint aborted



 [ 
https://issues.apache.org/jira/browse/FLINK-29369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29369:
-
Priority: Blocker  (was: Major)

> Commit delete file failure due to Checkpoint aborted
> 
>
> Key: FLINK-29369
> URL: https://issues.apache.org/jira/browse/FLINK-29369
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Blocker
> Fix For: table-store-0.3.0, table-store-0.2.1
>
>
> After checkpoint abort, the files in cp5 may fall into cp6, because the 
> compaction commit is deleted first and then added, which may lead to:
> -Delete a file
> -Add the same file again
> This causes the deleted file not to be found.
> We need to properly process the merge of the compression files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] yuzelin commented on a diff in pull request #20790: [FLINK-29228][hive] Align the schema of the HiveServer2 getMetadata with JDBC



yuzelin commented on code in PR #20790:
URL: https://github.com/apache/flink/pull/20790#discussion_r976003878


##
flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/table/endpoint/hive/HiveServer2EndpointITCase.java:
##
@@ -572,6 +595,107 @@ public void testGetInfo() throws Exception {
 }
 }
 
+@Test
+public void testExecuteStatementInSyncMode() throws Exception {
+TCLIService.Client client = createClient();
+TSessionHandle sessionHandle = client.OpenSession(new 
TOpenSessionReq()).getSessionHandle();
+TOperationHandle operationHandle =
+client.ExecuteStatement(new 
TExecuteStatementReq(sessionHandle, "SHOW CATALOGS"))
+.getOperationHandle();
+
+assertThat(
+client.GetOperationStatus(new 
TGetOperationStatusReq(operationHandle))
+.getOperationState())
+.isEqualTo(TOperationState.FINISHED_STATE);
+
+RowSet rowSet =
+RowSetFactory.create(
+client.FetchResults(
+new TFetchResultsReq(
+operationHandle,
+TFetchOrientation.FETCH_NEXT,
+Integer.MAX_VALUE))
+.getResults(),
+HIVE_CLI_SERVICE_PROTOCOL_V10);
+Iterator iterator = rowSet.iterator();
+List> actual = new ArrayList<>();
+while (iterator.hasNext()) {
+actual.add(new ArrayList<>(Arrays.asList(iterator.next(;
+}
+
assertThat(actual).isEqualTo(Collections.singletonList(Collections.singletonList("hive")));
+}

Review Comment:
   Is it necessary to call client.CloseSession?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-29369) Commit delete file failure due to Checkpoint aborted

Jingsong Lee created FLINK-29369:


 Summary: Commit delete file failure due to Checkpoint aborted
 Key: FLINK-29369
 URL: https://issues.apache.org/jira/browse/FLINK-29369
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0, table-store-0.2.1


After checkpoint abort, the files in cp5 may fall into cp6, because the 
compaction commit is deleted first and then added, which may lead to:
-Delete a file
-Add the same file again

This causes the deleted file not to be found.

We need to properly process the merge of the compression files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-29369) Commit delete file failure due to Checkpoint aborted



 [ 
https://issues.apache.org/jira/browse/FLINK-29369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-29369:


Assignee: Jingsong Lee

> Commit delete file failure due to Checkpoint aborted
> 
>
> Key: FLINK-29369
> URL: https://issues.apache.org/jira/browse/FLINK-29369
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.3.0, table-store-0.2.1
>
>
> After checkpoint abort, the files in cp5 may fall into cp6, because the 
> compaction commit is deleted first and then added, which may lead to:
> -Delete a file
> -Add the same file again
> This causes the deleted file not to be found.
> We need to properly process the merge of the compression files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] luoyuxia commented on pull request #20789: [FLINK-29152][hive] fix inconsistent behavior with Hive for `desc table` in Hive dialect



luoyuxia commented on PR #20789:
URL: https://github.com/apache/flink/pull/20789#issuecomment-1253188618

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-29368) Modify DESCRIBE statement docs for now syntax

2022-09-20 Thread Yunhong Zheng (Jira)

Yunhong Zheng created FLINK-29368:
-

 Summary: Modify DESCRIBE statement docs for now syntax
 Key: FLINK-29368
 URL: https://issues.apache.org/jira/browse/FLINK-29368
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.17.0
Reporter: Yunhong Zheng
 Fix For: 1.17.0


In Flink 1.17.0, DESCRIBE statement syntax will be changed to DESCRIBE/DESC 
[EXTENDED] [catalog_name.][database_name.]table_name 
[PARTITION(partition_spec)] [col_name]. So, it need to modify the docs for this 
statement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29368) Modify DESCRIBE statement docs for now syntax

2022-09-20 Thread Yunhong Zheng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-29368:
--
Issue Type: Improvement  (was: Bug)

> Modify DESCRIBE statement docs for now syntax
> -
>
> Key: FLINK-29368
> URL: https://issues.apache.org/jira/browse/FLINK-29368
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> In Flink 1.17.0, DESCRIBE statement syntax will be changed to DESCRIBE/DESC 
> [EXTENDED] [catalog_name.][database_name.]table_name 
> [PARTITION(partition_spec)] [col_name]. So, it need to modify the docs for 
> this statement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29367) Avoid manifest corruption for incorrect checkpoint recovery

Jingsong Lee created FLINK-29367:


 Summary: Avoid manifest corruption for incorrect checkpoint 
recovery
 Key: FLINK-29367
 URL: https://issues.apache.org/jira/browse/FLINK-29367
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Caizhi Weng
 Fix For: table-store-0.3.0


When the job runs to checkpoint N, if the user recovers from an old checkpoint 
(such as checkpoint N-5), the sink of the current FTS will cause a manifest 
corruption because duplicate files may be committed.

We should avoid such corruption, and the storage should be robust enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] mbalassi merged pull request #374: [hotfix] Remove strange configs from helm defaults



mbalassi merged PR #374:
URL: https://github.com/apache/flink-kubernetes-operator/pull/374


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-29366) Use flink-shaded-jacson to parse flink-conf.yaml



 [ 
https://issues.apache.org/jira/browse/FLINK-29366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Kui updated FLINK-29366:
-
Summary: Use flink-shaded-jacson to parse flink-conf.yaml  (was: Use flink 
shaded jacson to parse flink-conf.yaml)

> Use flink-shaded-jacson to parse flink-conf.yaml
> 
>
> Key: FLINK-29366
> URL: https://issues.apache.org/jira/browse/FLINK-29366
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.13.3
>Reporter: Yuan Kui
>Priority: Major
>
> Now we use a simple 
> implementation(org.apache.flink.configuration.GlobalConfiguration#loadYAMLResource)
>  to parse flink-conf.yaml, which can only parse simple key-value pairs.
> Although there have been discussions on this issue 
> historically(see:[https://github.com/stratosphere/stratosphere/issues/113])
> but I think that in the actual production environment, we often need to 
> config complex structure into flink-conf.yaml. At this time, the yaml libary 
> is required for parsing, so I suggest to use the yaml library to parse 
> flink-conf.yaml  instead of our own implementation.
> In fact, the flink-core module already has a dependency on 
> flink-shaded-jackson which could parse yaml format,  we can use this jar 
> without more dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29366) Use flink-shaded-jacson library to parse flink-conf.yaml



 [ 
https://issues.apache.org/jira/browse/FLINK-29366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Kui updated FLINK-29366:
-
Summary: Use flink-shaded-jacson library to parse flink-conf.yaml  (was: 
Use flink-shaded-jacson to parse flink-conf.yaml)

> Use flink-shaded-jacson library to parse flink-conf.yaml
> 
>
> Key: FLINK-29366
> URL: https://issues.apache.org/jira/browse/FLINK-29366
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.13.3
>Reporter: Yuan Kui
>Priority: Major
>
> Now we use a simple 
> implementation(org.apache.flink.configuration.GlobalConfiguration#loadYAMLResource)
>  to parse flink-conf.yaml, which can only parse simple key-value pairs.
> Although there have been discussions on this issue 
> historically(see:[https://github.com/stratosphere/stratosphere/issues/113])
> but I think that in the actual production environment, we often need to 
> config complex structure into flink-conf.yaml. At this time, the yaml libary 
> is required for parsing, so I suggest to use the yaml library to parse 
> flink-conf.yaml  instead of our own implementation.
> In fact, the flink-core module already has a dependency on 
> flink-shaded-jackson which could parse yaml format,  we can use this jar 
> without more dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29366) Use flink shaded jacson to parse flink-conf.yaml



 [ 
https://issues.apache.org/jira/browse/FLINK-29366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Kui updated FLINK-29366:
-
Description: 
Now we use a simple 
implementation(org.apache.flink.configuration.GlobalConfiguration#loadYAMLResource)
 to parse flink-conf.yaml, which can only parse simple key-value pairs.

Although there have been discussions on this issue 
historically(see:[https://github.com/stratosphere/stratosphere/issues/113])
but I think that in the actual production environment, we often need to config 
complex structure into flink-conf.yaml. At this time, the yaml libary is 
required for parsing, so I suggest to use the yaml library to parse 
flink-conf.yaml  instead of our own implementation.

In fact, the flink-core module already has a dependency on flink-shaded-jackson 
which could parse yaml format,  we can use this jar without more dependencies.

  was:
Now we use a simple 
implementation(org.apache.flink.configuration.GlobalConfiguration#loadYAMLResource)
 to parse flink-conf.yaml, which can only parse key-value pairs.

Although there have been discussions on this issue 
historically(see:https://github.com/stratosphere/stratosphere/issues/113)
but I think that in the actual production environment, we often need to config 
complex structure into flink-conf.yaml. At this time, the yaml libary is 
required for parsing, so I suggest to use the yaml library to parse 
flink-conf.yaml  instead of our own implementation.

In fact, the flink-core module already has a dependency on flink-shaded-jackson 
which could parse yaml format,  we can use this jar without more dependencies.


> Use flink shaded jacson to parse flink-conf.yaml
> 
>
> Key: FLINK-29366
> URL: https://issues.apache.org/jira/browse/FLINK-29366
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.13.3
>Reporter: Yuan Kui
>Priority: Major
>
> Now we use a simple 
> implementation(org.apache.flink.configuration.GlobalConfiguration#loadYAMLResource)
>  to parse flink-conf.yaml, which can only parse simple key-value pairs.
> Although there have been discussions on this issue 
> historically(see:[https://github.com/stratosphere/stratosphere/issues/113])
> but I think that in the actual production environment, we often need to 
> config complex structure into flink-conf.yaml. At this time, the yaml libary 
> is required for parsing, so I suggest to use the yaml library to parse 
> flink-conf.yaml  instead of our own implementation.
> In fact, the flink-core module already has a dependency on 
> flink-shaded-jackson which could parse yaml format,  we can use this jar 
> without more dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29366) Use flink shaded jacson to parse flink-conf.yaml



[ 
https://issues.apache.org/jira/browse/FLINK-29366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607487#comment-17607487
 ] 

Yuan Kui commented on FLINK-29366:
--

[~chesnay] What do you think about this idea?

> Use flink shaded jacson to parse flink-conf.yaml
> 
>
> Key: FLINK-29366
> URL: https://issues.apache.org/jira/browse/FLINK-29366
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.13.3
>Reporter: Yuan Kui
>Priority: Major
>
> Now we use a simple 
> implementation(org.apache.flink.configuration.GlobalConfiguration#loadYAMLResource)
>  to parse flink-conf.yaml, which can only parse key-value pairs.
> Although there have been discussions on this issue 
> historically(see:https://github.com/stratosphere/stratosphere/issues/113)
> but I think that in the actual production environment, we often need to 
> config complex structure into flink-conf.yaml. At this time, the yaml libary 
> is required for parsing, so I suggest to use the yaml library to parse 
> flink-conf.yaml  instead of our own implementation.
> In fact, the flink-core module already has a dependency on 
> flink-shaded-jackson which could parse yaml format,  we can use this jar 
> without more dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29366) Use flink shaded jacson to parse flink-conf.yaml

Yuan Kui created FLINK-29366:


 Summary: Use flink shaded jacson to parse flink-conf.yaml
 Key: FLINK-29366
 URL: https://issues.apache.org/jira/browse/FLINK-29366
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Affects Versions: 1.13.3
Reporter: Yuan Kui


Now we use a simple 
implementation(org.apache.flink.configuration.GlobalConfiguration#loadYAMLResource)
 to parse flink-conf.yaml, which can only parse key-value pairs.

Although there have been discussions on this issue 
historically(see:https://github.com/stratosphere/stratosphere/issues/113)
but I think that in the actual production environment, we often need to config 
complex structure into flink-conf.yaml. At this time, the yaml libary is 
required for parsing, so I suggest to use the yaml library to parse 
flink-conf.yaml  instead of our own implementation.

In fact, the flink-core module already has a dependency on flink-shaded-jackson 
which could parse yaml format,  we can use this jar without more dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] gaoyunhaii closed pull request #20786: [hotfix] make ParquetProtoWriters.ParquetProtoWriterBuilder public



gaoyunhaii closed pull request #20786: [hotfix] make 
ParquetProtoWriters.ParquetProtoWriterBuilder public
URL: https://github.com/apache/flink/pull/20786


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-29345) Too many open files in table store orc writer



 [ 
https://issues.apache.org/jira/browse/FLINK-29345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-29345.

Resolution: Fixed

master: 835632c6e4758ad7d11ccbdb3a8ebb8dfa6aa709

> Too many open files in table store orc writer
> -
>
> Key: FLINK-29345
> URL: https://issues.apache.org/jira/browse/FLINK-29345
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Shammon
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
> Attachments: image-2022-09-20-11-57-11-373.png
>
>
>  !image-2022-09-20-11-57-11-373.png! 
> We can avoid reading the local file to obtain the config every time we create 
> a new writer by reusing the prepared configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-table-store] JingsongLi merged pull request #296: [FLINK-29345] Create reusing reader/writer config in orc format



JingsongLi merged PR #296:
URL: https://github.com/apache/flink-table-store/pull/296


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-29339) JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager blocks main thread

2022-09-20 Thread Yun Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao reassigned FLINK-29339:
---

Assignee: Xuannan Su

> JobMasterPartitionTrackerImpl#requestShuffleDescriptorsFromResourceManager 
> blocks main thread
> -
>
> Key: FLINK-29339
> URL: https://issues.apache.org/jira/browse/FLINK-29339
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Chesnay Schepler
>Assignee: Xuannan Su
>Priority: Critical
>  Labels: pull-request-available
>
> {code:java}
> private List requestShuffleDescriptorsFromResourceManager(
> IntermediateDataSetID intermediateDataSetID) {
> Preconditions.checkNotNull(
> resourceManagerGateway, "JobMaster is not connected to 
> ResourceManager");
> try {
> return this.resourceManagerGateway
> .getClusterPartitionsShuffleDescriptors(intermediateDataSetID)
> .get(); // <-- there's your problem
> } catch (Throwable e) {
> throw new RuntimeException(
> String.format(
> "Failed to get shuffle descriptors of intermediate 
> dataset %s from ResourceManager",
> intermediateDataSetID),
> e);
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-23409) CrossITCase fails with "NoResourceAvailableException: Slot request bulk is not fulfillable! Could not allocate the required slot within slot request timeout"

2022-09-20 Thread felixzh (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-23409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607477#comment-17607477
 ] 

felixzh commented on FLINK-23409:
-

I also encountered this problem. After locating, the reason is that the yarn 
queue resources running this job are exhausted

> CrossITCase fails with "NoResourceAvailableException: Slot request bulk is 
> not fulfillable! Could not allocate the required slot within slot request 
> timeout"
> -
>
> Key: FLINK-23409
> URL: https://issues.apache.org/jira/browse/FLINK-23409
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Table SQL / Planner
>Affects Versions: 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20548=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=10074
> {code}
> Jul 16 09:21:37   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> Jul 16 09:21:37   at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> Jul 16 09:21:37   at 
> akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> Jul 16 09:21:37   at 
> akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> Jul 16 09:21:37   at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> Jul 16 09:21:37   at 
> akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> Jul 16 09:21:37   at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> Jul 16 09:21:37   at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> Jul 16 09:21:37   ... 4 more
> Jul 16 09:21:37 Caused by: java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Slot request bulk is not fulfillable! Could not allocate the required slot 
> within slot request timeout
> Jul 16 09:21:37   at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
> Jul 16 09:21:37   at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
> Jul 16 09:21:37   at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
> Jul 16 09:21:37   at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> Jul 16 09:21:37   ... 31 more
> Jul 16 09:21:37 Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Slot request bulk is not fulfillable! Could not allocate the required slot 
> within slot request timeout
> Jul 16 09:21:37   at 
> org.apache.flink.runtime.jobmaster.slotpool.PhysicalSlotRequestBulkCheckerImpl.lambda$schedulePendingRequestBulkWithTimestampCheck$0(PhysicalSlotRequestBulkCheckerImpl.java:86)
> Jul 16 09:21:37   ... 24 more
> Jul 16 09:21:37 Caused by: java.util.concurrent.TimeoutException: Timeout has 
> occurred: 30 ms
> Jul 16 09:21:37   ... 25 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-29325) Fix documentation bug on how to enable batch mode for streaming examples

2022-09-20 Thread Yun Tang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang resolved FLINK-29325.
--
Resolution: Fixed

merged in master(1.17): 05600f844a904f34ab45f512715a76193974b497

release-1.16: de4aa4b7fee0f112fa3cfe66d0ad620841e18d74

release-1.15: 9ee1589c42565f47fdae8b82d488e6610bdb7fc6

> Fix documentation bug on how to enable batch mode for streaming examples
> 
>
> Key: FLINK-29325
> URL: https://issues.apache.org/jira/browse/FLINK-29325
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission, Documentation
>Affects Versions: 1.15.2
>Reporter: Jun He
>Assignee: Jun He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0, 1.15.3
>
>
> In latest flink doc, it says that we should use 
> '-Dexecution.runtime-mode=BATCH' to enable batch mode，but it does not work 
> actually. The wrong way is as below:
> bin/flink run -Dexecution.runtime-mode=BATCH examples/streaming/WordCount.jar
> we should use '--execution-mode batch' instead, the correct way is as below
> bin/flink run examples/streaming/WordCount.jar --execution-mode batch



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] Myasuka closed pull request #20849: [FLINK-29325][documentation]Fix documentation bug on how to enable ba…



Myasuka closed pull request #20849: [FLINK-29325][documentation]Fix 
documentation bug on how to enable ba…
URL: https://github.com/apache/flink/pull/20849


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-17141) Name of SQL Operator is too long

2022-09-20 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-17141.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> Name of SQL Operator is too long
> 
>
> Key: FLINK-17141
> URL: https://issues.apache.org/jira/browse/FLINK-17141
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Wenlong Lyu
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
> Fix For: 1.15.0
>
>
> the name of the operator contains the detail logic of the operator, which 
> make it very large when there are a lot of columns. It is a disaster for 
> logging and web ui, also can cost a lot of memory because we use the name 
> widely such as ExecutionVertex and failover message etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-6573) Flink MongoDB Connector

2022-09-20 Thread Jiabao Sun (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-6573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607465#comment-17607465
 ] 

Jiabao Sun commented on FLINK-6573:
---

Thanks [~martijnvisser] for the feedback

I just noticed the mongo-flink repository seems to be implemented via new 
Source and Sink interfaces in recent commits. Compared to it, the PR-20848 
provides more powerful features such as parallel read and write and lookup 
abilities.

If this makes sense I'm willing to created a FLIP to introduce it.

 

> Flink MongoDB Connector
> ---
>
> Key: FLINK-6573
> URL: https://issues.apache.org/jira/browse/FLINK-6573
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 1.2.0
> Environment: Linux Operating System, Mongo DB
>Reporter: Nagamallikarjuna
>Assignee: ZhuoYu Chen
>Priority: Not a Priority
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-11-15-14-41-07-514.png
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hi Community,
> Currently we are using Flink in the current Project. We have huge amount of 
> data to process using Flink which resides in Mongo DB. We have a requirement 
> of parallel data connectivity in between Flink and Mongo DB for both 
> reads/writes. Currently we are planning to create this connector and 
> contribute to the Community.
> I will update the further details once I receive your feedback 
> Please let us know if you have any concerns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-table-store] tsreaper merged pull request #295: [FLINK-29297] Group Table Store file writers into SingleFileWriter and RollingFileWriter



tsreaper merged PR #295:
URL: https://github.com/apache/flink-table-store/pull/295


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Myasuka commented on pull request #20849: [FLINK-29325][documentation]Fix documentation bug on how to enable ba…



Myasuka commented on PR #20849:
URL: https://github.com/apache/flink/pull/20849#issuecomment-1253127101

   My forked CI passed: 
https://dev.azure.com/myasuka/flink/_build/results?buildId=436=results


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] HuangXingBo commented on pull request #20859: [release] release notes for the 1.16 release



HuangXingBo commented on PR #20859:
URL: https://github.com/apache/flink/pull/20859#issuecomment-1253126326

   @alpinegizmo Thanks a lot for the review. My understanding is that the 
release notes will only contain what the user needs to know about the version 
upgrade, such as config or a dependency change, or some old behavior change. 
But some new features will be in the release announcement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-29222) Wrong behavior for Hive's load data inpath



 [ 
https://issues.apache.org/jira/browse/FLINK-29222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-29222.
---
Resolution: Fixed

> Wrong behavior for Hive's load data inpath
> --
>
> Key: FLINK-29222
> URL: https://issues.apache.org/jira/browse/FLINK-29222
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.16.0
>Reporter: luoyuxia
>Assignee: luoyuxia
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> In hive, `load data inpath` will remove src file, and `load data local 
> inpath` won't remove the src file.
> But When using the following sql with Hive dialect:
> {code:java}
> load data local inpath 'test.txt' INTO TABLE tab2 {code}
> The file `test.txt` will be removed, although the expected is not to remove 
> the `test.txt`.
> The reason is the parameter order is not right when try to call 
> `HiveCatalog#loadTable(...,  isOverWrite, isSourceLocal)`,
> It'll call it with 
> {code:java}
> hiveCatalog.loadTable(
>..., 
> hiveLoadDataOperation.isSrcLocal(), // should be isOverwrite
> hiveLoadDataOperation.isOverwrite()); // should be isSrcLocal{code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-29185) Failed to execute USING JAR in Hive Dialect



 [ 
https://issues.apache.org/jira/browse/FLINK-29185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-29185.
---
Resolution: Fixed

> Failed to execute USING JAR in Hive Dialect
> ---
>
> Key: FLINK-29185
> URL: https://issues.apache.org/jira/browse/FLINK-29185
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.16.0
>Reporter: Shengkai Fang
>Assignee: luoyuxia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-29045) Optimize error message in Flink SQL Client and Gateway when try to use Hive Dialect



 [ 
https://issues.apache.org/jira/browse/FLINK-29045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-29045.
---
  Assignee: luoyuxia
Resolution: Fixed

> Optimize error message in Flink SQL Client and Gateway when try to use Hive 
> Dialect
> ---
>
> Key: FLINK-29045
> URL: https://issues.apache.org/jira/browse/FLINK-29045
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: luoyuxia
>Assignee: luoyuxia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> Since Flink 1.15 , if users want to use HiveDialect, they have to swap 
> flink-table-planner-loader located in /lib with flink-table-planner_2.12 
> located in /opt
> Noticing it bothers some users as reported in [FLINK-27020| 
> https://issues.apache.org/jira/browse/FLINK-27020], 
> [FLINK-28618|https://issues.apache.org/jira/browse/FLINK-28618] .
> Althogh the document has noted it, but some users may still miss it.  It 
> would be better to show the detail error message  and tell user how to deal 
> with  such case in Flink SQL client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29045) Optimize error message in Flink SQL Client and Gateway when try to use Hive Dialect



[ 
https://issues.apache.org/jira/browse/FLINK-29045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607244#comment-17607244
 ] 

Jark Wu edited comment on FLINK-29045 at 9/21/22 1:48 AM:
--

Fixed in 
 - master: 791d8396163a8eb045493f7333218c5d881cc6ff
 - release-1.16: 9e16d54b9ea0422a97bcbe20ebb244be54dc1c3c


was (Author: jark):
Fixed in 
 - master: 791d8396163a8eb045493f7333218c5d881cc6ff
 - release-1.16: TODO

> Optimize error message in Flink SQL Client and Gateway when try to use Hive 
> Dialect
> ---
>
> Key: FLINK-29045
> URL: https://issues.apache.org/jira/browse/FLINK-29045
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: luoyuxia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> Since Flink 1.15 , if users want to use HiveDialect, they have to swap 
> flink-table-planner-loader located in /lib with flink-table-planner_2.12 
> located in /opt
> Noticing it bothers some users as reported in [FLINK-27020| 
> https://issues.apache.org/jira/browse/FLINK-27020], 
> [FLINK-28618|https://issues.apache.org/jira/browse/FLINK-28618] .
> Althogh the document has noted it, but some users may still miss it.  It 
> would be better to show the detail error message  and tell user how to deal 
> with  such case in Flink SQL client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29185) Failed to execute USING JAR in Hive Dialect



[ 
https://issues.apache.org/jira/browse/FLINK-29185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607023#comment-17607023
 ] 

Jark Wu edited comment on FLINK-29185 at 9/21/22 1:48 AM:
--

Fixed in
 - master: 3994788892fc761cf0c2fd09f362d4dab8f14c61
 - release-1.16: 82ab2918e992f747043dbe49d900b36fe28df282


was (Author: jark):
Fixed in
 - master: 3994788892fc761cf0c2fd09f362d4dab8f14c61
 - release-1.16: TODO

> Failed to execute USING JAR in Hive Dialect
> ---
>
> Key: FLINK-29185
> URL: https://issues.apache.org/jira/browse/FLINK-29185
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.16.0
>Reporter: Shengkai Fang
>Assignee: luoyuxia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-29191) Hive dialect can't get value for the variables set by set command



 [ 
https://issues.apache.org/jira/browse/FLINK-29191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-29191.
---
Fix Version/s: 1.17.0
 Assignee: luoyuxia
   Resolution: Fixed

Fixed in 
 - master: 64c550c67c2d580f369dfaa6ff48e2e6816c6fcd
 - release-1.16: d4d855a3c08733afac935d87df6544f0811aef84

> Hive dialect can't get value for the variables set by  set command
> --
>
> Key: FLINK-29191
> URL: https://issues.apache.org/jira/browse/FLINK-29191
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.16.0
>Reporter: luoyuxia
>Assignee: luoyuxia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> When using Hive dialect, we can use 
> {code:java}
> set k1=v1;
> {code}
> to set variable to Flink's table config.
> But if we want the get the value for `k1` by using 
> {code:java}
> set k1;
> {code}
> we will get nothing.
> The reason is Hive dialect won't lookup the vairable in Flink's table config.
> To fix it, we also need to lookup in Flink's table config.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29222) Wrong behavior for Hive's load data inpath



[ 
https://issues.apache.org/jira/browse/FLINK-29222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607246#comment-17607246
 ] 

Jark Wu edited comment on FLINK-29222 at 9/21/22 1:47 AM:
--

Fixed in  
 - master: 4448d9fd5e344bd0c2e197c2676c403bc2b665b9
 - release-1.16: bff0985aef4ed43681e6ad3bd81fc460bef3c6a5


was (Author: jark):
Fixed in  
 - master: 4448d9fd5e344bd0c2e197c2676c403bc2b665b9
 - release-1.16: TODO

> Wrong behavior for Hive's load data inpath
> --
>
> Key: FLINK-29222
> URL: https://issues.apache.org/jira/browse/FLINK-29222
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.16.0
>Reporter: luoyuxia
>Assignee: luoyuxia
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> In hive, `load data inpath` will remove src file, and `load data local 
> inpath` won't remove the src file.
> But When using the following sql with Hive dialect:
> {code:java}
> load data local inpath 'test.txt' INTO TABLE tab2 {code}
> The file `test.txt` will be removed, although the expected is not to remove 
> the `test.txt`.
> The reason is the parameter order is not right when try to call 
> `HiveCatalog#loadTable(...,  isOverWrite, isSourceLocal)`,
> It'll call it with 
> {code:java}
> hiveCatalog.loadTable(
>..., 
> hiveLoadDataOperation.isSrcLocal(), // should be isOverwrite
> hiveLoadDataOperation.isOverwrite()); // should be isSrcLocal{code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] wuchong merged pull request #20774: [FLINK-29191][hive] fix Hive dialect can't get value for the variables via `SET` command



wuchong merged PR #20774:
URL: https://github.com/apache/flink/pull/20774


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wuchong merged pull request #20864: [BP-1.16][FLINK-29191][FLINK-29222][FLINK-29045][FLINK-29185][hive] Backport commits to release-1.16



wuchong merged PR #20864:
URL: https://github.com/apache/flink/pull/20864


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on pull request #20855: [FLINK-29337][hive] Fix fail to use Hive Dialect for non-hive table



luoyuxia commented on PR #20855:
URL: https://github.com/apache/flink/pull/20855#issuecomment-1253096232

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] afedulov commented on a diff in pull request #20757: [FLINK-27919] Add FLIP-27-based source for data generation (FLIP-238)



afedulov commented on code in PR #20757:
URL: https://github.com/apache/flink/pull/20757#discussion_r975913726


##
flink-tests/src/test/java/org/apache/flink/api/connector/source/lib/util/RateLimitedSourceReaderITCase.java:
##
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.connector.source.lib.util;
+
+import org.apache.flink.api.common.eventtime.WatermarkStrategy;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.connector.source.SourceReader;
+import org.apache.flink.api.connector.source.SourceReaderContext;
+import org.apache.flink.api.connector.source.SourceReaderFactory;
+import org.apache.flink.api.connector.source.datagen.DataGeneratorSource;
+import org.apache.flink.api.connector.source.datagen.GeneratorFunction;
+import org.apache.flink.api.connector.source.lib.NumberSequenceSource;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.test.junit5.MiniClusterExtension;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.jupiter.api.DisplayName;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.extension.RegisterExtension;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+import java.util.stream.LongStream;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.assertj.core.api.Assertions.assertThat;
+
+/** An integration test for rate limiting built into the DataGeneratorSource. 
*/
+public class RateLimitedSourceReaderITCase extends TestLogger {

Review Comment:
   I like the idea of testing this. 
   Added a sketch, please let me know what you think: 
https://github.com/apache/flink/pull/20757/commits/a452c659e5e4ffa1e59ad0a722e9991c7757cdc1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-jira-bot] dependabot[bot] opened a new pull request, #24: Bump urllib3 from 1.26.4 to 1.26.5



dependabot[bot] opened a new pull request, #24:
URL: https://github.com/apache/flink-jira-bot/pull/24

   Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.4 to 1.26.5.
   
   Release notes
   Sourced from https://github.com/urllib3/urllib3/releases;>urllib3's 
releases.
   
   1.26.5
   :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 
2: https://urllib3.readthedocs.io/en/latest/v2-roadmap.html;>Read more in 
the v2.0 Roadmap
   
   Fixed deprecation warnings emitted in Python 3.10.
   Updated vendored six library to 1.16.0.
   Improved performance of URL parser when splitting the authority 
component.
   
   If you or your organization rely on urllib3 consider supporting 
us via https://github.com/sponsors/urllib3;>GitHub 
Sponsors
   
   
   
   Changelog
   Sourced from https://github.com/urllib3/urllib3/blob/main/CHANGES.rst;>urllib3's 
changelog.
   
   1.26.5 (2021-05-26)
   
   Fixed deprecation warnings emitted in Python 3.10.
   Updated vendored six library to 1.16.0.
   Improved performance of URL parser when splitting
   the authority component.
   
   
   
   
   Commits
   
   https://github.com/urllib3/urllib3/commit/d1616473df94b94f0f5ad19d2a6608cfe93b7cdf;>d161647
 Release 1.26.5
   https://github.com/urllib3/urllib3/commit/2d4a3fee6de2fa45eb82169361918f759269b4ec;>2d4a3fe
 Improve performance of sub-authority splitting in URL
   https://github.com/urllib3/urllib3/commit/2698537d52f8ff1f0bbb1d45cf018b118e91f637;>2698537
 Update vendored six to 1.16.0
   https://github.com/urllib3/urllib3/commit/07bed791e9c391d8bf12950f76537dc3c6f90550;>07bed79
 Fix deprecation warnings for Python 3.10 ssl module
   https://github.com/urllib3/urllib3/commit/d725a9b56bb8baf87c9e6eee0e9edf010034b63b;>d725a9b
 Add Python 3.10 to GitHub Actions
   https://github.com/urllib3/urllib3/commit/339ad34c677c98fd9ad008de1d8bbeb9dbf34381;>339ad34
 Use pytest==6.2.4 on Python 3.10+
   https://github.com/urllib3/urllib3/commit/f271c9c3149e20d7feffb6429b135bbb6c09ddf4;>f271c9c
 Apply latest Black formatting
   https://github.com/urllib3/urllib3/commit/1884878aac87ef0494b282e940c32c24ee917d52;>1884878
 [1.26] Properly proxy EOF on the SSLTransport test suite
   See full diff in https://github.com/urllib3/urllib3/compare/1.26.4...1.26.5;>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=urllib3=pip=1.26.4=1.26.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   - `@dependabot use these labels` will set the current labels as the default 
for future PRs for this repo and language
   - `@dependabot use these reviewers` will set the current reviewers as the 
default for future PRs for this repo and language
   - `@dependabot use these assignees` will set the current assignees as the 
default for future PRs for this repo and language
   - `@dependabot use this milestone` will set the current milestone as the 
default for future PRs for this repo and language
   
   You can disable automated security fix PRs for this repo from the [Security 
Alerts page](https://github.com/apache/flink-jira-bot/network/alerts).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail:

[GitHub] [flink-kubernetes-operator] HuangZhenQiu commented on a diff in pull request #375: [FLINK-29327] remove operator config from job runtime config before d…



HuangZhenQiu commented on code in PR #375:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/375#discussion_r975840333


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java:
##
@@ -165,7 +165,8 @@ public void submitApplicationCluster(
 if (requireHaMetadata) {
 validateHaMetadataExists(conf);
 }
-deployApplicationCluster(jobSpec, conf);
+
+deployApplicationCluster(jobSpec, removeOperatorConfigs(conf));

Review Comment:
   Do you mean submitSessionCluster in NativeFlinkService? 
kubernetesClusterDescriptor.deploySessionCluster takes cluster specification as 
input. It already helped to filter unnecessary configs?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-28755) Error when switching from stateless to savepoint upgrade mode

2022-09-20 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-28755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-28755:
---
Labels: pull-request-available  (was: )

> Error when switching from stateless to savepoint upgrade mode
> -
>
> Key: FLINK-28755
> URL: https://issues.apache.org/jira/browse/FLINK-28755
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0, kubernetes-operator-1.2.0
>Reporter: Gyula Fora
>Assignee: Gabor Somogyi
>Priority: Major
>  Labels: pull-request-available
>
> When using the savepoint upgrade mode the state.savepoints.dir currently 
> comes from the currently deployed spec / config.
> This causes a nullpointer exception when switching to savepoint upgrade mode 
> from stateless if state.savepoints.dir was previously undefined: 
> {noformat}
> org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59)
> org.apache.flink.kubernetes.operator.service.AbstractFlinkService.cancelJob(AbstractFlinkService.java:279)
> org.apache.flink.kubernetes.operator.service.NativeFlinkService.cancelJob(NativeFlinkService.java:93)
> org.apache.flink.kubernetes.operator.reconciler.deployment.ApplicationReconciler.cancelJob(ApplicationReconciler.java:172)
> org.apache.flink.kubernetes.operator.reconciler.deployment.ApplicationReconciler.cancelJob(ApplicationReconciler.java:52)
> org.apache.flink.kubernetes.operator.reconciler.deployment.AbstractJobReconciler.reconcileSpecChange(AbstractJobReconciler.java:108)
> org.apache.flink.kubernetes.operator.reconciler.deployment.AbstractFlinkResourceReconciler.reconcile(AbstractFlinkResourceReconciler.java:148)
> org.apache.flink.kubernetes.operator.reconciler.deployment.AbstractFlinkResourceReconciler.reconcile(AbstractFlinkResourceReconciler.java:56)
> org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:115){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gaborgsomogyi opened a new pull request, #376: [FLINK-28755] Fix error when switching from stateless to savepoint upgrade mode



gaborgsomogyi opened a new pull request, #376:
URL: https://github.com/apache/flink-kubernetes-operator/pull/376

   ## What is the purpose of the change
   
   Stateless to savepoint upgrade mode change blown up with exception. The 
reason was that stateless configuration doesn't require `state.savepoints.dir` 
config and `cancel` operation of the old job wanted to use that to make a 
savepoint. In this PR I've copied the necessary config from deploy config into 
observe config to make them available.
   
   ## Brief change log
   
   * Copied `state.savepoints.dir` from deploy config to observe config
   * Moved all upgrade mode tests to `ApplicationReconcilerUpgradeModeTest`
   * Split a single upgrade test into 9 which represent all the transition 
combinations
   * Enforce state pre-requisites in transitions which is described 
[here](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/job-management/#stateful-and-stateless-application-upgrades)
   * Minor beautifications here and there
   
   ## Verifying this change
   
   Existing + additional unit tests
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
 - Core observer or reconciler logic that is regularly executed: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-27962) KafkaSourceReader fails to commit consumer offsets for checkpoints



 [ 
https://issues.apache.org/jira/browse/FLINK-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommy Schnabel updated FLINK-27962:
---
Attachment: Screen Shot 2022-09-20 at 5.18.04 PM.png

> KafkaSourceReader fails to commit consumer offsets for checkpoints
> --
>
> Key: FLINK-27962
> URL: https://issues.apache.org/jira/browse/FLINK-27962
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4, 1.15.0
>Reporter: Dmytro
>Priority: Major
> Attachments: Screen Shot 2022-09-20 at 5.18.04 PM.png
>
>
> The KafkaSourceReader works well for many hours, then fails and re-connects 
> successfully, then continues to work some time. After the first three 
> failures it hangs on "Offset commit failed" and never connected again. 
> Restarting the Flink job does help and it works until the next "3 times fail".
> I am aware about [the 
> note|https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#consumer-offset-committing]
>  that Kafka source does NOT rely on committed offsets for fault tolerance. 
> Committing offset is only for exposing the progress of consumer and consuming 
> group for monitoring.
> I agree if the failures are only periodic, but I would argue complete 
> failures are unacceptable
> *Failed to commit consumer offsets for checkpoint:*
> {code:java}
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:19:52,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464521
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets.
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:20:02,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464522
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets.
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:20:02,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464523
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets
> . fails permanently until the job restart
>  {code}
> *Consumer Config:*
> {code:java}
> allow.auto.create.topics = true
> auto.commit.interval.ms = 5000
> auto.offset.reset = none
> bootstrap.servers = [test.host.net:9093]
> check.crcs = true
> client.dns.lookup = use_all_dns_ips
> client.id = test-client-id
> client.rack =
> connections.max.idle.ms = 18
> default.api.timeout.ms = 6
> enable.auto.commit = false
> exclude.internal.topics = true
> fetch.max.bytes = 52428800
> fetch.max.wait.ms = 500
> fetch.min.bytes = 1
> group.id = test-group-id
> group.instance.id = null
> heartbeat.interval.ms = 3000
> interceptor.classes = []
> internal.leave.group.on.close = true
> internal.throw.on.fetch.stable.offset.unsupported = false
> isolation.level = read_uncommitted
> key.deserializer = class 
> org.apache.kafka.common.serialization.ByteArrayDeserializer
> max.partition.fetch.bytes = 1048576
> max.poll.interval.ms = 30
> max.poll.records = 500
> metadata.max.age.ms = 18
> metric.reporters = []
> metrics.num.samples = 2
> metrics.recording.level = INFO
> metrics.sample.window.ms = 3
> partition.assignment.strategy = [class 
> org.apache.kafka.clients.consumer.RangeAssignor]
> receive.buffer.bytes = 65536
> reconnect.backoff.max.ms = 1000
> reconnect.backoff.ms = 50
> request.timeout.ms = 6
> retry.backoff.ms = 100
> sasl.client.callback.handler.class = null
> sasl.jaas.config = [hidden]
> sasl.kerberos.kinit.cmd = /usr/bin/kinit
> sasl.kerberos.min.time.before.relogin = 6
> sasl.kerberos.service.name = null
> sasl.kerberos.ticket.renew.jitter = 0.05
> sasl.kerberos.ticket.renew.window.factor = 0.8
> sasl.login.callback.handler.class = class 
> com.test.kafka.security.AzureAuthenticateCallbackHandler
> sasl.login.class = null
> sasl.login.refresh.buffer.seconds = 300
> sasl.login.refresh.min.period.seconds = 60
> sasl.login.refresh.window.factor = 0.8
> sasl.login.refresh.window.jitter = 0.05
> sasl.mechanism = OAUTHBEARER
> security.protocol = SASL_SSL
> security.providers = null

[jira] [Commented] (FLINK-27962) KafkaSourceReader fails to commit consumer offsets for checkpoints



[ 
https://issues.apache.org/jira/browse/FLINK-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607408#comment-17607408
 ] 

Tommy Schnabel commented on FLINK-27962:


 !Screen Shot 2022-09-20 at 5.18.04 PM.png! 

> KafkaSourceReader fails to commit consumer offsets for checkpoints
> --
>
> Key: FLINK-27962
> URL: https://issues.apache.org/jira/browse/FLINK-27962
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4, 1.15.0
>Reporter: Dmytro
>Priority: Major
> Attachments: Screen Shot 2022-09-20 at 5.18.04 PM.png
>
>
> The KafkaSourceReader works well for many hours, then fails and re-connects 
> successfully, then continues to work some time. After the first three 
> failures it hangs on "Offset commit failed" and never connected again. 
> Restarting the Flink job does help and it works until the next "3 times fail".
> I am aware about [the 
> note|https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#consumer-offset-committing]
>  that Kafka source does NOT rely on committed offsets for fault tolerance. 
> Committing offset is only for exposing the progress of consumer and consuming 
> group for monitoring.
> I agree if the failures are only periodic, but I would argue complete 
> failures are unacceptable
> *Failed to commit consumer offsets for checkpoint:*
> {code:java}
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:19:52,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464521
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets.
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:20:02,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464522
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets.
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:20:02,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464523
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets
> . fails permanently until the job restart
>  {code}
> *Consumer Config:*
> {code:java}
> allow.auto.create.topics = true
> auto.commit.interval.ms = 5000
> auto.offset.reset = none
> bootstrap.servers = [test.host.net:9093]
> check.crcs = true
> client.dns.lookup = use_all_dns_ips
> client.id = test-client-id
> client.rack =
> connections.max.idle.ms = 18
> default.api.timeout.ms = 6
> enable.auto.commit = false
> exclude.internal.topics = true
> fetch.max.bytes = 52428800
> fetch.max.wait.ms = 500
> fetch.min.bytes = 1
> group.id = test-group-id
> group.instance.id = null
> heartbeat.interval.ms = 3000
> interceptor.classes = []
> internal.leave.group.on.close = true
> internal.throw.on.fetch.stable.offset.unsupported = false
> isolation.level = read_uncommitted
> key.deserializer = class 
> org.apache.kafka.common.serialization.ByteArrayDeserializer
> max.partition.fetch.bytes = 1048576
> max.poll.interval.ms = 30
> max.poll.records = 500
> metadata.max.age.ms = 18
> metric.reporters = []
> metrics.num.samples = 2
> metrics.recording.level = INFO
> metrics.sample.window.ms = 3
> partition.assignment.strategy = [class 
> org.apache.kafka.clients.consumer.RangeAssignor]
> receive.buffer.bytes = 65536
> reconnect.backoff.max.ms = 1000
> reconnect.backoff.ms = 50
> request.timeout.ms = 6
> retry.backoff.ms = 100
> sasl.client.callback.handler.class = null
> sasl.jaas.config = [hidden]
> sasl.kerberos.kinit.cmd = /usr/bin/kinit
> sasl.kerberos.min.time.before.relogin = 6
> sasl.kerberos.service.name = null
> sasl.kerberos.ticket.renew.jitter = 0.05
> sasl.kerberos.ticket.renew.window.factor = 0.8
> sasl.login.callback.handler.class = class 
> com.test.kafka.security.AzureAuthenticateCallbackHandler
> sasl.login.class = null
> sasl.login.refresh.buffer.seconds = 300
> sasl.login.refresh.min.period.seconds = 60
> sasl.login.refresh.window.factor = 0.8
> sasl.login.refresh.window.jitter = 0.05
> sasl.mechanism = OAUTHBEARER
> security.protocol = SASL_SSL

[jira] [Comment Edited] (FLINK-27962) KafkaSourceReader fails to commit consumer offsets for checkpoints

[
https://issues.apache.org/jira/browse/FLINK-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607407#comment-17607407
]

Tommy Schnabel edited comment on FLINK-27962 at 9/20/22 9:21 PM:
-

Hi there, we're also seeing this at Twilio Segment on versions 1.15.1 and
1.15.2, but _not_ on 1.14.4. I didn't test 1.15.0 but I imagine it's present
there. Here's what we're seeing in one of our task manager's logs:
{code}
2022-09-20 15:58:07,978 WARN
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit
failed with a retriable exception. You should retry committing the latest
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The
coordinator is not available.
2022-09-20 15:58:07,981 WARN
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit
failed with a retriable exception. You should retry committing the latest
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The
coordinator is not available.
2022-09-20 15:58:08,029 WARN
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit
failed with a retriable exception. You should retry committing the latest
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The
coordinator is not available.
2022-09-20 15:58:08,055 WARN
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit
failed with a retriable exception. You should retry committing the latest
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The
coordinator is not available.
{code}

Is there any ETA on getting this fixed?

Updating to attaching one of our graphs to show how we're seeing 2/3rds of our
partitions not committing offsets while 1/3rd does go down. Inspecting further
I've discovered that we _are_ processing all those supposedly lagged messages,
the offsets are just not being committed back to kafka.

was (Author: JIRAUSER294826):

Is there any ETA on getting this fixed?

> KafkaSourceReader fails to commit consumer offsets for checkpoints
> --
>
> Key: FLINK-27962
> URL: https://issues.apache.org/jira/browse/FLINK-27962
> Project: Flink
>

[GitHub] [flink] afedulov commented on a diff in pull request #20865: [FLINK-14896][connectors/kinesis] Shade and relocate Jackson dependen…



afedulov commented on code in PR #20865:
URL: https://github.com/apache/flink/pull/20865#discussion_r975820738


##
flink-connectors/flink-connector-kinesis/src/main/resources/META-INF/NOTICE:
##
@@ -17,6 +17,10 @@ This project bundles the following dependencies under the 
Apache Software Licens
 - com.amazonaws:aws-java-sdk-cloudwatch:1.12.276
 - com.amazonaws:dynamodb-streams-kinesis-adapter:1.5.3
 - com.amazonaws:jmespath-java:1.12.276
+- com.fasterxml.jackson.core:jackson-annotations:jar:2.13.2

Review Comment:
   I believe the `:jar` bit at the end is redundant and causes the build to 
fail because of the NOTICE checker.
   ```
   Dependency com.fasterxml.jackson.core:jackson-core:2.13.2 is not listed.
   Dependency com.fasterxml.jackson.core:jackson-annotations:2.13.2 is not 
listed.
   Dependency com.fasterxml.jackson.core:jackson-databind:2.13.2.2 is not 
listed.
   ...
   Dependency com.fasterxml.jackson.core:jackson-core:jar:2.13.2 is not 
bundled, but listed.
   Dependency com.fasterxml.jackson.core:jackson-annotations:jar:2.13.2 is not 
bundled, but listed.
   Dependency com.fasterxml.jackson.core:jackson-databind:jar:2.13.2.2 is not 
bundled, but listed.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-27962) KafkaSourceReader fails to commit consumer offsets for checkpoints



[ 
https://issues.apache.org/jira/browse/FLINK-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607407#comment-17607407
 ] 

Tommy Schnabel edited comment on FLINK-27962 at 9/20/22 9:15 PM:
-



Hi there, we're also seeing this at Twilio Segment on versions 1.15.1 and 
1.15.2, but _not_ on 1.14.4. I didn't test 1.15.0 but I imagine it's present 
there. Here's what we're seeing in one of our task manager's logs:
{code}
2022-09-20 15:58:07,978 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:07,981 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:08,029 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:08,055 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
{code}

Is there any ETA on getting this fixed?


was (Author: JIRAUSER294826):
Hi there, we're also seeing this at Twilio Segment on versions 1.15.1 and 
1.15.2, but _not_ on 1.14.4. I didn't test 1.15.0 but I imagine it's present 
there. Here's what we're seeing in one of our task manager's logs:
```
2022-09-20 15:58:07,978 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:07,981 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:08,029 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:08,055 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
```

Is there any ETA on getting this fixed?

> KafkaSourceReader fails to commit consumer offsets for checkpoints
> --
>
> Key: FLINK-27962
> URL: https://issues.apache.org/jira/browse/FLINK-27962
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4, 1.15.0
>Reporter: Dmytro
>Priority: Major
>
> The KafkaSourceReader works well for many hours, then fails and re-connects 
> successfully, then continues to work some time. After the

[jira] [Commented] (FLINK-27962) KafkaSourceReader fails to commit consumer offsets for checkpoints



[ 
https://issues.apache.org/jira/browse/FLINK-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607407#comment-17607407
 ] 

Tommy Schnabel commented on FLINK-27962:


Hi there, we're also seeing this at Twilio Segment on versions 1.15.1 and 
1.15.2, but _not_ on 1.14.4. I didn't test 1.15.0 but I imagine it's present 
there. Here's what we're seeing in one of our task manager's logs:
```
2022-09-20 15:58:07,978 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:07,981 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:08,029 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
2022-09-20 15:58:08,055 WARN  
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to 
commit consumer offsets for checkpoint 363507
org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit 
failed with a retriable exception. You should retry committing the latest 
consumed offsets.
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
```

Is there any ETA on getting this fixed?

> KafkaSourceReader fails to commit consumer offsets for checkpoints
> --
>
> Key: FLINK-27962
> URL: https://issues.apache.org/jira/browse/FLINK-27962
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4, 1.15.0
>Reporter: Dmytro
>Priority: Major
>
> The KafkaSourceReader works well for many hours, then fails and re-connects 
> successfully, then continues to work some time. After the first three 
> failures it hangs on "Offset commit failed" and never connected again. 
> Restarting the Flink job does help and it works until the next "3 times fail".
> I am aware about [the 
> note|https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#consumer-offset-committing]
>  that Kafka source does NOT rely on committed offsets for fault tolerance. 
> Committing offset is only for exposing the progress of consumer and consuming 
> group for monitoring.
> I agree if the failures are only periodic, but I would argue complete 
> failures are unacceptable
> *Failed to commit consumer offsets for checkpoint:*
> {code:java}
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:19:52,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464521
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets.
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:20:02,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464522
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets.
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> 2022-06-06 14:20:02,297 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 464523
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets
> . fails permanently until the job restart
>  {code}
> *Consumer Config:*
>

[jira] [Commented] (FLINK-6573) Flink MongoDB Connector



[ 
https://issues.apache.org/jira/browse/FLINK-6573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607404#comment-17607404
 ] 

Martijn Visser commented on FLINK-6573:
---

[~arvid] Thanks for the ping

[~jiabao.sun] Thanks for reaching out! Great to hear that you would like to 
introduce a new connector to the Flink ecosystem. 

In order to introduce a new connector, we'll first need to go through the FLIP 
process. See 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-252%3A+Amazon+DynamoDB+Sink+Connector
 as an example. For MongoDB, a new FLIP would need to be created, discussed and 
voted on. When the vote has passed, we can create a new repository (like 
github.com/apache/flink-connector-mongodb) where the source code for that 
connector can be stored. New connectors aren't currently being merged in 
Flink's main repo.

One question, as I see there's also https://github.com/mongo-flink/mongo-flink 
- Are you familiar with that connector? 

> Flink MongoDB Connector
> ---
>
> Key: FLINK-6573
> URL: https://issues.apache.org/jira/browse/FLINK-6573
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 1.2.0
> Environment: Linux Operating System, Mongo DB
>Reporter: Nagamallikarjuna
>Assignee: ZhuoYu Chen
>Priority: Not a Priority
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-11-15-14-41-07-514.png
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hi Community,
> Currently we are using Flink in the current Project. We have huge amount of 
> data to process using Flink which resides in Mongo DB. We have a requirement 
> of parallel data connectivity in between Flink and Mongo DB for both 
> reads/writes. Currently we are planning to create this connector and 
> contribute to the Community.
> I will update the further details once I receive your feedback 
> Please let us know if you have any concerns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29361) How to set headers with the new Flink KafkaSink



[ 
https://issues.apache.org/jira/browse/FLINK-29361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607402#comment-17607402
 ] 

Martijn Visser commented on FLINK-29361:


Can you elaborate on what you did in FlinkKafkaProducer which doesn't work for 
you with KafkaSink? I don't think there should be a problem to write Kafka 
headers. 

> How to set headers with the new Flink KafkaSink
> ---
>
> Key: FLINK-29361
> URL: https://issues.apache.org/jira/browse/FLINK-29361
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Reporter: Xin Hao
>Priority: Minor
>
> I'm using Flink 1.15.2, when I try to migrate to the new KafkaSink, it seems 
> that it's not possible to add Kafka record headers.
> I think we should add this feature or document it if we already have it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #371: [FLINK-29313] Observe spec should include latest operator configs + improved test coverage



morhidi commented on code in PR #371:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/371#discussion_r975795996


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/controller/RollbackTest.java:
##
@@ -179,6 +181,17 @@ public void testRollbackFailureWithLastState(FlinkVersion 
flinkVersion) throws E
 assertEquals("RUNNING", 
dep.getStatus().getJobStatus().getState());
 assertEquals(1, flinkService.listJobs().size());
 
+// Trigger deployment recovery
+flinkService.clear();
+flinkService.setPortReady(false);
+
+testController.reconcile(dep, context);
+flinkService.setPortReady(true);
+testController.reconcile(dep, context);
+var jobs = flinkService.listJobs();
+// Make sure deployment was recovered with correct 
spec/config
+assertTrue(jobs.get(jobs.size() - 1).f2.containsKey("t"));

Review Comment:
   Do we loose any `kubernetes.operator*` setting for a job in case of a 
rollback?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-29363) Allow web ui to fully redirect to other page



[ 
https://issues.apache.org/jira/browse/FLINK-29363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607401#comment-17607401
 ] 

Martijn Visser commented on FLINK-29363:


I don't immediately see why Flink should introduce this redirect. You can still 
enable solutions like 
https://developer.okta.com/blog/2018/08/28/nginx-auth-request to secure your 
Flink UI access, right? There's no redirect needed for that. 

> Allow web ui to fully redirect to other page
> 
>
> Key: FLINK-29363
> URL: https://issues.apache.org/jira/browse/FLINK-29363
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.15.2
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> In a streaming platform system, web ui usually integrates with internal 
> authentication and authorization system. Given the validation failed, the 
> request needs to be redirected to a landing page. It does't work for AJAX 
> request. It will be great to have the web ui configurable to allow auto full 
> redirect. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29365) Millisecond behind latest jumps after Flink 1.15.2 upgrade



[ 
https://issues.apache.org/jira/browse/FLINK-29365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607398#comment-17607398
 ] 

Martijn Visser commented on FLINK-29365:


CC [~dannycranmer]

> Millisecond behind latest jumps after Flink 1.15.2 upgrade
> --
>
> Key: FLINK-29365
> URL: https://issues.apache.org/jira/browse/FLINK-29365
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.15.2
> Environment: Redeployment from 1.14.4 to 1.15.2
>Reporter: Wilson Wu
>Priority: Major
> Attachments: Screen Shot 2022-09-19 at 2.50.56 PM.png
>
>
> (First time filling a ticket in Flink community, please let me know if there 
> are any guidelines I need to follow)
> I noticed a very strange behavior with a recent version bump from Flink 
> 1.14.4 to 1.15.2. My project consumes around 30K records per second from a 
> sharded kinesis stream, and during the version upgrade, it will follow the 
> best practice to first trigger a savepoint from the running job, start the 
> new job from the savepoint and then remove the old job. So far so good, and 
> the above logic has been tested multiple times without any issue for 1.14.4. 
> Usually, after the version upgrade, our job will have a few minutes delay for 
> millisecond behind latest, but it will catch up with the speed quickly(within 
> 30mins). Our savepoint is around one hundred MBs big, and our job DAG will 
> become 90 - 100% busy with some backpressure when we redeploy but after 10-20 
> minutes it goes back to normal.
> Then the strange thing happened, when I tried to redeploy with 1.15.2 upgrade 
> from a running 1.14.4 job, I can see a savepoint has been created and the new 
> job is running, all the metrics look fine, except suddenly [millisecond 
> behind the 
> latest|https://flink.apache.org/news/2019/02/25/monitoring-best-practices.html]
>  jumps to 10 hours!! and it takes days for my application to catch up with 
> the kinesis stream latest record. I don't understand why it jumps from 0 
> second to 10+ hours when we restart the new job. The only main change I 
> introduced with version bump is to change 
> [failOnError|https://nightlies.apache.org/flink/flink-docs-release-1.15/api/java/org/apache/flink/connector/kinesis/sink/KinesisStreamsSink.html]
>  from true to false, but I don't think this is the root cause.
> I tried to redeploy the new 1.15.2 job by changing our parallelism, 
> redeploying a job from 1.15.2 does not introduce a big delay, so I assume the 
> issue above only happens when we bump version from 1.14.4 to 1.15.2(note the 
> attached screenshot)? I did try to bump it twice and I see the same 10hrs+ 
> jump in delay, we do not have changes related to any timezones.
> Please let me know if this can be filled as a bug, as I do not have a running 
> project with all the kinesis setup available that can reproduce the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #371: [FLINK-29313] Observe spec should include latest operator configs + improved test coverage



morhidi commented on PR #371:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/371#issuecomment-1252899555

   Went through the PR, the situation with the configs actually not that bad. 
Just by looking at the code I was not able to figure out what happens with the 
operator configs in case of a rollback. I guess we can potentially loose some 
in case those are not kept in the new spec too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #371: [FLINK-29313] Observe spec should include latest operator configs + improved test coverage



morhidi commented on code in PR #371:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/371#discussion_r975795996


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/controller/RollbackTest.java:
##
@@ -179,6 +181,17 @@ public void testRollbackFailureWithLastState(FlinkVersion 
flinkVersion) throws E
 assertEquals("RUNNING", 
dep.getStatus().getJobStatus().getState());
 assertEquals(1, flinkService.listJobs().size());
 
+// Trigger deployment recovery
+flinkService.clear();
+flinkService.setPortReady(false);
+
+testController.reconcile(dep, context);
+flinkService.setPortReady(true);
+testController.reconcile(dep, context);
+var jobs = flinkService.listJobs();
+// Make sure deployment was recovered with correct 
spec/config
+assertTrue(jobs.get(jobs.size() - 1).f2.containsKey("t"));

Review Comment:
   Do we loose every `kubernetes.operator*` setting for a job in case of a 
rollback?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #371: [FLINK-29313] Observe spec should include latest operator configs + improved test coverage



morhidi commented on code in PR #371:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/371#discussion_r975778961


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/crd/spec/AbstractFlinkSpec.java:
##
@@ -51,8 +53,7 @@ public abstract class AbstractFlinkSpec implements 
Diffable {
 /** Flink configuration overrides for the Flink deployment or Flink 
session job. */
 @SpecDiff.Config({
 @SpecDiff.Entry(prefix = "parallelism.default", type = 
DiffType.IGNORE),
-@SpecDiff.Entry(prefix = "kubernetes.operator", type = 
DiffType.IGNORE),
-@SpecDiff.Entry(prefix = "metrics.scope.k8soperator", type = 
DiffType.IGNORE)
+@SpecDiff.Entry(prefix = K8S_OP_CONF_PREFIX, type = DiffType.IGNORE),

Review Comment:
   Shell we keep the the deprecated configs e.g. `metrics.scope.k8soperator` 
here for a while, or we simply say that deprecated operator configs will 
trigger an upgrade?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #374: [hotfix] Remove strange configs from helm defaults



morhidi commented on PR #374:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/374#issuecomment-1252860453

   +1 LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-29365) Millisecond behind latest jumps after Flink 1.15.2 upgrade

2022-09-20 Thread Wilson Wu (Jira)

Wilson Wu created FLINK-29365:
-

 Summary: Millisecond behind latest jumps after Flink 1.15.2 upgrade
 Key: FLINK-29365
 URL: https://issues.apache.org/jira/browse/FLINK-29365
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kinesis
Affects Versions: 1.15.2
 Environment: Redeployment from 1.14.4 to 1.15.2
Reporter: Wilson Wu
 Attachments: Screen Shot 2022-09-19 at 2.50.56 PM.png

(First time filling a ticket in Flink community, please let me know if there 
are any guidelines I need to follow)

I noticed a very strange behavior with a recent version bump from Flink 1.14.4 
to 1.15.2. My project consumes around 30K records per second from a sharded 
kinesis stream, and during the version upgrade, it will follow the best 
practice to first trigger a savepoint from the running job, start the new job 
from the savepoint and then remove the old job. So far so good, and the above 
logic has been tested multiple times without any issue for 1.14.4. Usually, 
after the version upgrade, our job will have a few minutes delay for 
millisecond behind latest, but it will catch up with the speed quickly(within 
30mins). Our savepoint is around one hundred MBs big, and our job DAG will 
become 90 - 100% busy with some backpressure when we redeploy but after 10-20 
minutes it goes back to normal.

Then the strange thing happened, when I tried to redeploy with 1.15.2 upgrade 
from a running 1.14.4 job, I can see a savepoint has been created and the new 
job is running, all the metrics look fine, except suddenly [millisecond behind 
the 
latest|https://flink.apache.org/news/2019/02/25/monitoring-best-practices.html] 
jumps to 10 hours!! and it takes days for my application to catch up with the 
kinesis stream latest record. I don't understand why it jumps from 0 second to 
10+ hours when we restart the new job. The only main change I introduced with 
version bump is to change 
[failOnError|https://nightlies.apache.org/flink/flink-docs-release-1.15/api/java/org/apache/flink/connector/kinesis/sink/KinesisStreamsSink.html]
 from true to false, but I don't think this is the root cause.

I tried to redeploy the new 1.15.2 job by changing our parallelism, redeploying 
a job from 1.15.2 does not introduce a big delay, so I assume the issue above 
only happens when we bump version from 1.14.4 to 1.15.2(note the attached 
screenshot)? I did try to bump it twice and I see the same 10hrs+ jump in 
delay, we do not have changes related to any timezones.

Please let me know if this can be filled as a bug, as I do not have a running 
project with all the kinesis setup available that can reproduce the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #375: [FLINK-29327] remove operator config from job runtime config before d…



gyfora commented on PR #375:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/375#issuecomment-1252807583

   Btw in this outstanding PR 
(https://github.com/apache/flink-kubernetes-operator/pull/371/files) I have 
added some convenient utilities to test the configuration that was submitted to 
the cluster. Maybe you could simply rebase on it to leverage it and we can make 
sure to merge that first.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #375: [FLINK-29327] remove operator config from job runtime config before d…



gyfora commented on code in PR #375:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/375#discussion_r975733433


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java:
##
@@ -165,7 +165,8 @@ public void submitApplicationCluster(
 if (requireHaMetadata) {
 validateHaMetadataExists(conf);
 }
-deployApplicationCluster(jobSpec, conf);
+
+deployApplicationCluster(jobSpec, removeOperatorConfigs(conf));

Review Comment:
   I think you might have forgotten adding it for session cluster deployments.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] afedulov commented on a diff in pull request #20757: [FLINK-27919] Add FLIP-27-based source for data generation (FLIP-238)



afedulov commented on code in PR #20757:
URL: https://github.com/apache/flink/pull/20757#discussion_r975719283


##
flink-core/src/test/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceTest.java:
##
@@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.connector.source.lib;
+
+import org.apache.flink.api.common.eventtime.Watermark;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.connector.source.ReaderOutput;
+import org.apache.flink.api.connector.source.SourceEvent;
+import org.apache.flink.api.connector.source.SourceOutput;
+import org.apache.flink.api.connector.source.SourceReader;
+import org.apache.flink.api.connector.source.SourceReaderContext;
+import org.apache.flink.api.connector.source.SplitEnumerator;
+import org.apache.flink.api.connector.source.datagen.DataGeneratorSource;
+import org.apache.flink.api.connector.source.datagen.GeneratorFunction;
+import org.apache.flink.api.connector.source.mocks.MockSplitEnumeratorContext;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.InputStatus;
+import org.apache.flink.metrics.groups.SourceReaderMetricGroup;
+import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
+import org.apache.flink.mock.Whitebox;
+import org.apache.flink.util.SimpleUserCodeClassLoader;
+import org.apache.flink.util.UserCodeClassLoader;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.DisplayName;
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.List;
+import java.util.Queue;
+
+import static org.assertj.core.api.Assertions.assertThat;
+
+/** Tests for the {@link DataGeneratorSource}. */
+public class DataGeneratorSourceTest {
+
+@Test
+@DisplayName("Correctly restores SplitEnumerator from a snapshot.")
+public void testRestoreEnumerator() throws Exception {
+final GeneratorFunction generatorFunctionStateless = index 
-> index;
+final DataGeneratorSource dataGeneratorSource =
+new DataGeneratorSource<>(generatorFunctionStateless, 100, 
Types.LONG);
+
+final int parallelism = 2;
+final 
MockSplitEnumeratorContext context =
+new MockSplitEnumeratorContext<>(parallelism);
+
+SplitEnumerator<
+NumberSequenceSource.NumberSequenceSplit,
+Collection>
+enumerator = dataGeneratorSource.createEnumerator(context);
+
+// start() is not strictly necessary in the current implementation, 
but should logically be
+// executed in this order (protect against any breaking changes in the 
start() method).
+enumerator.start();
+
+Collection enumeratorState =
+enumerator.snapshotState(0);
+
+@SuppressWarnings("unchecked")
+final Queue splits =
+(Queue)
+Whitebox.getInternalState(enumerator, 
"remainingSplits");
+
+assertThat(splits).hasSize(parallelism);
+
+enumerator = dataGeneratorSource.restoreEnumerator(context, 
enumeratorState);
+
+@SuppressWarnings("unchecked")
+final Queue restoredSplits =
+(Queue)
+Whitebox.getInternalState(enumerator, 
"remainingSplits");
+
+assertThat(restoredSplits).hasSize(enumeratorState.size());

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-29363) Allow web ui to fully redirect to other page

2022-09-20 Thread Robert Metzger (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607365#comment-17607365
 ] 

Robert Metzger commented on FLINK-29363:


So the setup would be that there's an authenticating proxy between the Flink 
Web UI and Flink's REST API.
The problem is currently that if a REST API call fails, the UI will just break, 
instead of redirecting to another page.

How would we be able to distinguish between auth errors and generic errors? I 
guess based on the HTTP error codes?.
One problem I see is that this setting is purely used in the UI, so we need a 
way of forwarding a "global setting" to the UI ... but I guess that's solvable.

> Allow web ui to fully redirect to other page
> 
>
> Key: FLINK-29363
> URL: https://issues.apache.org/jira/browse/FLINK-29363
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.15.2
>Reporter: Zhenqiu Huang
>Priority: Minor
>
> In a streaming platform system, web ui usually integrates with internal 
> authentication and authorization system. Given the validation failed, the 
> request needs to be redirected to a landing page. It does't work for AJAX 
> request. It will be great to have the web ui configurable to allow auto full 
> redirect. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] afedulov commented on a diff in pull request #20757: [FLINK-27919] Add FLIP-27-based source for data generation (FLIP-238)



afedulov commented on code in PR #20757:
URL: https://github.com/apache/flink/pull/20757#discussion_r975658900


##
flink-tests/src/test/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceITCase.java:
##
@@ -0,0 +1,193 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.connector.source.lib;
+
+import org.apache.flink.api.common.eventtime.WatermarkStrategy;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.connector.source.SourceReaderContext;
+import org.apache.flink.api.connector.source.datagen.DataGeneratorSource;
+import org.apache.flink.api.connector.source.datagen.GeneratorFunction;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.test.junit5.MiniClusterExtension;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.jupiter.api.DisplayName;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.extension.RegisterExtension;
+
+import java.util.List;
+import java.util.stream.Collectors;
+import java.util.stream.LongStream;
+
+import static org.apache.flink.core.testutils.FlinkAssertions.anyCauseMatches;
+import static org.assertj.core.api.Assertions.assertThat;
+import static org.assertj.core.api.AssertionsForClassTypes.assertThatThrownBy;
+
+/** An integration test for {@code DataGeneratorSource}. */
+public class DataGeneratorSourceITCase extends TestLogger {
+
+private static final int PARALLELISM = 4;
+
+@RegisterExtension
+private static final MiniClusterExtension miniClusterExtension =
+new MiniClusterExtension(
+new MiniClusterResourceConfiguration.Builder()
+.setNumberTaskManagers(1)
+.setNumberSlotsPerTaskManager(PARALLELISM)
+.build());
+
+// 
+
+@Test
+@DisplayName("Combined results of parallel source readers produce the 
expected sequence.")
+public void testParallelSourceExecution() throws Exception {
+final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+env.setParallelism(PARALLELISM);
+
+final DataStream stream = getGeneratorSourceStream(index -> 
index, env, 1_000L);
+
+final List result = stream.executeAndCollect(1);
+
+assertThat(result).containsExactlyInAnyOrderElementsOf(range(0, 999));
+}
+
+@Test
+@DisplayName("Generator function can be instantiated as an anonymous 
class.")
+public void testParallelSourceExecutionWithAnonymousClass() throws 
Exception {
+final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+env.setParallelism(PARALLELISM);
+
+GeneratorFunction generatorFunction =
+new GeneratorFunction() {
+
+@Override
+public Long map(Long value) {
+return value;
+}
+};
+
+final DataStream stream = 
getGeneratorSourceStream(generatorFunction, env, 1_000L);
+
+final List result = stream.executeAndCollect(1);
+
+assertThat(result).containsExactlyInAnyOrderElementsOf(range(0, 999));
+}
+
+@Test
+@DisplayName("Exceptions from the generator function are not 'swallowed'.")
+public void testFailingGeneratorFunction() throws Exception {
+final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+env.setParallelism(PARALLELISM);
+
+GeneratorFunction generatorFunction =
+value -> {
+throw new Exception("boom");
+};
+
+final DataStream stream = 
getGeneratorSourceStream(generatorFunction, env, 1_000L);
+
+assertThatThrownBy(
+() -> {
+stream.executeAndCollect(1);
+})
+

[jira] [Updated] (FLINK-29364) Root cause of Exceptions thrown in the SourceReader start() method gets "swallowed".

2022-09-20 Thread Alexander Fedulov (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Fedulov updated FLINK-29364:
--
Description: 
If an exception is thrown in the {_}SourceReader{_}'s _start()_ method, its 
root cause does not get captured.

The details are still available here: 
[Task.java#L758|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L758]

But the execution falls through to 
[Task.java#L780|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L780]
  and discards the root cause of
canceling the source invokable without recording the actual reason.

 

Hot to reproduce: 
[DataGeneratorSourceITCase.java#L117|https://github.com/apache/flink/blob/3df7669fcc6ba08c5147195b80cc97ac1481ec8c/flink-tests/src/test/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceITCase.java#L117]
 

  was:
If an exception is thrown in the {_}SourceReader{_}'s _start()_ method, its 
root cause does not get captured.

The details are still available here: 
[Task.java#L758|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L758]

But the execution falls through to 
[Task.java#L780|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L780]
  and discards the root cause of
canceling the source invokable without recording the actual reason.

 


> Root cause of Exceptions thrown in the SourceReader start() method gets 
> "swallowed".
> 
>
> Key: FLINK-29364
> URL: https://issues.apache.org/jira/browse/FLINK-29364
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.15.2
>Reporter: Alexander Fedulov
>Priority: Major
>
> If an exception is thrown in the {_}SourceReader{_}'s _start()_ method, its 
> root cause does not get captured.
> The details are still available here: 
> [Task.java#L758|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L758]
> But the execution falls through to 
> [Task.java#L780|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L780]
>   and discards the root cause of
> canceling the source invokable without recording the actual reason.
>  
> Hot to reproduce: 
> [DataGeneratorSourceITCase.java#L117|https://github.com/apache/flink/blob/3df7669fcc6ba08c5147195b80cc97ac1481ec8c/flink-tests/src/test/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceITCase.java#L117]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29364) Root cause of Exceptions thrown in the SourceReader start() method gets "swallowed".

2022-09-20 Thread Alexander Fedulov (Jira)

Alexander Fedulov created FLINK-29364:
-

 Summary: Root cause of Exceptions thrown in the SourceReader 
start() method gets "swallowed".
 Key: FLINK-29364
 URL: https://issues.apache.org/jira/browse/FLINK-29364
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.15.2
Reporter: Alexander Fedulov


If an exception is thrown in the {_}SourceReader{_}'s _start()_ method, its 
root cause does not get captured.

The details are still available here: 
[Task.java#L758|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L758]

But the execution falls through to 
[Task.java#L780|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L780]
  and discards the root cause of
canceling the source invokable without recording the actual reason.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29363) Allow web ui to fully redirect to other page

2022-09-20 Thread Zhenqiu Huang (Jira)

Zhenqiu Huang created FLINK-29363:
-

 Summary: Allow web ui to fully redirect to other page
 Key: FLINK-29363
 URL: https://issues.apache.org/jira/browse/FLINK-29363
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.15.2
Reporter: Zhenqiu Huang


In a streaming platform system, web ui usually integrates with internal 
authentication and authorization system. Given the validation failed, the 
request needs to be redirected to a landing page. It does't work for AJAX 
request. It will be great to have the web ui configurable to allow auto full 
redirect. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-3983) Allow users to set any (relevant) configuration parameter of the KinesisProducerConfiguration



 [ 
https://issues.apache.org/jira/browse/FLINK-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer resolved FLINK-3983.
--
Resolution: Won't Do

> Allow users to set any (relevant) configuration parameter of the 
> KinesisProducerConfiguration
> -
>
> Key: FLINK-3983
> URL: https://issues.apache.org/jira/browse/FLINK-3983
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common, Connectors / Kinesis
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> Currently, users can only set some of the configuration parameters in the 
> {{KinesisProducerConfiguration}} through Properties.
> It would be good to introduce configuration keys for these keys so that users 
> can change the producer configuration.
> I think these and most of the other variables in the 
> KinesisProducerConfiguration should be exposed via properties:
> - aggregationEnabled
> - collectionMaxCount
> - collectionMaxSize
> - connectTimeout
> - credentialsRefreshDelay
> - failIfThrottled
> - logLevel
> - metricsGranularity
> - metricsLevel
> - metricsNamespace
> - metricsUploadDelay



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-3983) Allow users to set any (relevant) configuration parameter of the KinesisProducerConfiguration



[ 
https://issues.apache.org/jira/browse/FLINK-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607331#comment-17607331
 ] 

Danny Cranmer commented on FLINK-3983:
--

Since the new Sink KinesisDataStreamsSink in Flink 1.15 no longer uses 
KinesisProducerLibrary, I am closing this as won't do.

> Allow users to set any (relevant) configuration parameter of the 
> KinesisProducerConfiguration
> -
>
> Key: FLINK-3983
> URL: https://issues.apache.org/jira/browse/FLINK-3983
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common, Connectors / Kinesis
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> Currently, users can only set some of the configuration parameters in the 
> {{KinesisProducerConfiguration}} through Properties.
> It would be good to introduce configuration keys for these keys so that users 
> can change the producer configuration.
> I think these and most of the other variables in the 
> KinesisProducerConfiguration should be exposed via properties:
> - aggregationEnabled
> - collectionMaxCount
> - collectionMaxSize
> - connectTimeout
> - credentialsRefreshDelay
> - failIfThrottled
> - logLevel
> - metricsGranularity
> - metricsLevel
> - metricsNamespace
> - metricsUploadDelay



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-3924) Remove protobuf shading from Kinesis connector



[ 
https://issues.apache.org/jira/browse/FLINK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607328#comment-17607328
 ] 

Danny Cranmer commented on FLINK-3924:
--

This is no longer a problem, since the license issue has resolved. 

> Remove protobuf shading from Kinesis connector
> --
>
> Key: FLINK-3924
> URL: https://issues.apache.org/jira/browse/FLINK-3924
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common, Connectors / Kinesis
>Reporter: Robert Metzger
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> The Kinesis connector is currently creating a fat jar with a custom protobuf 
> version (2.6.1), relocated into a different package.
> We need to build the fat jar to change the protobuf calls from the original 
> protobuf to the relocated one.
> Because Kinesis is licensed under the Amazon Software License (which is not 
> entirely to the ASL2.0), I don't want to deploy kinesis connector binaries to 
> maven central with the releases. These binaries would contain code from 
> Amazon. It would be more than just linking to an (optional) dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-3924) Remove protobuf shading from Kinesis connector



 [ 
https://issues.apache.org/jira/browse/FLINK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer closed FLINK-3924.

Resolution: Not A Problem

> Remove protobuf shading from Kinesis connector
> --
>
> Key: FLINK-3924
> URL: https://issues.apache.org/jira/browse/FLINK-3924
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common, Connectors / Kinesis
>Reporter: Robert Metzger
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> The Kinesis connector is currently creating a fat jar with a custom protobuf 
> version (2.6.1), relocated into a different package.
> We need to build the fat jar to change the protobuf calls from the original 
> protobuf to the relocated one.
> Because Kinesis is licensed under the Amazon Software License (which is not 
> entirely to the ASL2.0), I don't want to deploy kinesis connector binaries to 
> maven central with the releases. These binaries would contain code from 
> Amazon. It would be more than just linking to an (optional) dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-26699) Update AWS SDK v1 and AWS SDK v2 to the latest versions



 [ 
https://issues.apache.org/jira/browse/FLINK-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer resolved FLINK-26699.
---
Resolution: Duplicate

> Update AWS SDK v1 and AWS SDK v2 to the latest versions
> ---
>
> Key: FLINK-26699
> URL: https://issues.apache.org/jira/browse/FLINK-26699
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Common, Connectors / FileSystem, Connectors 
> / Kinesis
>Affects Versions: 1.14.5, 1.15.0, 1.16.0
>Reporter: Martijn Visser
>Priority: Major
>
> Flink currently includes multiple references to both AWS SDK v1 and AWS SDK 
> v2. Both of them are outdated and their transitive dependencies contain 
> vulnerabilities. Those don't immediately affect Flink, but they do cause 
> false positives in scanners. 
> We should at least upgrade both of them to their latest versions. 
> Currently used versions and latest available version (at the moment of 
> creating this ticket)
> 1.12.7 instead of 1.12.178
> 1.11.951 instead of 1.12.178
> 2.17.52 instead of 2.17.149



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] afedulov commented on a diff in pull request #20757: [FLINK-27919] Add FLIP-27-based source for data generation (FLIP-238)



afedulov commented on code in PR #20757:
URL: https://github.com/apache/flink/pull/20757#discussion_r975621958


##
flink-core/src/test/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceTest.java:
##
@@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.connector.source.lib;
+
+import org.apache.flink.api.common.eventtime.Watermark;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.connector.source.ReaderOutput;
+import org.apache.flink.api.connector.source.SourceEvent;
+import org.apache.flink.api.connector.source.SourceOutput;
+import org.apache.flink.api.connector.source.SourceReader;
+import org.apache.flink.api.connector.source.SourceReaderContext;
+import org.apache.flink.api.connector.source.SplitEnumerator;
+import org.apache.flink.api.connector.source.datagen.DataGeneratorSource;
+import org.apache.flink.api.connector.source.datagen.GeneratorFunction;
+import org.apache.flink.api.connector.source.mocks.MockSplitEnumeratorContext;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.InputStatus;
+import org.apache.flink.metrics.groups.SourceReaderMetricGroup;
+import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
+import org.apache.flink.mock.Whitebox;
+import org.apache.flink.util.SimpleUserCodeClassLoader;
+import org.apache.flink.util.UserCodeClassLoader;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.DisplayName;
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.List;
+import java.util.Queue;
+
+import static org.assertj.core.api.Assertions.assertThat;
+
+/** Tests for the {@link DataGeneratorSource}. */
+public class DataGeneratorSourceTest {
+
+@Test
+@DisplayName("Correctly restores SplitEnumerator from a snapshot.")
+public void testRestoreEnumerator() throws Exception {
+final GeneratorFunction generatorFunctionStateless = index 
-> index;
+final DataGeneratorSource dataGeneratorSource =
+new DataGeneratorSource<>(generatorFunctionStateless, 100, 
Types.LONG);
+
+final int parallelism = 2;
+final 
MockSplitEnumeratorContext context =
+new MockSplitEnumeratorContext<>(parallelism);
+
+SplitEnumerator<
+NumberSequenceSource.NumberSequenceSplit,
+Collection>
+enumerator = dataGeneratorSource.createEnumerator(context);
+
+// start() is not strictly necessary in the current implementation, 
but should logically be
+// executed in this order (protect against any breaking changes in the 
start() method).
+enumerator.start();
+
+Collection enumeratorState =
+enumerator.snapshotState(0);
+
+@SuppressWarnings("unchecked")
+final Queue splits =
+(Queue)
+Whitebox.getInternalState(enumerator, 
"remainingSplits");
+
+assertThat(splits).hasSize(parallelism);
+
+enumerator = dataGeneratorSource.restoreEnumerator(context, 
enumeratorState);
+
+@SuppressWarnings("unchecked")
+final Queue restoredSplits =
+(Queue)
+Whitebox.getInternalState(enumerator, 
"remainingSplits");
+
+assertThat(restoredSplits).hasSize(enumeratorState.size());
+}
+
+@Test
+@DisplayName("Uses the underlying NumberSequenceSource correctly for 
checkpointing.")
+public void testReaderCheckpoints() throws Exception {
+final long from = 177;
+final long mid = 333;
+final long to = 563;
+final long elementsPerCycle = (to - from) / 3;
+
+final TestingReaderOutput out = new TestingReaderOutput<>();
+
+SourceReader reader = 
createReader();
+reader.addSplits(
+Arrays.asList(
+new 
NumberSequenceSource.NumberSequenceSplit("split-1", from, mid),
+new 
NumberSequenceSource.NumberSequenceSplit("split-2", mid + 1, to)));
+
+long remainingInCycle =

[GitHub] [flink-kubernetes-operator] HuangZhenQiu commented on pull request #375: [FLINK-29327] remove operator config from job runtime config before d…



HuangZhenQiu commented on PR #375:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/375#issuecomment-1252645380

   @gyfora 
   Sounds good. Let me do it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #375: [FLINK-29327] remove operator config from job runtime config before d…



gyfora commented on PR #375:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/375#issuecomment-1252643707

   It would be nice to have a test that tests the removal as part of the 
regular deployment logic. I see you have a test for the removal itself but we 
don't test that the removal is actually called when we deploy a job.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #375: [FLINK-29327] remove operator config from job runtime config before d…



morhidi commented on PR #375:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/375#issuecomment-1252642003

   @mbalassi can you kick off the workflow on this pls?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-29327) Operator configs are showing up among standard Flink configs

2022-09-20 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29327:
---
Labels: pull-request-available  (was: )

> Operator configs are showing up among standard Flink configs
> 
>
> Key: FLINK-29327
> URL: https://issues.apache.org/jira/browse/FLINK-29327
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Matyas Orhidi
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] HuangZhenQiu opened a new pull request, #375: [FLINK-29327] remove operator config from job runtime config before d…



HuangZhenQiu opened a new pull request, #375:
URL: https://github.com/apache/flink-kubernetes-operator/pull/375

   ## What is the purpose of the change
   Remove operator related configs from flink runtime config, so that users 
will not see any operator related config in web ui.
   
   ## Brief change log
 - remove operator configs before deployments in Flink services.
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup with unit test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
(no)
 - Core observer or reconciler logic that is regularly executed: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-26890) DynamoDB consumer error consuming partitions close to retention



 [ 
https://issues.apache.org/jira/browse/FLINK-26890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-26890:
--
Fix Version/s: (was: 1.16.0)
   (was: 1.13.7)
   (was: 1.14.7)

> DynamoDB consumer error consuming partitions close to retention
> ---
>
> Key: FLINK-26890
> URL: https://issues.apache.org/jira/browse/FLINK-26890
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.15.0
>Reporter: Danny Cranmer
>Priority: Major
>
> *Background*
> The Amazon Kinesis Data Streams consumer supports consuming from Amazon 
> DynamoDB via the [DynamoDB Streams Kinesis 
> Adapter|https://github.com/awslabs/dynamodb-streams-kinesis-adapter]. 
> *Problem*
> We have seen instances of consumer throwing {{ResouceNotFoundException}} when 
> attempting to invoke {{GetShardIterator}}.
> {code}
> com.amazonaws.services.kinesis.model.ResourceNotFoundException: Requested 
> resource not found: Shard does not exist 
> {code}
> According to the DynamoDB team, the {{DescribeStream}} call may return shard 
> IDs that are no longer valid, and this exception needs to be handled by the 
> client. 
> *Solution*
> Modify the DynamoDB consumer to treat {{ResourceNotFoundException}} as a 
> shard closed signal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-26256) AWS SDK Async Event Loop Group Classloader Issue



 [ 
https://issues.apache.org/jira/browse/FLINK-26256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer closed FLINK-26256.
-
Resolution: Won't Fix

> AWS SDK Async Event Loop Group Classloader Issue
> 
>
> Key: FLINK-26256
> URL: https://issues.apache.org/jira/browse/FLINK-26256
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Reporter: Danny Cranmer
>Priority: Major
>  Labels: pull-request-available, stale-assigned
>
> h3. Background
> AWS SDK v2 async clients use a Netty async client for Kinesis Data 
> Streams/Firehose sink and Kinesis Data Streams EFO consumer. The SDK creates 
> a shared thread pool for Netty to use for network operations when one is not 
> configured. The thread pool is managed by a shared ELG (event loop group), 
> and this is stored in a static field. We do not configure this for the AWS 
> connectors in the Flink codebase. 
> When threads are spawned within the ELG, they inherit the context classloader 
> from the current thread. If the ELG is created from a shared classloader, for 
> instance Flink parent classloader, or MiniCluster parent classloader, 
> multiple Flink jobs can share the same ELG. When an ELG thread is spawned 
> from a Flink job, it will inherit the Flink user classloader. When this job 
> completes/fails, the classloader is destroyed, however the Netty thread is 
> still referencing it, and this leads to below exception.
> h3. Impact
> This issue *does not* impact jobs deployed to TM when AWS SDK v2 is loaded 
> via the Flink User Classloader. It is expected this is the standard 
> deployment configuration.
> This issue is known to impact:
> - Flink mini cluster, for example in integration tests (FLINK-26064)
> - Flink cluster loading AWS SDK v2 via parent classloader
> h3. Suggested solution
> There are a few possible solutions, as discussed 
> https://github.com/apache/flink/pull/18733
> 1. Create a separate ELG per Flink job
> 2. Create a separate ELG per subtask
> 3. Attach the correct classloader to ELG spawned threads
> h3. Error Stack
> (shortened stack trace, as full is too large)
> {noformat}
> Feb 09 20:05:04 java.util.concurrent.ExecutionException: 
> software.amazon.awssdk.core.exception.SdkClientException: Unable to execute 
> HTTP request: Trying to access closed classloader. Please check if you store 
> classloaders directly or indirectly in static fields. If the stacktrace 
> suggests that the leak occurs in a third party library and cannot be fixed 
> immediately, you can disable this check with the configuration 
> 'classloader.check-leaked-classloader'.
> Feb 09 20:05:04   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> Feb 09 20:05:04   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> (...)
> Feb 09 20:05:04 Caused by: 
> software.amazon.awssdk.core.exception.SdkClientException: Unable to execute 
> HTTP request: Trying to access closed classloader. Please check if you store 
> classloaders directly or indirectly in static fields. If the stacktrace 
> suggests that the leak occurs in a third party library and cannot be fixed 
> immediately, you can disable this check with the configuration 
> 'classloader.check-leaked-classloader'.
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:98)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:43)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:204)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:200)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:179)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:159)
> (...)
> Feb 09 20:05:04 Caused by: java.lang.IllegalStateException: Trying to access 
> closed classloader. Please check if you store classloaders directly or 
> indirectly in static fields. If the stacktrace suggests that the leak occurs 
> in a third party library and cannot be fixed immediately, you can disable 
> this check with the configuration 'classloader.check-leaked-classloader'.
> Feb 09 20:05:04   at 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164)
> Feb 09

[jira] [Commented] (FLINK-26256) AWS SDK Async Event Loop Group Classloader Issue



[ 
https://issues.apache.org/jira/browse/FLINK-26256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607304#comment-17607304
 ] 

Danny Cranmer commented on FLINK-26256:
---

I am closing this issue as won't do for the following reasons:
- This does not impact Flink jobs that load the connector via the user 
classloader, which is the recommended best practise
- We have resolved the race condition in the integration tests
- Fixing this is not trivial and adds a lot of code to maintain 

Please reopen if you disagree with my reasoning.

> AWS SDK Async Event Loop Group Classloader Issue
> 
>
> Key: FLINK-26256
> URL: https://issues.apache.org/jira/browse/FLINK-26256
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Reporter: Danny Cranmer
>Priority: Major
>  Labels: pull-request-available, stale-assigned
>
> h3. Background
> AWS SDK v2 async clients use a Netty async client for Kinesis Data 
> Streams/Firehose sink and Kinesis Data Streams EFO consumer. The SDK creates 
> a shared thread pool for Netty to use for network operations when one is not 
> configured. The thread pool is managed by a shared ELG (event loop group), 
> and this is stored in a static field. We do not configure this for the AWS 
> connectors in the Flink codebase. 
> When threads are spawned within the ELG, they inherit the context classloader 
> from the current thread. If the ELG is created from a shared classloader, for 
> instance Flink parent classloader, or MiniCluster parent classloader, 
> multiple Flink jobs can share the same ELG. When an ELG thread is spawned 
> from a Flink job, it will inherit the Flink user classloader. When this job 
> completes/fails, the classloader is destroyed, however the Netty thread is 
> still referencing it, and this leads to below exception.
> h3. Impact
> This issue *does not* impact jobs deployed to TM when AWS SDK v2 is loaded 
> via the Flink User Classloader. It is expected this is the standard 
> deployment configuration.
> This issue is known to impact:
> - Flink mini cluster, for example in integration tests (FLINK-26064)
> - Flink cluster loading AWS SDK v2 via parent classloader
> h3. Suggested solution
> There are a few possible solutions, as discussed 
> https://github.com/apache/flink/pull/18733
> 1. Create a separate ELG per Flink job
> 2. Create a separate ELG per subtask
> 3. Attach the correct classloader to ELG spawned threads
> h3. Error Stack
> (shortened stack trace, as full is too large)
> {noformat}
> Feb 09 20:05:04 java.util.concurrent.ExecutionException: 
> software.amazon.awssdk.core.exception.SdkClientException: Unable to execute 
> HTTP request: Trying to access closed classloader. Please check if you store 
> classloaders directly or indirectly in static fields. If the stacktrace 
> suggests that the leak occurs in a third party library and cannot be fixed 
> immediately, you can disable this check with the configuration 
> 'classloader.check-leaked-classloader'.
> Feb 09 20:05:04   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> Feb 09 20:05:04   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> (...)
> Feb 09 20:05:04 Caused by: 
> software.amazon.awssdk.core.exception.SdkClientException: Unable to execute 
> HTTP request: Trying to access closed classloader. Please check if you store 
> classloaders directly or indirectly in static fields. If the stacktrace 
> suggests that the leak occurs in a third party library and cannot be fixed 
> immediately, you can disable this check with the configuration 
> 'classloader.check-leaked-classloader'.
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:98)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:43)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:204)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:200)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:179)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:159)
> (...)
> Feb 09 20:05:04 Caused by: java.lang.IllegalStateException: Trying to access 
> closed classloader. Please check if you store classloaders directly or 
> indirectly in static fields. If the

[jira] [Updated] (FLINK-26890) DynamoDB consumer error consuming partitions close to retention



 [ 
https://issues.apache.org/jira/browse/FLINK-26890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-26890:
--
Summary: DynamoDB consumer error consuming partitions close to retention  
(was: [Kinesis][Consumer] DynamoDB consumer error consuming partitions close to 
retention)

> DynamoDB consumer error consuming partitions close to retention
> ---
>
> Key: FLINK-26890
> URL: https://issues.apache.org/jira/browse/FLINK-26890
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.15.0
>Reporter: Danny Cranmer
>Priority: Major
> Fix For: 1.16.0, 1.13.7, 1.14.7
>
>
> *Background*
> The Amazon Kinesis Data Streams consumer supports consuming from Amazon 
> DynamoDB via the [DynamoDB Streams Kinesis 
> Adapter|https://github.com/awslabs/dynamodb-streams-kinesis-adapter]. 
> *Problem*
> We have seen instances of consumer throwing {{ResouceNotFoundException}} when 
> attempting to invoke {{GetShardIterator}}.
> {code}
> com.amazonaws.services.kinesis.model.ResourceNotFoundException: Requested 
> resource not found: Shard does not exist 
> {code}
> According to the DynamoDB team, the {{DescribeStream}} call may return shard 
> IDs that are no longer valid, and this exception needs to be handled by the 
> client. 
> *Solution*
> Modify the DynamoDB consumer to treat {{ResourceNotFoundException}} as a 
> shard closed signal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] flinkbot commented on pull request #20865: [FLINK-14896][connectors/kinesis] Shade and relocate Jackson dependen…



flinkbot commented on PR #20865:
URL: https://github.com/apache/flink/pull/20865#issuecomment-1252620026

   
   ## CI report:
   
   * cdb8d49a1af3d5491590fae859ab7c11ecb62405 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-28747) "target_id can not be missing" in HTTP statefun request

2022-09-20 Thread Stephan Weinwurm (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607301#comment-17607301
 ] 

Stephan Weinwurm commented on FLINK-28747:
--

[~groot] / [~trohrmann]quick ping. I don't have enough knowledge on what's the 
right fix here. Would one of you mind taking a look and getting this fixed? 
Thank you in advance!

> "target_id can not be missing" in HTTP statefun request
> ---
>
> Key: FLINK-28747
> URL: https://issues.apache.org/jira/browse/FLINK-28747
> Project: Flink
>  Issue Type: Bug
>  Components: Stateful Functions
>Affects Versions: statefun-3.0.0, statefun-3.2.0, statefun-3.1.1
>Reporter: Stephan Weinwurm
>Priority: Major
>
> Hi all,
> We've suddenly started to see the following exception in our HTTP statefun 
> functions endpoints:
> {code}Traceback (most recent call last):
>   File 
> "/src/.venv/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", 
> line 403, in run_asgi
> result = await app(self.scope, self.receive, self.send)
>   File 
> "/src/.venv/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", 
> line 78, in __call__
> return await self.app(scope, receive, send)
>   File "/src/worker/baseplate_asgi/asgi/baseplate_asgi_middleware.py", line 
> 37, in __call__
> await span_processor.execute()
>   File "/src/worker/baseplate_asgi/asgi/asgi_http_span_processor.py", line 
> 61, in execute
> raise e
>   File "/src/worker/baseplate_asgi/asgi/asgi_http_span_processor.py", line 
> 57, in execute
> await self.app(self.scope, self.receive, self.send)
>   File "/src/.venv/lib/python3.9/site-packages/starlette/applications.py", 
> line 124, in __call__
> await self.middleware_stack(scope, receive, send)
>   File 
> "/src/.venv/lib/python3.9/site-packages/starlette/middleware/errors.py", line 
> 184, in __call__
> raise exc
>   File 
> "/src/.venv/lib/python3.9/site-packages/starlette/middleware/errors.py", line 
> 162, in __call__
> await self.app(scope, receive, _send)
>   File 
> "/src/.venv/lib/python3.9/site-packages/starlette/middleware/exceptions.py", 
> line 75, in __call__
> raise exc
>   File 
> "/src/.venv/lib/python3.9/site-packages/starlette/middleware/exceptions.py", 
> line 64, in __call__
> await self.app(scope, receive, sender)
>   File "/src/.venv/lib/python3.9/site-packages/starlette/routing.py", line 
> 680, in __call__
> await route.handle(scope, receive, send)
>   File "/src/.venv/lib/python3.9/site-packages/starlette/routing.py", line 
> 275, in handle
> await self.app(scope, receive, send)
>   File "/src/.venv/lib/python3.9/site-packages/starlette/routing.py", line 
> 65, in app
> response = await func(request)
>   File "/src/worker/baseplate_statefun/server/asgi/make_statefun_handler.py", 
> line 25, in statefun_handler
> result = await handler.handle_async(request_body)
>   File "/src/.venv/lib/python3.9/site-packages/statefun/request_reply_v3.py", 
> line 262, in handle_async
> msg = Message(target_typename=sdk_address.typename, 
> target_id=sdk_address.id,
>   File "/src/.venv/lib/python3.9/site-packages/statefun/messages.py", line 
> 42, in __init__
> raise ValueError("target_id can not be missing"){code}
> Interestingly, this has started to happen in three separate Flink deployments 
> at the very same time. The only thing in common between the three deployments 
> is that they consume the same Kafka topics.
> No deployments have happened when the issue started happening which was on 
> July 28th 3:05PM. We have since been continuously seeing the error.
> We were also able to extract the request that Flink sends to the HTTP 
> statefun endpoint:
> {code}{'invocation': {'target': {'namespace': 'com.x.dummy', 'type': 
> 'dummy'}, 'invocations': [{'argument': {'typename': 
> 'type.googleapis.com/v2_event.Event', 'has_value': True, 'value': 
> '-redicated-'}}]}}
> {code}
> As you can see, no `id` field is present in the `invocation.target` object or 
> the `target_id` was an empty string.
>  
> This is our module.yaml from one of the Flink deployments:
>  
> {code}
> version: "3.0"
> module:
> meta:
> type: remote
> spec:
> endpoints:
>  - endpoint:
> meta:
> kind: io.statefun.endpoints.v1/http
> spec:
> functions: com.x.dummy/dummy
> urlPathTemplate: [http://x-worker-dummy.x-functions:9090/statefun]
> timeouts:
> call: 2 min
> read: 2 min
> write: 2 min
> maxNumBatchRequests: 100
> ingresses:
>  - ingress:
> meta:
> type: io.statefun.kafka/ingress
> id: com.x/ingress
> spec:
> address: x-kafka-0.x.ue1.x.net:9092
> consumerGroupId: x-worker-dummy
> topics:
>  - topic: v2_post_events
> valueType: type.googleapis.com/v2_event.Event
> targets:
>  - com.x.dummy/dummy
> startupPosition:
> type: group-offsets
> autoOffsetResetPosition: earliest

[jira] [Commented] (FLINK-6573) Flink MongoDB Connector

2022-09-20 Thread Jiabao Sun (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-6573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607298#comment-17607298
 ] 

Jiabao Sun commented on FLINK-6573:
---

Thanks [~arvid].

Looking forward to getting some advise from [~martijnvisser] .

> Flink MongoDB Connector
> ---
>
> Key: FLINK-6573
> URL: https://issues.apache.org/jira/browse/FLINK-6573
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Affects Versions: 1.2.0
> Environment: Linux Operating System, Mongo DB
>Reporter: Nagamallikarjuna
>Assignee: ZhuoYu Chen
>Priority: Not a Priority
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-11-15-14-41-07-514.png
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hi Community,
> Currently we are using Flink in the current Project. We have huge amount of 
> data to process using Flink which resides in Mongo DB. We have a requirement 
> of parallel data connectivity in between Flink and Mongo DB for both 
> reads/writes. Currently we are planning to create this connector and 
> contribute to the Community.
> I will update the further details once I receive your feedback 
> Please let us know if you have any concerns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] afedulov commented on a diff in pull request #20757: [FLINK-27919] Add FLIP-27-based source for data generation (FLIP-238)



afedulov commented on code in PR #20757:
URL: https://github.com/apache/flink/pull/20757#discussion_r975590103


##
flink-core/src/test/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceTest.java:
##
@@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.connector.source.lib;
+
+import org.apache.flink.api.common.eventtime.Watermark;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.api.connector.source.ReaderOutput;
+import org.apache.flink.api.connector.source.SourceEvent;
+import org.apache.flink.api.connector.source.SourceOutput;
+import org.apache.flink.api.connector.source.SourceReader;
+import org.apache.flink.api.connector.source.SourceReaderContext;
+import org.apache.flink.api.connector.source.SplitEnumerator;
+import org.apache.flink.api.connector.source.datagen.DataGeneratorSource;
+import org.apache.flink.api.connector.source.datagen.GeneratorFunction;
+import org.apache.flink.api.connector.source.mocks.MockSplitEnumeratorContext;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.InputStatus;
+import org.apache.flink.metrics.groups.SourceReaderMetricGroup;
+import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
+import org.apache.flink.mock.Whitebox;
+import org.apache.flink.util.SimpleUserCodeClassLoader;
+import org.apache.flink.util.UserCodeClassLoader;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.DisplayName;
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.List;
+import java.util.Queue;
+
+import static org.assertj.core.api.Assertions.assertThat;
+
+/** Tests for the {@link DataGeneratorSource}. */
+public class DataGeneratorSourceTest {
+
+@Test
+@DisplayName("Correctly restores SplitEnumerator from a snapshot.")
+public void testRestoreEnumerator() throws Exception {
+final GeneratorFunction generatorFunctionStateless = index 
-> index;
+final DataGeneratorSource dataGeneratorSource =
+new DataGeneratorSource<>(generatorFunctionStateless, 100, 
Types.LONG);
+
+final int parallelism = 2;
+final 
MockSplitEnumeratorContext context =
+new MockSplitEnumeratorContext<>(parallelism);
+
+SplitEnumerator<
+NumberSequenceSource.NumberSequenceSplit,
+Collection>
+enumerator = dataGeneratorSource.createEnumerator(context);
+
+// start() is not strictly necessary in the current implementation, 
but should logically be
+// executed in this order (protect against any breaking changes in the 
start() method).
+enumerator.start();
+
+Collection enumeratorState =
+enumerator.snapshotState(0);
+
+@SuppressWarnings("unchecked")
+final Queue splits =
+(Queue)
+Whitebox.getInternalState(enumerator, 
"remainingSplits");
+
+assertThat(splits).hasSize(parallelism);

Review Comment:
   Good idea, I modified the test accordingly: 
https://github.com/apache/flink/pull/20757/commits/e8a11fe0afe73ae22d9ef7911b0302846cfbfd61



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] dannycranmer opened a new pull request, #20865: [FLINK-14896][connectors/kinesis] Shade and relocate Jackson dependen…



dannycranmer opened a new pull request, #20865:
URL: https://github.com/apache/flink/pull/20865

   ## What is the purpose of the change
   
   - Shade and relocate Jackson dependencies within `flink-kinesis-connector` 
to avoid conflicts with user code
   
   ## Brief change log
   
   - Shade and relocate Jackson dependencies within `flink-kinesis-connector`
   - Update NOTICE file in `flink-kinesis-connector` and 
`flink-sql-kinesis-connector`
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as unit tests, 
integration and end-to-end tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): yes
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-14896) Kinesis connector doesn't shade jackson dependency



[ 
https://issues.apache.org/jira/browse/FLINK-14896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607294#comment-17607294
 ] 

Danny Cranmer commented on FLINK-14896:
---

There is a good discussion on the old PR 
https://github.com/apache/flink/pull/10285. I am inclined to shade Jackson 
since it is not exposed on the public interfaces, and the majority of 
dependencies are shaded for this connector, including AWS SDK. This means users 
cannot change Jackson without potentially breaking AWS SDK. Since AWS SDK is 
shaded, the user cannot override this version.

Looking forwards, the new Kinesis connector does not use Jackson directly, or 
shade the AWS SDK.

> Kinesis connector doesn't shade jackson dependency
> --
>
> Key: FLINK-14896
> URL: https://issues.apache.org/jira/browse/FLINK-14896
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.9.0
> Environment: AWS EMR 5.28.0
>Reporter: Michel Davit
>Priority: Not a Priority
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> flink-kinesis-connector depends on aws java sdk which is shaded to 
> {{org.apache.flink.kinesis.shaded.com.amazonaws.}}
>  
> {{However, the aws sdk has a transitive dependency to jackson wich is not 
> shaded in the artifact.}}
>  
> {{This creates problem when running flink on YARN: }}{{The aws sdk requires 
> jackson-core v2.6 but hadoop pulls in 2.3. See 
> [here|https://github.com/apache/flink/blob/e7c11ed672013512e5b159e7e892b27b1ef60a1b/flink-yarn/pom.xml#L133].}}
>  
> {{If YARN uses the loads wrong jackson version from classpath. Jod fails 
> with}}
> {code:java}
> 2019-11-20 17:23:11,563 ERROR 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler- Unhandled 
> exception.org.apache.flink.client.program.ProgramInvocationException: The 
> program caused an error: at 
> org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:93)
> at 
> org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:80)
> at 
> org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$JarHandlerContext.toJobGraph(JarHandlerUtils.java:126)
> at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$getJobGraphAsync$6(JarRunHandler.java:142)
> at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)Caused by: 
> java.lang.NoSuchMethodError: 
> com.fasterxml.jackson.databind.ObjectMapper.enable([Lcom/fasterxml/jackson/core/JsonParser$Feature;)Lcom/fasterxml/jackson/databind/ObjectMapper;
> at 
> com.amazonaws.partitions.PartitionsLoader.(PartitionsLoader.java:54)  
>   at 
> com.amazonaws.regions.RegionMetadataFactory.create(RegionMetadataFactory.java:30)
> at com.amazonaws.regions.RegionUtils.initialize(RegionUtils.java:65)
> at com.amazonaws.regions.RegionUtils.getRegionMetadata(RegionUtils.java:53)   
>  at com.amazonaws.regions.RegionUtils.getRegion(RegionUtils.java:107)at 
> com.amazonaws.client.builder.AwsClientBuilder.getRegionObject(AwsClientBuilder.java:256)
> at 
> com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:460)
> at 
> com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424)
> at 
> com.amazonaws.client.builder.AwsAsyncClientBuilder.build(AwsAsyncClientBuilder.java:80)
> ...
> {code}
> The flink-kinesis-connector should do as other connectors: shade jackson or 
> use the flink-shaded-jackson core dependency



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] alpinegizmo commented on pull request #20859: [release] release notes for the 1.16 release



alpinegizmo commented on PR #20859:
URL: https://github.com/apache/flink/pull/20859#issuecomment-1252595721

   @HuangXingBo There are some topics I'm surprised aren't mentioned:
   
   * FLINK-27878 [FLIP-232] Add Retry Support For Async I/O In DataStream API
   * FLINK-21585 Add options to include/exclude metrics
   * FLINK-29255 [FLIP-258] - Enforce binary compatibility in patch releases
   * FLINK-26074 Improve FlameGraphs scalability for high parallelism jobs
   * FLINK-27853 [FLIP-229]: Introduces Join Hint for Flink SQL Batch Job
   * FLINK-28415 [FLIP-221]: Support Partial and Full Caching in Lookup Table 
Source
   
   Also, a lot was done relating to batch execution and speculative execution:
   
   FLINK-28131 FLIP-168: Speculative Execution For Batch Job
   FLINK-28130 FLIP-224: Blocklist Mechanism
   FLINK-28397 FLIP-245: Source Supports Speculative Execution For Batch Job
   FLINK-28706 FLIP-248: Introduce dynamic partition pruning
   FLINK-28778 FLIP-247: Bulk fetch of table and column statistics for given 
partitions
   FLINK-27862 FLIP-235: Hybrid Shuffle Mode
   
   Here's one I'm not sure about. it looks like a lot of progress was made, but 
perhaps it's not useable without FLINK-29073? 
   
   * FLINK-15472 [FLIP-91]: SQL Gateway
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-29324) Calling Kinesis connector close method before subtask starts running results in NPE



[ 
https://issues.apache.org/jira/browse/FLINK-29324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607280#comment-17607280
 ] 

Danny Cranmer commented on FLINK-29324:
---

Thanks for the contribution [~xiaohei]

> Calling Kinesis connector close method before subtask starts running results 
> in NPE
> ---
>
> Key: FLINK-29324
> URL: https://issues.apache.org/jira/browse/FLINK-29324
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.14.5, 1.15.2
>Reporter: Anthony Pounds-Cornish
>Assignee: Dongming.Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0, 1.15.3
>
>
> When a Flink application is stopped before a Kinesis connector subtask has 
> been started, the following exception is thrown:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.close(FlinkKinesisConsumer.java:421)
> ...{noformat}
> This appears to be related to the fact that [fetcher 
> creation|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L307]
>  may not occur by [the time it is referenced when the consumer is 
> closed|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L421].
> A suggested fix is to make the {{close()}} method null safe [as it has been 
> in the {{cancel()}} 
> method|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L407].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-29324) Calling Kinesis connector close method before subtask starts running results in NPE



 [ 
https://issues.apache.org/jira/browse/FLINK-29324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer resolved FLINK-29324.
---
Resolution: Fixed

> Calling Kinesis connector close method before subtask starts running results 
> in NPE
> ---
>
> Key: FLINK-29324
> URL: https://issues.apache.org/jira/browse/FLINK-29324
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.14.5, 1.15.2
>Reporter: Anthony Pounds-Cornish
>Assignee: Dongming.Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0, 1.15.3
>
>
> When a Flink application is stopped before a Kinesis connector subtask has 
> been started, the following exception is thrown:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.close(FlinkKinesisConsumer.java:421)
> ...{noformat}
> This appears to be related to the fact that [fetcher 
> creation|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L307]
>  may not occur by [the time it is referenced when the consumer is 
> closed|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L421].
> A suggested fix is to make the {{close()}} method null safe [as it has been 
> in the {{cancel()}} 
> method|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L407].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29324) Calling Kinesis connector close method before subtask starts running results in NPE



[ 
https://issues.apache.org/jira/browse/FLINK-29324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607267#comment-17607267
 ] 

Danny Cranmer edited comment on FLINK-29324 at 9/20/22 3:51 PM:


Merged commit 
[{{71fea9a}}|https://github.com/apache/flink/commit/71fea9a4522505a6c0f23f1de599b7f87a633ccf]
 into master.

Merged commit 
[{{22086c}}|https://github.com/apache/flink/commit/22086c67a6a97148eb74ed32b281eec393721738]
 into release-1.16

Merged commit 
[{{eb6565}}|https://github.com/apache/flink/commit/eb65655f8ce39627a6bd28c8b0c92db44d2e]
 into release-1.15


was (Author: dannycranmer):
Merged commit 
[{{71fea9a}}|https://github.com/apache/flink/commit/71fea9a4522505a6c0f23f1de599b7f87a633ccf]
 into master.

Merged commit 
[{{22086c}}|https://github.com/apache/flink/commit/22086c67a6a97148eb74ed32b281eec393721738]
 into release-1.16

> Calling Kinesis connector close method before subtask starts running results 
> in NPE
> ---
>
> Key: FLINK-29324
> URL: https://issues.apache.org/jira/browse/FLINK-29324
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.14.5, 1.15.2
>Reporter: Anthony Pounds-Cornish
>Assignee: Dongming.Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0, 1.15.3
>
>
> When a Flink application is stopped before a Kinesis connector subtask has 
> been started, the following exception is thrown:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.close(FlinkKinesisConsumer.java:421)
> ...{noformat}
> This appears to be related to the fact that [fetcher 
> creation|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L307]
>  may not occur by [the time it is referenced when the consumer is 
> closed|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L421].
> A suggested fix is to make the {{close()}} method null safe [as it has been 
> in the {{cancel()}} 
> method|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L407].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29324) Calling Kinesis connector close method before subtask starts running results in NPE



 [ 
https://issues.apache.org/jira/browse/FLINK-29324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-29324:
--
Fix Version/s: 1.16.0
   1.15.3

> Calling Kinesis connector close method before subtask starts running results 
> in NPE
> ---
>
> Key: FLINK-29324
> URL: https://issues.apache.org/jira/browse/FLINK-29324
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.14.5, 1.15.2
>Reporter: Anthony Pounds-Cornish
>Assignee: Dongming.Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0, 1.15.3
>
>
> When a Flink application is stopped before a Kinesis connector subtask has 
> been started, the following exception is thrown:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.close(FlinkKinesisConsumer.java:421)
> ...{noformat}
> This appears to be related to the fact that [fetcher 
> creation|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L307]
>  may not occur by [the time it is referenced when the consumer is 
> closed|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L421].
> A suggested fix is to make the {{close()}} method null safe [as it has been 
> in the {{cancel()}} 
> method|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L407].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29362) Allow loading dynamic config for kerberos authentication in CliFrontend

2022-09-20 Thread Biao Geng (Jira)

Biao Geng created FLINK-29362:
-

 Summary: Allow loading dynamic config for kerberos authentication 
in CliFrontend
 Key: FLINK-29362
 URL: https://issues.apache.org/jira/browse/FLINK-29362
 Project: Flink
  Issue Type: Improvement
  Components: Command Line Client
Reporter: Biao Geng


In the 
[code|https://github.com/apache/flink/blob/97f5a45cd035fbae37a7468c6f771451ddb4a0a4/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L1167],
 Flink's client will try to {{SecurityUtils.install(new 
SecurityConfiguration(cli.configuration));}} with configs(e.g. 
{{security.kerberos.login.principal}} and {{security.kerberos.login.keytab}}) 
from only flink-conf.yaml.
If users specify the above 2 config via -D option, it will not work as 
{{cli.parseAndRun(args)}} will be executed after installing security configs 
from flink-conf.yaml.
However, if a user specify principal A in client's flink-conf.yaml and use -D 
option to specify principal B, the launched YARN container will use principal B 
though the job is submitted in client end with principal A.

Such behavior can be misleading as Flink provides 2 ways to set a config but 
does not keep consistency between client and cluster. It also influence users 
who want use flink with kerberos as they must modify flink-conf.yaml if they 
want to use another kerberos user.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29324) Calling Kinesis connector close method before subtask starts running results in NPE



[ 
https://issues.apache.org/jira/browse/FLINK-29324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607267#comment-17607267
 ] 

Danny Cranmer edited comment on FLINK-29324 at 9/20/22 3:42 PM:


Merged commit 
[{{71fea9a}}|https://github.com/apache/flink/commit/71fea9a4522505a6c0f23f1de599b7f87a633ccf]
 into master.

Merged commit 
[{{22086c}}|https://github.com/apache/flink/commit/22086c67a6a97148eb74ed32b281eec393721738]
 into release-1.16


was (Author: dannycranmer):
Merged commit 
[{{71fea9a}}|https://github.com/apache/flink/commit/71fea9a4522505a6c0f23f1de599b7f87a633ccf]
 into master

> Calling Kinesis connector close method before subtask starts running results 
> in NPE
> ---
>
> Key: FLINK-29324
> URL: https://issues.apache.org/jira/browse/FLINK-29324
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.14.5, 1.15.2
>Reporter: Anthony Pounds-Cornish
>Assignee: Dongming.Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> When a Flink application is stopped before a Kinesis connector subtask has 
> been started, the following exception is thrown:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.close(FlinkKinesisConsumer.java:421)
> ...{noformat}
> This appears to be related to the fact that [fetcher 
> creation|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L307]
>  may not occur by [the time it is referenced when the consumer is 
> closed|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L421].
> A suggested fix is to make the {{close()}} method null safe [as it has been 
> in the {{cancel()}} 
> method|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L407].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29277) Flink submits tasks to yarn Federation and throws an exception 'org.apache.commons.lang3.NotImplementedException: Code is not implemented'

2022-09-20 Thread Biao Geng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Biao Geng updated FLINK-29277:
--
Attachment: screenshot-1.png

> Flink submits tasks to yarn Federation and throws an exception 
> 'org.apache.commons.lang3.NotImplementedException: Code is not implemented'
> --
>
> Key: FLINK-29277
> URL: https://issues.apache.org/jira/browse/FLINK-29277
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.14.3
> Environment: Flink 1.14.3、JDK8、hadoop-3.2.1
>Reporter: Jiankun Feng
>Priority: Blocker
> Attachments: error.log, image-2022-09-13-15-56-47-631.png, 
> screenshot-1.png
>
>
> 2022-09-13 11:02:35,488 INFO  
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The 
> derived from fraction jvm overhead memory (102.400mb (107374184 bytes)) is 
> less than its min value 192.000mb (201326592 bytes), min value will be used 
> instead
> 2022-09-13 11:02:35,751 WARN  org.apache.flink.table.client.cli.CliClient     
>              [] - Could not execute SQL statement.
> org.apache.flink.table.client.gateway.SqlExecutionException: Could not 
> execute SQL statement.
>         at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.executeModifyOperations(LocalExecutor.java:225)
>  ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.callInserts(CliClient.java:617) 
> ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.callInsert(CliClient.java:606) 
> ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.callOperation(CliClient.java:466) 
> ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.lambda$executeStatement$1(CliClient.java:346)
>  [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at java.util.Optional.ifPresent(Optional.java:159) ~[?:1.8.0_141]
>         at 
> org.apache.flink.table.client.cli.CliClient.executeStatement(CliClient.java:339)
>  [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.executeFile(CliClient.java:318) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.executeInNonInteractiveMode(CliClient.java:234)
>  [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:153) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at org.apache.flink.table.client.SqlClient.start(SqlClient.java:95) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:187) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at org.apache.flink.table.client.SqlClient.main(SqlClient.java:161) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
> Caused by: org.apache.flink.table.api.TableException: Failed to execute sql
>         at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:791)
>  ~[flink-table_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:754)
>  ~[flink-table_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeModifyOperations$4(LocalExecutor.java:223)
>  ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88)
>  ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.executeModifyOperations(LocalExecutor.java:223)
>  ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         ... 12 more
> Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: 
> Could not deploy Yarn job cluster.
>         at 
> org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:489)
>  ~[flink-dist_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:81)
>  ~[flink-dist_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
>

[jira] [Commented] (FLINK-29277) Flink submits tasks to yarn Federation and throws an exception 'org.apache.commons.lang3.NotImplementedException: Code is not implemented'

2022-09-20 Thread Biao Geng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607268#comment-17607268
 ] 

Biao Geng commented on FLINK-29277:
---

In hadoop3.2.1, 
org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor#getClusterNodes
 is not implemented. 
 !screenshot-1.png! 

> Flink submits tasks to yarn Federation and throws an exception 
> 'org.apache.commons.lang3.NotImplementedException: Code is not implemented'
> --
>
> Key: FLINK-29277
> URL: https://issues.apache.org/jira/browse/FLINK-29277
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.14.3
> Environment: Flink 1.14.3、JDK8、hadoop-3.2.1
>Reporter: Jiankun Feng
>Priority: Blocker
> Attachments: error.log, image-2022-09-13-15-56-47-631.png, 
> screenshot-1.png
>
>
> 2022-09-13 11:02:35,488 INFO  
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The 
> derived from fraction jvm overhead memory (102.400mb (107374184 bytes)) is 
> less than its min value 192.000mb (201326592 bytes), min value will be used 
> instead
> 2022-09-13 11:02:35,751 WARN  org.apache.flink.table.client.cli.CliClient     
>              [] - Could not execute SQL statement.
> org.apache.flink.table.client.gateway.SqlExecutionException: Could not 
> execute SQL statement.
>         at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.executeModifyOperations(LocalExecutor.java:225)
>  ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.callInserts(CliClient.java:617) 
> ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.callInsert(CliClient.java:606) 
> ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.callOperation(CliClient.java:466) 
> ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.lambda$executeStatement$1(CliClient.java:346)
>  [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at java.util.Optional.ifPresent(Optional.java:159) ~[?:1.8.0_141]
>         at 
> org.apache.flink.table.client.cli.CliClient.executeStatement(CliClient.java:339)
>  [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.executeFile(CliClient.java:318) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.cli.CliClient.executeInNonInteractiveMode(CliClient.java:234)
>  [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:153) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at org.apache.flink.table.client.SqlClient.start(SqlClient.java:95) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:187) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at org.apache.flink.table.client.SqlClient.main(SqlClient.java:161) 
> [flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
> Caused by: org.apache.flink.table.api.TableException: Failed to execute sql
>         at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:791)
>  ~[flink-table_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:754)
>  ~[flink-table_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeModifyOperations$4(LocalExecutor.java:223)
>  ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88)
>  ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.executeModifyOperations(LocalExecutor.java:223)
>  ~[flink-sql-client_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         ... 12 more
> Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: 
> Could not deploy Yarn job cluster.
>         at 
> org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:489)
>  ~[flink-dist_2.11-1.14.3-qihoo-f5.jar:1.14.3-qihoo-f5]
>         at 
>

[jira] [Updated] (FLINK-29324) Calling Kinesis connector close method before subtask starts running results in NPE



 [ 
https://issues.apache.org/jira/browse/FLINK-29324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-29324:
--
Fix Version/s: 1.17.0

> Calling Kinesis connector close method before subtask starts running results 
> in NPE
> ---
>
> Key: FLINK-29324
> URL: https://issues.apache.org/jira/browse/FLINK-29324
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.14.5, 1.15.2
>Reporter: Anthony Pounds-Cornish
>Assignee: Dongming.Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> When a Flink application is stopped before a Kinesis connector subtask has 
> been started, the following exception is thrown:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.close(FlinkKinesisConsumer.java:421)
> ...{noformat}
> This appears to be related to the fact that [fetcher 
> creation|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L307]
>  may not occur by [the time it is referenced when the consumer is 
> closed|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L421].
> A suggested fix is to make the {{close()}} method null safe [as it has been 
> in the {{cancel()}} 
> method|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L407].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29324) Calling Kinesis connector close method before subtask starts running results in NPE