[GitHub] [flink] X-czh opened a new pull request, #23465: [FLINK-33147] Introduce endpoint field in REST API and deprecate host field

2023-09-25 Thread via GitHub


X-czh opened a new pull request, #23465:
URL: https://github.com/apache/flink/pull/23465

   
   
   ## What is the purpose of the change
   
   First step towards [[FLIP-363: Unify the Representation of TaskManager 
Location in REST API and Web 
UI](https://cwiki.apache.org/confluence/display/FLINK/FLIP-363%3A+Unify+the+Representation+of+TaskManager+Location+in+REST+API+and+Web+UI)]:
 Introduce a new "endpoint" field in REST API to represent TaskManager endpoint 
(host + port) and deprecate the "host" field.
   
   ## Brief change log
   
   Introduce endpoint field in REST API and deprecate host field.
   
   ## Verifying this change
   
   This change is already covered by existing tests, and verified on a 
standalone cluster.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented) 
   
   The REST API doc is regenerated to cover the changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33156) Remove flakiness from tests in OperatorStateBackendTest.java

2023-09-25 Thread Asha Boyapati (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Asha Boyapati updated FLINK-33156:
--
Description: 
This issue is similar to:
https://issues.apache.org/jira/browse/FLINK-32963

We are proposing to make the following tests stable:

{quote}org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync
org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote}

The tests are currently flaky because the order of elements returned by 
iterators is non-deterministic.

The following PR fixes the flaky test by making it independent of the order of 
elements returned by the iterator:
https://github.com/apache/flink/pull/23464

We detected this using the NonDex tool using the following commands:

{quote}mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
-DnondexRuns=10 
-Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync

mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
-DnondexRuns=10 
-Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote}

Please see the following Continuous Integration log that shows the flakiness:
https://github.com/asha-boyapati/flink/actions/runs/6193757385

Please see the following Continuous Integration log that shows that the 
flakiness is fixed by this change:
https://github.com/asha-boyapati/flink/actions/runs/619409

  was:
This issue is similar to:
https://issues.apache.org/jira/browse/FLINK-32963

We are proposing to make the following tests stable:

{quote}org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync
org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote}

The tests are currently flaky because the order of elements returned by 
iterators is non-deterministic.

The following PR fixes the flaky test by making it independent of the order of 
elements returned by the iterator:
https://github.com/asha-boyapati/flink/pull/2

We detected this using the NonDex tool using the following commands:

{quote}mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
-DnondexRuns=10 
-Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync

mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
-DnondexRuns=10 
-Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote}

Please see the following Continuous Integration log that shows the flakiness:
https://github.com/asha-boyapati/flink/actions/runs/6193757385

Please see the following Continuous Integration log that shows that the 
flakiness is fixed by this change:
https://github.com/asha-boyapati/flink/actions/runs/619409


> Remove flakiness from tests in OperatorStateBackendTest.java
> 
>
> Key: FLINK-33156
> URL: https://issues.apache.org/jira/browse/FLINK-33156
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.17.1
>Reporter: Asha Boyapati
>Priority: Minor
> Fix For: 1.17.1
>
>
> This issue is similar to:
> https://issues.apache.org/jira/browse/FLINK-32963
> We are proposing to make the following tests stable:
> {quote}org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync
> org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote}
> The tests are currently flaky because the order of elements returned by 
> iterators is non-deterministic.
> The following PR fixes the flaky test by making it independent of the order 
> of elements returned by the iterator:
> https://github.com/apache/flink/pull/23464
> We detected this using the NonDex tool using the following commands:
> {quote}mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
> -DnondexRuns=10 
> -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync
> mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
> -DnondexRuns=10 
> -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote}
> Please see the following Continuous Integration log that shows the flakiness:
> https://github.com/asha-boyapati/flink/actions/runs/6193757385
> Please see the following Continuous Integration log that shows that the 
> flakiness is fixed by this change:
> https://github.com/asha-boyapati/flink/actions/runs/619409



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] asha-boyapati opened a new pull request, #23464: Remove flakiness from tests in OperatorStateBackendTest.java

2023-09-25 Thread via GitHub


asha-boyapati opened a new pull request, #23464:
URL: https://github.com/apache/flink/pull/23464

   
   
   ## What is the purpose of the change
   
   This PR is similar to the following PR which was accepted: 
https://github.com/apache/flink/pull/23298
   
   This PR makes the following tests stable:
   ```
   
   
org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync
   
org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync
   ```
   
   The tests are currently flaky because the order of elements returned by 
iterators is non-deterministic.
   
   This PR fixes the flaky tests by making them independent of the order of 
elements returned by the iterators.
   
   We detected this using the NonDex tool using the following commands:
   `
   mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
-DnondexRuns=10 
-Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync`
   
   `mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
-DnondexRuns=10 
-Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync`
   
   Please see the following Continuous Integration log that shows the flakiness:
   https://github.com/asha-boyapati/flink/actions/runs/6193757385
   
   Please see the following Continuous Integration log that shows that the 
flakiness is fixed by this change:
   https://github.com/asha-boyapati/flink/actions/runs/619409
   
   
   ## Brief change log
   
   This PR fixes the flaky tests by making them independent of the order of 
elements returned by the iterators.
   
   
   ## Verifying this change
   
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   This change only modifies tests slightly and does not change any underlying 
code.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-33156) Remove flakiness from tests in OperatorStateBackendTest.java

2023-09-25 Thread Asha Boyapati (Jira)
Asha Boyapati created FLINK-33156:
-

 Summary: Remove flakiness from tests in 
OperatorStateBackendTest.java
 Key: FLINK-33156
 URL: https://issues.apache.org/jira/browse/FLINK-33156
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends
Affects Versions: 1.17.1
Reporter: Asha Boyapati
 Fix For: 1.17.1


This issue is similar to:
https://issues.apache.org/jira/browse/FLINK-32963

We are proposing to make the following tests stable:

{quote}org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync
org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote}

The tests are currently flaky because the order of elements returned by 
iterators is non-deterministic.

The following PR fixes the flaky test by making it independent of the order of 
elements returned by the iterator:
https://github.com/asha-boyapati/flink/pull/2

We detected this using the NonDex tool using the following commands:

{quote}mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
-DnondexRuns=10 
-Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync

mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime 
-DnondexRuns=10 
-Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote}

Please see the following Continuous Integration log that shows the flakiness:
https://github.com/asha-boyapati/flink/actions/runs/6193757385

Please see the following Continuous Integration log that shows that the 
flakiness is fixed by this change:
https://github.com/asha-boyapati/flink/actions/runs/619409



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28303) Kafka SQL Connector loses data when restoring from a savepoint with a topic with empty partitions

2023-09-25 Thread tanjialiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769014#comment-17769014
 ] 

tanjialiang commented on FLINK-28303:
-

[~martijnvisser] I had already check the latest kafka connector code, this 
problem still exists.

> Kafka SQL Connector loses data when restoring from a savepoint with a topic 
> with empty partitions
> -
>
> Key: FLINK-28303
> URL: https://issues.apache.org/jira/browse/FLINK-28303
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4
>Reporter: Robert Metzger
>Priority: Major
>
> Steps to reproduce:
> - Set up a Kafka topic with 10 partitions
> - produce records 0-9 into the topic
> - take a savepoint and stop the job
> - produce records 10-19 into the topic
> - restore the job from the savepoint.
> The job will be missing usually 2-4 records from 10-19.
> My assumption is that if a partition never had data (which is likely with 10 
> partitions and 10 records), the savepoint will only contain offsets for 
> partitions with data. 
> While the job was offline (and we've written record 10-19 into the topic), 
> all partitions got filled. Now, when Kafka comes online again, it will use 
> the "latest" offset for those partitions, skipping some data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] xintongsong commented on a diff in pull request #23456: [FLINK-33144][datastream]Deprecate Iteration API in DataStream

2023-09-25 Thread via GitHub


xintongsong commented on code in PR #23456:
URL: https://github.com/apache/flink/pull/23456#discussion_r1336633757


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStream.java:
##
@@ -526,8 +526,14 @@ public DataStream global() {
  * in the set time, the stream terminates.
  *
  * @return The iterative data stream created.
+ * @deprecated This method is deprecated since Flink 1.19. The users are 
recommended to use
+ * Iteration API in Flink ML instead.

Review Comment:
   We should not recommend users to use Iteration API in Flink ML instead. It 
doesn't make sense that a user who doesn't need any ML algorithm would have to 
use Flink ML only to get access to the Iteration API.
   
   I'd suggest to the following:
   > The only known use case of this Iteration API comes from Flink ML, which 
already has its own implementation of iteration and  no longer uses this API. 
If there's any use cases other than Flink ML that needs iteration support, 
please reach out to d...@flink.apache.org and we can consider making the Flink 
ML iteration implementation a separate common library.



##
flink-examples/flink-examples-streaming/pom.xml:
##
@@ -137,6 +137,11 @@ under the License.


-Xlint:deprecation

true
+   
+   
+   
org/apache/flink/streaming/examples/iteration/IterateExample.java

Review Comment:
   We should explain that this example is temporarily preserved only for 
testing purpose.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] 1996fanrui commented on a diff in pull request #677: [FLINK-33097][autoscaler] Initialize the generic autoscaler module and interfaces

2023-09-25 Thread via GitHub


1996fanrui commented on code in PR #677:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/677#discussion_r1336584237


##
flink-autoscaler/pom.xml:
##
@@ -45,6 +45,32 @@ under the License.
 provided
 
 
+
+org.projectlombok
+lombok
+${lombok.version}
+provided
+
+
+
+org.junit.jupiter
+junit-jupiter-params
+test
+
+
+

[GitHub] [flink] xiangyuf commented on a diff in pull request #23455: [FLINK-25015][Table SQL/Client] change job name to sql when submitting queries using client

2023-09-25 Thread via GitHub


xiangyuf commented on code in PR #23455:
URL: https://github.com/apache/flink/pull/23455#discussion_r1336585759


##
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java:
##
@@ -1050,10 +1052,15 @@ private TableResultInternal executeInternal(
 }
 
 private TableResultInternal executeQueryOperation(QueryOperation 
operation) {
+String querySql = null;
+if (operation instanceof QuerySqlOperation) {
+querySql = ((QuerySqlOperation) operation).getQuerySql();
+}
 CollectModifyOperation sinkOperation = new 
CollectModifyOperation(operation);
 List> transformations =
 translate(Collections.singletonList(sinkOperation));
-final String defaultJobName = "collect";
+final String defaultJobName =

Review Comment:
   @FangYongs hi, in community version, only jobId will be used in the 
checkpoint path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] victor9309 commented on pull request #23451: [FLINK-32108][test] KubernetesExtension calls assumeThat in @BeforeAll callback which doesn't print the actual failure message

2023-09-25 Thread via GitHub


victor9309 commented on PR #23451:
URL: https://github.com/apache/flink/pull/23451#issuecomment-1734762519

   failure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] victor9309 closed pull request #23451: [FLINK-32108][test] KubernetesExtension calls assumeThat in @BeforeAll callback which doesn't print the actual failure message

2023-09-25 Thread via GitHub


victor9309 closed pull request #23451: [FLINK-32108][test] KubernetesExtension 
calls assumeThat in @BeforeAll callback which doesn't print the actual failure 
message
URL: https://github.com/apache/flink/pull/23451


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] XComp commented on a diff in pull request #677: [FLINK-33097][autoscaler] Initialize the generic autoscaler module and interfaces

2023-09-25 Thread via GitHub


XComp commented on code in PR #677:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/677#discussion_r1336571925


##
flink-autoscaler/pom.xml:
##
@@ -45,6 +45,32 @@ under the License.
 provided
 
 
+
+org.projectlombok
+lombok
+${lombok.version}
+provided
+
+
+
+org.junit.jupiter
+junit-jupiter-params
+test
+
+
+

[GitHub] [flink] flinkbot commented on pull request #23463: Revert "[FLINK-33064][table-planner] Improve the error message when the lookup source is used as the scan source"

2023-09-25 Thread via GitHub


flinkbot commented on PR #23463:
URL: https://github.com/apache/flink/pull/23463#issuecomment-1734753507

   
   ## CI report:
   
   * 14e35b26265249609fb92f82c02d598f1c29ce7e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] swuferhong opened a new pull request, #23463: Revert "[FLINK-33064][table-planner] Improve the error message when the lookup source is used as the scan source"

2023-09-25 Thread via GitHub


swuferhong opened a new pull request, #23463:
URL: https://github.com/apache/flink/pull/23463

   
   
   
   
   ## What is the purpose of the change
   
   After testing different sql pattern, found that `FLINK-33064` can cause 
serious bugs in views and subquery scenarios. So this pr is aims to revert 
`FLINK-33064`
   
   
   ## Brief change log
   
   revert `FLINK-33064`
   
   
   ## Verifying this change
   
   no tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? no docs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-connector-pulsar] minchowang commented on pull request #16: [BK-3.0][FLINK-30552][Connector/Pulsar] drop batch message size assertion, better set the cursor position.

2023-09-25 Thread via GitHub


minchowang commented on PR #16:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/16#issuecomment-1734745307

   @syhily Hi, Whether can apply to develop-1.14 
[develop-flink-1.14](https://github.com/streamnative/flink/tree/develop) for 
this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] swuferhong closed pull request #23457: [FLINK-33143][table-planner] Fix wrongly throw error while temporary table join with invalidScan lookup source and selected as view

2023-09-25 Thread via GitHub


swuferhong closed pull request #23457: [FLINK-33143][table-planner] Fix wrongly 
throw error while temporary table join with invalidScan lookup source and 
selected as view
URL: https://github.com/apache/flink/pull/23457


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-33143) Wrongly throw error while temporary table join with invalidScan lookup source and selected as view

2023-09-25 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng closed FLINK-33143.
-
Resolution: Resolved

> Wrongly throw error while temporary table join with invalidScan lookup source 
> and selected as view
> --
>
> Key: FLINK-33143
> URL: https://issues.apache.org/jira/browse/FLINK-33143
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Wrongly throw error while temporary table join with invalidScan lookup source 
> and selected as view.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33143) Wrongly throw error while temporary table join with invalidScan lookup source and selected as view

2023-09-25 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768969#comment-17768969
 ] 

Yunhong Zheng commented on FLINK-33143:
---

This fix way cannot cover all sql pattern, like subQuery, so this issue will be 
closed.

> Wrongly throw error while temporary table join with invalidScan lookup source 
> and selected as view
> --
>
> Key: FLINK-33143
> URL: https://issues.apache.org/jira/browse/FLINK-33143
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Wrongly throw error while temporary table join with invalidScan lookup source 
> and selected as view.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] xbthink commented on pull request #23454: [FLINK-33130]reuse source and sink operator io metrics for task

2023-09-25 Thread via GitHub


xbthink commented on PR #23454:
URL: https://github.com/apache/flink/pull/23454#issuecomment-1734723928

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23462: [BP-1.18][FLINK-32976][runtime] Fix NullPointException when starting flink cluster in standalone mode

2023-09-25 Thread via GitHub


flinkbot commented on PR #23462:
URL: https://github.com/apache/flink/pull/23462#issuecomment-1734714992

   
   ## CI report:
   
   * a630fd02f8232d8bbd9f521bdc7990dfd8d79e21 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33129) Can't create RowDataToAvroConverter for LocalZonedTimestampType logical type

2023-09-25 Thread Zhaoyang Shao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768959#comment-17768959
 ] 

Zhaoyang Shao commented on FLINK-33129:
---

In DataStructureConverters, there is similar implementation already using       
  
```
putConverter(
                LogicalTypeRoot.TIMESTAMP_WITH_LOCAL_TIME_ZONE,
                Integer.class,
                constructor(LocalZonedTimestampIntConverter::new));
```

[https://github.com/apache/flink/blob/9b2b4e3f194467aae0d299b3b403e0ca60c42ef0/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/data/conversion/DataStructureConverters.java#L134]

We can use same/smiliar approach to support TIMESTAMP_WITH_LOCAL_TIME_ZONE in 
`RowDataToAvroConverters.createConverter`

> Can't create RowDataToAvroConverter for LocalZonedTimestampType logical type
> 
>
> Key: FLINK-33129
> URL: https://issues.apache.org/jira/browse/FLINK-33129
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.17.1
>Reporter: Zhaoyang Shao
>Priority: Critical
> Fix For: 1.17.1
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> While creating converter using `RowDataToAvroConverters.createConverter` with 
> LocalZonedTimestampType logical type, the method will throw exception. This 
> is because the switch clause is missing a clause for 
> `LogicalTypeRoot.TIMESTAMP_WITH_LOCAL_TIME_ZON`.
> Code: 
> [https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/RowDataToAvroConverters.java#L75]
>  
> We can convert the value to `LocalDateTime` and then `TimestampData` using 
> method below. Then we can apply the same converter as 
> TIMESTAMP_WITHOUT_TIME_ZONE? 
>  
> `TimestampData fromLocalDateTime(LocalDateTime dateTime)`
> Can Flink team help adding the support for this logical type and logical type 
> root?
> This is now a blocker for creating Flink Iceberg consumer with Avro 
> GenericRecord when IcebergTable has `TimestampTZ` type field which will be 
> converted to LocalZonedTimestampType.
> See error below:
>  Unsupported type: TIMESTAMP_LTZ(6) 
> stack: [ [-] 
>   
> org.apache.flink.formats.avro.RowDataToAvroConverters.createConverter(RowDataToAvroConverters.java:186)
>  
>   
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) 
>   
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
>  
>   
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) 
>   
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) 
>   
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:550) 
>   
> java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
>  
>   
> java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:517) 
>   
> org.apache.flink.formats.avro.RowDataToAvroConverters.createRowConverter(RowDataToAvroConverters.java:224)
>  
>   
> org.apache.flink.formats.avro.RowDataToAvroConverters.createConverter(RowDataToAvroConverters.java:178)
>  
>   
> org.apache.iceberg.flink.source.RowDataToAvroGenericRecordConverter.(RowDataToAvroGenericRecordConverter.java:46)
>  
>   
> org.apache.iceberg.flink.source.RowDataToAvroGenericRecordConverter.fromIcebergSchema(RowDataToAvroGenericRecordConverter.java:60)
>  
>   
> org.apache.iceberg.flink.source.reader.AvroGenericRecordReaderFunction.lazyConverter(AvroGenericRecordReaderFunction.java:93)
>  
>   
> org.apache.iceberg.flink.source.reader.AvroGenericRecordReaderFunction.createDataIterator(AvroGenericRecordReaderFunction.java:85)
>  
>   
> org.apache.iceberg.flink.source.reader.DataIteratorReaderFunction.apply(DataIteratorReaderFunction.java:39)
>  
>   
> org.apache.iceberg.flink.source.reader.DataIteratorReaderFunction.apply(DataIteratorReaderFunction.java:27)
>  
>   
> org.apache.iceberg.flink.source.reader.IcebergSourceSplitReader.fetch(IcebergSourceSplitReader.java:74)
>  
>   
> org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58)
>  
>   
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:162)
>  
>   
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:114)
>  
>   
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
>   java.util.concurrent.FutureTask.run(FutureTask.java:264) 
>   
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  
>   

[GitHub] [flink] hackergin opened a new pull request, #23462: [BP-1.18][FLINK-32976][runtime] Fix NullPointException when starting flink cluster in standalone mode

2023-09-25 Thread via GitHub


hackergin opened a new pull request, #23462:
URL: https://github.com/apache/flink/pull/23462

   BP  https://github.com/apache/flink/pull/23446 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] Sxnan commented on pull request #22897: [FLINK-32476][runtime] Support configuring object-reuse for internal operators

2023-09-25 Thread via GitHub


Sxnan commented on PR #22897:
URL: https://github.com/apache/flink/pull/22897#issuecomment-1734709549

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33129) Can't create RowDataToAvroConverter for LocalZonedTimestampType logical type

2023-09-25 Thread Zhaoyang Shao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaoyang Shao updated FLINK-33129:
--
Priority: Critical  (was: Blocker)

> Can't create RowDataToAvroConverter for LocalZonedTimestampType logical type
> 
>
> Key: FLINK-33129
> URL: https://issues.apache.org/jira/browse/FLINK-33129
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.17.1
>Reporter: Zhaoyang Shao
>Priority: Critical
> Fix For: 1.17.1
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> While creating converter using `RowDataToAvroConverters.createConverter` with 
> LocalZonedTimestampType logical type, the method will throw exception. This 
> is because the switch clause is missing a clause for 
> `LogicalTypeRoot.TIMESTAMP_WITH_LOCAL_TIME_ZON`.
> Code: 
> [https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/RowDataToAvroConverters.java#L75]
>  
> We can convert the value to `LocalDateTime` and then `TimestampData` using 
> method below. Then we can apply the same converter as 
> TIMESTAMP_WITHOUT_TIME_ZONE? 
>  
> `TimestampData fromLocalDateTime(LocalDateTime dateTime)`
> Can Flink team help adding the support for this logical type and logical type 
> root?
> This is now a blocker for creating Flink Iceberg consumer with Avro 
> GenericRecord when IcebergTable has `TimestampTZ` type field which will be 
> converted to LocalZonedTimestampType.
> See error below:
>  Unsupported type: TIMESTAMP_LTZ(6) 
> stack: [ [-] 
>   
> org.apache.flink.formats.avro.RowDataToAvroConverters.createConverter(RowDataToAvroConverters.java:186)
>  
>   
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) 
>   
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
>  
>   
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) 
>   
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) 
>   
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:550) 
>   
> java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
>  
>   
> java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:517) 
>   
> org.apache.flink.formats.avro.RowDataToAvroConverters.createRowConverter(RowDataToAvroConverters.java:224)
>  
>   
> org.apache.flink.formats.avro.RowDataToAvroConverters.createConverter(RowDataToAvroConverters.java:178)
>  
>   
> org.apache.iceberg.flink.source.RowDataToAvroGenericRecordConverter.(RowDataToAvroGenericRecordConverter.java:46)
>  
>   
> org.apache.iceberg.flink.source.RowDataToAvroGenericRecordConverter.fromIcebergSchema(RowDataToAvroGenericRecordConverter.java:60)
>  
>   
> org.apache.iceberg.flink.source.reader.AvroGenericRecordReaderFunction.lazyConverter(AvroGenericRecordReaderFunction.java:93)
>  
>   
> org.apache.iceberg.flink.source.reader.AvroGenericRecordReaderFunction.createDataIterator(AvroGenericRecordReaderFunction.java:85)
>  
>   
> org.apache.iceberg.flink.source.reader.DataIteratorReaderFunction.apply(DataIteratorReaderFunction.java:39)
>  
>   
> org.apache.iceberg.flink.source.reader.DataIteratorReaderFunction.apply(DataIteratorReaderFunction.java:27)
>  
>   
> org.apache.iceberg.flink.source.reader.IcebergSourceSplitReader.fetch(IcebergSourceSplitReader.java:74)
>  
>   
> org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58)
>  
>   
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:162)
>  
>   
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:114)
>  
>   
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
>   java.util.concurrent.FutureTask.run(FutureTask.java:264) 
>   
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  
>   
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  
>   java.lang.Thread.run(Thread.java:829)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…

2023-09-25 Thread via GitHub


yigress commented on code in PR #23425:
URL: https://github.com/apache/flink/pull/23425#discussion_r1336513394


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java:
##
@@ -368,6 +373,34 @@ public void discard() throws Exception {
 }
 }
 
+@Override
+public void discard() throws Exception {
+discard(null);
+}
+
+private void discardOperatorStates(Executor ioExecutor) throws 
Exception {
+if (ioExecutor == null) {
+
StateUtil.bestEffortDiscardAllStateObjects(operatorStates.values());
+} else {
+List discardables =
+operatorStates.values().stream()
+.flatMap(op -> op.getDiscardables().stream())
+.collect(Collectors.toList());
+LOG.trace("Executing discard {} operator states {}", 
discardables.size());

Review Comment:
   updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…

2023-09-25 Thread via GitHub


yigress commented on code in PR #23425:
URL: https://github.com/apache/flink/pull/23425#discussion_r1336513096


##
flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java:
##
@@ -109,6 +109,20 @@ public class CheckpointingOptions {
 .defaultValue(1)
 .withDescription("The maximum number of completed 
checkpoints to retain.");
 
+/**
+ * Option whether to clean each checkpoint's states in fast mode. When in 
fast mode, operator
+ * states are discarded in parallel using the ExecutorService passed to 
the cleaner, otherwise
+ * operator states are discarded sequentially.
+ */
+@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
+public static final ConfigOption CLEANER_FAST_MODE =

Review Comment:
   updated the config comments



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…

2023-09-25 Thread via GitHub


yigress commented on code in PR #23425:
URL: https://github.com/apache/flink/pull/23425#discussion_r1336512951


##
flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java:
##
@@ -109,6 +109,20 @@ public class CheckpointingOptions {
 .defaultValue(1)
 .withDescription("The maximum number of completed 
checkpoints to retain.");
 
+/**
+ * Option whether to clean each checkpoint's states in fast mode. When in 
fast mode, operator
+ * states are discarded in parallel using the ExecutorService passed to 
the cleaner, otherwise
+ * operator states are discarded sequentially.
+ */
+@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
+public static final ConfigOption CLEANER_FAST_MODE =

Review Comment:
   changed to parallel-mode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…

2023-09-25 Thread via GitHub


yigress commented on code in PR #23425:
URL: https://github.com/apache/flink/pull/23425#discussion_r1336512772


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java:
##
@@ -368,6 +373,34 @@ public void discard() throws Exception {
 }
 }
 
+@Override
+public void discard() throws Exception {
+discard(null);
+}
+
+private void discardOperatorStates(Executor ioExecutor) throws 
Exception {

Review Comment:
   yes, the logic is the same. Added default method discardOperatorStates in 
Checkpoint interface to unify.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…

2023-09-25 Thread via GitHub


yigress commented on code in PR #23425:
URL: https://github.com/apache/flink/pull/23425#discussion_r1336512189


##
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorFailureTest.java:
##
@@ -72,7 +72,7 @@ class CheckpointCoordinatorFailureTest {
 
 /**
  * Tests that a failure while storing a completed checkpoint in the 
completed checkpoint store
- * will properly fail the originating pending checkpoint and clean upt the 
completed checkpoint.
+ * will properly fail the originating pending checkpoint and clean up the 
completed checkpoint.

Review Comment:
   restore the typo. :) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32471) IS_NOT_NULL can add to SUITABLE_FILTER_TO_PUSH

2023-09-25 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-32471:
---
Labels: pull-request-available stale-major  (was: pull-request-available)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issues has been marked as 
Major but is unassigned and neither itself nor its Sub-Tasks have been updated 
for 60 days. I have gone ahead and added a "stale-major" to the issue". If this 
ticket is a Major, please either assign yourself or give an update. Afterwards, 
please remove the label or in 7 days the issue will be deprioritized.


> IS_NOT_NULL can add to SUITABLE_FILTER_TO_PUSH
> --
>
> Key: FLINK-32471
> URL: https://issues.apache.org/jira/browse/FLINK-32471
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Reporter: grandfisher
>Priority: Major
>  Labels: pull-request-available, stale-major
>
> According to FLINK-31273:
> The reason for the error is that other filters conflict with IS_NULL, but in 
> fact it won't conflict with IS_NOT_NULL, because operators in 
> SUITABLE_FILTER_TO_PUSH  such as 'SqlKind.GREATER_THAN'  has an implicit 
> filter 'IS_NOT_NULL' according to SQL Semantics.
>  
> So we think it is feasible to add  IS_NOT_NULL to the SUITABLE_FILTER_TO_PUSH 
> list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33149) Bump snappy-java to 1.1.10.4

2023-09-25 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768878#comment-17768878
 ] 

Matthias Pohl commented on FLINK-33149:
---

I think my previous investigation is not enough: It seems to be used by 
{{flink-avro}}, {{flink-parquet}} and {{flink-presto}}, as well.

> Bump snappy-java to 1.1.10.4
> 
>
> Key: FLINK-33149
> URL: https://issues.apache.org/jira/browse/FLINK-33149
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Connectors / AWS, Connectors / HBase, 
> Connectors / Kafka, Stateful Functions
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Ryan Skraba
>Assignee: Ryan Skraba
>Priority: Major
>  Labels: pull-request-available
>
> Xerial published a security alert for a Denial of Service attack that [exists 
> on 
> 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv].
> This is included in flink-dist, but also in flink-statefun, and several 
> connectors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33149) Bump snappy-java to 1.1.10.4

2023-09-25 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768867#comment-17768867
 ] 

Matthias Pohl commented on FLINK-33149:
---

Thanks for looking into it. I did a code investigation to see where we use 
snappy in flink core.

Snappy was introduced for the state backend and used in 
[SnappyStreamCompressionDecorator.java:25-26|https://github.com/apache/flink/blob/116f297478f2d443178510565b1cd5a2f387e241/flink-runtime/src/main/java/org/apache/flink/runtime/state/SnappyStreamCompressionDecorator.java#L25].
 The classes that are affected by this vulnerability ({{SnappyInputStream}} and 
{{SnappyOutputStream}}) are not used. Flink uses {{SnappyFramedInputStream}} 
and {{SnappyFramedOutputStream}}. Therefore, it's not critical and priority 
Major makes sense. But it's still good to have this fixed considering the 
alerts that might pop up in security scanners.

I also did a brief analysis of a few connector implementations:
{code}
➜  workspace for c in $(ls -d flink-connector*); do echo $c; grep 
--include=pom.xml -Hirn snappy $c; done
flink-connector-aws
flink-connector-aws/pom.xml:254:
org.xerial.snappy
flink-connector-aws/pom.xml:255:
snappy-java
flink-connector-cassandra
flink-connector-elasticsearch
flink-connector-gcp-pubsub
flink-connector-hbase
flink-connector-hbase/pom.xml:245:  
org.xerial.snappy
flink-connector-hbase/pom.xml:246:  
snappy-java
flink-connector-hive
flink-connector-jdbc
flink-connector-kafka
flink-connector-kafka/pom.xml:70:
1.1.8.3
flink-connector-kafka/pom.xml:231:
org.xerial.snappy
flink-connector-kafka/pom.xml:232:
snappy-java
flink-connector-kafka/pom.xml:233:
${snappy-java.version}
flink-connector-mongodb
flink-connector-opensearch
flink-connector-pulsar
flink-connector-rabbitmq
flink-connector-redis-streams
{code}

Only {{flink-connector-kafka}} and {{flink-connector-aws}} have this dependency 
listed. None of them actually uses any classes from within the {{xerial}} 
package:
{code}
for c in $(ls -d flink-connector*); do echo $c; grep --include="*java" -Hirn 
xerial $c; done
flink-connector-aws
flink-connector-cassandra
flink-connector-elasticsearch
flink-connector-gcp-pubsub
flink-connector-hbase
flink-connector-hive
flink-connector-jdbc
flink-connector-kafka
flink-connector-mongodb
flink-connector-opensearch
flink-connector-pulsar
flink-connector-rabbitmq
flink-connector-redis-streams
{code}

Would it be worth removing the dependency from the connectors entirely? WDYT?

> Bump snappy-java to 1.1.10.4
> 
>
> Key: FLINK-33149
> URL: https://issues.apache.org/jira/browse/FLINK-33149
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Connectors / AWS, Connectors / HBase, 
> Connectors / Kafka, Stateful Functions
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Ryan Skraba
>Assignee: Ryan Skraba
>Priority: Major
>  Labels: pull-request-available
>
> Xerial published a security alert for a Denial of Service attack that [exists 
> on 
> 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv].
> This is included in flink-dist, but also in flink-statefun, and several 
> connectors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-connector-aws] dependabot[bot] opened a new pull request, #98: Bump org.xerial.snappy:snappy-java from 1.1.10.1 to 1.1.10.4

2023-09-25 Thread via GitHub


dependabot[bot] opened a new pull request, #98:
URL: https://github.com/apache/flink-connector-aws/pull/98

   Bumps [org.xerial.snappy:snappy-java](https://github.com/xerial/snappy-java) 
from 1.1.10.1 to 1.1.10.4.
   
   Release notes
   Sourced from https://github.com/xerial/snappy-java/releases;>org.xerial.snappy:snappy-java's
 releases.
   
   v1.1.10.4
   
   What's Changed
   Security Fix
   
   Fixed SnappyInputStream so as not to allocate too large memory when 
decompressing data with an extremely large chunk size by https://github.com/tunnelshade;>@​tunnelshade (https://github.com/xerial/snappy-java/commit/9f8c3cf74223ed0a8a834134be9c917b9f10ceb5;>code
 change)
   
   This does not affect users only using Snappy.compress/uncompress 
methods
   
   
   
    Features
   
   feature: Upgrade the internal snappy version to 1.1.10 (1.1.8 was 
wrongly used before) by https://github.com/xerial;>@​xerial in https://redirect.github.com/xerial/snappy-java/pull/508;>xerial/snappy-java#508
   Support JDK21 (no internal change)
   
    Dependency Updates
   
   Update scalafmt-core to 3.7.11 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/485;>xerial/snappy-java#485
   Update sbt to 1.9.3 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/483;>xerial/snappy-java#483
   Update scalafmt-core to 3.7.12 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/487;>xerial/snappy-java#487
   Bump actions/checkout from 3 to 4 by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/xerial/snappy-java/pull/502;>xerial/snappy-java#502
   Update sbt to 1.9.4 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/496;>xerial/snappy-java#496
   Update scalafmt-core to 3.7.14 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/501;>xerial/snappy-java#501
   Update sbt to 1.9.6 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/505;>xerial/snappy-java#505
   Update native libraries by https://github.com/github-actions;>@​github-actions in 
https://redirect.github.com/xerial/snappy-java/pull/503;>xerial/snappy-java#503
   
     Internal Updates
   
   Update airframe-log to 23.7.4 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/486;>xerial/snappy-java#486
   Update airframe-log to 23.8.0 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/488;>xerial/snappy-java#488
   Update sbt-scalafmt to 2.5.2 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/500;>xerial/snappy-java#500
   Update airframe-log to 23.8.6 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/497;>xerial/snappy-java#497
   Update sbt-scalafmt to 2.5.1 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/499;>xerial/snappy-java#499
   Update airframe-log to 23.9.1 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/504;>xerial/snappy-java#504
   Update airframe-log to 23.9.2 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/509;>xerial/snappy-java#509
   
   Other Changes
   
   Update NOTICE by https://github.com/imsudiproy;>@​imsudiproy in https://redirect.github.com/xerial/snappy-java/pull/492;>xerial/snappy-java#492
   
   Full Changelog: https://github.com/xerial/snappy-java/compare/v1.1.10.3...v1.1.10.4;>https://github.com/xerial/snappy-java/compare/v1.1.10.3...v1.1.10.4
   v1.1.10.3
   
   What's Changed
    Bug Fixes
   
   Fix the GLIBC_2.32 not found issue of 
libsnappyjava.so in certain Linux distributions on s390x by https://github.com/kun-lu20;>@​kun-lu20 in https://redirect.github.com/xerial/snappy-java/pull/481;>xerial/snappy-java#481
   
    Dependency Updates
   
   Update scalafmt-core to 3.7.10 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/480;>xerial/snappy-java#480
   Update native libraries by https://github.com/github-actions;>@​github-actions in 
https://redirect.github.com/xerial/snappy-java/pull/482;>xerial/snappy-java#482
   
   New Contributors
   
   https://github.com/kun-lu20;>@​kun-lu20 made 
their first contribution in https://redirect.github.com/xerial/snappy-java/pull/481;>xerial/snappy-java#481
   
   
   
   ... (truncated)
   
   
   Commits
   
   https://github.com/xerial/snappy-java/commit/9f8c3cf74223ed0a8a834134be9c917b9f10ceb5;>9f8c3cf
 Merge pull request from GHSA-55g7-9cwv-5qfv
   https://github.com/xerial/snappy-java/commit/49d700175f18ed5f8c5d371b7c2f80c75979bd68;>49d7001
 Update 

[GitHub] [flink-statefun] dependabot[bot] opened a new pull request, #340: Bump org.xerial.snappy:snappy-java from 1.1.10.1 to 1.1.10.4 in /statefun-flink

2023-09-25 Thread via GitHub


dependabot[bot] opened a new pull request, #340:
URL: https://github.com/apache/flink-statefun/pull/340

   Bumps [org.xerial.snappy:snappy-java](https://github.com/xerial/snappy-java) 
from 1.1.10.1 to 1.1.10.4.
   
   Release notes
   Sourced from https://github.com/xerial/snappy-java/releases;>org.xerial.snappy:snappy-java's
 releases.
   
   v1.1.10.4
   
   What's Changed
   Security Fix
   
   Fixed SnappyInputStream so as not to allocate too large memory when 
decompressing data with an extremely large chunk size by https://github.com/tunnelshade;>@​tunnelshade (https://github.com/xerial/snappy-java/commit/9f8c3cf74223ed0a8a834134be9c917b9f10ceb5;>code
 change)
   
   This does not affect users only using Snappy.compress/uncompress 
methods
   
   
   
    Features
   
   feature: Upgrade the internal snappy version to 1.1.10 (1.1.8 was 
wrongly used before) by https://github.com/xerial;>@​xerial in https://redirect.github.com/xerial/snappy-java/pull/508;>xerial/snappy-java#508
   Support JDK21 (no internal change)
   
    Dependency Updates
   
   Update scalafmt-core to 3.7.11 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/485;>xerial/snappy-java#485
   Update sbt to 1.9.3 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/483;>xerial/snappy-java#483
   Update scalafmt-core to 3.7.12 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/487;>xerial/snappy-java#487
   Bump actions/checkout from 3 to 4 by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/xerial/snappy-java/pull/502;>xerial/snappy-java#502
   Update sbt to 1.9.4 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/496;>xerial/snappy-java#496
   Update scalafmt-core to 3.7.14 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/501;>xerial/snappy-java#501
   Update sbt to 1.9.6 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/505;>xerial/snappy-java#505
   Update native libraries by https://github.com/github-actions;>@​github-actions in 
https://redirect.github.com/xerial/snappy-java/pull/503;>xerial/snappy-java#503
   
     Internal Updates
   
   Update airframe-log to 23.7.4 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/486;>xerial/snappy-java#486
   Update airframe-log to 23.8.0 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/488;>xerial/snappy-java#488
   Update sbt-scalafmt to 2.5.2 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/500;>xerial/snappy-java#500
   Update airframe-log to 23.8.6 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/497;>xerial/snappy-java#497
   Update sbt-scalafmt to 2.5.1 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/499;>xerial/snappy-java#499
   Update airframe-log to 23.9.1 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/504;>xerial/snappy-java#504
   Update airframe-log to 23.9.2 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/509;>xerial/snappy-java#509
   
   Other Changes
   
   Update NOTICE by https://github.com/imsudiproy;>@​imsudiproy in https://redirect.github.com/xerial/snappy-java/pull/492;>xerial/snappy-java#492
   
   Full Changelog: https://github.com/xerial/snappy-java/compare/v1.1.10.3...v1.1.10.4;>https://github.com/xerial/snappy-java/compare/v1.1.10.3...v1.1.10.4
   v1.1.10.3
   
   What's Changed
    Bug Fixes
   
   Fix the GLIBC_2.32 not found issue of 
libsnappyjava.so in certain Linux distributions on s390x by https://github.com/kun-lu20;>@​kun-lu20 in https://redirect.github.com/xerial/snappy-java/pull/481;>xerial/snappy-java#481
   
    Dependency Updates
   
   Update scalafmt-core to 3.7.10 by https://github.com/xerial-bot;>@​xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/480;>xerial/snappy-java#480
   Update native libraries by https://github.com/github-actions;>@​github-actions in 
https://redirect.github.com/xerial/snappy-java/pull/482;>xerial/snappy-java#482
   
   New Contributors
   
   https://github.com/kun-lu20;>@​kun-lu20 made 
their first contribution in https://redirect.github.com/xerial/snappy-java/pull/481;>xerial/snappy-java#481
   
   
   
   ... (truncated)
   
   
   Commits
   
   https://github.com/xerial/snappy-java/commit/9f8c3cf74223ed0a8a834134be9c917b9f10ceb5;>9f8c3cf
 Merge pull request from GHSA-55g7-9cwv-5qfv
   https://github.com/xerial/snappy-java/commit/49d700175f18ed5f8c5d371b7c2f80c75979bd68;>49d7001
 Update 

[jira] [Assigned] (FLINK-33149) Bump snappy-java to 1.1.10.4

2023-09-25 Thread Matthias Pohl (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl reassigned FLINK-33149:
-

Assignee: Ryan Skraba

> Bump snappy-java to 1.1.10.4
> 
>
> Key: FLINK-33149
> URL: https://issues.apache.org/jira/browse/FLINK-33149
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Connectors / AWS, Connectors / HBase, 
> Connectors / Kafka, Stateful Functions
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Ryan Skraba
>Assignee: Ryan Skraba
>Priority: Major
>  Labels: pull-request-available
>
> Xerial published a security alert for a Denial of Service attack that [exists 
> on 
> 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv].
> This is included in flink-dist, but also in flink-statefun, and several 
> connectors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] XComp commented on pull request #23461: [BP-1.18][FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


XComp commented on PR #23461:
URL: https://github.com/apache/flink/pull/23461#issuecomment-1734177907

   ```
   16:26:18.071 [ERROR] Failed to execute goal 
com.github.eirslett:frontend-maven-plugin:1.11.0:install-node-and-npm (install 
node and npm) on project flink-runtime-web: Could not extract the Node archive: 
Could not extract archive: 
'/__w/2/.m2/repository/com/github/eirslett/node/16.13.2/node-16.13.2-linux-x64.tar.gz':
 EOFException -> [Help 1]
   16:26:18.071 [ERROR] 
   16:26:18.071 [ERROR] To see the full stack trace of the errors, re-run Maven 
with the -e switch.
   16:26:18.071 [ERROR] Re-run Maven using the -X switch to enable full debug 
logging.
   ```
   
   Azure retry initiated because of the error in compile stage. It looks like 
an infrastructure issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23461: [BP-1.18][FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


flinkbot commented on PR #23461:
URL: https://github.com/apache/flink/pull/23461#issuecomment-1734057845

   
   ## CI report:
   
   * 71309fb4cba1e09de71b03a2ad190137e76eab2d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23460: [BP-1.17][FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


flinkbot commented on PR #23460:
URL: https://github.com/apache/flink/pull/23460#issuecomment-1734056213

   
   ## CI report:
   
   * 3907cb98c6ec3a08e5c1c6ba86fd8f69e774e08d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23459: [BP-1.16][FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


flinkbot commented on PR #23459:
URL: https://github.com/apache/flink/pull/23459#issuecomment-1734054047

   
   ## CI report:
   
   * ea9b0a0cff9ae7889e246989b2e5d27779f9eb0e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] RyanSkraba opened a new pull request, #23461: [BP-1.18][FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


RyanSkraba opened a new pull request, #23461:
URL: https://github.com/apache/flink/pull/23461

   ## What is the purpose of the change
   
   Backport of https://github.com/apache/flink/pull/23458 to `release-1.18`
   
   Bump the version of snappy to address a vulnerability.
   
   ## Brief change log
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): **yes**
 - The public API, i.e., is any changed class annotated with **no**
 - The serializers: **no**
 - The runtime per-record code paths (performance sensitive): **no**
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no**
 - The S3 file system connector: **no**
   
   ## Documentation
   
 - Does this pull request introduce a new feature? **no**
 - If yes, how is the feature documented? **not applicable**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] RyanSkraba opened a new pull request, #23460: [FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


RyanSkraba opened a new pull request, #23460:
URL: https://github.com/apache/flink/pull/23460

   ## What is the purpose of the change
   
   Backport of https://github.com/apache/flink/pull/23458 to `release-1.17`
   
   Bump the version of snappy to address a vulnerability.
   
   ## Brief change log
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): **yes**
 - The public API, i.e., is any changed class annotated with **no**
 - The serializers: **no**
 - The runtime per-record code paths (performance sensitive): **no**
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no**
 - The S3 file system connector: **no**
   
   ## Documentation
   
 - Does this pull request introduce a new feature? **no**
 - If yes, how is the feature documented? **not applicable**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] RyanSkraba opened a new pull request, #23459: [BP-1.16][FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


RyanSkraba opened a new pull request, #23459:
URL: https://github.com/apache/flink/pull/23459

   ## What is the purpose of the change
   
   Backport of https://github.com/apache/flink/pull/23458 to `release-1.16`
   
   Bump the version of snappy to address a vulnerability.
   
   ## Brief change log
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): **yes**
 - The public API, i.e., is any changed class annotated with **no**
 - The serializers: **no**
 - The runtime per-record code paths (performance sensitive): **no**
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no**
 - The S3 file system connector: **no**
   
   ## Documentation
   
 - Does this pull request introduce a new feature? **no**
 - If yes, how is the feature documented? **not applicable**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-26585) State Processor API: Loading a state set buffers the whole state set in memory before starting to process

2023-09-25 Thread Matthias Schwalbe (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768751#comment-17768751
 ] 

Matthias Schwalbe edited comment on FLINK-26585 at 9/25/23 3:41 PM:


Thanks [~masteryhx] ,

... Jira somehow swallowed my responses (me confused :)), I'll take care of it 
again tomorrow (likely)

Thias


was (Author: matthias schwalbe):
Thanks [~masteryhx] ,

... Jira somehow swallowed my responses (me confused :)), I'll take of it again 
tomorrow (likely)

Thias

> State Processor API: Loading a state set buffers the whole state set in 
> memory before starting to process
> -
>
> Key: FLINK-26585
> URL: https://issues.apache.org/jira/browse/FLINK-26585
> Project: Flink
>  Issue Type: Improvement
>  Components: API / State Processor
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Matthias Schwalbe
>Assignee: Matthias Schwalbe
>Priority: Major
>  Labels: pull-request-available
> Attachments: MultiStateKeyIteratorNoStreams.java
>
>
> * When loading a state, MultiStateKeyIterator load and bufferes the whole 
> state in memory before it event processes a single data point 
>  ** This is absolutely no problem for small state (hence the unit tests work 
> fine)
>  ** MultiStateKeyIterator ctor sets up a java Stream that iterates all state 
> descriptors and flattens all datapoints contained within
>  ** The java.util.stream.Stream#flatMap function causes the buffering of the 
> whole data set when enumerated later on
>  ** See call stack [1] 
>  *** I our case this is 150e6 data points (> 1GiB just for the pointers to 
> the data, let alone the data itself ~30GiB)
>  ** I’m not aware of some instrumentation of Stream in order to avoid the 
> problem, hence
>  ** I coded an alternative implementation of MultiStateKeyIterator that 
> avoids using java Stream,
>  ** I can contribute our implementation (MultiStateKeyIteratorNoStreams)
> [1]
> Streams call stack:
> hasNext:77, RocksStateKeysIterator 
> (org.apache.flink.contrib.streaming.state.iterator)
> next:82, RocksStateKeysIterator 
> (org.apache.flink.contrib.streaming.state.iterator)
> forEachRemaining:116, Iterator (java.util)
> forEachRemaining:1801, Spliterators$IteratorSpliterator (java.util)
> forEach:580, ReferencePipeline$Head (java.util.stream)
> accept:270, ReferencePipeline$7$1 (java.util.stream)  
>  #  Stream flatMap(final Function Stream> var1)
> accept:373, ReferencePipeline$11$1 (java.util.stream) 
>  # Stream peek(final Consumer var1)
> accept:193, ReferencePipeline$3$1 (java.util.stream)      
>  #  Stream map(final Function 
> var1)
> tryAdvance:1359, ArrayList$ArrayListSpliterator (java.util)
> lambda$initPartialTraversalState$0:294, 
> StreamSpliterators$WrappingSpliterator (java.util.stream)
> getAsBoolean:-1, 1528195520 
> (java.util.stream.StreamSpliterators$WrappingSpliterator$$Lambda$57)
> fillBuffer:206, StreamSpliterators$AbstractWrappingSpliterator 
> (java.util.stream)
> doAdvance:161, StreamSpliterators$AbstractWrappingSpliterator 
> (java.util.stream)
> tryAdvance:300, StreamSpliterators$WrappingSpliterator (java.util.stream)
> hasNext:681, Spliterators$1Adapter (java.util)
> hasNext:83, MultiStateKeyIterator (org.apache.flink.state.api.input)
> hasNext:162, KeyedStateReaderOperator$NamespaceDecorator 
> (org.apache.flink.state.api.input.operator)
> reachedEnd:215, KeyedStateInputFormat (org.apache.flink.state.api.input)
> invoke:191, DataSourceTask (org.apache.flink.runtime.operators)
> doRun:776, Task (org.apache.flink.runtime.taskmanager)
> run:563, Task (org.apache.flink.runtime.taskmanager)
> run:748, Thread (java.lang)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-26585) State Processor API: Loading a state set buffers the whole state set in memory before starting to process

2023-09-25 Thread Matthias Schwalbe (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768751#comment-17768751
 ] 

Matthias Schwalbe commented on FLINK-26585:
---

Thanks [~masteryhx] ,

... Jira somehow swallowed my responses (me confused :)), I'll take of it again 
tomorrow (likely)

Thias

> State Processor API: Loading a state set buffers the whole state set in 
> memory before starting to process
> -
>
> Key: FLINK-26585
> URL: https://issues.apache.org/jira/browse/FLINK-26585
> Project: Flink
>  Issue Type: Improvement
>  Components: API / State Processor
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Matthias Schwalbe
>Assignee: Matthias Schwalbe
>Priority: Major
>  Labels: pull-request-available
> Attachments: MultiStateKeyIteratorNoStreams.java
>
>
> * When loading a state, MultiStateKeyIterator load and bufferes the whole 
> state in memory before it event processes a single data point 
>  ** This is absolutely no problem for small state (hence the unit tests work 
> fine)
>  ** MultiStateKeyIterator ctor sets up a java Stream that iterates all state 
> descriptors and flattens all datapoints contained within
>  ** The java.util.stream.Stream#flatMap function causes the buffering of the 
> whole data set when enumerated later on
>  ** See call stack [1] 
>  *** I our case this is 150e6 data points (> 1GiB just for the pointers to 
> the data, let alone the data itself ~30GiB)
>  ** I’m not aware of some instrumentation of Stream in order to avoid the 
> problem, hence
>  ** I coded an alternative implementation of MultiStateKeyIterator that 
> avoids using java Stream,
>  ** I can contribute our implementation (MultiStateKeyIteratorNoStreams)
> [1]
> Streams call stack:
> hasNext:77, RocksStateKeysIterator 
> (org.apache.flink.contrib.streaming.state.iterator)
> next:82, RocksStateKeysIterator 
> (org.apache.flink.contrib.streaming.state.iterator)
> forEachRemaining:116, Iterator (java.util)
> forEachRemaining:1801, Spliterators$IteratorSpliterator (java.util)
> forEach:580, ReferencePipeline$Head (java.util.stream)
> accept:270, ReferencePipeline$7$1 (java.util.stream)  
>  #  Stream flatMap(final Function Stream> var1)
> accept:373, ReferencePipeline$11$1 (java.util.stream) 
>  # Stream peek(final Consumer var1)
> accept:193, ReferencePipeline$3$1 (java.util.stream)      
>  #  Stream map(final Function 
> var1)
> tryAdvance:1359, ArrayList$ArrayListSpliterator (java.util)
> lambda$initPartialTraversalState$0:294, 
> StreamSpliterators$WrappingSpliterator (java.util.stream)
> getAsBoolean:-1, 1528195520 
> (java.util.stream.StreamSpliterators$WrappingSpliterator$$Lambda$57)
> fillBuffer:206, StreamSpliterators$AbstractWrappingSpliterator 
> (java.util.stream)
> doAdvance:161, StreamSpliterators$AbstractWrappingSpliterator 
> (java.util.stream)
> tryAdvance:300, StreamSpliterators$WrappingSpliterator (java.util.stream)
> hasNext:681, Spliterators$1Adapter (java.util)
> hasNext:83, MultiStateKeyIterator (org.apache.flink.state.api.input)
> hasNext:162, KeyedStateReaderOperator$NamespaceDecorator 
> (org.apache.flink.state.api.input.operator)
> reachedEnd:215, KeyedStateInputFormat (org.apache.flink.state.api.input)
> invoke:191, DataSourceTask (org.apache.flink.runtime.operators)
> doRun:776, Task (org.apache.flink.runtime.taskmanager)
> run:563, Task (org.apache.flink.runtime.taskmanager)
> run:748, Thread (java.lang)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33150) add the processing logic for the long type

2023-09-25 Thread Martijn Visser (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-33150:
---
Fix Version/s: (was: 1.15.4)

>  add the processing logic for the long type
> ---
>
> Key: FLINK-33150
> URL: https://issues.apache.org/jira/browse/FLINK-33150
> Project: Flink
>  Issue Type: New Feature
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.15.4
>Reporter: wenhao.yu
>Priority: Minor
>
> The AvroToRowDataConverters class has a convertToDate method that will report 
> an error when it encounters time data represented by the long type, so add a 
> code to handle the long type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] jgagnon1 commented on pull request #23453: [FLINK-33128] Add converter.open() method call on TestValuesRuntimeFunctions

2023-09-25 Thread via GitHub


jgagnon1 commented on PR #23453:
URL: https://github.com/apache/flink/pull/23453#issuecomment-1733850068

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] snuyanzin commented on pull request #23458: [FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


snuyanzin commented on PR #23458:
URL: https://github.com/apache/flink/pull/23458#issuecomment-1733693205

   @RyanSkraba thanks for the contribution
   looks ok from my side
   could you please also provide backports to 1.16.x, 1.17.x, 1.18.x ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled

2023-09-25 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768679#comment-17768679
 ] 

Yang Wang commented on FLINK-33155:
---

Thanks for your comments.

 

> Changing the default behavior from file to UGI can be a breaking change to 
> users which are depending on that some way

What I mean is to get the delegation token from UGI instead of reading from 
file, just like we have already done in the {{{}YarnClusterDescriptor{}}}[1]. I 
am not sure why this will be a breaking change because the tokens in the 
{{ContainerLaunchContext}} are just same.

 

[1]. 
[https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L1334]

> Flink ResourceManager continuously fails to start TM container on YARN when 
> Kerberos enabled
> 
>
> Key: FLINK-33155
> URL: https://issues.apache.org/jira/browse/FLINK-33155
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Yang Wang
>Priority: Major
>
> When Kerberos enabled(with key tab) and after one day(the container token 
> expired), Flink fails to create the TaskManager container on YARN due to the 
> following exception.
>  
> {code:java}
> 2023-09-25 16:48:50,030 INFO  
> org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
> Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: 
> Container container_1695106898104_0003_01_69 was invalid. Diagnostics: 
> [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN 
> owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com,
>  renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, 
> sequenceNumber=12, masterKeyId=3) can't be found in cache
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, 
> realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, 
> masterKeyId=3) can't be found in cache
>     at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1491)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1388)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>     at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>     at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>     at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588)
>     at 
> org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
>     at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
>     at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
>     at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>     at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
>     at 
> 

[jira] [Commented] (FLINK-28303) Kafka SQL Connector loses data when restoring from a savepoint with a topic with empty partitions

2023-09-25 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768680#comment-17768680
 ] 

Martijn Visser commented on FLINK-28303:


[~tanjialiang] It would be good to double check this in the latest Kafka 
connector code, perhaps it's already addressed since the work was done on 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source

> Kafka SQL Connector loses data when restoring from a savepoint with a topic 
> with empty partitions
> -
>
> Key: FLINK-28303
> URL: https://issues.apache.org/jira/browse/FLINK-28303
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4
>Reporter: Robert Metzger
>Priority: Major
>
> Steps to reproduce:
> - Set up a Kafka topic with 10 partitions
> - produce records 0-9 into the topic
> - take a savepoint and stop the job
> - produce records 10-19 into the topic
> - restore the job from the savepoint.
> The job will be missing usually 2-4 records from 10-19.
> My assumption is that if a partition never had data (which is likely with 10 
> partitions and 10 records), the savepoint will only contain offsets for 
> partitions with data. 
> While the job was offline (and we've written record 10-19 into the topic), 
> all partitions got filled. Now, when Kafka comes online again, it will use 
> the "latest" offset for those partitions, skipping some data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled

2023-09-25 Thread Gabor Somogyi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768665#comment-17768665
 ] 

Gabor Somogyi edited comment on FLINK-33155 at 9/25/23 12:20 PM:
-

Not updating UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION is a known 
limitation of YARN.

If the mentioned code runs on the JM side and delegation tokens are enabled 
then it makes sense since the JM keeps it's tokens up-to-date all the time.

Couple of notes:
 * Changing the default behavior from file to UGI can be a breaking change to 
users which are depending on that some way
 * DT handling is a single threaded operation but as I see TM creation uses 
multiple threads which may end-up in undefined behavior


was (Author: gaborgsomogyi):
Not updating UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION is a known 
limitation of YARN.

If the mentioned code runs on the JM side and delegation tokens are enabled 
then it makes sense since the JM keeps it's tokens up-to-date all the time.

Couple of notes:
 * Changing the default behavior from file to UGI can be a breaking change to 
users which are depending on that some way...

> Flink ResourceManager continuously fails to start TM container on YARN when 
> Kerberos enabled
> 
>
> Key: FLINK-33155
> URL: https://issues.apache.org/jira/browse/FLINK-33155
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Yang Wang
>Priority: Major
>
> When Kerberos enabled(with key tab) and after one day(the container token 
> expired), Flink fails to create the TaskManager container on YARN due to the 
> following exception.
>  
> {code:java}
> 2023-09-25 16:48:50,030 INFO  
> org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
> Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: 
> Container container_1695106898104_0003_01_69 was invalid. Diagnostics: 
> [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN 
> owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com,
>  renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, 
> sequenceNumber=12, masterKeyId=3) can't be found in cache
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, 
> realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, 
> masterKeyId=3) can't be found in cache
>     at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1491)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1388)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>     at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>     at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>     at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588)
>     at 
> org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
>     at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
>     at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
>     at 

[jira] [Commented] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled

2023-09-25 Thread Gabor Somogyi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768665#comment-17768665
 ] 

Gabor Somogyi commented on FLINK-33155:
---

Not updating UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION is a known 
limitation of YARN.

If the mentioned code runs on the JM side and delegation tokens are enabled 
then it makes sense since the JM keeps it's tokens up-to-date all the time.

Couple of notes:
 * Changing the default behavior from file to UGI can be a breaking change to 
users which are depending on that some way...

> Flink ResourceManager continuously fails to start TM container on YARN when 
> Kerberos enabled
> 
>
> Key: FLINK-33155
> URL: https://issues.apache.org/jira/browse/FLINK-33155
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Yang Wang
>Priority: Major
>
> When Kerberos enabled(with key tab) and after one day(the container token 
> expired), Flink fails to create the TaskManager container on YARN due to the 
> following exception.
>  
> {code:java}
> 2023-09-25 16:48:50,030 INFO  
> org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
> Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: 
> Container container_1695106898104_0003_01_69 was invalid. Diagnostics: 
> [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN 
> owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com,
>  renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, 
> sequenceNumber=12, masterKeyId=3) can't be found in cache
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, 
> realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, 
> masterKeyId=3) can't be found in cache
>     at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1491)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1388)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>     at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>     at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>     at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588)
>     at 
> org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
>     at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
>     at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
>     at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>     at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243)
>     at 
> 

[jira] [Updated] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled

2023-09-25 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-33155:
--
Description: 
When Kerberos enabled(with key tab) and after one day(the container token 
expired), Flink fails to create the TaskManager container on YARN due to the 
following exception.

 
{code:java}
2023-09-25 16:48:50,030 INFO  
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: 
Container container_1695106898104_0003_01_69 was invalid. Diagnostics: 
[2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN 
owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com,
 renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, 
sequenceNumber=12, masterKeyId=3) can't be found in cache
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, 
realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, 
masterKeyId=3) can't be found in cache
    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)
    at org.apache.hadoop.ipc.Client.call(Client.java:1491)
    at org.apache.hadoop.ipc.Client.call(Client.java:1388)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
    at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
    at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
    at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588)
    at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
    at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750) {code}
The root cause might be that we are reading the delegation token from JM local 
file[1]. It will expire after one day. When the old TaskManager container 
crashes and ResourceManager tries to create a new one, the YARN NodeManager 
will use the expired token to localize the resources for TaskManager and then 
fail.

Instead, we could read the latest valid token from 

[jira] [Updated] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled

2023-09-25 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-33155:
--
Description: 
When Kerberos enabled(with key tab) and after one day(the container token 
expired), Flink fails to create the TaskManager container on YARN due to the 
following exception.

 
{code:java}
2023-09-25 16:48:50,030 INFO  
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: 
Container container_1695106898104_0003_01_69 was invalid. Diagnostics: 
[2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN 
owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com,
 renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, 
sequenceNumber=12, masterKeyId=3) can't be found in cache
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, 
realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, 
masterKeyId=3) can't be found in cache
    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)
    at org.apache.hadoop.ipc.Client.call(Client.java:1491)
    at org.apache.hadoop.ipc.Client.call(Client.java:1388)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
    at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
    at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
    at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588)
    at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
    at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750) {code}
The root cause might be that we are reading the delegation token from JM local 
file[1]. It will expire after one day. When the old TaskManager container 
crashes and ResourceManager tries to create a new one, the YARN NodeManager 
will use the expired token to localize the resources for TaskManager and then 
fail.

 

[1]. 

[jira] [Updated] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled

2023-09-25 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-33155:
--
Component/s: Deployment / YARN

> Flink ResourceManager continuously fails to start TM container on YARN when 
> Kerberos enabled
> 
>
> Key: FLINK-33155
> URL: https://issues.apache.org/jira/browse/FLINK-33155
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Yang Wang
>Priority: Major
>
> When Kerberos enabled(with key tab) and after one day(the container token 
> expired), Flink fails to create the TaskManager container on YARN due to the 
> following exception.
>  
> {code:java}
> 2023-09-25 16:48:50,030 INFO  
> org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
> Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: 
> Container container_1695106898104_0003_01_69 was invalid. Diagnostics: 
> [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN 
> owner=, renewer=, realUser=, issueDate=1695196431487, 
> maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in 
> cacheorg.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hadoop: HDFS_DELEGATION_TOKEN 
> owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com,
>  renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, 
> sequenceNumber=12, masterKeyId=3) can't be found in cacheat 
> org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)at 
> org.apache.hadoop.ipc.Client.call(Client.java:1491)at 
> org.apache.hadoop.ipc.Client.call(Client.java:1388)at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
> at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
> at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)at 
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588)
> at 
> org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
> at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)at 
> org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)at 
> java.security.AccessController.doPrivileged(Native Method)at 
> javax.security.auth.Subject.doAs(Subject.java:422)at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)at 
> 

[jira] [Created] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled

2023-09-25 Thread Yang Wang (Jira)
Yang Wang created FLINK-33155:
-

 Summary: Flink ResourceManager continuously fails to start TM 
container on YARN when Kerberos enabled
 Key: FLINK-33155
 URL: https://issues.apache.org/jira/browse/FLINK-33155
 Project: Flink
  Issue Type: Bug
Reporter: Yang Wang


When Kerberos enabled(with key tab) and after one day(the container token 
expired), Flink fails to create the TaskManager container on YARN due to the 
following exception.

 
{code:java}
2023-09-25 16:48:50,030 INFO  
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: 
Container container_1695106898104_0003_01_69 was invalid. Diagnostics: 
[2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN 
owner=, renewer=, realUser=, issueDate=1695196431487, 
maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in 
cacheorg.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (token for hadoop: HDFS_DELEGATION_TOKEN 
owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com,
 renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, 
sequenceNumber=12, masterKeyId=3) can't be found in cacheat 
org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)at 
org.apache.hadoop.ipc.Client.call(Client.java:1491)at 
org.apache.hadoop.ipc.Client.call(Client.java:1388)at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)   
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)at 
org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588)
at 
org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)at 
org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)at 
org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)at 
org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)at 
java.security.AccessController.doPrivileged(Native Method)at 
javax.security.auth.Subject.doAs(Subject.java:422)at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
   at java.lang.Thread.run(Thread.java:750) {code}
The root cause might be that we are reading the delegation token from JM local 
file[1]. It will expire after one day. When the old TaskManager container 
crashes and ResourceManager tries to create a new one, 

[jira] [Commented] (FLINK-28303) Kafka SQL Connector loses data when restoring from a savepoint with a topic with empty partitions

2023-09-25 Thread tanjialiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768637#comment-17768637
 ] 

tanjialiang commented on FLINK-28303:
-

Maybe this this is the reason? 
[FLINK-33153|https://issues.apache.org/jira/browse/FLINK-33153]

> Kafka SQL Connector loses data when restoring from a savepoint with a topic 
> with empty partitions
> -
>
> Key: FLINK-28303
> URL: https://issues.apache.org/jira/browse/FLINK-28303
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4
>Reporter: Robert Metzger
>Priority: Major
>
> Steps to reproduce:
> - Set up a Kafka topic with 10 partitions
> - produce records 0-9 into the topic
> - take a savepoint and stop the job
> - produce records 10-19 into the topic
> - restore the job from the savepoint.
> The job will be missing usually 2-4 records from 10-19.
> My assumption is that if a partition never had data (which is likely with 10 
> partitions and 10 records), the savepoint will only contain offsets for 
> partitions with data. 
> While the job was offline (and we've written record 10-19 into the topic), 
> all partitions got filled. Now, when Kafka comes online again, it will use 
> the "latest" offset for those partitions, skipping some data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33154) flink on k8s,An error occurred during consuming rocketmq

2023-09-25 Thread Monody (Jira)
Monody created FLINK-33154:
--

 Summary: flink on k8s,An error occurred during consuming rocketmq
 Key: FLINK-33154
 URL: https://issues.apache.org/jira/browse/FLINK-33154
 Project: Flink
  Issue Type: Technical Debt
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.1.0
 Environment: 
flink-kubernetes-operator:https://github.com/apache/flink-kubernetes-operator#current-api-version-v1beta1
rocketmq-flink:https://github.com/apache/rocketmq-flink
Reporter: Monody


The following error occurs when flink consumes rocketmq. The flink job is 
running on k8s, and the projects used are:
The projects used by flink to consume rocketmq are:
The flink job runs normally on yarn, and no abnormality is found on the 
rocketmq server. Why does this happen? and how to solve it?
!https://user-images.githubusercontent.com/47728686/265662530-231c500c-fd64-4679-9b0f-ff4a025dd766.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33153) Kafka using latest-offset maybe missing data

2023-09-25 Thread tanjialiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768631#comment-17768631
 ] 

tanjialiang commented on FLINK-33153:
-

It relate to https://issues.apache.org/jira/browse/FLINK-28303, so i close this 
issue.

> Kafka using latest-offset maybe missing data
> 
>
> Key: FLINK-33153
> URL: https://issues.apache.org/jira/browse/FLINK-33153
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: kafka-4.1.0
>Reporter: tanjialiang
>Priority: Minor
>
> When Kafka start with the latest-offset strategy, it does not fetch the 
> latest snapshot offset and specify it for consumption. Instead, it sets the 
> startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes 
> currentOffset = -1, and call the KafkaConsumer's  seekToEnd API). The 
> currentOffset is only set to the consumed offset + 1 when the task consumes 
> data, and this currentOffset is stored in the state during checkpointing. If 
> there are very few messages in Kafka and a partition has not consumed any 
> data, and I stop the task with a savepoint, then write data to that 
> partition, and start the task with the savepoint, the task will resume from 
> the saved state. Due to the startingOffset in the state being -1, it will 
> cause the task to miss the data that was written before the recovery point.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33153) Kafka using latest-offset maybe missing data

2023-09-25 Thread tanjialiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tanjialiang updated FLINK-33153:

External issue URL:   (was: 
https://issues.apache.org/jira/browse/FLINK-28303)

> Kafka using latest-offset maybe missing data
> 
>
> Key: FLINK-33153
> URL: https://issues.apache.org/jira/browse/FLINK-33153
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: kafka-4.1.0
>Reporter: tanjialiang
>Priority: Minor
>
> When Kafka start with the latest-offset strategy, it does not fetch the 
> latest snapshot offset and specify it for consumption. Instead, it sets the 
> startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes 
> currentOffset = -1, and call the KafkaConsumer's  seekToEnd API). The 
> currentOffset is only set to the consumed offset + 1 when the task consumes 
> data, and this currentOffset is stored in the state during checkpointing. If 
> there are very few messages in Kafka and a partition has not consumed any 
> data, and I stop the task with a savepoint, then write data to that 
> partition, and start the task with the savepoint, the task will resume from 
> the saved state. Due to the startingOffset in the state being -1, it will 
> cause the task to miss the data that was written before the recovery point.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33153) Kafka using latest-offset maybe missing data

2023-09-25 Thread tanjialiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tanjialiang updated FLINK-33153:

External issue URL: https://issues.apache.org/jira/browse/FLINK-28303
  Release Note:   (was: realte to 
https://issues.apache.org/jira/browse/FLINK-28303)

> Kafka using latest-offset maybe missing data
> 
>
> Key: FLINK-33153
> URL: https://issues.apache.org/jira/browse/FLINK-33153
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: kafka-4.1.0
>Reporter: tanjialiang
>Priority: Minor
>
> When Kafka start with the latest-offset strategy, it does not fetch the 
> latest snapshot offset and specify it for consumption. Instead, it sets the 
> startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes 
> currentOffset = -1, and call the KafkaConsumer's  seekToEnd API). The 
> currentOffset is only set to the consumed offset + 1 when the task consumes 
> data, and this currentOffset is stored in the state during checkpointing. If 
> there are very few messages in Kafka and a partition has not consumed any 
> data, and I stop the task with a savepoint, then write data to that 
> partition, and start the task with the savepoint, the task will resume from 
> the saved state. Due to the startingOffset in the state being -1, it will 
> cause the task to miss the data that was written before the recovery point.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33153) Kafka using latest-offset maybe missing data

2023-09-25 Thread tanjialiang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tanjialiang closed FLINK-33153.
---
Release Note: realte to https://issues.apache.org/jira/browse/FLINK-28303
  Resolution: Duplicate

> Kafka using latest-offset maybe missing data
> 
>
> Key: FLINK-33153
> URL: https://issues.apache.org/jira/browse/FLINK-33153
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: kafka-4.1.0
>Reporter: tanjialiang
>Priority: Minor
>
> When Kafka start with the latest-offset strategy, it does not fetch the 
> latest snapshot offset and specify it for consumption. Instead, it sets the 
> startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes 
> currentOffset = -1, and call the KafkaConsumer's  seekToEnd API). The 
> currentOffset is only set to the consumed offset + 1 when the task consumes 
> data, and this currentOffset is stored in the state during checkpointing. If 
> there are very few messages in Kafka and a partition has not consumed any 
> data, and I stop the task with a savepoint, then write data to that 
> partition, and start the task with the savepoint, the task will resume from 
> the saved state. Due to the startingOffset in the state being -1, it will 
> cause the task to miss the data that was written before the recovery point.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] FangYongs commented on a diff in pull request #23455: [FLINK-25015][Table SQL/Client] change job name to sql when submitting queries using client

2023-09-25 Thread via GitHub


FangYongs commented on code in PR #23455:
URL: https://github.com/apache/flink/pull/23455#discussion_r1335703134


##
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java:
##
@@ -1050,10 +1052,15 @@ private TableResultInternal executeInternal(
 }
 
 private TableResultInternal executeQueryOperation(QueryOperation 
operation) {
+String querySql = null;
+if (operation instanceof QuerySqlOperation) {
+querySql = ((QuerySqlOperation) operation).getQuerySql();
+}
 CollectModifyOperation sinkOperation = new 
CollectModifyOperation(operation);
 List> transformations =
 translate(Collections.singletonList(sinkOperation));
-final String defaultJobName = "collect";
+final String defaultJobName =
+StringUtils.isNullOrWhitespaceOnly(querySql) ? "collect" : 
querySql;

Review Comment:
   Add test units to check that the job name is modified



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] FangYongs commented on a diff in pull request #23455: [FLINK-25015][Table SQL/Client] change job name to sql when submitting queries using client

2023-09-25 Thread via GitHub


FangYongs commented on code in PR #23455:
URL: https://github.com/apache/flink/pull/23455#discussion_r1335701445


##
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java:
##
@@ -1050,10 +1052,15 @@ private TableResultInternal executeInternal(
 }
 
 private TableResultInternal executeQueryOperation(QueryOperation 
operation) {
+String querySql = null;
+if (operation instanceof QuerySqlOperation) {
+querySql = ((QuerySqlOperation) operation).getQuerySql();
+}
 CollectModifyOperation sinkOperation = new 
CollectModifyOperation(operation);
 List> transformations =
 translate(Collections.singletonList(sinkOperation));
-final String defaultJobName = "collect";
+final String defaultJobName =

Review Comment:
   Job name will be used in the checkpoint path, is it ok when there're some 
special characters in the sql statement?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-33153) Kafka using latest-offset maybe missing data

2023-09-25 Thread tanjialiang (Jira)
tanjialiang created FLINK-33153:
---

 Summary: Kafka using latest-offset maybe missing data
 Key: FLINK-33153
 URL: https://issues.apache.org/jira/browse/FLINK-33153
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: kafka-4.1.0
Reporter: tanjialiang


When Kafka start with the latest-offset strategy, it does not fetch the latest 
snapshot offset and specify it for consumption. Instead, it sets the 
startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes 
currentOffset = -1, and call the KafkaConsumer's  seekToEnd API). The 
currentOffset is only set to the consumed offset + 1 when the task consumes 
data, and this currentOffset is stored in the state during checkpointing. If 
there are very few messages in Kafka and a partition has not consumed any data, 
and I stop the task with a savepoint, then write data to that partition, and 
start the task with the savepoint, the task will resume from the saved state. 
Due to the startingOffset in the state being -1, it will cause the task to miss 
the data that was written before the recovery point.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33127) HeapKeyedStateBackend: use buffered I/O to speed up local recovery

2023-09-25 Thread Hangxiang Yu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768610#comment-17768610
 ] 

Hangxiang Yu commented on FLINK-33127:
--

Actually I have taked a second review last month but not received your response 
until now.

Of course, I'm fine that we focused on FLINK-26585 firstly.

Just Kindly ping about the duplicated ticket.

> HeapKeyedStateBackend: use buffered I/O to speed up local recovery
> --
>
> Key: FLINK-33127
> URL: https://issues.apache.org/jira/browse/FLINK-33127
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Yangyang ZHANG
>Assignee: Yangyang ZHANG
>Priority: Major
> Attachments: thread_dump.png
>
>
> Recently, I observed a slow restore case in local recovery using hashmap 
> statebackend.
> It took 147 seconds to restore from a 467MB snapshot, 9 times slower than 
> that (16s) when restore from remote fs.
> The thread dump show that It read local snapshot file directly by unbuffered 
> FileInputStream / fs.local.LocalDataInputStream.
> !thread_dump.png!
> Maybe we can wrap with BufferInputStream to speed up local recovery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-26585) State Processor API: Loading a state set buffers the whole state set in memory before starting to process

2023-09-25 Thread Hangxiang Yu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768607#comment-17768607
 ] 

Hangxiang Yu commented on FLINK-26585:
--

[~Matthias Schwalbe] I have taked a second review last month but not received 
response from you.

You could check the newest comments from me last month.

> State Processor API: Loading a state set buffers the whole state set in 
> memory before starting to process
> -
>
> Key: FLINK-26585
> URL: https://issues.apache.org/jira/browse/FLINK-26585
> Project: Flink
>  Issue Type: Improvement
>  Components: API / State Processor
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Matthias Schwalbe
>Assignee: Matthias Schwalbe
>Priority: Major
>  Labels: pull-request-available
> Attachments: MultiStateKeyIteratorNoStreams.java
>
>
> * When loading a state, MultiStateKeyIterator load and bufferes the whole 
> state in memory before it event processes a single data point 
>  ** This is absolutely no problem for small state (hence the unit tests work 
> fine)
>  ** MultiStateKeyIterator ctor sets up a java Stream that iterates all state 
> descriptors and flattens all datapoints contained within
>  ** The java.util.stream.Stream#flatMap function causes the buffering of the 
> whole data set when enumerated later on
>  ** See call stack [1] 
>  *** I our case this is 150e6 data points (> 1GiB just for the pointers to 
> the data, let alone the data itself ~30GiB)
>  ** I’m not aware of some instrumentation of Stream in order to avoid the 
> problem, hence
>  ** I coded an alternative implementation of MultiStateKeyIterator that 
> avoids using java Stream,
>  ** I can contribute our implementation (MultiStateKeyIteratorNoStreams)
> [1]
> Streams call stack:
> hasNext:77, RocksStateKeysIterator 
> (org.apache.flink.contrib.streaming.state.iterator)
> next:82, RocksStateKeysIterator 
> (org.apache.flink.contrib.streaming.state.iterator)
> forEachRemaining:116, Iterator (java.util)
> forEachRemaining:1801, Spliterators$IteratorSpliterator (java.util)
> forEach:580, ReferencePipeline$Head (java.util.stream)
> accept:270, ReferencePipeline$7$1 (java.util.stream)  
>  #  Stream flatMap(final Function Stream> var1)
> accept:373, ReferencePipeline$11$1 (java.util.stream) 
>  # Stream peek(final Consumer var1)
> accept:193, ReferencePipeline$3$1 (java.util.stream)      
>  #  Stream map(final Function 
> var1)
> tryAdvance:1359, ArrayList$ArrayListSpliterator (java.util)
> lambda$initPartialTraversalState$0:294, 
> StreamSpliterators$WrappingSpliterator (java.util.stream)
> getAsBoolean:-1, 1528195520 
> (java.util.stream.StreamSpliterators$WrappingSpliterator$$Lambda$57)
> fillBuffer:206, StreamSpliterators$AbstractWrappingSpliterator 
> (java.util.stream)
> doAdvance:161, StreamSpliterators$AbstractWrappingSpliterator 
> (java.util.stream)
> tryAdvance:300, StreamSpliterators$WrappingSpliterator (java.util.stream)
> hasNext:681, Spliterators$1Adapter (java.util)
> hasNext:83, MultiStateKeyIterator (org.apache.flink.state.api.input)
> hasNext:162, KeyedStateReaderOperator$NamespaceDecorator 
> (org.apache.flink.state.api.input.operator)
> reachedEnd:215, KeyedStateInputFormat (org.apache.flink.state.api.input)
> invoke:191, DataSourceTask (org.apache.flink.runtime.operators)
> doRun:776, Task (org.apache.flink.runtime.taskmanager)
> run:563, Task (org.apache.flink.runtime.taskmanager)
> run:748, Thread (java.lang)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-26585) State Processor API: Loading a state set buffers the whole state set in memory before starting to process

2023-09-25 Thread Matthias Schwalbe (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768604#comment-17768604
 ] 

Matthias Schwalbe commented on FLINK-26585:
---

Hi [~masteryhx] ,

The PR [#23239|https://github.com/apache/flink/pull/23239] seems to be stuck 
for a while now, awaiting second review. May I kindly ask you to take care of 
that.

I believe I answered all open questions...

 

Sincere greeting

Thias

> State Processor API: Loading a state set buffers the whole state set in 
> memory before starting to process
> -
>
> Key: FLINK-26585
> URL: https://issues.apache.org/jira/browse/FLINK-26585
> Project: Flink
>  Issue Type: Improvement
>  Components: API / State Processor
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Matthias Schwalbe
>Assignee: Matthias Schwalbe
>Priority: Major
>  Labels: pull-request-available
> Attachments: MultiStateKeyIteratorNoStreams.java
>
>
> * When loading a state, MultiStateKeyIterator load and bufferes the whole 
> state in memory before it event processes a single data point 
>  ** This is absolutely no problem for small state (hence the unit tests work 
> fine)
>  ** MultiStateKeyIterator ctor sets up a java Stream that iterates all state 
> descriptors and flattens all datapoints contained within
>  ** The java.util.stream.Stream#flatMap function causes the buffering of the 
> whole data set when enumerated later on
>  ** See call stack [1] 
>  *** I our case this is 150e6 data points (> 1GiB just for the pointers to 
> the data, let alone the data itself ~30GiB)
>  ** I’m not aware of some instrumentation of Stream in order to avoid the 
> problem, hence
>  ** I coded an alternative implementation of MultiStateKeyIterator that 
> avoids using java Stream,
>  ** I can contribute our implementation (MultiStateKeyIteratorNoStreams)
> [1]
> Streams call stack:
> hasNext:77, RocksStateKeysIterator 
> (org.apache.flink.contrib.streaming.state.iterator)
> next:82, RocksStateKeysIterator 
> (org.apache.flink.contrib.streaming.state.iterator)
> forEachRemaining:116, Iterator (java.util)
> forEachRemaining:1801, Spliterators$IteratorSpliterator (java.util)
> forEach:580, ReferencePipeline$Head (java.util.stream)
> accept:270, ReferencePipeline$7$1 (java.util.stream)  
>  #  Stream flatMap(final Function Stream> var1)
> accept:373, ReferencePipeline$11$1 (java.util.stream) 
>  # Stream peek(final Consumer var1)
> accept:193, ReferencePipeline$3$1 (java.util.stream)      
>  #  Stream map(final Function 
> var1)
> tryAdvance:1359, ArrayList$ArrayListSpliterator (java.util)
> lambda$initPartialTraversalState$0:294, 
> StreamSpliterators$WrappingSpliterator (java.util.stream)
> getAsBoolean:-1, 1528195520 
> (java.util.stream.StreamSpliterators$WrappingSpliterator$$Lambda$57)
> fillBuffer:206, StreamSpliterators$AbstractWrappingSpliterator 
> (java.util.stream)
> doAdvance:161, StreamSpliterators$AbstractWrappingSpliterator 
> (java.util.stream)
> tryAdvance:300, StreamSpliterators$WrappingSpliterator (java.util.stream)
> hasNext:681, Spliterators$1Adapter (java.util)
> hasNext:83, MultiStateKeyIterator (org.apache.flink.state.api.input)
> hasNext:162, KeyedStateReaderOperator$NamespaceDecorator 
> (org.apache.flink.state.api.input.operator)
> reachedEnd:215, KeyedStateInputFormat (org.apache.flink.state.api.input)
> invoke:191, DataSourceTask (org.apache.flink.runtime.operators)
> doRun:776, Task (org.apache.flink.runtime.taskmanager)
> run:563, Task (org.apache.flink.runtime.taskmanager)
> run:748, Thread (java.lang)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33127) HeapKeyedStateBackend: use buffered I/O to speed up local recovery

2023-09-25 Thread Matthias Schwalbe (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768602#comment-17768602
 ] 

Matthias Schwalbe commented on FLINK-33127:
---

[~masteryhx] : I actually want to finish FLINK-26585 first before I can start  
FLINK-26586 (capacity)

FLINK-26585 is somewhat hung in approval of PR without any progress for a 
couple of weeks.

(will ping you on that ticket in a second)

Thias

> HeapKeyedStateBackend: use buffered I/O to speed up local recovery
> --
>
> Key: FLINK-33127
> URL: https://issues.apache.org/jira/browse/FLINK-33127
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Yangyang ZHANG
>Assignee: Yangyang ZHANG
>Priority: Major
> Attachments: thread_dump.png
>
>
> Recently, I observed a slow restore case in local recovery using hashmap 
> statebackend.
> It took 147 seconds to restore from a 467MB snapshot, 9 times slower than 
> that (16s) when restore from remote fs.
> The thread dump show that It read local snapshot file directly by unbuffered 
> FileInputStream / fs.local.LocalDataInputStream.
> !thread_dump.png!
> Maybe we can wrap with BufferInputStream to speed up local recovery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-33151) Prometheus Sink Connector - Create Github Repo

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer resolved FLINK-33151.
---
Resolution: Done

> Prometheus Sink Connector - Create Github Repo
> --
>
> Key: FLINK-33151
> URL: https://issues.apache.org/jira/browse/FLINK-33151
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Reporter: Danny Cranmer
>Assignee: Danny Cranmer
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>
> Create the \{{flink-connector-prometheus}} repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33151) Prometheus Sink Connector - Create Github Repo

2023-09-25 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768596#comment-17768596
 ] 

Danny Cranmer commented on FLINK-33151:
---

https://github.com/apache/flink-connector-prometheus

> Prometheus Sink Connector - Create Github Repo
> --
>
> Key: FLINK-33151
> URL: https://issues.apache.org/jira/browse/FLINK-33151
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Reporter: Danny Cranmer
>Assignee: Danny Cranmer
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>
> Create the \{{flink-connector-prometheus}} repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33152) Prometheus Sink Connector - Integration tests

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33152:
--
Component/s: Connectors / Prometheus

> Prometheus Sink Connector - Integration tests
> -
>
> Key: FLINK-33152
> URL: https://issues.apache.org/jira/browse/FLINK-33152
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Reporter: Lorenzo Nicora
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>
> Integration tests against containerised Prometheus



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31757) FLIP-370: Support Balanced Tasks Scheduling

2023-09-25 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-31757:

Description: This is an umbrella JIRA of 
[FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw].  (was: This is a 
umbrella JIRA of [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw].

 

Supposed we have a Job with 21 {{{}JobVertex{}}}. The parallelism of vertex A 
is 100, and the others are 5. If each {{TaskManager}} only have one slot, then 
we need 100 TMs.

There will be 5 slots with 21 sub-tasks, and the others will only have one 
sub-task of A. Does this mean we have to make a trade-off between wasted 
resources and insufficient resources?

>From a resource utilization point of view, we expect all subtasks to be evenly 
>distributed on each TM.)

> FLIP-370: Support Balanced Tasks Scheduling
> ---
>
> Key: FLINK-31757
> URL: https://issues.apache.org/jira/browse/FLINK-31757
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: RocMarshal
>Assignee: RocMarshal
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-04-13-08-04-04-667.png
>
>
> This is an umbrella JIRA of 
> [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33139) Prometheus Sink Connector - Table API support

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33139:
--
Fix Version/s: prometheus-connector-1.0.0

> Prometheus Sink Connector - Table API support
> -
>
> Key: FLINK-33139
> URL: https://issues.apache.org/jira/browse/FLINK-33139
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lorenzo Nicora
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33152) Prometheus Sink Connector - Integration tests

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33152:
--
Fix Version/s: prometheus-connector-1.0.0

> Prometheus Sink Connector - Integration tests
> -
>
> Key: FLINK-33152
> URL: https://issues.apache.org/jira/browse/FLINK-33152
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lorenzo Nicora
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>
> Integration tests against containerised Prometheus



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33151) Prometheus Sink Connector - Create Github Repo

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33151:
--
Component/s: Connectors / Prometheus

> Prometheus Sink Connector - Create Github Repo
> --
>
> Key: FLINK-33151
> URL: https://issues.apache.org/jira/browse/FLINK-33151
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Reporter: Danny Cranmer
>Assignee: Danny Cranmer
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>
> Create the \{{flink-connector-prometheus}} repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33151) Prometheus Sink Connector - Create Github Repo

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33151:
--
Fix Version/s: prometheus-connector-1.0.0

> Prometheus Sink Connector - Create Github Repo
> --
>
> Key: FLINK-33151
> URL: https://issues.apache.org/jira/browse/FLINK-33151
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Danny Cranmer
>Assignee: Danny Cranmer
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>
> Create the \{{flink-connector-prometheus}} repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33141) Promentheus Sink Connector - Amazon Managed Prometheus Request Signer

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33141:
--
Fix Version/s: (was: prometheus-connector-1.0.0)

> Promentheus Sink Connector - Amazon Managed Prometheus Request Signer
> -
>
> Key: FLINK-33141
> URL: https://issues.apache.org/jira/browse/FLINK-33141
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / AWS
>Reporter: Lorenzo Nicora
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33139) Prometheus Sink Connector - Table API support

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33139:
--
Component/s: Connectors / Prometheus

> Prometheus Sink Connector - Table API support
> -
>
> Key: FLINK-33139
> URL: https://issues.apache.org/jira/browse/FLINK-33139
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Reporter: Lorenzo Nicora
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33141) Promentheus Sink Connector - Amazon Managed Prometheus Request Signer

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33141:
--
Fix Version/s: prometheus-connector-1.0.0

> Promentheus Sink Connector - Amazon Managed Prometheus Request Signer
> -
>
> Key: FLINK-33141
> URL: https://issues.apache.org/jira/browse/FLINK-33141
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / AWS
>Reporter: Lorenzo Nicora
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33138) Prometheus Connector Sink - DataStream API implementation

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33138:
--
Fix Version/s: prometheus-connector-1.0.0

> Prometheus Connector Sink - DataStream API implementation
> -
>
> Key: FLINK-33138
> URL: https://issues.apache.org/jira/browse/FLINK-33138
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lorenzo Nicora
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33138) Prometheus Connector Sink - DataStream API implementation

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33138:
--
Component/s: Connectors / Prometheus

> Prometheus Connector Sink - DataStream API implementation
> -
>
> Key: FLINK-33138
> URL: https://issues.apache.org/jira/browse/FLINK-33138
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Reporter: Lorenzo Nicora
>Priority: Major
> Fix For: prometheus-connector-1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33137) FLIP-312: Prometheus Sink Connector

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33137:
--
Component/s: Connectors / Prometheus

> FLIP-312: Prometheus Sink Connector
> ---
>
> Key: FLINK-33137
> URL: https://issues.apache.org/jira/browse/FLINK-33137
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Prometheus
>Reporter: Lorenzo Nicora
>Assignee: Lorenzo Nicora
>Priority: Major
>  Labels: Connector
> Fix For: prometheus-connector-1.0.0
>
>
> Umbrella Jira for implementation of Prometheus Sink Connector
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33137) FLIP-312: Prometheus Sink Connector

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-33137:
--
Fix Version/s: prometheus-connector-1.0.0

> FLIP-312: Prometheus Sink Connector
> ---
>
> Key: FLINK-33137
> URL: https://issues.apache.org/jira/browse/FLINK-33137
> Project: Flink
>  Issue Type: New Feature
>Reporter: Lorenzo Nicora
>Assignee: Lorenzo Nicora
>Priority: Major
>  Labels: Connector
> Fix For: prometheus-connector-1.0.0
>
>
> Umbrella Jira for implementation of Prometheus Sink Connector
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #23458: [FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


flinkbot commented on PR #23458:
URL: https://github.com/apache/flink/pull/23458#issuecomment-1733193724

   
   ## CI report:
   
   * 4debb4036b86485b39be61cb284e7ba6e8daaba7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31757) FLIP-370: Support Balanced Tasks Scheduling

2023-09-25 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-31757:

Summary: FLIP-370: Support Balanced Tasks Scheduling  (was: Support 
Balanced Tasks Scheduling)

> FLIP-370: Support Balanced Tasks Scheduling
> ---
>
> Key: FLINK-31757
> URL: https://issues.apache.org/jira/browse/FLINK-31757
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: RocMarshal
>Assignee: RocMarshal
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-04-13-08-04-04-667.png
>
>
> This is a umbrella JIRA of 
> [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw].
>  
> Supposed we have a Job with 21 {{{}JobVertex{}}}. The parallelism of vertex A 
> is 100, and the others are 5. If each {{TaskManager}} only have one slot, 
> then we need 100 TMs.
> There will be 5 slots with 21 sub-tasks, and the others will only have one 
> sub-task of A. Does this mean we have to make a trade-off between wasted 
> resources and insufficient resources?
> From a resource utilization point of view, we expect all subtasks to be 
> evenly distributed on each TM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31757) Support Balanced Tasks Scheduling

2023-09-25 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-31757:

Description: 
This is a umbrella JIRA of 
[FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw].

 

Supposed we have a Job with 21 {{{}JobVertex{}}}. The parallelism of vertex A 
is 100, and the others are 5. If each {{TaskManager}} only have one slot, then 
we need 100 TMs.

There will be 5 slots with 21 sub-tasks, and the others will only have one 
sub-task of A. Does this mean we have to make a trade-off between wasted 
resources and insufficient resources?

>From a resource utilization point of view, we expect all subtasks to be evenly 
>distributed on each TM.

  was:
Supposed we have a Job with 21 {{JobVertex}}. The parallelism of vertex A is 
100, and the others are 5. If each {{TaskManager}} only have one slot, then we 
need 100 TMs.

There will be 5 slots with 21 sub-tasks, and the others will only have one 
sub-task of A. Does this mean we have to make a trade-off between wasted 
resources and insufficient resources?

>From a resource utilization point of view, we expect all subtasks to be evenly 
>distributed on each TM.


> Support Balanced Tasks Scheduling
> -
>
> Key: FLINK-31757
> URL: https://issues.apache.org/jira/browse/FLINK-31757
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: RocMarshal
>Assignee: RocMarshal
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-04-13-08-04-04-667.png
>
>
> This is a umbrella JIRA of 
> [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw].
>  
> Supposed we have a Job with 21 {{{}JobVertex{}}}. The parallelism of vertex A 
> is 100, and the others are 5. If each {{TaskManager}} only have one slot, 
> then we need 100 TMs.
> There will be 5 slots with 21 sub-tasks, and the others will only have one 
> sub-task of A. Does this mean we have to make a trade-off between wasted 
> resources and insufficient resources?
> From a resource utilization point of view, we expect all subtasks to be 
> evenly distributed on each TM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33149) Bump snappy-java to 1.1.10.4

2023-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33149:
---
Labels: pull-request-available  (was: )

> Bump snappy-java to 1.1.10.4
> 
>
> Key: FLINK-33149
> URL: https://issues.apache.org/jira/browse/FLINK-33149
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Connectors / AWS, Connectors / HBase, 
> Connectors / Kafka, Stateful Functions
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Ryan Skraba
>Priority: Major
>  Labels: pull-request-available
>
> Xerial published a security alert for a Denial of Service attack that [exists 
> on 
> 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv].
> This is included in flink-dist, but also in flink-statefun, and several 
> connectors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] RyanSkraba opened a new pull request, #23458: [FLINK-33149][build] Bump snappy to 1.1.10.4

2023-09-25 Thread via GitHub


RyanSkraba opened a new pull request, #23458:
URL: https://github.com/apache/flink/pull/23458

   ## What is the purpose of the change
   
   Bump the version of snappy to address a vulnerability.
   
   ## Brief change log
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): **yes**
 - The public API, i.e., is any changed class annotated with **no**
 - The serializers: **no**
 - The runtime per-record code paths (performance sensitive): **no**
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no**
 - The S3 file system connector: **no**
   
   ## Documentation
   
 - Does this pull request introduce a new feature? **no**
 - If yes, how is the feature documented? **not applicable**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-33152) Prometheus Sink Connector - Integration tests

2023-09-25 Thread Lorenzo Nicora (Jira)
Lorenzo Nicora created FLINK-33152:
--

 Summary: Prometheus Sink Connector - Integration tests
 Key: FLINK-33152
 URL: https://issues.apache.org/jira/browse/FLINK-33152
 Project: Flink
  Issue Type: Sub-task
Reporter: Lorenzo Nicora


Integration tests against containerised Prometheus



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33140) Prometheus Sink Connector - E2E example on AWS

2023-09-25 Thread Lorenzo Nicora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorenzo Nicora updated FLINK-33140:
---
Summary: Prometheus Sink Connector - E2E example on AWS  (was: Prometheus 
Sink Connector - E2E example)

> Prometheus Sink Connector - E2E example on AWS
> --
>
> Key: FLINK-33140
> URL: https://issues.apache.org/jira/browse/FLINK-33140
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lorenzo Nicora
>Priority: Major
>
> End-to-end example application, to be deployed on Amazon Managed Service for 
> Apache Flink, and writing to Amazon Managed Prometheus



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33140) Prometheus Sink Connector - E2E example on AWS

2023-09-25 Thread Lorenzo Nicora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorenzo Nicora updated FLINK-33140:
---
Description: End-to-end example application, deployable on Amazon Managed 
Service for Apache Flink, and writing to Amazon Managed Prometheus  (was: 
End-to-end example application, to be deployed on Amazon Managed Service for 
Apache Flink, and writing to Amazon Managed Prometheus)

> Prometheus Sink Connector - E2E example on AWS
> --
>
> Key: FLINK-33140
> URL: https://issues.apache.org/jira/browse/FLINK-33140
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lorenzo Nicora
>Priority: Major
>
> End-to-end example application, deployable on Amazon Managed Service for 
> Apache Flink, and writing to Amazon Managed Prometheus



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33140) Prometheus Sink Connector - E2E example

2023-09-25 Thread Lorenzo Nicora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorenzo Nicora updated FLINK-33140:
---
Summary: Prometheus Sink Connector - E2E example  (was: Prometheus Sink 
Connector - E2E test)

> Prometheus Sink Connector - E2E example
> ---
>
> Key: FLINK-33140
> URL: https://issues.apache.org/jira/browse/FLINK-33140
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lorenzo Nicora
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33140) Prometheus Sink Connector - E2E example

2023-09-25 Thread Lorenzo Nicora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorenzo Nicora updated FLINK-33140:
---
Description: End-to-end example application, to be deployed on Amazon 
Managed Service for Apache Flink, and writing to Amazon Managed Prometheus

> Prometheus Sink Connector - E2E example
> ---
>
> Key: FLINK-33140
> URL: https://issues.apache.org/jira/browse/FLINK-33140
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lorenzo Nicora
>Priority: Major
>
> End-to-end example application, to be deployed on Amazon Managed Service for 
> Apache Flink, and writing to Amazon Managed Prometheus



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33151) Prometheus Sink Connector - Create Github Repo

2023-09-25 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-33151:
-

 Summary: Prometheus Sink Connector - Create Github Repo
 Key: FLINK-33151
 URL: https://issues.apache.org/jira/browse/FLINK-33151
 Project: Flink
  Issue Type: Sub-task
Reporter: Danny Cranmer


Create the \{{flink-connector-prometheus}} repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33151) Prometheus Sink Connector - Create Github Repo

2023-09-25 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer reassigned FLINK-33151:
-

Assignee: Danny Cranmer

> Prometheus Sink Connector - Create Github Repo
> --
>
> Key: FLINK-33151
> URL: https://issues.apache.org/jira/browse/FLINK-33151
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Danny Cranmer
>Assignee: Danny Cranmer
>Priority: Major
>
> Create the \{{flink-connector-prometheus}} repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33150) add the processing logic for the long type

2023-09-25 Thread wenhao.yu (Jira)
wenhao.yu created FLINK-33150:
-

 Summary:  add the processing logic for the long type
 Key: FLINK-33150
 URL: https://issues.apache.org/jira/browse/FLINK-33150
 Project: Flink
  Issue Type: New Feature
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.15.4
Reporter: wenhao.yu
 Fix For: 1.15.4


The AvroToRowDataConverters class has a convertToDate method that will report 
an error when it encounters time data represented by the long type, so add a 
code to handle the long type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33137) FLIP-312: Prometheus Sink Connector

2023-09-25 Thread Hong Liang Teoh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh reassigned FLINK-33137:
---

Assignee: Lorenzo Nicora

> FLIP-312: Prometheus Sink Connector
> ---
>
> Key: FLINK-33137
> URL: https://issues.apache.org/jira/browse/FLINK-33137
> Project: Flink
>  Issue Type: New Feature
>Reporter: Lorenzo Nicora
>Assignee: Lorenzo Nicora
>Priority: Major
>  Labels: Connector
>
> Umbrella Jira for implementation of Prometheus Sink Connector
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33149) Bump snappy-java to 1.1.10.4

2023-09-25 Thread Ryan Skraba (Jira)
Ryan Skraba created FLINK-33149:
---

 Summary: Bump snappy-java to 1.1.10.4
 Key: FLINK-33149
 URL: https://issues.apache.org/jira/browse/FLINK-33149
 Project: Flink
  Issue Type: Bug
  Components: API / Core, Connectors / AWS, Connectors / HBase, 
Connectors / Kafka, Stateful Functions
Affects Versions: 1.18.0, 1.16.3, 1.17.2
Reporter: Ryan Skraba


Xerial published a security alert for a Denial of Service attack that [exists 
on 
1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv].

This is included in flink-dist, but also in flink-statefun, and several 
connectors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] echauchot commented on pull request #23443: [FLINK-33059] Support transparent compression for file-connector for all file input formats

2023-09-25 Thread via GitHub


echauchot commented on PR #23443:
URL: https://github.com/apache/flink/pull/23443#issuecomment-1733119379

   Hi @rmetzger, I saw you authored parts of this code, can you please do a 
review or point me to another reviewer ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-33107) Update stable-spec upgrade mode on reconciled-spec change

2023-09-25 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-33107.
--
Fix Version/s: kubernetes-operator-1.7.0
   Resolution: Fixed

merged to main 662fa612a8ab352e43ab8a99fa61aadfbe41e4d7

> Update stable-spec upgrade mode on reconciled-spec change
> -
>
> Key: FLINK-33107
> URL: https://issues.apache.org/jira/browse/FLINK-33107
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.6.0, kubernetes-operator-1.7.0
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.7.0
>
>
> Since now the rollback mechanism uses the regular upgrade flow, we need to 
> ensure that the lastStableSpec upgrade mode is kept in sync with the 
> lastReconciled spec to ensure correct stateful upgrades.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora merged pull request #681: [FLINK-33107] Use correct upgrade mode when executing rollback, simpl…

2023-09-25 Thread via GitHub


gyfora merged PR #681:
URL: https://github.com/apache/flink-kubernetes-operator/pull/681


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >