[GitHub] [flink] X-czh opened a new pull request, #23465: [FLINK-33147] Introduce endpoint field in REST API and deprecate host field
X-czh opened a new pull request, #23465: URL: https://github.com/apache/flink/pull/23465 ## What is the purpose of the change First step towards [[FLIP-363: Unify the Representation of TaskManager Location in REST API and Web UI](https://cwiki.apache.org/confluence/display/FLINK/FLIP-363%3A+Unify+the+Representation+of+TaskManager+Location+in+REST+API+and+Web+UI)]: Introduce a new "endpoint" field in REST API to represent TaskManager endpoint (host + port) and deprecate the "host" field. ## Brief change log Introduce endpoint field in REST API and deprecate host field. ## Verifying this change This change is already covered by existing tests, and verified on a standalone cluster. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) The REST API doc is regenerated to cover the changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33156) Remove flakiness from tests in OperatorStateBackendTest.java
[ https://issues.apache.org/jira/browse/FLINK-33156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asha Boyapati updated FLINK-33156: -- Description: This issue is similar to: https://issues.apache.org/jira/browse/FLINK-32963 We are proposing to make the following tests stable: {quote}org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote} The tests are currently flaky because the order of elements returned by iterators is non-deterministic. The following PR fixes the flaky test by making it independent of the order of elements returned by the iterator: https://github.com/apache/flink/pull/23464 We detected this using the NonDex tool using the following commands: {quote}mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime -DnondexRuns=10 -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime -DnondexRuns=10 -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote} Please see the following Continuous Integration log that shows the flakiness: https://github.com/asha-boyapati/flink/actions/runs/6193757385 Please see the following Continuous Integration log that shows that the flakiness is fixed by this change: https://github.com/asha-boyapati/flink/actions/runs/619409 was: This issue is similar to: https://issues.apache.org/jira/browse/FLINK-32963 We are proposing to make the following tests stable: {quote}org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote} The tests are currently flaky because the order of elements returned by iterators is non-deterministic. The following PR fixes the flaky test by making it independent of the order of elements returned by the iterator: https://github.com/asha-boyapati/flink/pull/2 We detected this using the NonDex tool using the following commands: {quote}mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime -DnondexRuns=10 -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime -DnondexRuns=10 -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote} Please see the following Continuous Integration log that shows the flakiness: https://github.com/asha-boyapati/flink/actions/runs/6193757385 Please see the following Continuous Integration log that shows that the flakiness is fixed by this change: https://github.com/asha-boyapati/flink/actions/runs/619409 > Remove flakiness from tests in OperatorStateBackendTest.java > > > Key: FLINK-33156 > URL: https://issues.apache.org/jira/browse/FLINK-33156 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.17.1 >Reporter: Asha Boyapati >Priority: Minor > Fix For: 1.17.1 > > > This issue is similar to: > https://issues.apache.org/jira/browse/FLINK-32963 > We are proposing to make the following tests stable: > {quote}org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync > org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote} > The tests are currently flaky because the order of elements returned by > iterators is non-deterministic. > The following PR fixes the flaky test by making it independent of the order > of elements returned by the iterator: > https://github.com/apache/flink/pull/23464 > We detected this using the NonDex tool using the following commands: > {quote}mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime > -DnondexRuns=10 > -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync > mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime > -DnondexRuns=10 > -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote} > Please see the following Continuous Integration log that shows the flakiness: > https://github.com/asha-boyapati/flink/actions/runs/6193757385 > Please see the following Continuous Integration log that shows that the > flakiness is fixed by this change: > https://github.com/asha-boyapati/flink/actions/runs/619409 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] asha-boyapati opened a new pull request, #23464: Remove flakiness from tests in OperatorStateBackendTest.java
asha-boyapati opened a new pull request, #23464: URL: https://github.com/apache/flink/pull/23464 ## What is the purpose of the change This PR is similar to the following PR which was accepted: https://github.com/apache/flink/pull/23298 This PR makes the following tests stable: ``` org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync ``` The tests are currently flaky because the order of elements returned by iterators is non-deterministic. This PR fixes the flaky tests by making them independent of the order of elements returned by the iterators. We detected this using the NonDex tool using the following commands: ` mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime -DnondexRuns=10 -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync` `mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime -DnondexRuns=10 -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync` Please see the following Continuous Integration log that shows the flakiness: https://github.com/asha-boyapati/flink/actions/runs/6193757385 Please see the following Continuous Integration log that shows that the flakiness is fixed by this change: https://github.com/asha-boyapati/flink/actions/runs/619409 ## Brief change log This PR fixes the flaky tests by making them independent of the order of elements returned by the iterators. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. This change only modifies tests slightly and does not change any underlying code. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33156) Remove flakiness from tests in OperatorStateBackendTest.java
Asha Boyapati created FLINK-33156: - Summary: Remove flakiness from tests in OperatorStateBackendTest.java Key: FLINK-33156 URL: https://issues.apache.org/jira/browse/FLINK-33156 Project: Flink Issue Type: Bug Components: Runtime / State Backends Affects Versions: 1.17.1 Reporter: Asha Boyapati Fix For: 1.17.1 This issue is similar to: https://issues.apache.org/jira/browse/FLINK-32963 We are proposing to make the following tests stable: {quote}org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote} The tests are currently flaky because the order of elements returned by iterators is non-deterministic. The following PR fixes the flaky test by making it independent of the order of elements returned by the iterator: https://github.com/asha-boyapati/flink/pull/2 We detected this using the NonDex tool using the following commands: {quote}mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime -DnondexRuns=10 -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreSync mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl flink-runtime -DnondexRuns=10 -Dtest=org.apache.flink.runtime.state.OperatorStateBackendTest#testSnapshotRestoreAsync{quote} Please see the following Continuous Integration log that shows the flakiness: https://github.com/asha-boyapati/flink/actions/runs/6193757385 Please see the following Continuous Integration log that shows that the flakiness is fixed by this change: https://github.com/asha-boyapati/flink/actions/runs/619409 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28303) Kafka SQL Connector loses data when restoring from a savepoint with a topic with empty partitions
[ https://issues.apache.org/jira/browse/FLINK-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769014#comment-17769014 ] tanjialiang commented on FLINK-28303: - [~martijnvisser] I had already check the latest kafka connector code, this problem still exists. > Kafka SQL Connector loses data when restoring from a savepoint with a topic > with empty partitions > - > > Key: FLINK-28303 > URL: https://issues.apache.org/jira/browse/FLINK-28303 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.4 >Reporter: Robert Metzger >Priority: Major > > Steps to reproduce: > - Set up a Kafka topic with 10 partitions > - produce records 0-9 into the topic > - take a savepoint and stop the job > - produce records 10-19 into the topic > - restore the job from the savepoint. > The job will be missing usually 2-4 records from 10-19. > My assumption is that if a partition never had data (which is likely with 10 > partitions and 10 records), the savepoint will only contain offsets for > partitions with data. > While the job was offline (and we've written record 10-19 into the topic), > all partitions got filled. Now, when Kafka comes online again, it will use > the "latest" offset for those partitions, skipping some data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] xintongsong commented on a diff in pull request #23456: [FLINK-33144][datastream]Deprecate Iteration API in DataStream
xintongsong commented on code in PR #23456: URL: https://github.com/apache/flink/pull/23456#discussion_r1336633757 ## flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStream.java: ## @@ -526,8 +526,14 @@ public DataStream global() { * in the set time, the stream terminates. * * @return The iterative data stream created. + * @deprecated This method is deprecated since Flink 1.19. The users are recommended to use + * Iteration API in Flink ML instead. Review Comment: We should not recommend users to use Iteration API in Flink ML instead. It doesn't make sense that a user who doesn't need any ML algorithm would have to use Flink ML only to get access to the Iteration API. I'd suggest to the following: > The only known use case of this Iteration API comes from Flink ML, which already has its own implementation of iteration and no longer uses this API. If there's any use cases other than Flink ML that needs iteration support, please reach out to d...@flink.apache.org and we can consider making the Flink ML iteration implementation a separate common library. ## flink-examples/flink-examples-streaming/pom.xml: ## @@ -137,6 +137,11 @@ under the License. -Xlint:deprecation true + + + org/apache/flink/streaming/examples/iteration/IterateExample.java Review Comment: We should explain that this example is temporarily preserved only for testing purpose. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-kubernetes-operator] 1996fanrui commented on a diff in pull request #677: [FLINK-33097][autoscaler] Initialize the generic autoscaler module and interfaces
1996fanrui commented on code in PR #677: URL: https://github.com/apache/flink-kubernetes-operator/pull/677#discussion_r1336584237 ## flink-autoscaler/pom.xml: ## @@ -45,6 +45,32 @@ under the License. provided + +org.projectlombok +lombok +${lombok.version} +provided + + + +org.junit.jupiter +junit-jupiter-params +test + + +
[GitHub] [flink] xiangyuf commented on a diff in pull request #23455: [FLINK-25015][Table SQL/Client] change job name to sql when submitting queries using client
xiangyuf commented on code in PR #23455: URL: https://github.com/apache/flink/pull/23455#discussion_r1336585759 ## flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java: ## @@ -1050,10 +1052,15 @@ private TableResultInternal executeInternal( } private TableResultInternal executeQueryOperation(QueryOperation operation) { +String querySql = null; +if (operation instanceof QuerySqlOperation) { +querySql = ((QuerySqlOperation) operation).getQuerySql(); +} CollectModifyOperation sinkOperation = new CollectModifyOperation(operation); List> transformations = translate(Collections.singletonList(sinkOperation)); -final String defaultJobName = "collect"; +final String defaultJobName = Review Comment: @FangYongs hi, in community version, only jobId will be used in the checkpoint path. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] victor9309 commented on pull request #23451: [FLINK-32108][test] KubernetesExtension calls assumeThat in @BeforeAll callback which doesn't print the actual failure message
victor9309 commented on PR #23451: URL: https://github.com/apache/flink/pull/23451#issuecomment-1734762519 failure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] victor9309 closed pull request #23451: [FLINK-32108][test] KubernetesExtension calls assumeThat in @BeforeAll callback which doesn't print the actual failure message
victor9309 closed pull request #23451: [FLINK-32108][test] KubernetesExtension calls assumeThat in @BeforeAll callback which doesn't print the actual failure message URL: https://github.com/apache/flink/pull/23451 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-kubernetes-operator] XComp commented on a diff in pull request #677: [FLINK-33097][autoscaler] Initialize the generic autoscaler module and interfaces
XComp commented on code in PR #677: URL: https://github.com/apache/flink-kubernetes-operator/pull/677#discussion_r1336571925 ## flink-autoscaler/pom.xml: ## @@ -45,6 +45,32 @@ under the License. provided + +org.projectlombok +lombok +${lombok.version} +provided + + + +org.junit.jupiter +junit-jupiter-params +test + + +
[GitHub] [flink] flinkbot commented on pull request #23463: Revert "[FLINK-33064][table-planner] Improve the error message when the lookup source is used as the scan source"
flinkbot commented on PR #23463: URL: https://github.com/apache/flink/pull/23463#issuecomment-1734753507 ## CI report: * 14e35b26265249609fb92f82c02d598f1c29ce7e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] swuferhong opened a new pull request, #23463: Revert "[FLINK-33064][table-planner] Improve the error message when the lookup source is used as the scan source"
swuferhong opened a new pull request, #23463: URL: https://github.com/apache/flink/pull/23463 ## What is the purpose of the change After testing different sql pattern, found that `FLINK-33064` can cause serious bugs in views and subquery scenarios. So this pr is aims to revert `FLINK-33064` ## Brief change log revert `FLINK-33064` ## Verifying this change no tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? no docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-connector-pulsar] minchowang commented on pull request #16: [BK-3.0][FLINK-30552][Connector/Pulsar] drop batch message size assertion, better set the cursor position.
minchowang commented on PR #16: URL: https://github.com/apache/flink-connector-pulsar/pull/16#issuecomment-1734745307 @syhily Hi, Whether can apply to develop-1.14 [develop-flink-1.14](https://github.com/streamnative/flink/tree/develop) for this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] swuferhong closed pull request #23457: [FLINK-33143][table-planner] Fix wrongly throw error while temporary table join with invalidScan lookup source and selected as view
swuferhong closed pull request #23457: [FLINK-33143][table-planner] Fix wrongly throw error while temporary table join with invalidScan lookup source and selected as view URL: https://github.com/apache/flink/pull/23457 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-33143) Wrongly throw error while temporary table join with invalidScan lookup source and selected as view
[ https://issues.apache.org/jira/browse/FLINK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yunhong Zheng closed FLINK-33143. - Resolution: Resolved > Wrongly throw error while temporary table join with invalidScan lookup source > and selected as view > -- > > Key: FLINK-33143 > URL: https://issues.apache.org/jira/browse/FLINK-33143 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.19.0 >Reporter: Yunhong Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > Wrongly throw error while temporary table join with invalidScan lookup source > and selected as view. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33143) Wrongly throw error while temporary table join with invalidScan lookup source and selected as view
[ https://issues.apache.org/jira/browse/FLINK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768969#comment-17768969 ] Yunhong Zheng commented on FLINK-33143: --- This fix way cannot cover all sql pattern, like subQuery, so this issue will be closed. > Wrongly throw error while temporary table join with invalidScan lookup source > and selected as view > -- > > Key: FLINK-33143 > URL: https://issues.apache.org/jira/browse/FLINK-33143 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.19.0 >Reporter: Yunhong Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > Wrongly throw error while temporary table join with invalidScan lookup source > and selected as view. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] xbthink commented on pull request #23454: [FLINK-33130]reuse source and sink operator io metrics for task
xbthink commented on PR #23454: URL: https://github.com/apache/flink/pull/23454#issuecomment-1734723928 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23462: [BP-1.18][FLINK-32976][runtime] Fix NullPointException when starting flink cluster in standalone mode
flinkbot commented on PR #23462: URL: https://github.com/apache/flink/pull/23462#issuecomment-1734714992 ## CI report: * a630fd02f8232d8bbd9f521bdc7990dfd8d79e21 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33129) Can't create RowDataToAvroConverter for LocalZonedTimestampType logical type
[ https://issues.apache.org/jira/browse/FLINK-33129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768959#comment-17768959 ] Zhaoyang Shao commented on FLINK-33129: --- In DataStructureConverters, there is similar implementation already using ``` putConverter( LogicalTypeRoot.TIMESTAMP_WITH_LOCAL_TIME_ZONE, Integer.class, constructor(LocalZonedTimestampIntConverter::new)); ``` [https://github.com/apache/flink/blob/9b2b4e3f194467aae0d299b3b403e0ca60c42ef0/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/data/conversion/DataStructureConverters.java#L134] We can use same/smiliar approach to support TIMESTAMP_WITH_LOCAL_TIME_ZONE in `RowDataToAvroConverters.createConverter` > Can't create RowDataToAvroConverter for LocalZonedTimestampType logical type > > > Key: FLINK-33129 > URL: https://issues.apache.org/jira/browse/FLINK-33129 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.17.1 >Reporter: Zhaoyang Shao >Priority: Critical > Fix For: 1.17.1 > > Original Estimate: 1h > Remaining Estimate: 1h > > While creating converter using `RowDataToAvroConverters.createConverter` with > LocalZonedTimestampType logical type, the method will throw exception. This > is because the switch clause is missing a clause for > `LogicalTypeRoot.TIMESTAMP_WITH_LOCAL_TIME_ZON`. > Code: > [https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/RowDataToAvroConverters.java#L75] > > We can convert the value to `LocalDateTime` and then `TimestampData` using > method below. Then we can apply the same converter as > TIMESTAMP_WITHOUT_TIME_ZONE? > > `TimestampData fromLocalDateTime(LocalDateTime dateTime)` > Can Flink team help adding the support for this logical type and logical type > root? > This is now a blocker for creating Flink Iceberg consumer with Avro > GenericRecord when IcebergTable has `TimestampTZ` type field which will be > converted to LocalZonedTimestampType. > See error below: > Unsupported type: TIMESTAMP_LTZ(6) > stack: [ [-] > > org.apache.flink.formats.avro.RowDataToAvroConverters.createConverter(RowDataToAvroConverters.java:186) > > > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > > > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:550) > > java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260) > > > java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:517) > > org.apache.flink.formats.avro.RowDataToAvroConverters.createRowConverter(RowDataToAvroConverters.java:224) > > > org.apache.flink.formats.avro.RowDataToAvroConverters.createConverter(RowDataToAvroConverters.java:178) > > > org.apache.iceberg.flink.source.RowDataToAvroGenericRecordConverter.(RowDataToAvroGenericRecordConverter.java:46) > > > org.apache.iceberg.flink.source.RowDataToAvroGenericRecordConverter.fromIcebergSchema(RowDataToAvroGenericRecordConverter.java:60) > > > org.apache.iceberg.flink.source.reader.AvroGenericRecordReaderFunction.lazyConverter(AvroGenericRecordReaderFunction.java:93) > > > org.apache.iceberg.flink.source.reader.AvroGenericRecordReaderFunction.createDataIterator(AvroGenericRecordReaderFunction.java:85) > > > org.apache.iceberg.flink.source.reader.DataIteratorReaderFunction.apply(DataIteratorReaderFunction.java:39) > > > org.apache.iceberg.flink.source.reader.DataIteratorReaderFunction.apply(DataIteratorReaderFunction.java:27) > > > org.apache.iceberg.flink.source.reader.IcebergSourceSplitReader.fetch(IcebergSourceSplitReader.java:74) > > > org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58) > > > org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:162) > > > org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:114) > > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > java.util.concurrent.FutureTask.run(FutureTask.java:264) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > >
[GitHub] [flink] hackergin opened a new pull request, #23462: [BP-1.18][FLINK-32976][runtime] Fix NullPointException when starting flink cluster in standalone mode
hackergin opened a new pull request, #23462: URL: https://github.com/apache/flink/pull/23462 BP https://github.com/apache/flink/pull/23446 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Sxnan commented on pull request #22897: [FLINK-32476][runtime] Support configuring object-reuse for internal operators
Sxnan commented on PR #22897: URL: https://github.com/apache/flink/pull/22897#issuecomment-1734709549 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33129) Can't create RowDataToAvroConverter for LocalZonedTimestampType logical type
[ https://issues.apache.org/jira/browse/FLINK-33129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhaoyang Shao updated FLINK-33129: -- Priority: Critical (was: Blocker) > Can't create RowDataToAvroConverter for LocalZonedTimestampType logical type > > > Key: FLINK-33129 > URL: https://issues.apache.org/jira/browse/FLINK-33129 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.17.1 >Reporter: Zhaoyang Shao >Priority: Critical > Fix For: 1.17.1 > > Original Estimate: 1h > Remaining Estimate: 1h > > While creating converter using `RowDataToAvroConverters.createConverter` with > LocalZonedTimestampType logical type, the method will throw exception. This > is because the switch clause is missing a clause for > `LogicalTypeRoot.TIMESTAMP_WITH_LOCAL_TIME_ZON`. > Code: > [https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/RowDataToAvroConverters.java#L75] > > We can convert the value to `LocalDateTime` and then `TimestampData` using > method below. Then we can apply the same converter as > TIMESTAMP_WITHOUT_TIME_ZONE? > > `TimestampData fromLocalDateTime(LocalDateTime dateTime)` > Can Flink team help adding the support for this logical type and logical type > root? > This is now a blocker for creating Flink Iceberg consumer with Avro > GenericRecord when IcebergTable has `TimestampTZ` type field which will be > converted to LocalZonedTimestampType. > See error below: > Unsupported type: TIMESTAMP_LTZ(6) > stack: [ [-] > > org.apache.flink.formats.avro.RowDataToAvroConverters.createConverter(RowDataToAvroConverters.java:186) > > > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > > > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:550) > > java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260) > > > java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:517) > > org.apache.flink.formats.avro.RowDataToAvroConverters.createRowConverter(RowDataToAvroConverters.java:224) > > > org.apache.flink.formats.avro.RowDataToAvroConverters.createConverter(RowDataToAvroConverters.java:178) > > > org.apache.iceberg.flink.source.RowDataToAvroGenericRecordConverter.(RowDataToAvroGenericRecordConverter.java:46) > > > org.apache.iceberg.flink.source.RowDataToAvroGenericRecordConverter.fromIcebergSchema(RowDataToAvroGenericRecordConverter.java:60) > > > org.apache.iceberg.flink.source.reader.AvroGenericRecordReaderFunction.lazyConverter(AvroGenericRecordReaderFunction.java:93) > > > org.apache.iceberg.flink.source.reader.AvroGenericRecordReaderFunction.createDataIterator(AvroGenericRecordReaderFunction.java:85) > > > org.apache.iceberg.flink.source.reader.DataIteratorReaderFunction.apply(DataIteratorReaderFunction.java:39) > > > org.apache.iceberg.flink.source.reader.DataIteratorReaderFunction.apply(DataIteratorReaderFunction.java:27) > > > org.apache.iceberg.flink.source.reader.IcebergSourceSplitReader.fetch(IcebergSourceSplitReader.java:74) > > > org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58) > > > org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:162) > > > org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:114) > > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > java.util.concurrent.FutureTask.run(FutureTask.java:264) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > > java.lang.Thread.run(Thread.java:829) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…
yigress commented on code in PR #23425: URL: https://github.com/apache/flink/pull/23425#discussion_r1336513394 ## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java: ## @@ -368,6 +373,34 @@ public void discard() throws Exception { } } +@Override +public void discard() throws Exception { +discard(null); +} + +private void discardOperatorStates(Executor ioExecutor) throws Exception { +if (ioExecutor == null) { + StateUtil.bestEffortDiscardAllStateObjects(operatorStates.values()); +} else { +List discardables = +operatorStates.values().stream() +.flatMap(op -> op.getDiscardables().stream()) +.collect(Collectors.toList()); +LOG.trace("Executing discard {} operator states {}", discardables.size()); Review Comment: updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…
yigress commented on code in PR #23425: URL: https://github.com/apache/flink/pull/23425#discussion_r1336513096 ## flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java: ## @@ -109,6 +109,20 @@ public class CheckpointingOptions { .defaultValue(1) .withDescription("The maximum number of completed checkpoints to retain."); +/** + * Option whether to clean each checkpoint's states in fast mode. When in fast mode, operator + * states are discarded in parallel using the ExecutorService passed to the cleaner, otherwise + * operator states are discarded sequentially. + */ +@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS) +public static final ConfigOption CLEANER_FAST_MODE = Review Comment: updated the config comments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…
yigress commented on code in PR #23425: URL: https://github.com/apache/flink/pull/23425#discussion_r1336512951 ## flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java: ## @@ -109,6 +109,20 @@ public class CheckpointingOptions { .defaultValue(1) .withDescription("The maximum number of completed checkpoints to retain."); +/** + * Option whether to clean each checkpoint's states in fast mode. When in fast mode, operator + * states are discarded in parallel using the ExecutorService passed to the cleaner, otherwise + * operator states are discarded sequentially. + */ +@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS) +public static final ConfigOption CLEANER_FAST_MODE = Review Comment: changed to parallel-mode. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…
yigress commented on code in PR #23425: URL: https://github.com/apache/flink/pull/23425#discussion_r1336512772 ## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java: ## @@ -368,6 +373,34 @@ public void discard() throws Exception { } } +@Override +public void discard() throws Exception { +discard(null); +} + +private void discardOperatorStates(Executor ioExecutor) throws Exception { Review Comment: yes, the logic is the same. Added default method discardOperatorStates in Checkpoint interface to unify. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] yigress commented on a diff in pull request #23425: [FLINK-33090][checkpointing] CheckpointsCleaner clean individual chec…
yigress commented on code in PR #23425: URL: https://github.com/apache/flink/pull/23425#discussion_r1336512189 ## flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorFailureTest.java: ## @@ -72,7 +72,7 @@ class CheckpointCoordinatorFailureTest { /** * Tests that a failure while storing a completed checkpoint in the completed checkpoint store - * will properly fail the originating pending checkpoint and clean upt the completed checkpoint. + * will properly fail the originating pending checkpoint and clean up the completed checkpoint. Review Comment: restore the typo. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-32471) IS_NOT_NULL can add to SUITABLE_FILTER_TO_PUSH
[ https://issues.apache.org/jira/browse/FLINK-32471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-32471: --- Labels: pull-request-available stale-major (was: pull-request-available) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Major but is unassigned and neither itself nor its Sub-Tasks have been updated for 60 days. I have gone ahead and added a "stale-major" to the issue". If this ticket is a Major, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > IS_NOT_NULL can add to SUITABLE_FILTER_TO_PUSH > -- > > Key: FLINK-32471 > URL: https://issues.apache.org/jira/browse/FLINK-32471 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Reporter: grandfisher >Priority: Major > Labels: pull-request-available, stale-major > > According to FLINK-31273: > The reason for the error is that other filters conflict with IS_NULL, but in > fact it won't conflict with IS_NOT_NULL, because operators in > SUITABLE_FILTER_TO_PUSH such as 'SqlKind.GREATER_THAN' has an implicit > filter 'IS_NOT_NULL' according to SQL Semantics. > > So we think it is feasible to add IS_NOT_NULL to the SUITABLE_FILTER_TO_PUSH > list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33149) Bump snappy-java to 1.1.10.4
[ https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768878#comment-17768878 ] Matthias Pohl commented on FLINK-33149: --- I think my previous investigation is not enough: It seems to be used by {{flink-avro}}, {{flink-parquet}} and {{flink-presto}}, as well. > Bump snappy-java to 1.1.10.4 > > > Key: FLINK-33149 > URL: https://issues.apache.org/jira/browse/FLINK-33149 > Project: Flink > Issue Type: Bug > Components: API / Core, Connectors / AWS, Connectors / HBase, > Connectors / Kafka, Stateful Functions >Affects Versions: 1.18.0, 1.16.3, 1.17.2 >Reporter: Ryan Skraba >Assignee: Ryan Skraba >Priority: Major > Labels: pull-request-available > > Xerial published a security alert for a Denial of Service attack that [exists > on > 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv]. > This is included in flink-dist, but also in flink-statefun, and several > connectors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33149) Bump snappy-java to 1.1.10.4
[ https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768867#comment-17768867 ] Matthias Pohl commented on FLINK-33149: --- Thanks for looking into it. I did a code investigation to see where we use snappy in flink core. Snappy was introduced for the state backend and used in [SnappyStreamCompressionDecorator.java:25-26|https://github.com/apache/flink/blob/116f297478f2d443178510565b1cd5a2f387e241/flink-runtime/src/main/java/org/apache/flink/runtime/state/SnappyStreamCompressionDecorator.java#L25]. The classes that are affected by this vulnerability ({{SnappyInputStream}} and {{SnappyOutputStream}}) are not used. Flink uses {{SnappyFramedInputStream}} and {{SnappyFramedOutputStream}}. Therefore, it's not critical and priority Major makes sense. But it's still good to have this fixed considering the alerts that might pop up in security scanners. I also did a brief analysis of a few connector implementations: {code} ➜ workspace for c in $(ls -d flink-connector*); do echo $c; grep --include=pom.xml -Hirn snappy $c; done flink-connector-aws flink-connector-aws/pom.xml:254: org.xerial.snappy flink-connector-aws/pom.xml:255: snappy-java flink-connector-cassandra flink-connector-elasticsearch flink-connector-gcp-pubsub flink-connector-hbase flink-connector-hbase/pom.xml:245: org.xerial.snappy flink-connector-hbase/pom.xml:246: snappy-java flink-connector-hive flink-connector-jdbc flink-connector-kafka flink-connector-kafka/pom.xml:70: 1.1.8.3 flink-connector-kafka/pom.xml:231: org.xerial.snappy flink-connector-kafka/pom.xml:232: snappy-java flink-connector-kafka/pom.xml:233: ${snappy-java.version} flink-connector-mongodb flink-connector-opensearch flink-connector-pulsar flink-connector-rabbitmq flink-connector-redis-streams {code} Only {{flink-connector-kafka}} and {{flink-connector-aws}} have this dependency listed. None of them actually uses any classes from within the {{xerial}} package: {code} for c in $(ls -d flink-connector*); do echo $c; grep --include="*java" -Hirn xerial $c; done flink-connector-aws flink-connector-cassandra flink-connector-elasticsearch flink-connector-gcp-pubsub flink-connector-hbase flink-connector-hive flink-connector-jdbc flink-connector-kafka flink-connector-mongodb flink-connector-opensearch flink-connector-pulsar flink-connector-rabbitmq flink-connector-redis-streams {code} Would it be worth removing the dependency from the connectors entirely? WDYT? > Bump snappy-java to 1.1.10.4 > > > Key: FLINK-33149 > URL: https://issues.apache.org/jira/browse/FLINK-33149 > Project: Flink > Issue Type: Bug > Components: API / Core, Connectors / AWS, Connectors / HBase, > Connectors / Kafka, Stateful Functions >Affects Versions: 1.18.0, 1.16.3, 1.17.2 >Reporter: Ryan Skraba >Assignee: Ryan Skraba >Priority: Major > Labels: pull-request-available > > Xerial published a security alert for a Denial of Service attack that [exists > on > 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv]. > This is included in flink-dist, but also in flink-statefun, and several > connectors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-connector-aws] dependabot[bot] opened a new pull request, #98: Bump org.xerial.snappy:snappy-java from 1.1.10.1 to 1.1.10.4
dependabot[bot] opened a new pull request, #98: URL: https://github.com/apache/flink-connector-aws/pull/98 Bumps [org.xerial.snappy:snappy-java](https://github.com/xerial/snappy-java) from 1.1.10.1 to 1.1.10.4. Release notes Sourced from https://github.com/xerial/snappy-java/releases;>org.xerial.snappy:snappy-java's releases. v1.1.10.4 What's Changed Security Fix Fixed SnappyInputStream so as not to allocate too large memory when decompressing data with an extremely large chunk size by https://github.com/tunnelshade;>@tunnelshade (https://github.com/xerial/snappy-java/commit/9f8c3cf74223ed0a8a834134be9c917b9f10ceb5;>code change) This does not affect users only using Snappy.compress/uncompress methods Features feature: Upgrade the internal snappy version to 1.1.10 (1.1.8 was wrongly used before) by https://github.com/xerial;>@xerial in https://redirect.github.com/xerial/snappy-java/pull/508;>xerial/snappy-java#508 Support JDK21 (no internal change) Dependency Updates Update scalafmt-core to 3.7.11 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/485;>xerial/snappy-java#485 Update sbt to 1.9.3 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/483;>xerial/snappy-java#483 Update scalafmt-core to 3.7.12 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/487;>xerial/snappy-java#487 Bump actions/checkout from 3 to 4 by https://github.com/dependabot;>@dependabot in https://redirect.github.com/xerial/snappy-java/pull/502;>xerial/snappy-java#502 Update sbt to 1.9.4 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/496;>xerial/snappy-java#496 Update scalafmt-core to 3.7.14 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/501;>xerial/snappy-java#501 Update sbt to 1.9.6 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/505;>xerial/snappy-java#505 Update native libraries by https://github.com/github-actions;>@github-actions in https://redirect.github.com/xerial/snappy-java/pull/503;>xerial/snappy-java#503 Internal Updates Update airframe-log to 23.7.4 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/486;>xerial/snappy-java#486 Update airframe-log to 23.8.0 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/488;>xerial/snappy-java#488 Update sbt-scalafmt to 2.5.2 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/500;>xerial/snappy-java#500 Update airframe-log to 23.8.6 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/497;>xerial/snappy-java#497 Update sbt-scalafmt to 2.5.1 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/499;>xerial/snappy-java#499 Update airframe-log to 23.9.1 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/504;>xerial/snappy-java#504 Update airframe-log to 23.9.2 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/509;>xerial/snappy-java#509 Other Changes Update NOTICE by https://github.com/imsudiproy;>@imsudiproy in https://redirect.github.com/xerial/snappy-java/pull/492;>xerial/snappy-java#492 Full Changelog: https://github.com/xerial/snappy-java/compare/v1.1.10.3...v1.1.10.4;>https://github.com/xerial/snappy-java/compare/v1.1.10.3...v1.1.10.4 v1.1.10.3 What's Changed Bug Fixes Fix the GLIBC_2.32 not found issue of libsnappyjava.so in certain Linux distributions on s390x by https://github.com/kun-lu20;>@kun-lu20 in https://redirect.github.com/xerial/snappy-java/pull/481;>xerial/snappy-java#481 Dependency Updates Update scalafmt-core to 3.7.10 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/480;>xerial/snappy-java#480 Update native libraries by https://github.com/github-actions;>@github-actions in https://redirect.github.com/xerial/snappy-java/pull/482;>xerial/snappy-java#482 New Contributors https://github.com/kun-lu20;>@kun-lu20 made their first contribution in https://redirect.github.com/xerial/snappy-java/pull/481;>xerial/snappy-java#481 ... (truncated) Commits https://github.com/xerial/snappy-java/commit/9f8c3cf74223ed0a8a834134be9c917b9f10ceb5;>9f8c3cf Merge pull request from GHSA-55g7-9cwv-5qfv https://github.com/xerial/snappy-java/commit/49d700175f18ed5f8c5d371b7c2f80c75979bd68;>49d7001 Update
[GitHub] [flink-statefun] dependabot[bot] opened a new pull request, #340: Bump org.xerial.snappy:snappy-java from 1.1.10.1 to 1.1.10.4 in /statefun-flink
dependabot[bot] opened a new pull request, #340: URL: https://github.com/apache/flink-statefun/pull/340 Bumps [org.xerial.snappy:snappy-java](https://github.com/xerial/snappy-java) from 1.1.10.1 to 1.1.10.4. Release notes Sourced from https://github.com/xerial/snappy-java/releases;>org.xerial.snappy:snappy-java's releases. v1.1.10.4 What's Changed Security Fix Fixed SnappyInputStream so as not to allocate too large memory when decompressing data with an extremely large chunk size by https://github.com/tunnelshade;>@tunnelshade (https://github.com/xerial/snappy-java/commit/9f8c3cf74223ed0a8a834134be9c917b9f10ceb5;>code change) This does not affect users only using Snappy.compress/uncompress methods Features feature: Upgrade the internal snappy version to 1.1.10 (1.1.8 was wrongly used before) by https://github.com/xerial;>@xerial in https://redirect.github.com/xerial/snappy-java/pull/508;>xerial/snappy-java#508 Support JDK21 (no internal change) Dependency Updates Update scalafmt-core to 3.7.11 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/485;>xerial/snappy-java#485 Update sbt to 1.9.3 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/483;>xerial/snappy-java#483 Update scalafmt-core to 3.7.12 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/487;>xerial/snappy-java#487 Bump actions/checkout from 3 to 4 by https://github.com/dependabot;>@dependabot in https://redirect.github.com/xerial/snappy-java/pull/502;>xerial/snappy-java#502 Update sbt to 1.9.4 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/496;>xerial/snappy-java#496 Update scalafmt-core to 3.7.14 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/501;>xerial/snappy-java#501 Update sbt to 1.9.6 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/505;>xerial/snappy-java#505 Update native libraries by https://github.com/github-actions;>@github-actions in https://redirect.github.com/xerial/snappy-java/pull/503;>xerial/snappy-java#503 Internal Updates Update airframe-log to 23.7.4 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/486;>xerial/snappy-java#486 Update airframe-log to 23.8.0 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/488;>xerial/snappy-java#488 Update sbt-scalafmt to 2.5.2 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/500;>xerial/snappy-java#500 Update airframe-log to 23.8.6 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/497;>xerial/snappy-java#497 Update sbt-scalafmt to 2.5.1 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/499;>xerial/snappy-java#499 Update airframe-log to 23.9.1 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/504;>xerial/snappy-java#504 Update airframe-log to 23.9.2 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/509;>xerial/snappy-java#509 Other Changes Update NOTICE by https://github.com/imsudiproy;>@imsudiproy in https://redirect.github.com/xerial/snappy-java/pull/492;>xerial/snappy-java#492 Full Changelog: https://github.com/xerial/snappy-java/compare/v1.1.10.3...v1.1.10.4;>https://github.com/xerial/snappy-java/compare/v1.1.10.3...v1.1.10.4 v1.1.10.3 What's Changed Bug Fixes Fix the GLIBC_2.32 not found issue of libsnappyjava.so in certain Linux distributions on s390x by https://github.com/kun-lu20;>@kun-lu20 in https://redirect.github.com/xerial/snappy-java/pull/481;>xerial/snappy-java#481 Dependency Updates Update scalafmt-core to 3.7.10 by https://github.com/xerial-bot;>@xerial-bot in https://redirect.github.com/xerial/snappy-java/pull/480;>xerial/snappy-java#480 Update native libraries by https://github.com/github-actions;>@github-actions in https://redirect.github.com/xerial/snappy-java/pull/482;>xerial/snappy-java#482 New Contributors https://github.com/kun-lu20;>@kun-lu20 made their first contribution in https://redirect.github.com/xerial/snappy-java/pull/481;>xerial/snappy-java#481 ... (truncated) Commits https://github.com/xerial/snappy-java/commit/9f8c3cf74223ed0a8a834134be9c917b9f10ceb5;>9f8c3cf Merge pull request from GHSA-55g7-9cwv-5qfv https://github.com/xerial/snappy-java/commit/49d700175f18ed5f8c5d371b7c2f80c75979bd68;>49d7001 Update
[jira] [Assigned] (FLINK-33149) Bump snappy-java to 1.1.10.4
[ https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl reassigned FLINK-33149: - Assignee: Ryan Skraba > Bump snappy-java to 1.1.10.4 > > > Key: FLINK-33149 > URL: https://issues.apache.org/jira/browse/FLINK-33149 > Project: Flink > Issue Type: Bug > Components: API / Core, Connectors / AWS, Connectors / HBase, > Connectors / Kafka, Stateful Functions >Affects Versions: 1.18.0, 1.16.3, 1.17.2 >Reporter: Ryan Skraba >Assignee: Ryan Skraba >Priority: Major > Labels: pull-request-available > > Xerial published a security alert for a Denial of Service attack that [exists > on > 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv]. > This is included in flink-dist, but also in flink-statefun, and several > connectors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] XComp commented on pull request #23461: [BP-1.18][FLINK-33149][build] Bump snappy to 1.1.10.4
XComp commented on PR #23461: URL: https://github.com/apache/flink/pull/23461#issuecomment-1734177907 ``` 16:26:18.071 [ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.11.0:install-node-and-npm (install node and npm) on project flink-runtime-web: Could not extract the Node archive: Could not extract archive: '/__w/2/.m2/repository/com/github/eirslett/node/16.13.2/node-16.13.2-linux-x64.tar.gz': EOFException -> [Help 1] 16:26:18.071 [ERROR] 16:26:18.071 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. 16:26:18.071 [ERROR] Re-run Maven using the -X switch to enable full debug logging. ``` Azure retry initiated because of the error in compile stage. It looks like an infrastructure issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23461: [BP-1.18][FLINK-33149][build] Bump snappy to 1.1.10.4
flinkbot commented on PR #23461: URL: https://github.com/apache/flink/pull/23461#issuecomment-1734057845 ## CI report: * 71309fb4cba1e09de71b03a2ad190137e76eab2d UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23460: [BP-1.17][FLINK-33149][build] Bump snappy to 1.1.10.4
flinkbot commented on PR #23460: URL: https://github.com/apache/flink/pull/23460#issuecomment-1734056213 ## CI report: * 3907cb98c6ec3a08e5c1c6ba86fd8f69e774e08d UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #23459: [BP-1.16][FLINK-33149][build] Bump snappy to 1.1.10.4
flinkbot commented on PR #23459: URL: https://github.com/apache/flink/pull/23459#issuecomment-1734054047 ## CI report: * ea9b0a0cff9ae7889e246989b2e5d27779f9eb0e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] RyanSkraba opened a new pull request, #23461: [BP-1.18][FLINK-33149][build] Bump snappy to 1.1.10.4
RyanSkraba opened a new pull request, #23461: URL: https://github.com/apache/flink/pull/23461 ## What is the purpose of the change Backport of https://github.com/apache/flink/pull/23458 to `release-1.18` Bump the version of snappy to address a vulnerability. ## Brief change log ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **yes** - The public API, i.e., is any changed class annotated with **no** - The serializers: **no** - The runtime per-record code paths (performance sensitive): **no** - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no** - The S3 file system connector: **no** ## Documentation - Does this pull request introduce a new feature? **no** - If yes, how is the feature documented? **not applicable** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] RyanSkraba opened a new pull request, #23460: [FLINK-33149][build] Bump snappy to 1.1.10.4
RyanSkraba opened a new pull request, #23460: URL: https://github.com/apache/flink/pull/23460 ## What is the purpose of the change Backport of https://github.com/apache/flink/pull/23458 to `release-1.17` Bump the version of snappy to address a vulnerability. ## Brief change log ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **yes** - The public API, i.e., is any changed class annotated with **no** - The serializers: **no** - The runtime per-record code paths (performance sensitive): **no** - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no** - The S3 file system connector: **no** ## Documentation - Does this pull request introduce a new feature? **no** - If yes, how is the feature documented? **not applicable** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] RyanSkraba opened a new pull request, #23459: [BP-1.16][FLINK-33149][build] Bump snappy to 1.1.10.4
RyanSkraba opened a new pull request, #23459: URL: https://github.com/apache/flink/pull/23459 ## What is the purpose of the change Backport of https://github.com/apache/flink/pull/23458 to `release-1.16` Bump the version of snappy to address a vulnerability. ## Brief change log ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **yes** - The public API, i.e., is any changed class annotated with **no** - The serializers: **no** - The runtime per-record code paths (performance sensitive): **no** - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no** - The S3 file system connector: **no** ## Documentation - Does this pull request introduce a new feature? **no** - If yes, how is the feature documented? **not applicable** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-26585) State Processor API: Loading a state set buffers the whole state set in memory before starting to process
[ https://issues.apache.org/jira/browse/FLINK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768751#comment-17768751 ] Matthias Schwalbe edited comment on FLINK-26585 at 9/25/23 3:41 PM: Thanks [~masteryhx] , ... Jira somehow swallowed my responses (me confused :)), I'll take care of it again tomorrow (likely) Thias was (Author: matthias schwalbe): Thanks [~masteryhx] , ... Jira somehow swallowed my responses (me confused :)), I'll take of it again tomorrow (likely) Thias > State Processor API: Loading a state set buffers the whole state set in > memory before starting to process > - > > Key: FLINK-26585 > URL: https://issues.apache.org/jira/browse/FLINK-26585 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Affects Versions: 1.13.0, 1.14.0, 1.15.0 >Reporter: Matthias Schwalbe >Assignee: Matthias Schwalbe >Priority: Major > Labels: pull-request-available > Attachments: MultiStateKeyIteratorNoStreams.java > > > * When loading a state, MultiStateKeyIterator load and bufferes the whole > state in memory before it event processes a single data point > ** This is absolutely no problem for small state (hence the unit tests work > fine) > ** MultiStateKeyIterator ctor sets up a java Stream that iterates all state > descriptors and flattens all datapoints contained within > ** The java.util.stream.Stream#flatMap function causes the buffering of the > whole data set when enumerated later on > ** See call stack [1] > *** I our case this is 150e6 data points (> 1GiB just for the pointers to > the data, let alone the data itself ~30GiB) > ** I’m not aware of some instrumentation of Stream in order to avoid the > problem, hence > ** I coded an alternative implementation of MultiStateKeyIterator that > avoids using java Stream, > ** I can contribute our implementation (MultiStateKeyIteratorNoStreams) > [1] > Streams call stack: > hasNext:77, RocksStateKeysIterator > (org.apache.flink.contrib.streaming.state.iterator) > next:82, RocksStateKeysIterator > (org.apache.flink.contrib.streaming.state.iterator) > forEachRemaining:116, Iterator (java.util) > forEachRemaining:1801, Spliterators$IteratorSpliterator (java.util) > forEach:580, ReferencePipeline$Head (java.util.stream) > accept:270, ReferencePipeline$7$1 (java.util.stream) > # Stream flatMap(final Function Stream> var1) > accept:373, ReferencePipeline$11$1 (java.util.stream) > # Stream peek(final Consumer var1) > accept:193, ReferencePipeline$3$1 (java.util.stream) > # Stream map(final Function > var1) > tryAdvance:1359, ArrayList$ArrayListSpliterator (java.util) > lambda$initPartialTraversalState$0:294, > StreamSpliterators$WrappingSpliterator (java.util.stream) > getAsBoolean:-1, 1528195520 > (java.util.stream.StreamSpliterators$WrappingSpliterator$$Lambda$57) > fillBuffer:206, StreamSpliterators$AbstractWrappingSpliterator > (java.util.stream) > doAdvance:161, StreamSpliterators$AbstractWrappingSpliterator > (java.util.stream) > tryAdvance:300, StreamSpliterators$WrappingSpliterator (java.util.stream) > hasNext:681, Spliterators$1Adapter (java.util) > hasNext:83, MultiStateKeyIterator (org.apache.flink.state.api.input) > hasNext:162, KeyedStateReaderOperator$NamespaceDecorator > (org.apache.flink.state.api.input.operator) > reachedEnd:215, KeyedStateInputFormat (org.apache.flink.state.api.input) > invoke:191, DataSourceTask (org.apache.flink.runtime.operators) > doRun:776, Task (org.apache.flink.runtime.taskmanager) > run:563, Task (org.apache.flink.runtime.taskmanager) > run:748, Thread (java.lang) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-26585) State Processor API: Loading a state set buffers the whole state set in memory before starting to process
[ https://issues.apache.org/jira/browse/FLINK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768751#comment-17768751 ] Matthias Schwalbe commented on FLINK-26585: --- Thanks [~masteryhx] , ... Jira somehow swallowed my responses (me confused :)), I'll take of it again tomorrow (likely) Thias > State Processor API: Loading a state set buffers the whole state set in > memory before starting to process > - > > Key: FLINK-26585 > URL: https://issues.apache.org/jira/browse/FLINK-26585 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Affects Versions: 1.13.0, 1.14.0, 1.15.0 >Reporter: Matthias Schwalbe >Assignee: Matthias Schwalbe >Priority: Major > Labels: pull-request-available > Attachments: MultiStateKeyIteratorNoStreams.java > > > * When loading a state, MultiStateKeyIterator load and bufferes the whole > state in memory before it event processes a single data point > ** This is absolutely no problem for small state (hence the unit tests work > fine) > ** MultiStateKeyIterator ctor sets up a java Stream that iterates all state > descriptors and flattens all datapoints contained within > ** The java.util.stream.Stream#flatMap function causes the buffering of the > whole data set when enumerated later on > ** See call stack [1] > *** I our case this is 150e6 data points (> 1GiB just for the pointers to > the data, let alone the data itself ~30GiB) > ** I’m not aware of some instrumentation of Stream in order to avoid the > problem, hence > ** I coded an alternative implementation of MultiStateKeyIterator that > avoids using java Stream, > ** I can contribute our implementation (MultiStateKeyIteratorNoStreams) > [1] > Streams call stack: > hasNext:77, RocksStateKeysIterator > (org.apache.flink.contrib.streaming.state.iterator) > next:82, RocksStateKeysIterator > (org.apache.flink.contrib.streaming.state.iterator) > forEachRemaining:116, Iterator (java.util) > forEachRemaining:1801, Spliterators$IteratorSpliterator (java.util) > forEach:580, ReferencePipeline$Head (java.util.stream) > accept:270, ReferencePipeline$7$1 (java.util.stream) > # Stream flatMap(final Function Stream> var1) > accept:373, ReferencePipeline$11$1 (java.util.stream) > # Stream peek(final Consumer var1) > accept:193, ReferencePipeline$3$1 (java.util.stream) > # Stream map(final Function > var1) > tryAdvance:1359, ArrayList$ArrayListSpliterator (java.util) > lambda$initPartialTraversalState$0:294, > StreamSpliterators$WrappingSpliterator (java.util.stream) > getAsBoolean:-1, 1528195520 > (java.util.stream.StreamSpliterators$WrappingSpliterator$$Lambda$57) > fillBuffer:206, StreamSpliterators$AbstractWrappingSpliterator > (java.util.stream) > doAdvance:161, StreamSpliterators$AbstractWrappingSpliterator > (java.util.stream) > tryAdvance:300, StreamSpliterators$WrappingSpliterator (java.util.stream) > hasNext:681, Spliterators$1Adapter (java.util) > hasNext:83, MultiStateKeyIterator (org.apache.flink.state.api.input) > hasNext:162, KeyedStateReaderOperator$NamespaceDecorator > (org.apache.flink.state.api.input.operator) > reachedEnd:215, KeyedStateInputFormat (org.apache.flink.state.api.input) > invoke:191, DataSourceTask (org.apache.flink.runtime.operators) > doRun:776, Task (org.apache.flink.runtime.taskmanager) > run:563, Task (org.apache.flink.runtime.taskmanager) > run:748, Thread (java.lang) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33150) add the processing logic for the long type
[ https://issues.apache.org/jira/browse/FLINK-33150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn Visser updated FLINK-33150: --- Fix Version/s: (was: 1.15.4) > add the processing logic for the long type > --- > > Key: FLINK-33150 > URL: https://issues.apache.org/jira/browse/FLINK-33150 > Project: Flink > Issue Type: New Feature > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.15.4 >Reporter: wenhao.yu >Priority: Minor > > The AvroToRowDataConverters class has a convertToDate method that will report > an error when it encounters time data represented by the long type, so add a > code to handle the long type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] jgagnon1 commented on pull request #23453: [FLINK-33128] Add converter.open() method call on TestValuesRuntimeFunctions
jgagnon1 commented on PR #23453: URL: https://github.com/apache/flink/pull/23453#issuecomment-1733850068 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] snuyanzin commented on pull request #23458: [FLINK-33149][build] Bump snappy to 1.1.10.4
snuyanzin commented on PR #23458: URL: https://github.com/apache/flink/pull/23458#issuecomment-1733693205 @RyanSkraba thanks for the contribution looks ok from my side could you please also provide backports to 1.16.x, 1.17.x, 1.18.x ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled
[ https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768679#comment-17768679 ] Yang Wang commented on FLINK-33155: --- Thanks for your comments. > Changing the default behavior from file to UGI can be a breaking change to > users which are depending on that some way What I mean is to get the delegation token from UGI instead of reading from file, just like we have already done in the {{{}YarnClusterDescriptor{}}}[1]. I am not sure why this will be a breaking change because the tokens in the {{ContainerLaunchContext}} are just same. [1]. [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L1334] > Flink ResourceManager continuously fails to start TM container on YARN when > Kerberos enabled > > > Key: FLINK-33155 > URL: https://issues.apache.org/jira/browse/FLINK-33155 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Reporter: Yang Wang >Priority: Major > > When Kerberos enabled(with key tab) and after one day(the container token > expired), Flink fails to create the TaskManager container on YARN due to the > following exception. > > {code:java} > 2023-09-25 16:48:50,030 INFO > org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - > Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: > Container container_1695106898104_0003_01_69 was invalid. Diagnostics: > [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN > owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com, > renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, > sequenceNumber=12, masterKeyId=3) can't be found in cache > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, > realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, > masterKeyId=3) can't be found in cache > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545) > at org.apache.hadoop.ipc.Client.call(Client.java:1491) > at org.apache.hadoop.ipc.Client.call(Client.java:1388) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) > at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) > at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588) > at > org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) > at >
[jira] [Commented] (FLINK-28303) Kafka SQL Connector loses data when restoring from a savepoint with a topic with empty partitions
[ https://issues.apache.org/jira/browse/FLINK-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768680#comment-17768680 ] Martijn Visser commented on FLINK-28303: [~tanjialiang] It would be good to double check this in the latest Kafka connector code, perhaps it's already addressed since the work was done on https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source > Kafka SQL Connector loses data when restoring from a savepoint with a topic > with empty partitions > - > > Key: FLINK-28303 > URL: https://issues.apache.org/jira/browse/FLINK-28303 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.4 >Reporter: Robert Metzger >Priority: Major > > Steps to reproduce: > - Set up a Kafka topic with 10 partitions > - produce records 0-9 into the topic > - take a savepoint and stop the job > - produce records 10-19 into the topic > - restore the job from the savepoint. > The job will be missing usually 2-4 records from 10-19. > My assumption is that if a partition never had data (which is likely with 10 > partitions and 10 records), the savepoint will only contain offsets for > partitions with data. > While the job was offline (and we've written record 10-19 into the topic), > all partitions got filled. Now, when Kafka comes online again, it will use > the "latest" offset for those partitions, skipping some data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled
[ https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768665#comment-17768665 ] Gabor Somogyi edited comment on FLINK-33155 at 9/25/23 12:20 PM: - Not updating UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION is a known limitation of YARN. If the mentioned code runs on the JM side and delegation tokens are enabled then it makes sense since the JM keeps it's tokens up-to-date all the time. Couple of notes: * Changing the default behavior from file to UGI can be a breaking change to users which are depending on that some way * DT handling is a single threaded operation but as I see TM creation uses multiple threads which may end-up in undefined behavior was (Author: gaborgsomogyi): Not updating UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION is a known limitation of YARN. If the mentioned code runs on the JM side and delegation tokens are enabled then it makes sense since the JM keeps it's tokens up-to-date all the time. Couple of notes: * Changing the default behavior from file to UGI can be a breaking change to users which are depending on that some way... > Flink ResourceManager continuously fails to start TM container on YARN when > Kerberos enabled > > > Key: FLINK-33155 > URL: https://issues.apache.org/jira/browse/FLINK-33155 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Reporter: Yang Wang >Priority: Major > > When Kerberos enabled(with key tab) and after one day(the container token > expired), Flink fails to create the TaskManager container on YARN due to the > following exception. > > {code:java} > 2023-09-25 16:48:50,030 INFO > org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - > Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: > Container container_1695106898104_0003_01_69 was invalid. Diagnostics: > [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN > owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com, > renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, > sequenceNumber=12, masterKeyId=3) can't be found in cache > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, > realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, > masterKeyId=3) can't be found in cache > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545) > at org.apache.hadoop.ipc.Client.call(Client.java:1491) > at org.apache.hadoop.ipc.Client.call(Client.java:1388) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) > at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) > at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588) > at > org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) > at
[jira] [Commented] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled
[ https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768665#comment-17768665 ] Gabor Somogyi commented on FLINK-33155: --- Not updating UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION is a known limitation of YARN. If the mentioned code runs on the JM side and delegation tokens are enabled then it makes sense since the JM keeps it's tokens up-to-date all the time. Couple of notes: * Changing the default behavior from file to UGI can be a breaking change to users which are depending on that some way... > Flink ResourceManager continuously fails to start TM container on YARN when > Kerberos enabled > > > Key: FLINK-33155 > URL: https://issues.apache.org/jira/browse/FLINK-33155 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Reporter: Yang Wang >Priority: Major > > When Kerberos enabled(with key tab) and after one day(the container token > expired), Flink fails to create the TaskManager container on YARN due to the > following exception. > > {code:java} > 2023-09-25 16:48:50,030 INFO > org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - > Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: > Container container_1695106898104_0003_01_69 was invalid. Diagnostics: > [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN > owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com, > renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, > sequenceNumber=12, masterKeyId=3) can't be found in cache > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, > realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, > masterKeyId=3) can't be found in cache > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545) > at org.apache.hadoop.ipc.Client.call(Client.java:1491) > at org.apache.hadoop.ipc.Client.call(Client.java:1388) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) > at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) > at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588) > at > org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243) > at >
[jira] [Updated] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled
[ https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated FLINK-33155: -- Description: When Kerberos enabled(with key tab) and after one day(the container token expired), Flink fails to create the TaskManager container on YARN due to the following exception. {code:java} 2023-09-25 16:48:50,030 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: Container container_1695106898104_0003_01_69 was invalid. Diagnostics: [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com, renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in cache org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in cache at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545) at org.apache.hadoop.ipc.Client.call(Client.java:1491) at org.apache.hadoop.ipc.Client.call(Client.java:1388) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588) at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) {code} The root cause might be that we are reading the delegation token from JM local file[1]. It will expire after one day. When the old TaskManager container crashes and ResourceManager tries to create a new one, the YARN NodeManager will use the expired token to localize the resources for TaskManager and then fail. Instead, we could read the latest valid token from
[jira] [Updated] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled
[ https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated FLINK-33155: -- Description: When Kerberos enabled(with key tab) and after one day(the container token expired), Flink fails to create the TaskManager container on YARN due to the following exception. {code:java} 2023-09-25 16:48:50,030 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: Container container_1695106898104_0003_01_69 was invalid. Diagnostics: [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com, renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in cache org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in cache at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545) at org.apache.hadoop.ipc.Client.call(Client.java:1491) at org.apache.hadoop.ipc.Client.call(Client.java:1388) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588) at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) {code} The root cause might be that we are reading the delegation token from JM local file[1]. It will expire after one day. When the old TaskManager container crashes and ResourceManager tries to create a new one, the YARN NodeManager will use the expired token to localize the resources for TaskManager and then fail. [1].
[jira] [Updated] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled
[ https://issues.apache.org/jira/browse/FLINK-33155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated FLINK-33155: -- Component/s: Deployment / YARN > Flink ResourceManager continuously fails to start TM container on YARN when > Kerberos enabled > > > Key: FLINK-33155 > URL: https://issues.apache.org/jira/browse/FLINK-33155 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Reporter: Yang Wang >Priority: Major > > When Kerberos enabled(with key tab) and after one day(the container token > expired), Flink fails to create the TaskManager container on YARN due to the > following exception. > > {code:java} > 2023-09-25 16:48:50,030 INFO > org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - > Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: > Container container_1695106898104_0003_01_69 was invalid. Diagnostics: > [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN > owner=, renewer=, realUser=, issueDate=1695196431487, > maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in > cacheorg.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (token for hadoop: HDFS_DELEGATION_TOKEN > owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com, > renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, > sequenceNumber=12, masterKeyId=3) can't be found in cacheat > org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)at > org.apache.hadoop.ipc.Client.call(Client.java:1491)at > org.apache.hadoop.ipc.Client.call(Client.java:1388)at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) > at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498)at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) > at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)at > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588) > at > org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)at > org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)at > java.security.AccessController.doPrivileged(Native Method)at > javax.security.auth.Subject.doAs(Subject.java:422)at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224) > at java.util.concurrent.FutureTask.run(FutureTask.java:266)at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)at >
[jira] [Created] (FLINK-33155) Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled
Yang Wang created FLINK-33155: - Summary: Flink ResourceManager continuously fails to start TM container on YARN when Kerberos enabled Key: FLINK-33155 URL: https://issues.apache.org/jira/browse/FLINK-33155 Project: Flink Issue Type: Bug Reporter: Yang Wang When Kerberos enabled(with key tab) and after one day(the container token expired), Flink fails to create the TaskManager container on YARN due to the following exception. {code:java} 2023-09-25 16:48:50,030 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Worker container_1695106898104_0003_01_69 is terminated. Diagnostics: Container container_1695106898104_0003_01_69 was invalid. Diagnostics: [2023-09-25 16:48:45.710]token (token for hadoop: HDFS_DELEGATION_TOKEN owner=, renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in cacheorg.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (token for hadoop: HDFS_DELEGATION_TOKEN owner=hadoop/master-1-1.c-5ee7bdc598b6e1cc.cn-beijing.emr.aliyuncs@emr.c-5ee7bdc598b6e1cc.com, renewer=, realUser=, issueDate=1695196431487, maxDate=1695801231487, sequenceNumber=12, masterKeyId=3) can't be found in cacheat org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1545)at org.apache.hadoop.ipc.Client.call(Client.java:1491)at org.apache.hadoop.ipc.Client.call(Client.java:1388)at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:907) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1666)at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1573) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1588) at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224) at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) {code} The root cause might be that we are reading the delegation token from JM local file[1]. It will expire after one day. When the old TaskManager container crashes and ResourceManager tries to create a new one,
[jira] [Commented] (FLINK-28303) Kafka SQL Connector loses data when restoring from a savepoint with a topic with empty partitions
[ https://issues.apache.org/jira/browse/FLINK-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768637#comment-17768637 ] tanjialiang commented on FLINK-28303: - Maybe this this is the reason? [FLINK-33153|https://issues.apache.org/jira/browse/FLINK-33153] > Kafka SQL Connector loses data when restoring from a savepoint with a topic > with empty partitions > - > > Key: FLINK-28303 > URL: https://issues.apache.org/jira/browse/FLINK-28303 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.4 >Reporter: Robert Metzger >Priority: Major > > Steps to reproduce: > - Set up a Kafka topic with 10 partitions > - produce records 0-9 into the topic > - take a savepoint and stop the job > - produce records 10-19 into the topic > - restore the job from the savepoint. > The job will be missing usually 2-4 records from 10-19. > My assumption is that if a partition never had data (which is likely with 10 > partitions and 10 records), the savepoint will only contain offsets for > partitions with data. > While the job was offline (and we've written record 10-19 into the topic), > all partitions got filled. Now, when Kafka comes online again, it will use > the "latest" offset for those partitions, skipping some data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33154) flink on k8s,An error occurred during consuming rocketmq
Monody created FLINK-33154: -- Summary: flink on k8s,An error occurred during consuming rocketmq Key: FLINK-33154 URL: https://issues.apache.org/jira/browse/FLINK-33154 Project: Flink Issue Type: Technical Debt Components: Kubernetes Operator Affects Versions: kubernetes-operator-1.1.0 Environment: flink-kubernetes-operator:https://github.com/apache/flink-kubernetes-operator#current-api-version-v1beta1 rocketmq-flink:https://github.com/apache/rocketmq-flink Reporter: Monody The following error occurs when flink consumes rocketmq. The flink job is running on k8s, and the projects used are: The projects used by flink to consume rocketmq are: The flink job runs normally on yarn, and no abnormality is found on the rocketmq server. Why does this happen? and how to solve it? !https://user-images.githubusercontent.com/47728686/265662530-231c500c-fd64-4679-9b0f-ff4a025dd766.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33153) Kafka using latest-offset maybe missing data
[ https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768631#comment-17768631 ] tanjialiang commented on FLINK-33153: - It relate to https://issues.apache.org/jira/browse/FLINK-28303, so i close this issue. > Kafka using latest-offset maybe missing data > > > Key: FLINK-33153 > URL: https://issues.apache.org/jira/browse/FLINK-33153 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: kafka-4.1.0 >Reporter: tanjialiang >Priority: Minor > > When Kafka start with the latest-offset strategy, it does not fetch the > latest snapshot offset and specify it for consumption. Instead, it sets the > startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes > currentOffset = -1, and call the KafkaConsumer's seekToEnd API). The > currentOffset is only set to the consumed offset + 1 when the task consumes > data, and this currentOffset is stored in the state during checkpointing. If > there are very few messages in Kafka and a partition has not consumed any > data, and I stop the task with a savepoint, then write data to that > partition, and start the task with the savepoint, the task will resume from > the saved state. Due to the startingOffset in the state being -1, it will > cause the task to miss the data that was written before the recovery point. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33153) Kafka using latest-offset maybe missing data
[ https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tanjialiang updated FLINK-33153: External issue URL: (was: https://issues.apache.org/jira/browse/FLINK-28303) > Kafka using latest-offset maybe missing data > > > Key: FLINK-33153 > URL: https://issues.apache.org/jira/browse/FLINK-33153 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: kafka-4.1.0 >Reporter: tanjialiang >Priority: Minor > > When Kafka start with the latest-offset strategy, it does not fetch the > latest snapshot offset and specify it for consumption. Instead, it sets the > startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes > currentOffset = -1, and call the KafkaConsumer's seekToEnd API). The > currentOffset is only set to the consumed offset + 1 when the task consumes > data, and this currentOffset is stored in the state during checkpointing. If > there are very few messages in Kafka and a partition has not consumed any > data, and I stop the task with a savepoint, then write data to that > partition, and start the task with the savepoint, the task will resume from > the saved state. Due to the startingOffset in the state being -1, it will > cause the task to miss the data that was written before the recovery point. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33153) Kafka using latest-offset maybe missing data
[ https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tanjialiang updated FLINK-33153: External issue URL: https://issues.apache.org/jira/browse/FLINK-28303 Release Note: (was: realte to https://issues.apache.org/jira/browse/FLINK-28303) > Kafka using latest-offset maybe missing data > > > Key: FLINK-33153 > URL: https://issues.apache.org/jira/browse/FLINK-33153 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: kafka-4.1.0 >Reporter: tanjialiang >Priority: Minor > > When Kafka start with the latest-offset strategy, it does not fetch the > latest snapshot offset and specify it for consumption. Instead, it sets the > startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes > currentOffset = -1, and call the KafkaConsumer's seekToEnd API). The > currentOffset is only set to the consumed offset + 1 when the task consumes > data, and this currentOffset is stored in the state during checkpointing. If > there are very few messages in Kafka and a partition has not consumed any > data, and I stop the task with a savepoint, then write data to that > partition, and start the task with the savepoint, the task will resume from > the saved state. Due to the startingOffset in the state being -1, it will > cause the task to miss the data that was written before the recovery point. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33153) Kafka using latest-offset maybe missing data
[ https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tanjialiang closed FLINK-33153. --- Release Note: realte to https://issues.apache.org/jira/browse/FLINK-28303 Resolution: Duplicate > Kafka using latest-offset maybe missing data > > > Key: FLINK-33153 > URL: https://issues.apache.org/jira/browse/FLINK-33153 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: kafka-4.1.0 >Reporter: tanjialiang >Priority: Minor > > When Kafka start with the latest-offset strategy, it does not fetch the > latest snapshot offset and specify it for consumption. Instead, it sets the > startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes > currentOffset = -1, and call the KafkaConsumer's seekToEnd API). The > currentOffset is only set to the consumed offset + 1 when the task consumes > data, and this currentOffset is stored in the state during checkpointing. If > there are very few messages in Kafka and a partition has not consumed any > data, and I stop the task with a savepoint, then write data to that > partition, and start the task with the savepoint, the task will resume from > the saved state. Due to the startingOffset in the state being -1, it will > cause the task to miss the data that was written before the recovery point. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] FangYongs commented on a diff in pull request #23455: [FLINK-25015][Table SQL/Client] change job name to sql when submitting queries using client
FangYongs commented on code in PR #23455: URL: https://github.com/apache/flink/pull/23455#discussion_r1335703134 ## flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java: ## @@ -1050,10 +1052,15 @@ private TableResultInternal executeInternal( } private TableResultInternal executeQueryOperation(QueryOperation operation) { +String querySql = null; +if (operation instanceof QuerySqlOperation) { +querySql = ((QuerySqlOperation) operation).getQuerySql(); +} CollectModifyOperation sinkOperation = new CollectModifyOperation(operation); List> transformations = translate(Collections.singletonList(sinkOperation)); -final String defaultJobName = "collect"; +final String defaultJobName = +StringUtils.isNullOrWhitespaceOnly(querySql) ? "collect" : querySql; Review Comment: Add test units to check that the job name is modified -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] FangYongs commented on a diff in pull request #23455: [FLINK-25015][Table SQL/Client] change job name to sql when submitting queries using client
FangYongs commented on code in PR #23455: URL: https://github.com/apache/flink/pull/23455#discussion_r1335701445 ## flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java: ## @@ -1050,10 +1052,15 @@ private TableResultInternal executeInternal( } private TableResultInternal executeQueryOperation(QueryOperation operation) { +String querySql = null; +if (operation instanceof QuerySqlOperation) { +querySql = ((QuerySqlOperation) operation).getQuerySql(); +} CollectModifyOperation sinkOperation = new CollectModifyOperation(operation); List> transformations = translate(Collections.singletonList(sinkOperation)); -final String defaultJobName = "collect"; +final String defaultJobName = Review Comment: Job name will be used in the checkpoint path, is it ok when there're some special characters in the sql statement? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33153) Kafka using latest-offset maybe missing data
tanjialiang created FLINK-33153: --- Summary: Kafka using latest-offset maybe missing data Key: FLINK-33153 URL: https://issues.apache.org/jira/browse/FLINK-33153 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: kafka-4.1.0 Reporter: tanjialiang When Kafka start with the latest-offset strategy, it does not fetch the latest snapshot offset and specify it for consumption. Instead, it sets the startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes currentOffset = -1, and call the KafkaConsumer's seekToEnd API). The currentOffset is only set to the consumed offset + 1 when the task consumes data, and this currentOffset is stored in the state during checkpointing. If there are very few messages in Kafka and a partition has not consumed any data, and I stop the task with a savepoint, then write data to that partition, and start the task with the savepoint, the task will resume from the saved state. Due to the startingOffset in the state being -1, it will cause the task to miss the data that was written before the recovery point. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33127) HeapKeyedStateBackend: use buffered I/O to speed up local recovery
[ https://issues.apache.org/jira/browse/FLINK-33127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768610#comment-17768610 ] Hangxiang Yu commented on FLINK-33127: -- Actually I have taked a second review last month but not received your response until now. Of course, I'm fine that we focused on FLINK-26585 firstly. Just Kindly ping about the duplicated ticket. > HeapKeyedStateBackend: use buffered I/O to speed up local recovery > -- > > Key: FLINK-33127 > URL: https://issues.apache.org/jira/browse/FLINK-33127 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Yangyang ZHANG >Assignee: Yangyang ZHANG >Priority: Major > Attachments: thread_dump.png > > > Recently, I observed a slow restore case in local recovery using hashmap > statebackend. > It took 147 seconds to restore from a 467MB snapshot, 9 times slower than > that (16s) when restore from remote fs. > The thread dump show that It read local snapshot file directly by unbuffered > FileInputStream / fs.local.LocalDataInputStream. > !thread_dump.png! > Maybe we can wrap with BufferInputStream to speed up local recovery. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-26585) State Processor API: Loading a state set buffers the whole state set in memory before starting to process
[ https://issues.apache.org/jira/browse/FLINK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768607#comment-17768607 ] Hangxiang Yu commented on FLINK-26585: -- [~Matthias Schwalbe] I have taked a second review last month but not received response from you. You could check the newest comments from me last month. > State Processor API: Loading a state set buffers the whole state set in > memory before starting to process > - > > Key: FLINK-26585 > URL: https://issues.apache.org/jira/browse/FLINK-26585 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Affects Versions: 1.13.0, 1.14.0, 1.15.0 >Reporter: Matthias Schwalbe >Assignee: Matthias Schwalbe >Priority: Major > Labels: pull-request-available > Attachments: MultiStateKeyIteratorNoStreams.java > > > * When loading a state, MultiStateKeyIterator load and bufferes the whole > state in memory before it event processes a single data point > ** This is absolutely no problem for small state (hence the unit tests work > fine) > ** MultiStateKeyIterator ctor sets up a java Stream that iterates all state > descriptors and flattens all datapoints contained within > ** The java.util.stream.Stream#flatMap function causes the buffering of the > whole data set when enumerated later on > ** See call stack [1] > *** I our case this is 150e6 data points (> 1GiB just for the pointers to > the data, let alone the data itself ~30GiB) > ** I’m not aware of some instrumentation of Stream in order to avoid the > problem, hence > ** I coded an alternative implementation of MultiStateKeyIterator that > avoids using java Stream, > ** I can contribute our implementation (MultiStateKeyIteratorNoStreams) > [1] > Streams call stack: > hasNext:77, RocksStateKeysIterator > (org.apache.flink.contrib.streaming.state.iterator) > next:82, RocksStateKeysIterator > (org.apache.flink.contrib.streaming.state.iterator) > forEachRemaining:116, Iterator (java.util) > forEachRemaining:1801, Spliterators$IteratorSpliterator (java.util) > forEach:580, ReferencePipeline$Head (java.util.stream) > accept:270, ReferencePipeline$7$1 (java.util.stream) > # Stream flatMap(final Function Stream> var1) > accept:373, ReferencePipeline$11$1 (java.util.stream) > # Stream peek(final Consumer var1) > accept:193, ReferencePipeline$3$1 (java.util.stream) > # Stream map(final Function > var1) > tryAdvance:1359, ArrayList$ArrayListSpliterator (java.util) > lambda$initPartialTraversalState$0:294, > StreamSpliterators$WrappingSpliterator (java.util.stream) > getAsBoolean:-1, 1528195520 > (java.util.stream.StreamSpliterators$WrappingSpliterator$$Lambda$57) > fillBuffer:206, StreamSpliterators$AbstractWrappingSpliterator > (java.util.stream) > doAdvance:161, StreamSpliterators$AbstractWrappingSpliterator > (java.util.stream) > tryAdvance:300, StreamSpliterators$WrappingSpliterator (java.util.stream) > hasNext:681, Spliterators$1Adapter (java.util) > hasNext:83, MultiStateKeyIterator (org.apache.flink.state.api.input) > hasNext:162, KeyedStateReaderOperator$NamespaceDecorator > (org.apache.flink.state.api.input.operator) > reachedEnd:215, KeyedStateInputFormat (org.apache.flink.state.api.input) > invoke:191, DataSourceTask (org.apache.flink.runtime.operators) > doRun:776, Task (org.apache.flink.runtime.taskmanager) > run:563, Task (org.apache.flink.runtime.taskmanager) > run:748, Thread (java.lang) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-26585) State Processor API: Loading a state set buffers the whole state set in memory before starting to process
[ https://issues.apache.org/jira/browse/FLINK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768604#comment-17768604 ] Matthias Schwalbe commented on FLINK-26585: --- Hi [~masteryhx] , The PR [#23239|https://github.com/apache/flink/pull/23239] seems to be stuck for a while now, awaiting second review. May I kindly ask you to take care of that. I believe I answered all open questions... Sincere greeting Thias > State Processor API: Loading a state set buffers the whole state set in > memory before starting to process > - > > Key: FLINK-26585 > URL: https://issues.apache.org/jira/browse/FLINK-26585 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Affects Versions: 1.13.0, 1.14.0, 1.15.0 >Reporter: Matthias Schwalbe >Assignee: Matthias Schwalbe >Priority: Major > Labels: pull-request-available > Attachments: MultiStateKeyIteratorNoStreams.java > > > * When loading a state, MultiStateKeyIterator load and bufferes the whole > state in memory before it event processes a single data point > ** This is absolutely no problem for small state (hence the unit tests work > fine) > ** MultiStateKeyIterator ctor sets up a java Stream that iterates all state > descriptors and flattens all datapoints contained within > ** The java.util.stream.Stream#flatMap function causes the buffering of the > whole data set when enumerated later on > ** See call stack [1] > *** I our case this is 150e6 data points (> 1GiB just for the pointers to > the data, let alone the data itself ~30GiB) > ** I’m not aware of some instrumentation of Stream in order to avoid the > problem, hence > ** I coded an alternative implementation of MultiStateKeyIterator that > avoids using java Stream, > ** I can contribute our implementation (MultiStateKeyIteratorNoStreams) > [1] > Streams call stack: > hasNext:77, RocksStateKeysIterator > (org.apache.flink.contrib.streaming.state.iterator) > next:82, RocksStateKeysIterator > (org.apache.flink.contrib.streaming.state.iterator) > forEachRemaining:116, Iterator (java.util) > forEachRemaining:1801, Spliterators$IteratorSpliterator (java.util) > forEach:580, ReferencePipeline$Head (java.util.stream) > accept:270, ReferencePipeline$7$1 (java.util.stream) > # Stream flatMap(final Function Stream> var1) > accept:373, ReferencePipeline$11$1 (java.util.stream) > # Stream peek(final Consumer var1) > accept:193, ReferencePipeline$3$1 (java.util.stream) > # Stream map(final Function > var1) > tryAdvance:1359, ArrayList$ArrayListSpliterator (java.util) > lambda$initPartialTraversalState$0:294, > StreamSpliterators$WrappingSpliterator (java.util.stream) > getAsBoolean:-1, 1528195520 > (java.util.stream.StreamSpliterators$WrappingSpliterator$$Lambda$57) > fillBuffer:206, StreamSpliterators$AbstractWrappingSpliterator > (java.util.stream) > doAdvance:161, StreamSpliterators$AbstractWrappingSpliterator > (java.util.stream) > tryAdvance:300, StreamSpliterators$WrappingSpliterator (java.util.stream) > hasNext:681, Spliterators$1Adapter (java.util) > hasNext:83, MultiStateKeyIterator (org.apache.flink.state.api.input) > hasNext:162, KeyedStateReaderOperator$NamespaceDecorator > (org.apache.flink.state.api.input.operator) > reachedEnd:215, KeyedStateInputFormat (org.apache.flink.state.api.input) > invoke:191, DataSourceTask (org.apache.flink.runtime.operators) > doRun:776, Task (org.apache.flink.runtime.taskmanager) > run:563, Task (org.apache.flink.runtime.taskmanager) > run:748, Thread (java.lang) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33127) HeapKeyedStateBackend: use buffered I/O to speed up local recovery
[ https://issues.apache.org/jira/browse/FLINK-33127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768602#comment-17768602 ] Matthias Schwalbe commented on FLINK-33127: --- [~masteryhx] : I actually want to finish FLINK-26585 first before I can start FLINK-26586 (capacity) FLINK-26585 is somewhat hung in approval of PR without any progress for a couple of weeks. (will ping you on that ticket in a second) Thias > HeapKeyedStateBackend: use buffered I/O to speed up local recovery > -- > > Key: FLINK-33127 > URL: https://issues.apache.org/jira/browse/FLINK-33127 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Yangyang ZHANG >Assignee: Yangyang ZHANG >Priority: Major > Attachments: thread_dump.png > > > Recently, I observed a slow restore case in local recovery using hashmap > statebackend. > It took 147 seconds to restore from a 467MB snapshot, 9 times slower than > that (16s) when restore from remote fs. > The thread dump show that It read local snapshot file directly by unbuffered > FileInputStream / fs.local.LocalDataInputStream. > !thread_dump.png! > Maybe we can wrap with BufferInputStream to speed up local recovery. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33151) Prometheus Sink Connector - Create Github Repo
[ https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer resolved FLINK-33151. --- Resolution: Done > Prometheus Sink Connector - Create Github Repo > -- > > Key: FLINK-33151 > URL: https://issues.apache.org/jira/browse/FLINK-33151 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Prometheus >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Major > Fix For: prometheus-connector-1.0.0 > > > Create the \{{flink-connector-prometheus}} repo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33151) Prometheus Sink Connector - Create Github Repo
[ https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768596#comment-17768596 ] Danny Cranmer commented on FLINK-33151: --- https://github.com/apache/flink-connector-prometheus > Prometheus Sink Connector - Create Github Repo > -- > > Key: FLINK-33151 > URL: https://issues.apache.org/jira/browse/FLINK-33151 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Prometheus >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Major > Fix For: prometheus-connector-1.0.0 > > > Create the \{{flink-connector-prometheus}} repo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33152) Prometheus Sink Connector - Integration tests
[ https://issues.apache.org/jira/browse/FLINK-33152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33152: -- Component/s: Connectors / Prometheus > Prometheus Sink Connector - Integration tests > - > > Key: FLINK-33152 > URL: https://issues.apache.org/jira/browse/FLINK-33152 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Prometheus >Reporter: Lorenzo Nicora >Priority: Major > Fix For: prometheus-connector-1.0.0 > > > Integration tests against containerised Prometheus -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31757) FLIP-370: Support Balanced Tasks Scheduling
[ https://issues.apache.org/jira/browse/FLINK-31757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Fan updated FLINK-31757: Description: This is an umbrella JIRA of [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw]. (was: This is a umbrella JIRA of [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw]. Supposed we have a Job with 21 {{{}JobVertex{}}}. The parallelism of vertex A is 100, and the others are 5. If each {{TaskManager}} only have one slot, then we need 100 TMs. There will be 5 slots with 21 sub-tasks, and the others will only have one sub-task of A. Does this mean we have to make a trade-off between wasted resources and insufficient resources? >From a resource utilization point of view, we expect all subtasks to be evenly >distributed on each TM.) > FLIP-370: Support Balanced Tasks Scheduling > --- > > Key: FLINK-31757 > URL: https://issues.apache.org/jira/browse/FLINK-31757 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: RocMarshal >Assignee: RocMarshal >Priority: Major > Labels: pull-request-available > Attachments: image-2023-04-13-08-04-04-667.png > > > This is an umbrella JIRA of > [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33139) Prometheus Sink Connector - Table API support
[ https://issues.apache.org/jira/browse/FLINK-33139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33139: -- Fix Version/s: prometheus-connector-1.0.0 > Prometheus Sink Connector - Table API support > - > > Key: FLINK-33139 > URL: https://issues.apache.org/jira/browse/FLINK-33139 > Project: Flink > Issue Type: Sub-task >Reporter: Lorenzo Nicora >Priority: Major > Fix For: prometheus-connector-1.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33152) Prometheus Sink Connector - Integration tests
[ https://issues.apache.org/jira/browse/FLINK-33152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33152: -- Fix Version/s: prometheus-connector-1.0.0 > Prometheus Sink Connector - Integration tests > - > > Key: FLINK-33152 > URL: https://issues.apache.org/jira/browse/FLINK-33152 > Project: Flink > Issue Type: Sub-task >Reporter: Lorenzo Nicora >Priority: Major > Fix For: prometheus-connector-1.0.0 > > > Integration tests against containerised Prometheus -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33151) Prometheus Sink Connector - Create Github Repo
[ https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33151: -- Component/s: Connectors / Prometheus > Prometheus Sink Connector - Create Github Repo > -- > > Key: FLINK-33151 > URL: https://issues.apache.org/jira/browse/FLINK-33151 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Prometheus >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Major > Fix For: prometheus-connector-1.0.0 > > > Create the \{{flink-connector-prometheus}} repo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33151) Prometheus Sink Connector - Create Github Repo
[ https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33151: -- Fix Version/s: prometheus-connector-1.0.0 > Prometheus Sink Connector - Create Github Repo > -- > > Key: FLINK-33151 > URL: https://issues.apache.org/jira/browse/FLINK-33151 > Project: Flink > Issue Type: Sub-task >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Major > Fix For: prometheus-connector-1.0.0 > > > Create the \{{flink-connector-prometheus}} repo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33141) Promentheus Sink Connector - Amazon Managed Prometheus Request Signer
[ https://issues.apache.org/jira/browse/FLINK-33141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33141: -- Fix Version/s: (was: prometheus-connector-1.0.0) > Promentheus Sink Connector - Amazon Managed Prometheus Request Signer > - > > Key: FLINK-33141 > URL: https://issues.apache.org/jira/browse/FLINK-33141 > Project: Flink > Issue Type: Sub-task > Components: Connectors / AWS >Reporter: Lorenzo Nicora >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33139) Prometheus Sink Connector - Table API support
[ https://issues.apache.org/jira/browse/FLINK-33139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33139: -- Component/s: Connectors / Prometheus > Prometheus Sink Connector - Table API support > - > > Key: FLINK-33139 > URL: https://issues.apache.org/jira/browse/FLINK-33139 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Prometheus >Reporter: Lorenzo Nicora >Priority: Major > Fix For: prometheus-connector-1.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33141) Promentheus Sink Connector - Amazon Managed Prometheus Request Signer
[ https://issues.apache.org/jira/browse/FLINK-33141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33141: -- Fix Version/s: prometheus-connector-1.0.0 > Promentheus Sink Connector - Amazon Managed Prometheus Request Signer > - > > Key: FLINK-33141 > URL: https://issues.apache.org/jira/browse/FLINK-33141 > Project: Flink > Issue Type: Sub-task > Components: Connectors / AWS >Reporter: Lorenzo Nicora >Priority: Major > Fix For: prometheus-connector-1.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33138) Prometheus Connector Sink - DataStream API implementation
[ https://issues.apache.org/jira/browse/FLINK-33138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33138: -- Fix Version/s: prometheus-connector-1.0.0 > Prometheus Connector Sink - DataStream API implementation > - > > Key: FLINK-33138 > URL: https://issues.apache.org/jira/browse/FLINK-33138 > Project: Flink > Issue Type: Sub-task >Reporter: Lorenzo Nicora >Priority: Major > Fix For: prometheus-connector-1.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33138) Prometheus Connector Sink - DataStream API implementation
[ https://issues.apache.org/jira/browse/FLINK-33138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33138: -- Component/s: Connectors / Prometheus > Prometheus Connector Sink - DataStream API implementation > - > > Key: FLINK-33138 > URL: https://issues.apache.org/jira/browse/FLINK-33138 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Prometheus >Reporter: Lorenzo Nicora >Priority: Major > Fix For: prometheus-connector-1.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33137) FLIP-312: Prometheus Sink Connector
[ https://issues.apache.org/jira/browse/FLINK-33137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33137: -- Component/s: Connectors / Prometheus > FLIP-312: Prometheus Sink Connector > --- > > Key: FLINK-33137 > URL: https://issues.apache.org/jira/browse/FLINK-33137 > Project: Flink > Issue Type: New Feature > Components: Connectors / Prometheus >Reporter: Lorenzo Nicora >Assignee: Lorenzo Nicora >Priority: Major > Labels: Connector > Fix For: prometheus-connector-1.0.0 > > > Umbrella Jira for implementation of Prometheus Sink Connector > https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33137) FLIP-312: Prometheus Sink Connector
[ https://issues.apache.org/jira/browse/FLINK-33137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33137: -- Fix Version/s: prometheus-connector-1.0.0 > FLIP-312: Prometheus Sink Connector > --- > > Key: FLINK-33137 > URL: https://issues.apache.org/jira/browse/FLINK-33137 > Project: Flink > Issue Type: New Feature >Reporter: Lorenzo Nicora >Assignee: Lorenzo Nicora >Priority: Major > Labels: Connector > Fix For: prometheus-connector-1.0.0 > > > Umbrella Jira for implementation of Prometheus Sink Connector > https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #23458: [FLINK-33149][build] Bump snappy to 1.1.10.4
flinkbot commented on PR #23458: URL: https://github.com/apache/flink/pull/23458#issuecomment-1733193724 ## CI report: * 4debb4036b86485b39be61cb284e7ba6e8daaba7 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-31757) FLIP-370: Support Balanced Tasks Scheduling
[ https://issues.apache.org/jira/browse/FLINK-31757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Fan updated FLINK-31757: Summary: FLIP-370: Support Balanced Tasks Scheduling (was: Support Balanced Tasks Scheduling) > FLIP-370: Support Balanced Tasks Scheduling > --- > > Key: FLINK-31757 > URL: https://issues.apache.org/jira/browse/FLINK-31757 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: RocMarshal >Assignee: RocMarshal >Priority: Major > Labels: pull-request-available > Attachments: image-2023-04-13-08-04-04-667.png > > > This is a umbrella JIRA of > [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw]. > > Supposed we have a Job with 21 {{{}JobVertex{}}}. The parallelism of vertex A > is 100, and the others are 5. If each {{TaskManager}} only have one slot, > then we need 100 TMs. > There will be 5 slots with 21 sub-tasks, and the others will only have one > sub-task of A. Does this mean we have to make a trade-off between wasted > resources and insufficient resources? > From a resource utilization point of view, we expect all subtasks to be > evenly distributed on each TM. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31757) Support Balanced Tasks Scheduling
[ https://issues.apache.org/jira/browse/FLINK-31757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Fan updated FLINK-31757: Description: This is a umbrella JIRA of [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw]. Supposed we have a Job with 21 {{{}JobVertex{}}}. The parallelism of vertex A is 100, and the others are 5. If each {{TaskManager}} only have one slot, then we need 100 TMs. There will be 5 slots with 21 sub-tasks, and the others will only have one sub-task of A. Does this mean we have to make a trade-off between wasted resources and insufficient resources? >From a resource utilization point of view, we expect all subtasks to be evenly >distributed on each TM. was: Supposed we have a Job with 21 {{JobVertex}}. The parallelism of vertex A is 100, and the others are 5. If each {{TaskManager}} only have one slot, then we need 100 TMs. There will be 5 slots with 21 sub-tasks, and the others will only have one sub-task of A. Does this mean we have to make a trade-off between wasted resources and insufficient resources? >From a resource utilization point of view, we expect all subtasks to be evenly >distributed on each TM. > Support Balanced Tasks Scheduling > - > > Key: FLINK-31757 > URL: https://issues.apache.org/jira/browse/FLINK-31757 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: RocMarshal >Assignee: RocMarshal >Priority: Major > Labels: pull-request-available > Attachments: image-2023-04-13-08-04-04-667.png > > > This is a umbrella JIRA of > [FLIP-370|https://cwiki.apache.org/confluence/x/U56zDw]. > > Supposed we have a Job with 21 {{{}JobVertex{}}}. The parallelism of vertex A > is 100, and the others are 5. If each {{TaskManager}} only have one slot, > then we need 100 TMs. > There will be 5 slots with 21 sub-tasks, and the others will only have one > sub-task of A. Does this mean we have to make a trade-off between wasted > resources and insufficient resources? > From a resource utilization point of view, we expect all subtasks to be > evenly distributed on each TM. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33149) Bump snappy-java to 1.1.10.4
[ https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33149: --- Labels: pull-request-available (was: ) > Bump snappy-java to 1.1.10.4 > > > Key: FLINK-33149 > URL: https://issues.apache.org/jira/browse/FLINK-33149 > Project: Flink > Issue Type: Bug > Components: API / Core, Connectors / AWS, Connectors / HBase, > Connectors / Kafka, Stateful Functions >Affects Versions: 1.18.0, 1.16.3, 1.17.2 >Reporter: Ryan Skraba >Priority: Major > Labels: pull-request-available > > Xerial published a security alert for a Denial of Service attack that [exists > on > 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv]. > This is included in flink-dist, but also in flink-statefun, and several > connectors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] RyanSkraba opened a new pull request, #23458: [FLINK-33149][build] Bump snappy to 1.1.10.4
RyanSkraba opened a new pull request, #23458: URL: https://github.com/apache/flink/pull/23458 ## What is the purpose of the change Bump the version of snappy to address a vulnerability. ## Brief change log ## Verifying this change This change is already covered by existing tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **yes** - The public API, i.e., is any changed class annotated with **no** - The serializers: **no** - The runtime per-record code paths (performance sensitive): **no** - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **no** - The S3 file system connector: **no** ## Documentation - Does this pull request introduce a new feature? **no** - If yes, how is the feature documented? **not applicable** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33152) Prometheus Sink Connector - Integration tests
Lorenzo Nicora created FLINK-33152: -- Summary: Prometheus Sink Connector - Integration tests Key: FLINK-33152 URL: https://issues.apache.org/jira/browse/FLINK-33152 Project: Flink Issue Type: Sub-task Reporter: Lorenzo Nicora Integration tests against containerised Prometheus -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33140) Prometheus Sink Connector - E2E example on AWS
[ https://issues.apache.org/jira/browse/FLINK-33140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorenzo Nicora updated FLINK-33140: --- Summary: Prometheus Sink Connector - E2E example on AWS (was: Prometheus Sink Connector - E2E example) > Prometheus Sink Connector - E2E example on AWS > -- > > Key: FLINK-33140 > URL: https://issues.apache.org/jira/browse/FLINK-33140 > Project: Flink > Issue Type: Sub-task >Reporter: Lorenzo Nicora >Priority: Major > > End-to-end example application, to be deployed on Amazon Managed Service for > Apache Flink, and writing to Amazon Managed Prometheus -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33140) Prometheus Sink Connector - E2E example on AWS
[ https://issues.apache.org/jira/browse/FLINK-33140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorenzo Nicora updated FLINK-33140: --- Description: End-to-end example application, deployable on Amazon Managed Service for Apache Flink, and writing to Amazon Managed Prometheus (was: End-to-end example application, to be deployed on Amazon Managed Service for Apache Flink, and writing to Amazon Managed Prometheus) > Prometheus Sink Connector - E2E example on AWS > -- > > Key: FLINK-33140 > URL: https://issues.apache.org/jira/browse/FLINK-33140 > Project: Flink > Issue Type: Sub-task >Reporter: Lorenzo Nicora >Priority: Major > > End-to-end example application, deployable on Amazon Managed Service for > Apache Flink, and writing to Amazon Managed Prometheus -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33140) Prometheus Sink Connector - E2E example
[ https://issues.apache.org/jira/browse/FLINK-33140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorenzo Nicora updated FLINK-33140: --- Summary: Prometheus Sink Connector - E2E example (was: Prometheus Sink Connector - E2E test) > Prometheus Sink Connector - E2E example > --- > > Key: FLINK-33140 > URL: https://issues.apache.org/jira/browse/FLINK-33140 > Project: Flink > Issue Type: Sub-task >Reporter: Lorenzo Nicora >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33140) Prometheus Sink Connector - E2E example
[ https://issues.apache.org/jira/browse/FLINK-33140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorenzo Nicora updated FLINK-33140: --- Description: End-to-end example application, to be deployed on Amazon Managed Service for Apache Flink, and writing to Amazon Managed Prometheus > Prometheus Sink Connector - E2E example > --- > > Key: FLINK-33140 > URL: https://issues.apache.org/jira/browse/FLINK-33140 > Project: Flink > Issue Type: Sub-task >Reporter: Lorenzo Nicora >Priority: Major > > End-to-end example application, to be deployed on Amazon Managed Service for > Apache Flink, and writing to Amazon Managed Prometheus -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33151) Prometheus Sink Connector - Create Github Repo
Danny Cranmer created FLINK-33151: - Summary: Prometheus Sink Connector - Create Github Repo Key: FLINK-33151 URL: https://issues.apache.org/jira/browse/FLINK-33151 Project: Flink Issue Type: Sub-task Reporter: Danny Cranmer Create the \{{flink-connector-prometheus}} repo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33151) Prometheus Sink Connector - Create Github Repo
[ https://issues.apache.org/jira/browse/FLINK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer reassigned FLINK-33151: - Assignee: Danny Cranmer > Prometheus Sink Connector - Create Github Repo > -- > > Key: FLINK-33151 > URL: https://issues.apache.org/jira/browse/FLINK-33151 > Project: Flink > Issue Type: Sub-task >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Major > > Create the \{{flink-connector-prometheus}} repo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33150) add the processing logic for the long type
wenhao.yu created FLINK-33150: - Summary: add the processing logic for the long type Key: FLINK-33150 URL: https://issues.apache.org/jira/browse/FLINK-33150 Project: Flink Issue Type: New Feature Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.15.4 Reporter: wenhao.yu Fix For: 1.15.4 The AvroToRowDataConverters class has a convertToDate method that will report an error when it encounters time data represented by the long type, so add a code to handle the long type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33137) FLIP-312: Prometheus Sink Connector
[ https://issues.apache.org/jira/browse/FLINK-33137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Liang Teoh reassigned FLINK-33137: --- Assignee: Lorenzo Nicora > FLIP-312: Prometheus Sink Connector > --- > > Key: FLINK-33137 > URL: https://issues.apache.org/jira/browse/FLINK-33137 > Project: Flink > Issue Type: New Feature >Reporter: Lorenzo Nicora >Assignee: Lorenzo Nicora >Priority: Major > Labels: Connector > > Umbrella Jira for implementation of Prometheus Sink Connector > https://cwiki.apache.org/confluence/display/FLINK/FLIP-312:+Prometheus+Sink+Connector -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33149) Bump snappy-java to 1.1.10.4
Ryan Skraba created FLINK-33149: --- Summary: Bump snappy-java to 1.1.10.4 Key: FLINK-33149 URL: https://issues.apache.org/jira/browse/FLINK-33149 Project: Flink Issue Type: Bug Components: API / Core, Connectors / AWS, Connectors / HBase, Connectors / Kafka, Stateful Functions Affects Versions: 1.18.0, 1.16.3, 1.17.2 Reporter: Ryan Skraba Xerial published a security alert for a Denial of Service attack that [exists on 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv]. This is included in flink-dist, but also in flink-statefun, and several connectors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] echauchot commented on pull request #23443: [FLINK-33059] Support transparent compression for file-connector for all file input formats
echauchot commented on PR #23443: URL: https://github.com/apache/flink/pull/23443#issuecomment-1733119379 Hi @rmetzger, I saw you authored parts of this code, can you please do a review or point me to another reviewer ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-33107) Update stable-spec upgrade mode on reconciled-spec change
[ https://issues.apache.org/jira/browse/FLINK-33107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora closed FLINK-33107. -- Fix Version/s: kubernetes-operator-1.7.0 Resolution: Fixed merged to main 662fa612a8ab352e43ab8a99fa61aadfbe41e4d7 > Update stable-spec upgrade mode on reconciled-spec change > - > > Key: FLINK-33107 > URL: https://issues.apache.org/jira/browse/FLINK-33107 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.6.0, kubernetes-operator-1.7.0 >Reporter: Gyula Fora >Assignee: Gyula Fora >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-1.7.0 > > > Since now the rollback mechanism uses the regular upgrade flow, we need to > ensure that the lastStableSpec upgrade mode is kept in sync with the > lastReconciled spec to ensure correct stateful upgrades. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-kubernetes-operator] gyfora merged pull request #681: [FLINK-33107] Use correct upgrade mode when executing rollback, simpl…
gyfora merged PR #681: URL: https://github.com/apache/flink-kubernetes-operator/pull/681 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org