[jira] [Commented] (FLINK-35039) Create Profiling JobManager/TaskManager Instance failed
[ https://issues.apache.org/jira/browse/FLINK-35039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838951#comment-17838951 ] Yun Tang commented on FLINK-35039: -- [~wczhu] Already assigned to you. > Create Profiling JobManager/TaskManager Instance failed > --- > > Key: FLINK-35039 > URL: https://issues.apache.org/jira/browse/FLINK-35039 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 > Environment: Hadoop 3.2.2 > Flink 1.19 >Reporter: ude >Assignee: ude >Priority: Major > Labels: pull-request-available > Attachments: image-2024-04-08-10-21-31-066.png, > image-2024-04-08-10-21-48-417.png, image-2024-04-08-10-30-16-683.png > > > I'm test the "async-profiler" feature in version 1.19, but when I submit a > task in yarn per-job mode, I get an error when I click Create Profiling > Instance on the flink Web UI page. > !image-2024-04-08-10-21-31-066.png! > !image-2024-04-08-10-21-48-417.png! > The error message obviously means that the yarn proxy server does not support > *POST* calls. I checked the code of _*WebAppProxyServlet.java*_ and found > that the *POST* method is indeed not supported, so I changed it to *PUT* > method and the call was successful. > !image-2024-04-08-10-30-16-683.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35039) Create Profiling JobManager/TaskManager Instance failed
[ https://issues.apache.org/jira/browse/FLINK-35039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-35039: Assignee: ude > Create Profiling JobManager/TaskManager Instance failed > --- > > Key: FLINK-35039 > URL: https://issues.apache.org/jira/browse/FLINK-35039 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 > Environment: Hadoop 3.2.2 > Flink 1.19 >Reporter: ude >Assignee: ude >Priority: Major > Labels: pull-request-available > Attachments: image-2024-04-08-10-21-31-066.png, > image-2024-04-08-10-21-48-417.png, image-2024-04-08-10-30-16-683.png > > > I'm test the "async-profiler" feature in version 1.19, but when I submit a > task in yarn per-job mode, I get an error when I click Create Profiling > Instance on the flink Web UI page. > !image-2024-04-08-10-21-31-066.png! > !image-2024-04-08-10-21-48-417.png! > The error message obviously means that the yarn proxy server does not support > *POST* calls. I checked the code of _*WebAppProxyServlet.java*_ and found > that the *POST* method is indeed not supported, so I changed it to *PUT* > method and the call was successful. > !image-2024-04-08-10-30-16-683.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-35111) Modify the spelling mistakes in the taskmanager html
[ https://issues.apache.org/jira/browse/FLINK-35111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-35111: - Affects Version/s: (was: 1.19.0) > Modify the spelling mistakes in the taskmanager html > > > Key: FLINK-35111 > URL: https://issues.apache.org/jira/browse/FLINK-35111 > Project: Flink > Issue Type: Improvement > Components: Runtime / Web Frontend >Reporter: ude >Assignee: Yun Tang >Priority: Major > Fix For: 1.19.0 > > > Fix the spelling error from "profiler" to "profiling" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-35111) Modify the spelling mistakes in the taskmanager html
[ https://issues.apache.org/jira/browse/FLINK-35111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-35111: - Fix Version/s: 1.19.0 (was: 1.20.0) > Modify the spelling mistakes in the taskmanager html > > > Key: FLINK-35111 > URL: https://issues.apache.org/jira/browse/FLINK-35111 > Project: Flink > Issue Type: Improvement > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: ude >Assignee: Yun Tang >Priority: Major > Fix For: 1.19.0 > > > Fix the spelling error from "profiler" to "profiling" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35039) Create Profiling JobManager/TaskManager Instance failed
[ https://issues.apache.org/jira/browse/FLINK-35039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838867#comment-17838867 ] Yun Tang commented on FLINK-35039: -- [~wczhu] Thanks for finding this problem in YARN environment. I just feel curious why YARN does not support POST? > Create Profiling JobManager/TaskManager Instance failed > --- > > Key: FLINK-35039 > URL: https://issues.apache.org/jira/browse/FLINK-35039 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 > Environment: Hadoop 3.2.2 > Flink 1.19 >Reporter: ude >Priority: Major > Attachments: image-2024-04-08-10-21-31-066.png, > image-2024-04-08-10-21-48-417.png, image-2024-04-08-10-30-16-683.png > > > I'm test the "async-profiler" feature in version 1.19, but when I submit a > task in yarn per-job mode, I get an error when I click Create Profiling > Instance on the flink Web UI page. > !image-2024-04-08-10-21-31-066.png! > !image-2024-04-08-10-21-48-417.png! > The error message obviously means that the yarn proxy server does not support > *POST* calls. I checked the code of _*WebAppProxyServlet.java*_ and found > that the *POST* method is indeed not supported, so I changed it to *PUT* > method and the call was successful. > !image-2024-04-08-10-30-16-683.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33934) Flink SQL Source use raw format maybe lead to data lost
[ https://issues.apache.org/jira/browse/FLINK-33934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-33934: Assignee: Yuan Kui > Flink SQL Source use raw format maybe lead to data lost > --- > > Key: FLINK-33934 > URL: https://issues.apache.org/jira/browse/FLINK-33934 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / Runtime >Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, > 1.19.0 >Reporter: Cai Liuyang >Assignee: Yuan Kui >Priority: Major > > In our product we encounter a case that lead to data lost, the job info: > 1. using flinkSQL that read data from messageQueue (our internal mq) and > write to hive (only select value field, doesn't contain metadata field) > 2. the format of source table is raw format > > But if we select value field and metadata field at the same time, than the > data lost will not appear > > After we review the code, we found that the reason is the object reuse of > Raw-format(see code > [RawFormatDeserializationSchema|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatDeserializationSchema.java#L62]), > why object reuse will lead to this problem is below (take kafka as example): > 1. RawFormatDeserializationSchema will be used in the Fetcher-Thread of > SourceOperator, Fetcher-Thread will read and deserialize data from kafka > partition, than put data to ElementQueue (see code [SourceOperator > FetcherTask > |https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/fetcher/FetchTask.java#L64]) > 2. SourceOperator's main thread will pull data from the > ElementQueue(which is shared with the FetcherThread) and process it (see code > [SourceOperator main > thread|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SourceReaderBase.java#L188]) > 3. For RawFormatDeserializationSchema, its deserialize function will > return the same object([reuse rowData > object|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatDeserializationSchema.java#L62]) > 4. So, if elementQueue have element that not be consumed, than the > fetcherThread can change the filed of the reused rawData that > RawFormatDeserializationSchema::deserialize returned, this will lead to data > lost; > > The reason that we select value and metadata field at the same time will not > encounter data lost is: > if we select metadata field there will return a new RowData object see > code: [DynamicKafkaDeserializationSchema deserialize with metadata field > |https://github.com/apache/flink-connector-kafka/blob/main/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/DynamicKafkaDeserializationSchema.java#L249] > and if we only select value filed, it will reuse the RowData object that > formatDeserializationSchema returned see code > [DynamicKafkaDeserializationSchema deserialize only with value > field|https://github.com/apache/flink-connector-kafka/blob/main/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/DynamicKafkaDeserializationSchema.java#L113] > > To solve this problem, i think we should remove reuse object of > RawFormatDeserializationSchema. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33934) Flink SQL Source use raw format maybe lead to data lost
[ https://issues.apache.org/jira/browse/FLINK-33934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-33934: - Affects Version/s: 1.19.0 1.18.0 1.17.0 1.16.0 1.15.0 1.14.0 1.13.0 1.12.0 > Flink SQL Source use raw format maybe lead to data lost > --- > > Key: FLINK-33934 > URL: https://issues.apache.org/jira/browse/FLINK-33934 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / Runtime >Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, > 1.19.0 >Reporter: Cai Liuyang >Priority: Major > > In our product we encounter a case that lead to data lost, the job info: > 1. using flinkSQL that read data from messageQueue (our internal mq) and > write to hive (only select value field, doesn't contain metadata field) > 2. the format of source table is raw format > > But if we select value field and metadata field at the same time, than the > data lost will not appear > > After we review the code, we found that the reason is the object reuse of > Raw-format(see code > [RawFormatDeserializationSchema|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatDeserializationSchema.java#L62]), > why object reuse will lead to this problem is below (take kafka as example): > 1. RawFormatDeserializationSchema will be used in the Fetcher-Thread of > SourceOperator, Fetcher-Thread will read and deserialize data from kafka > partition, than put data to ElementQueue (see code [SourceOperator > FetcherTask > |https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/fetcher/FetchTask.java#L64]) > 2. SourceOperator's main thread will pull data from the > ElementQueue(which is shared with the FetcherThread) and process it (see code > [SourceOperator main > thread|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SourceReaderBase.java#L188]) > 3. For RawFormatDeserializationSchema, its deserialize function will > return the same object([reuse rowData > object|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatDeserializationSchema.java#L62]) > 4. So, if elementQueue have element that not be consumed, than the > fetcherThread can change the filed of the reused rawData that > RawFormatDeserializationSchema::deserialize returned, this will lead to data > lost; > > The reason that we select value and metadata field at the same time will not > encounter data lost is: > if we select metadata field there will return a new RowData object see > code: [DynamicKafkaDeserializationSchema deserialize with metadata field > |https://github.com/apache/flink-connector-kafka/blob/main/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/DynamicKafkaDeserializationSchema.java#L249] > and if we only select value filed, it will reuse the RowData object that > formatDeserializationSchema returned see code > [DynamicKafkaDeserializationSchema deserialize only with value > field|https://github.com/apache/flink-connector-kafka/blob/main/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/DynamicKafkaDeserializationSchema.java#L113] > > To solve this problem, i think we should remove reuse object of > RawFormatDeserializationSchema. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33934) Flink SQL Source use raw format maybe lead to data lost
[ https://issues.apache.org/jira/browse/FLINK-33934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834765#comment-17834765 ] Yun Tang commented on FLINK-33934: -- I think it's easy to let data lost when using {{raw}} format with another customized connector. Moreover, the object reuse semantic is hidden in the {{raw}} format description. From my point of view, this is a potential bug, cc [~jark]. > Flink SQL Source use raw format maybe lead to data lost > --- > > Key: FLINK-33934 > URL: https://issues.apache.org/jira/browse/FLINK-33934 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / Runtime >Reporter: Cai Liuyang >Priority: Major > > In our product we encounter a case that lead to data lost, the job info: > 1. using flinkSQL that read data from messageQueue (our internal mq) and > write to hive (only select value field, doesn't contain metadata field) > 2. the format of source table is raw format > > But if we select value field and metadata field at the same time, than the > data lost will not appear > > After we review the code, we found that the reason is the object reuse of > Raw-format(see code > [RawFormatDeserializationSchema|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatDeserializationSchema.java#L62]), > why object reuse will lead to this problem is below (take kafka as example): > 1. RawFormatDeserializationSchema will be used in the Fetcher-Thread of > SourceOperator, Fetcher-Thread will read and deserialize data from kafka > partition, than put data to ElementQueue (see code [SourceOperator > FetcherTask > |https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/fetcher/FetchTask.java#L64]) > 2. SourceOperator's main thread will pull data from the > ElementQueue(which is shared with the FetcherThread) and process it (see code > [SourceOperator main > thread|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SourceReaderBase.java#L188]) > 3. For RawFormatDeserializationSchema, its deserialize function will > return the same object([reuse rowData > object|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatDeserializationSchema.java#L62]) > 4. So, if elementQueue have element that not be consumed, than the > fetcherThread can change the filed of the reused rawData that > RawFormatDeserializationSchema::deserialize returned, this will lead to data > lost; > > The reason that we select value and metadata field at the same time will not > encounter data lost is: > if we select metadata field there will return a new RowData object see > code: [DynamicKafkaDeserializationSchema deserialize with metadata field > |https://github.com/apache/flink-connector-kafka/blob/main/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/DynamicKafkaDeserializationSchema.java#L249] > and if we only select value filed, it will reuse the RowData object that > formatDeserializationSchema returned see code > [DynamicKafkaDeserializationSchema deserialize only with value > field|https://github.com/apache/flink-connector-kafka/blob/main/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/DynamicKafkaDeserializationSchema.java#L113] > > To solve this problem, i think we should remove reuse object of > RawFormatDeserializationSchema. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-34967) Update Website copyright footer to 2024
[ https://issues.apache.org/jira/browse/FLINK-34967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang closed FLINK-34967. Resolution: Fixed > Update Website copyright footer to 2024 > --- > > Key: FLINK-34967 > URL: https://issues.apache.org/jira/browse/FLINK-34967 > Project: Flink > Issue Type: Improvement > Components: Project Website >Reporter: Yun Tang >Assignee: Yu Chen >Priority: Major > > It's already the year 2024, we should update the copyright footer. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34968) Update flink-web copyright to 2024
[ https://issues.apache.org/jira/browse/FLINK-34968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34968. -- Assignee: Yu Chen Resolution: Fixed merged in asf-site: 3cda9946df98f0d2abda396ac87f7bc192ec70e0 > Update flink-web copyright to 2024 > -- > > Key: FLINK-34968 > URL: https://issues.apache.org/jira/browse/FLINK-34968 > Project: Flink > Issue Type: Improvement > Components: Project Website >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34976) LD_PRELOAD environment may not be effective after su to flink user
[ https://issues.apache.org/jira/browse/FLINK-34976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832742#comment-17832742 ] Yun Tang commented on FLINK-34976: -- I don't think the running Flink process would drop the {{LD_PRELOAD}} environment, please use `pmap` to check whether {{jemalloc.so}} is used by your process. > LD_PRELOAD environment may not be effective after su to flink user > -- > > Key: FLINK-34976 > URL: https://issues.apache.org/jira/browse/FLINK-34976 > Project: Flink > Issue Type: New Feature > Components: flink-docker >Affects Versions: 1.19.0 >Reporter: xiaogang zhou >Priority: Major > > I am not sure if LD_PRELOAD still takes effect after drop_privs_cmd. Should > we create a .bashrc file in home directory of flink, and export LD_PRELOAD > for flink user? > > [https://github.com/apache/flink-docker/blob/627987997ca7ec86bcc3d80b26df58aa595b91af/1.17/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L92] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34967) Update Website copyright footer to 2024
Yun Tang created FLINK-34967: Summary: Update Website copyright footer to 2024 Key: FLINK-34967 URL: https://issues.apache.org/jira/browse/FLINK-34967 Project: Flink Issue Type: Improvement Components: Project Website Reporter: Yun Tang Assignee: Yu Chen It's already the year 2024, we should update the copyright footer. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32299) Upload python jar when sql contains python udf jar
[ https://issues.apache.org/jira/browse/FLINK-32299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-32299: - Fix Version/s: 1.19.0 > Upload python jar when sql contains python udf jar > -- > > Key: FLINK-32299 > URL: https://issues.apache.org/jira/browse/FLINK-32299 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Gateway, Table SQL / Runtime >Reporter: Shengkai Fang >Assignee: Yangze Guo >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > Currently, sql gateway always uploads the python jar when submitting jobs. > However, it's not required for every sql job. We should add the python jar > into the PipelineOpitons.JARS only when user jobs contain python udf. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34617) Correct the Javadoc of org.apache.flink.api.common.time.Time
[ https://issues.apache.org/jira/browse/FLINK-34617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34617. -- Fix Version/s: 1.20.0 1.19.1 Resolution: Fixed merged master: 9617598de33b2b23b97ddb84887392659070c344 release-1.19: c6d96b7f7c07faad363779a5175d5772140891a5 > Correct the Javadoc of org.apache.flink.api.common.time.Time > > > Key: FLINK-34617 > URL: https://issues.apache.org/jira/browse/FLINK-34617 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.19.0 >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Major > Labels: pull-request-available > Fix For: 1.20.0, 1.19.1 > > > The current Javadoc of {{org.apache.flink.api.common.time.Time}} said it will > fully replace {{org.apache.flink.streaming.api.windowing.time.Time}} in Flink > 2.0. However, the {{Time}} class has been deprecated, and we should remove > the description. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34622) Typo of execution_mode configuration name in Chinese document
[ https://issues.apache.org/jira/browse/FLINK-34622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34622. -- Fix Version/s: 1.18.2 1.20.0 1.19.1 Assignee: Yu Chen Resolution: Fixed merged master: 17487c0c944c3925b89b26eadf38169da35410f7 release-1.19: 0a85a08303ced5715437ead15ce60203c70aa58d release-1.18: ff256ef85f4edf4e86ce2bd73e4bdef8b7e07fbb > Typo of execution_mode configuration name in Chinese document > - > > Key: FLINK-34622 > URL: https://issues.apache.org/jira/browse/FLINK-34622 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.18.2, 1.20.0, 1.19.1 > > Attachments: image-2024-03-08-14-46-34-859.png > > > !image-2024-03-08-14-46-34-859.png|width=794,height=380! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34617) Correct the Javadoc of org.apache.flink.api.common.time.Time
Yun Tang created FLINK-34617: Summary: Correct the Javadoc of org.apache.flink.api.common.time.Time Key: FLINK-34617 URL: https://issues.apache.org/jira/browse/FLINK-34617 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.19.0 Reporter: Yun Tang Assignee: Yun Tang The current Javadoc of {{org.apache.flink.api.common.time.Time}} said it will fully replace {{org.apache.flink.streaming.api.windowing.time.Time}} in Flink 2.0. However, the {{Time}} class has been deprecated, and we should remove the description. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34522) StateTtlConfig#cleanupInRocksdbCompactFilter still uses the deprecated Time class
[ https://issues.apache.org/jira/browse/FLINK-34522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822754#comment-17822754 ] Yun Tang commented on FLINK-34522: -- merged in release-1.19: 161defe0bb2dc8136133e07699b6ac433d52dc65 ... 7618bdeeab06c09219136a04a62262148c677134 > StateTtlConfig#cleanupInRocksdbCompactFilter still uses the deprecated Time > class > - > > Key: FLINK-34522 > URL: https://issues.apache.org/jira/browse/FLINK-34522 > Project: Flink > Issue Type: Improvement > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Rui Fan >Assignee: Rui Fan >Priority: Blocker > Labels: pull-request-available > Fix For: 1.19.0, 1.20.0 > > > FLINK-32570 deprecated the Time class and refactor all Public or > PublicEvolving apis to use the Java's Duration. > StateTtlConfig.Builder#cleanupInRocksdbCompactFilter is still using the Time > class. In general, we expect: > * Mark {{cleanupInRocksdbCompactFilter(long, Time)}} as {{@Deprecated}} > * Provide a new cleanupInRocksdbCompactFilter(long, Duration) > Note: This is exactly what FLINK-32570 does, so I guess FLINK-32570 missed > cleanupInRocksdbCompactFilter. > But I found this method is introduced in 1.19(FLINK-30854), so a better > solution may be: only provide cleanupInRocksdbCompactFilter(long, Duration) > and don't use Time. > The deprecated Api should be keep for 2 minor version. IIUC, we cannot remove > Time related class in Flink 2.0 if we don't deprecate it in 1.19. If so, I > think it's better to merge this JIRA in 1.19.0 as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33436) Documentation on the built-in Profiler
[ https://issues.apache.org/jira/browse/FLINK-33436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-33436. -- Fix Version/s: 1.19.0 1.20.0 Resolution: Fixed master: 5f06ce765256b375945b9e69db2f16123b53f194 release-1.19: 12ea64c0e2a56da3c5f6a656b23a2f2ac54f19d5 > Documentation on the built-in Profiler > -- > > Key: FLINK-33436 > URL: https://issues.apache.org/jira/browse/FLINK-33436 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0, 1.20.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34500) Release Testing: Verify FLINK-33261 Support Setting Parallelism for Table/SQL Sources
[ https://issues.apache.org/jira/browse/FLINK-34500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34500. -- Resolution: Fixed > Release Testing: Verify FLINK-33261 Support Setting Parallelism for Table/SQL > Sources > - > > Key: FLINK-34500 > URL: https://issues.apache.org/jira/browse/FLINK-34500 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Parent, Table SQL / API >Affects Versions: 1.19.0 >Reporter: SuDewei >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > > This issue aims to verify > [FLIP-367|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263429150]. > Volunteers can verify it by following the [doc > changes|https://github.com/apache/flink/pull/24234]. Since currently only the > pre-defined DataGen connector and user-defined connector supports setting > source parallelism, volunteers can verify it through DataGen Connector. > The basic steps include: > 1. Start a Flink cluster and submit a Flink SQL Job to the cluster. > 2. In this Flink Job, use the DataGen SQL Connector to generate data. > 3. Specify the parameter scan.parallelism in DataGen connector options as > user-defined parallelism instead of default parallelism. > 4. Observe whether the parallelism of the source has changed on the job graph > of the Flink Application UI, and whether the shuffle mode is correct. > If everything is normal, you will see that the parallelism of the source > operator is indeed different from that of downstream, and the shuffle mode is > rebalanced by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34500) Release Testing: Verify FLINK-33261 Support Setting Parallelism for Table/SQL Sources
[ https://issues.apache.org/jira/browse/FLINK-34500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821299#comment-17821299 ] Yun Tang commented on FLINK-34500: -- I started a local standalone cluster, and submitted the SQL queries via sql-client. DataGen source with parallelism different with `parallelism.default`, no matter larger or smaller, the source operator will be the one with exact parallelism just as `scan.parallelism` configured. If we did not add `scan.parallelism` option, the source-parallelism will be the same as the `parallelism.default` option. > Release Testing: Verify FLINK-33261 Support Setting Parallelism for Table/SQL > Sources > - > > Key: FLINK-34500 > URL: https://issues.apache.org/jira/browse/FLINK-34500 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Parent, Table SQL / API >Affects Versions: 1.19.0 >Reporter: SuDewei >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > > This issue aims to verify > [FLIP-367|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263429150]. > Volunteers can verify it by following the [doc > changes|https://github.com/apache/flink/pull/24234]. Since currently only the > pre-defined DataGen connector and user-defined connector supports setting > source parallelism, volunteers can verify it through DataGen Connector. > The basic steps include: > 1. Start a Flink cluster and submit a Flink SQL Job to the cluster. > 2. In this Flink Job, use the DataGen SQL Connector to generate data. > 3. Specify the parameter scan.parallelism in DataGen connector options as > user-defined parallelism instead of default parallelism. > 4. Observe whether the parallelism of the source has changed on the job graph > of the Flink Application UI, and whether the shuffle mode is correct. > If everything is normal, you will see that the parallelism of the source > operator is indeed different from that of downstream, and the shuffle mode is > rebalanced by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34500) Release Testing: Verify FLINK-33261 Support Setting Parallelism for Table/SQL Sources
[ https://issues.apache.org/jira/browse/FLINK-34500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-34500: Assignee: Yun Tang > Release Testing: Verify FLINK-33261 Support Setting Parallelism for Table/SQL > Sources > - > > Key: FLINK-34500 > URL: https://issues.apache.org/jira/browse/FLINK-34500 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Parent, Table SQL / API >Affects Versions: 1.19.0 >Reporter: SuDewei >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > > This issue aims to verify > [FLIP-367|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263429150]. > Volunteers can verify it by following the [doc > changes|https://github.com/apache/flink/pull/24234]. Since currently only the > pre-defined DataGen connector and user-defined connector supports setting > source parallelism, volunteers can verify it through DataGen Connector. > The basic steps include: > 1. Start a Flink cluster and submit a Flink SQL Job to the cluster. > 2. In this Flink Job, use the DataGen SQL Connector to generate data. > 3. Specify the parameter scan.parallelism in DataGen connector options as > user-defined parallelism instead of default parallelism. > 4. Observe whether the parallelism of the source has changed on the job graph > of the Flink Application UI, and whether the shuffle mode is correct. > If everything is normal, you will see that the parallelism of the source > operator is indeed different from that of downstream, and the shuffle mode is > rebalanced by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34388) Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone and native K8s application mode
[ https://issues.apache.org/jira/browse/FLINK-34388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34388. -- Resolution: Fixed Thanks for [~Yu Chen]'s work! I'll mark this ticket as resolved first, and we can continue discussing the left question [~ferenc-csaky] > Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone > and native K8s application mode > --- > > Key: FLINK-34388 > URL: https://issues.apache.org/jira/browse/FLINK-34388 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Metrics >Affects Versions: 1.19.0 >Reporter: Ferenc Csaky >Assignee: Yu Chen >Priority: Blocker > Labels: release-testing > Fix For: 1.19.0 > > > This ticket covers testing FLINK-28915. More details and the added docs are > accessible on the [PR|https://github.com/apache/flink/pull/24065] > Test 1: Pass {{local://}} job jar in standalone mode, check the artifacts are > not actually copied. > Test 2: Pass multiple artifacts in standalone mode. > Test 3: Pass a non-local job jar in native k8s mode. [1] > Test 4: Pass additional remote artifacts in native k8s mode. > Available config options: > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#artifact-fetching > [1] Custom docker image build instructions: > https://github.com/apache/flink-docker/tree/dev-master > Note: The docker build instructions also contains a web server example that > can be used to serve HTTP artifacts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34390) Release Testing: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34390. -- Resolution: Fixed > Release Testing: Verify FLINK-33325 Built-in cross-platform powerful java > profiler > -- > > Key: FLINK-34390 > URL: https://issues.apache.org/jira/browse/FLINK-34390 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Yun Tang >Assignee: junzhong qin >Priority: Major > Labels: release-testing > Fix For: 1.19.0 > > Attachments: image-2024-02-08-10-43-27-679.png, > image-2024-02-08-10-44-55-401.png, image-2024-02-08-10-45-13-951.png, > image-2024-02-08-10-45-31-564.png > > > See https://issues.apache.org/jira/browse/FLINK-34310 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34390) Release Testing: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818228#comment-17818228 ] Yun Tang commented on FLINK-34390: -- [~easonqin] Thanks for the testing! > Release Testing: Verify FLINK-33325 Built-in cross-platform powerful java > profiler > -- > > Key: FLINK-34390 > URL: https://issues.apache.org/jira/browse/FLINK-34390 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Yun Tang >Assignee: junzhong qin >Priority: Major > Labels: release-testing > Fix For: 1.19.0 > > Attachments: image-2024-02-08-10-43-27-679.png, > image-2024-02-08-10-44-55-401.png, image-2024-02-08-10-45-13-951.png, > image-2024-02-08-10-45-31-564.png > > > See https://issues.apache.org/jira/browse/FLINK-34310 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34388) Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone and native K8s application mode
[ https://issues.apache.org/jira/browse/FLINK-34388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-34388: Assignee: Yu Chen > Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone > and native K8s application mode > --- > > Key: FLINK-34388 > URL: https://issues.apache.org/jira/browse/FLINK-34388 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Metrics >Affects Versions: 1.19.0 >Reporter: Ferenc Csaky >Assignee: Yu Chen >Priority: Blocker > Labels: release-testing > Fix For: 1.19.0 > > > This ticket covers testing FLINK-28915. More details and the added docs are > accessible on the [PR|https://github.com/apache/flink/pull/24065] > Test 1: Pass {{local://}} job jar in standalone mode, check the artifacts are > not actually copied. > Test 2: Pass multiple artifacts in standalone mode. > Test 3: Pass a non-local job jar in native k8s mode. [1] > Test 4: Pass additional remote artifacts in native k8s mode. > Available config options: > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#artifact-fetching > [1] Custom docker image build instructions: > https://github.com/apache/flink-docker/tree/dev-master > Note: The docker build instructions also contains a web server example that > can be used to serve HTTP artifacts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33644) FLIP-393: Make QueryOperations SQL serializable
[ https://issues.apache.org/jira/browse/FLINK-33644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-33644: - Description: https://cwiki.apache.org/confluence/display/FLINK/FLIP-393%3A+Make+QueryOperations+SQL+serializable (was: https://cwiki.apache.org/confluence/x/4guZE) > FLIP-393: Make QueryOperations SQL serializable > --- > > Key: FLINK-33644 > URL: https://issues.apache.org/jira/browse/FLINK-33644 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Dawid Wysakowicz >Assignee: Dawid Wysakowicz >Priority: Major > Fix For: 1.19.0 > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-393%3A+Make+QueryOperations+SQL+serializable -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34355) Release Testing: Verify FLINK-34054 Support named parameters for functions and procedures
[ https://issues.apache.org/jira/browse/FLINK-34355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815192#comment-17815192 ] Yun Tang commented on FLINK-34355: -- [~xu_shuai_] Already assigned to you, please go ahead. > Release Testing: Verify FLINK-34054 Support named parameters for functions > and procedures > - > > Key: FLINK-34355 > URL: https://issues.apache.org/jira/browse/FLINK-34355 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Shuai Xu >Priority: Blocker > Labels: release-testing > Fix For: 1.19.0 > > > Test suggestion: > 1. Implement a test UDF or Procedure and support Named Parameters. > 2. When calling a function or procedure, use named parameters to verify if > the results are as expected. > You can test the following scenarios: > 1. Normal usage of named parameters, fully specifying each parameter. > 2. Omitting unnecessary parameters. > 3. Omitting necessary parameters to confirm if an error is reported. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34355) Release Testing: Verify FLINK-34054 Support named parameters for functions and procedures
[ https://issues.apache.org/jira/browse/FLINK-34355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-34355: Assignee: Shuai Xu > Release Testing: Verify FLINK-34054 Support named parameters for functions > and procedures > - > > Key: FLINK-34355 > URL: https://issues.apache.org/jira/browse/FLINK-34355 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Shuai Xu >Priority: Blocker > Labels: release-testing > Fix For: 1.19.0 > > > Test suggestion: > 1. Implement a test UDF or Procedure and support Named Parameters. > 2. When calling a function or procedure, use named parameters to verify if > the results are as expected. > You can test the following scenarios: > 1. Normal usage of named parameters, fully specifying each parameter. > 2. Omitting unnecessary parameters. > 3. Omitting necessary parameters to confirm if an error is reported. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34390) Release Testing: Verify FLINK-33325 Built-in cross-platform powerful java profiler
Yun Tang created FLINK-34390: Summary: Release Testing: Verify FLINK-33325 Built-in cross-platform powerful java profiler Key: FLINK-34390 URL: https://issues.apache.org/jira/browse/FLINK-34390 Project: Flink Issue Type: Sub-task Reporter: Yun Tang Assignee: Rui Fan -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34310. -- Resolution: Fixed > Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform > powerful java profiler > --- > > Key: FLINK-34310 > URL: https://issues.apache.org/jira/browse/FLINK-34310 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png, > screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png > > > Instructions: > 1. For the default case, it will print the hint to tell users how to enable > this feature. > !screenshot-2.png! > 2. After we add {{rest.profiling.enabled: true}} in the configurations, we > can use this feature now, and the default mode should be {{ITIMER}} > !screenshot-3.png! > 3. We cannot create another profiling while one is running > !screenshot-4.png! > 4. We can get at most 10 profilling snapshots by default, and the older one > will be deleted automaticially. > !screenshot-5.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34310: - Description: Instructions: 1. For the default case, it will print the hint to tell users how to enable this feature. !screenshot-2.png! 2. After we add {{rest.profiling.enabled: true}} in the configurations, we can use this feature now, and the default mode should be {{ITIMER}} !screenshot-3.png! 3. We cannot create another profiling while one is running !screenshot-4.png! 4. We can get at most 10 profilling snapshots by default, and the older one will be deleted automaticially. !screenshot-5.png! > Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform > powerful java profiler > --- > > Key: FLINK-34310 > URL: https://issues.apache.org/jira/browse/FLINK-34310 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png, > screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png > > > Instructions: > 1. For the default case, it will print the hint to tell users how to enable > this feature. > !screenshot-2.png! > 2. After we add {{rest.profiling.enabled: true}} in the configurations, we > can use this feature now, and the default mode should be {{ITIMER}} > !screenshot-3.png! > 3. We cannot create another profiling while one is running > !screenshot-4.png! > 4. We can get at most 10 profilling snapshots by default, and the older one > will be deleted automaticially. > !screenshot-5.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34310: - Attachment: screenshot-5.png > Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform > powerful java profiler > --- > > Key: FLINK-34310 > URL: https://issues.apache.org/jira/browse/FLINK-34310 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png, > screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34310: - Attachment: screenshot-4.png > Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform > powerful java profiler > --- > > Key: FLINK-34310 > URL: https://issues.apache.org/jira/browse/FLINK-34310 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png, > screenshot-2.png, screenshot-3.png, screenshot-4.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34310: - Attachment: screenshot-3.png > Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform > powerful java profiler > --- > > Key: FLINK-34310 > URL: https://issues.apache.org/jira/browse/FLINK-34310 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png, > screenshot-2.png, screenshot-3.png, screenshot-4.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34310: - Attachment: screenshot-2.png > Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform > powerful java profiler > --- > > Key: FLINK-34310 > URL: https://issues.apache.org/jira/browse/FLINK-34310 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png, > screenshot-2.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-34310: Assignee: Yun Tang (was: Yu Chen) > Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform > powerful java profiler > --- > > Key: FLINK-34310 > URL: https://issues.apache.org/jira/browse/FLINK-34310 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Yun Tang >Priority: Blocker > Fix For: 1.19.0 > > Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814666#comment-17814666 ] Yun Tang commented on FLINK-34310: -- I could help to provide the testing instructions, and I will assign it to [~fanrui] later. > Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform > powerful java profiler > --- > > Key: FLINK-34310 > URL: https://issues.apache.org/jira/browse/FLINK-34310 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: lincoln lee >Assignee: Yu Chen >Priority: Blocker > Labels: release-testing > Fix For: 1.19.0 > > Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34007) Flink Job stuck in suspend state after losing leadership in HA Mode
[ https://issues.apache.org/jira/browse/FLINK-34007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813138#comment-17813138 ] Yun Tang commented on FLINK-34007: -- It seems we have had a long discussion, does this problem also exist in Flink-1.17? [~wangyang0918] [~mapohl] > Flink Job stuck in suspend state after losing leadership in HA Mode > --- > > Key: FLINK-34007 > URL: https://issues.apache.org/jira/browse/FLINK-34007 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.19.0, 1.18.1, 1.18.2 >Reporter: Zhenqiu Huang >Assignee: Matthias Pohl >Priority: Blocker > Labels: pull-request-available > Fix For: 1.19.0 > > Attachments: Debug.log, LeaderElector-Debug.json, job-manager.log > > > The observation is that Job manager goes to suspend state with a failed > container not able to register itself to resource manager after timeout. > JM Log, see attached > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33325) FLIP-375: Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-33325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-33325: - Fix Version/s: 1.19.0 > FLIP-375: Built-in cross-platform powerful java profiler > > > Key: FLINK-33325 > URL: https://issues.apache.org/jira/browse/FLINK-33325 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Fix For: 1.19.0 > > > This is an umbrella JIRA of > [FLIP-375|https://cwiki.apache.org/confluence/x/64lEE] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34029) Support different profiling mode on Flink WEB
[ https://issues.apache.org/jira/browse/FLINK-34029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34029. -- Fix Version/s: 1.19.0 Assignee: Yu Chen Resolution: Fixed merged in master: 4db6e72ed766791d25ee0379c7c29d1b4e2c08df > Support different profiling mode on Flink WEB > - > > Key: FLINK-34029 > URL: https://issues.apache.org/jira/browse/FLINK-34029 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33696) FLIP-385: Add OpenTelemetryTraceReporter and OpenTelemetryMetricReporter
[ https://issues.apache.org/jira/browse/FLINK-33696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809320#comment-17809320 ] Yun Tang commented on FLINK-33696: -- [~pnowojski], is this ticket done for FLIP-385? > FLIP-385: Add OpenTelemetryTraceReporter and OpenTelemetryMetricReporter > > > Key: FLINK-33696 > URL: https://issues.apache.org/jira/browse/FLINK-33696 > Project: Flink > Issue Type: New Feature > Components: Runtime / Metrics >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Major > Fix For: 1.19.0 > > > h1. Motivation > [FLIP-384|https://cwiki.apache.org/confluence/display/FLINK/FLIP-384%3A+Introduce+TraceReporter+and+use+it+to+create+checkpointing+and+recovery+traces] > is adding TraceReporter interface. However with > [FLIP-384|https://cwiki.apache.org/confluence/display/FLINK/FLIP-384%3A+Introduce+TraceReporter+and+use+it+to+create+checkpointing+and+recovery+traces] > alone, Log4jTraceReporter would be the only available implementation of > TraceReporter interface, which is not very helpful. > In this FLIP I’m proposing to contribute both MetricExporter and > TraceReporter implementation using OpenTelemetry. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34164) [Benchmark] Compilation error since Jan. 16th
[ https://issues.apache.org/jira/browse/FLINK-34164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34164: - Fix Version/s: 1.19.0 > [Benchmark] Compilation error since Jan. 16th > - > > Key: FLINK-34164 > URL: https://issues.apache.org/jira/browse/FLINK-34164 > Project: Flink > Issue Type: Bug > Components: Benchmarks >Reporter: Zakelly Lan >Assignee: Junrui Li >Priority: Critical > Fix For: 1.19.0 > > > An error occured during the benchmark compile: > {code:java} > 13:17:40 [ERROR] > /mnt/jenkins/workspace/flink-main-benchmarks/flink-benchmarks/warning:[options] > bootstrap class path not set in conjunction with -source 8 > 13:17:40 > /mnt/jenkins/workspace/flink-main-benchmarks/flink-benchmarks/src/main/java/org/apache/flink/benchmark/StreamGraphUtils.java:38:19: > error: cannot find symbol {code} > It seems related with the FLINK-33980 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34164) [Benchmark] Compilation error since Jan. 16th
[ https://issues.apache.org/jira/browse/FLINK-34164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34164: - Priority: Critical (was: Major) > [Benchmark] Compilation error since Jan. 16th > - > > Key: FLINK-34164 > URL: https://issues.apache.org/jira/browse/FLINK-34164 > Project: Flink > Issue Type: Bug > Components: Benchmarks >Reporter: Zakelly Lan >Assignee: Junrui Li >Priority: Critical > > An error occured during the benchmark compile: > {code:java} > 13:17:40 [ERROR] > /mnt/jenkins/workspace/flink-main-benchmarks/flink-benchmarks/warning:[options] > bootstrap class path not set in conjunction with -source 8 > 13:17:40 > /mnt/jenkins/workspace/flink-main-benchmarks/flink-benchmarks/src/main/java/org/apache/flink/benchmark/StreamGraphUtils.java:38:19: > error: cannot find symbol {code} > It seems related with the FLINK-33980 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34148) Potential regression (Jan. 13): stringWrite with Java8
[ https://issues.apache.org/jira/browse/FLINK-34148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34148: - Priority: Critical (was: Major) > Potential regression (Jan. 13): stringWrite with Java8 > -- > > Key: FLINK-34148 > URL: https://issues.apache.org/jira/browse/FLINK-34148 > Project: Flink > Issue Type: Improvement > Components: API / Type Serialization System >Reporter: Zakelly Lan >Priority: Critical > Fix For: 1.19.0 > > > Significant drop of performance in stringWrite with Java8 from commit > [881062f352|https://github.com/apache/flink/commit/881062f352f8bf8c21ab7cbea95e111fd82fdf20] > to > [5d9d8748b6|https://github.com/apache/flink/commit/5d9d8748b64ff1a75964a5cd2857ab5061312b51] > . It only involves strings not so long (128 or 4). > stringWrite.128.ascii(Java8) baseline=1089.107756 current_value=754.52452 > stringWrite.128.chinese(Java8) baseline=504.244575 current_value=295.358989 > stringWrite.128.russian(Java8) baseline=655.582639 current_value=421.030188 > stringWrite.4.chinese(Java8) baseline=9598.791964 current_value=6627.929927 > stringWrite.4.russian(Java8) baseline=11070.666415 current_value=8289.95767 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34148) Potential regression (Jan. 13): stringWrite with Java8
[ https://issues.apache.org/jira/browse/FLINK-34148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-34148: - Fix Version/s: 1.19.0 > Potential regression (Jan. 13): stringWrite with Java8 > -- > > Key: FLINK-34148 > URL: https://issues.apache.org/jira/browse/FLINK-34148 > Project: Flink > Issue Type: Improvement > Components: API / Type Serialization System >Reporter: Zakelly Lan >Priority: Major > Fix For: 1.19.0 > > > Significant drop of performance in stringWrite with Java8 from commit > [881062f352|https://github.com/apache/flink/commit/881062f352f8bf8c21ab7cbea95e111fd82fdf20] > to > [5d9d8748b6|https://github.com/apache/flink/commit/5d9d8748b64ff1a75964a5cd2857ab5061312b51] > . It only involves strings not so long (128 or 4). > stringWrite.128.ascii(Java8) baseline=1089.107756 current_value=754.52452 > stringWrite.128.chinese(Java8) baseline=504.244575 current_value=295.358989 > stringWrite.128.russian(Java8) baseline=655.582639 current_value=421.030188 > stringWrite.4.chinese(Java8) baseline=9598.791964 current_value=6627.929927 > stringWrite.4.russian(Java8) baseline=11070.666415 current_value=8289.95767 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33434) Support invoke async-profiler on Taskmanager through REST API
[ https://issues.apache.org/jira/browse/FLINK-33434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-33434. -- Fix Version/s: 1.19.0 Resolution: Fixed merged in master: 525f6bc818eb7f15a19fa81421584920de8f8876 ... 4bee4e6e8ddb41ae9933d04bf21183223db6c2de > Support invoke async-profiler on Taskmanager through REST API > - > > Key: FLINK-33434 > URL: https://issues.apache.org/jira/browse/FLINK-33434 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34072) Use JAVA_RUN in shell scripts
[ https://issues.apache.org/jira/browse/FLINK-34072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34072. -- Resolution: Fixed merged in master: e7c8cd1562ebd45c1f7b48f519a11c6cd4fdf100 > Use JAVA_RUN in shell scripts > - > > Key: FLINK-34072 > URL: https://issues.apache.org/jira/browse/FLINK-34072 > Project: Flink > Issue Type: Improvement > Components: Deployment / Scripts >Reporter: Yun Tang >Assignee: Yu Chen >Priority: Minor > Labels: pull-request-available > Fix For: 1.19.0 > > > We should call {{JAVA_RUN}} in all cases when we launch {{java}} command, > otherwise we might be able to run the {{java}} if JAVA_HOME is not set. > such as: > {code:java} > flink-1.19-SNAPSHOT-bin/flink-1.19-SNAPSHOT/bin/config.sh: line 339: > 17 : > syntax error: operand expected (error token is "> 17 ") > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-18255) Add API annotations to RocksDB user-facing classes
[ https://issues.apache.org/jira/browse/FLINK-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17807535#comment-17807535 ] Yun Tang commented on FLINK-18255: -- I think we should need a FLIP, just like https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=278465498 [~lijinzhong], already assigned to you. > Add API annotations to RocksDB user-facing classes > -- > > Key: FLINK-18255 > URL: https://issues.apache.org/jira/browse/FLINK-18255 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 1.11.0 >Reporter: Nico Kruber >Assignee: Jinzhong Li >Priority: Major > Labels: auto-deprioritized-major, auto-deprioritized-minor > Fix For: 1.19.0 > > > Several user-facing classes in {{flink-statebackend-rocksdb}} don't have any > API annotations, not even {{@PublicEvolving}}. These should be added to > clarify their usage. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-18255) Add API annotations to RocksDB user-facing classes
[ https://issues.apache.org/jira/browse/FLINK-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-18255: Assignee: Jinzhong Li > Add API annotations to RocksDB user-facing classes > -- > > Key: FLINK-18255 > URL: https://issues.apache.org/jira/browse/FLINK-18255 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 1.11.0 >Reporter: Nico Kruber >Assignee: Jinzhong Li >Priority: Major > Labels: auto-deprioritized-major, auto-deprioritized-minor > Fix For: 1.19.0 > > > Several user-facing classes in {{flink-statebackend-rocksdb}} don't have any > API annotations, not even {{@PublicEvolving}}. These should be added to > clarify their usage. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-18255) Add API annotations to RocksDB user-facing classes
[ https://issues.apache.org/jira/browse/FLINK-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-18255: - Priority: Major (was: Not a Priority) > Add API annotations to RocksDB user-facing classes > -- > > Key: FLINK-18255 > URL: https://issues.apache.org/jira/browse/FLINK-18255 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 1.11.0 >Reporter: Nico Kruber >Priority: Major > Labels: auto-deprioritized-major, auto-deprioritized-minor > > Several user-facing classes in {{flink-statebackend-rocksdb}} don't have any > API annotations, not even {{@PublicEvolving}}. These should be added to > clarify their usage. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-18255) Add API annotations to RocksDB user-facing classes
[ https://issues.apache.org/jira/browse/FLINK-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-18255: - Fix Version/s: 1.19.0 > Add API annotations to RocksDB user-facing classes > -- > > Key: FLINK-18255 > URL: https://issues.apache.org/jira/browse/FLINK-18255 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 1.11.0 >Reporter: Nico Kruber >Priority: Major > Labels: auto-deprioritized-major, auto-deprioritized-minor > Fix For: 1.19.0 > > > Several user-facing classes in {{flink-statebackend-rocksdb}} don't have any > API annotations, not even {{@PublicEvolving}}. These should be added to > clarify their usage. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34072) Use JAVA_RUN in shell scripts
[ https://issues.apache.org/jira/browse/FLINK-34072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17806474#comment-17806474 ] Yun Tang commented on FLINK-34072: -- [~Yu Chen] Already assigned, please go ahead. > Use JAVA_RUN in shell scripts > - > > Key: FLINK-34072 > URL: https://issues.apache.org/jira/browse/FLINK-34072 > Project: Flink > Issue Type: Improvement > Components: Deployment / Scripts >Reporter: Yun Tang >Assignee: Yu Chen >Priority: Minor > Fix For: 1.19.0 > > > We should call {{JAVA_RUN}} in all cases when we launch {{java}} command, > otherwise we might be able to run the {{java}} if JAVA_HOME is not set. > such as: > {code:java} > flink-1.19-SNAPSHOT-bin/flink-1.19-SNAPSHOT/bin/config.sh: line 339: > 17 : > syntax error: operand expected (error token is "> 17 ") > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34072) Use JAVA_RUN in shell scripts
[ https://issues.apache.org/jira/browse/FLINK-34072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-34072: Assignee: Yu Chen > Use JAVA_RUN in shell scripts > - > > Key: FLINK-34072 > URL: https://issues.apache.org/jira/browse/FLINK-34072 > Project: Flink > Issue Type: Improvement > Components: Deployment / Scripts >Reporter: Yun Tang >Assignee: Yu Chen >Priority: Minor > Fix For: 1.19.0 > > > We should call {{JAVA_RUN}} in all cases when we launch {{java}} command, > otherwise we might be able to run the {{java}} if JAVA_HOME is not set. > such as: > {code:java} > flink-1.19-SNAPSHOT-bin/flink-1.19-SNAPSHOT/bin/config.sh: line 339: > 17 : > syntax error: operand expected (error token is "> 17 ") > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34072) Use JAVA_RUN in shell scripts
Yun Tang created FLINK-34072: Summary: Use JAVA_RUN in shell scripts Key: FLINK-34072 URL: https://issues.apache.org/jira/browse/FLINK-34072 Project: Flink Issue Type: Improvement Components: Deployment / Scripts Reporter: Yun Tang Fix For: 1.19.0 We should call {{JAVA_RUN}} in all cases when we launch {{java}} command, otherwise we might be able to run the {{java}} if JAVA_HOME is not set. such as: {code:java} flink-1.19-SNAPSHOT-bin/flink-1.19-SNAPSHOT/bin/config.sh: line 339: > 17 : syntax error: operand expected (error token is "> 17 ") {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34013) ProfilingServiceTest.testRollingDeletion is unstable on AZP
[ https://issues.apache.org/jira/browse/FLINK-34013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-34013. -- Fix Version/s: 1.19.0 Resolution: Fixed merged in master: c09f07a406398bc4b2320e9b5ae0a8f5f27a00dc > ProfilingServiceTest.testRollingDeletion is unstable on AZP > --- > > Key: FLINK-34013 > URL: https://issues.apache.org/jira/browse/FLINK-34013 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Assignee: Yu Chen >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.19.0 > > > This build > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56073=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8258 > fails as > {noformat} > Jan 06 02:09:28 org.opentest4j.AssertionFailedError: expected: <2> but was: > <3> > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145) > Jan 06 02:09:28 at > org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531) > Jan 06 02:09:28 at > org.apache.flink.runtime.util.profiler.ProfilingServiceTest.verifyRollingDeletionWorks(ProfilingServiceTest.java:167) > Jan 06 02:09:28 at > org.apache.flink.runtime.util.profiler.ProfilingServiceTest.testRollingDeletion(ProfilingServiceTest.java:117) > Jan 06 02:09:28 at java.lang.reflect.Method.invoke(Method.java:498) > Jan 06 02:09:28 at > java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34013) ProfilingServiceTest.testRollingDeletion is unstable on AZP
[ https://issues.apache.org/jira/browse/FLINK-34013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-34013: Assignee: Yu Chen (was: Yu Chen) > ProfilingServiceTest.testRollingDeletion is unstable on AZP > --- > > Key: FLINK-34013 > URL: https://issues.apache.org/jira/browse/FLINK-34013 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Assignee: Yu Chen >Priority: Critical > Labels: pull-request-available, test-stability > > This build > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56073=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8258 > fails as > {noformat} > Jan 06 02:09:28 org.opentest4j.AssertionFailedError: expected: <2> but was: > <3> > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145) > Jan 06 02:09:28 at > org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531) > Jan 06 02:09:28 at > org.apache.flink.runtime.util.profiler.ProfilingServiceTest.verifyRollingDeletionWorks(ProfilingServiceTest.java:167) > Jan 06 02:09:28 at > org.apache.flink.runtime.util.profiler.ProfilingServiceTest.testRollingDeletion(ProfilingServiceTest.java:117) > Jan 06 02:09:28 at java.lang.reflect.Method.invoke(Method.java:498) > Jan 06 02:09:28 at > java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34013) ProfilingServiceTest.testRollingDeletion is unstable on AZP
[ https://issues.apache.org/jira/browse/FLINK-34013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-34013: Assignee: Yu Chen > ProfilingServiceTest.testRollingDeletion is unstable on AZP > --- > > Key: FLINK-34013 > URL: https://issues.apache.org/jira/browse/FLINK-34013 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Assignee: Yu Chen >Priority: Critical > Labels: pull-request-available, test-stability > > This build > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56073=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8258 > fails as > {noformat} > Jan 06 02:09:28 org.opentest4j.AssertionFailedError: expected: <2> but was: > <3> > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > Jan 06 02:09:28 at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145) > Jan 06 02:09:28 at > org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531) > Jan 06 02:09:28 at > org.apache.flink.runtime.util.profiler.ProfilingServiceTest.verifyRollingDeletionWorks(ProfilingServiceTest.java:167) > Jan 06 02:09:28 at > org.apache.flink.runtime.util.profiler.ProfilingServiceTest.testRollingDeletion(ProfilingServiceTest.java:117) > Jan 06 02:09:28 at java.lang.reflect.Method.invoke(Method.java:498) > Jan 06 02:09:28 at > java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) > Jan 06 02:09:28 at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33433) Support invoke async-profiler on Jobmanager through REST API
[ https://issues.apache.org/jira/browse/FLINK-33433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-33433. -- Fix Version/s: 1.19.0 Resolution: Fixed merged in master: 240494fd6169cb98b47808a003ee00804a780360...3efe9d2b09bedde89322594f0f3927004b6b1adf > Support invoke async-profiler on Jobmanager through REST API > > > Key: FLINK-33433 > URL: https://issues.apache.org/jira/browse/FLINK-33433 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30535) Introduce TTL state based benchmarks
[ https://issues.apache.org/jira/browse/FLINK-30535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800407#comment-17800407 ] Yun Tang commented on FLINK-30535: -- Some work done on Flink side: master: 0c82f8af859a4f463a07f5dfb35648970c1c3425 > Introduce TTL state based benchmarks > > > Key: FLINK-30535 > URL: https://issues.apache.org/jira/browse/FLINK-30535 > Project: Flink > Issue Type: New Feature > Components: Benchmarks >Reporter: Yun Tang >Assignee: Zakelly Lan >Priority: Major > Labels: pull-request-available > > This ticket is inspired by https://issues.apache.org/jira/browse/FLINK-30088 > which wants to optimize the TTL state performance. I think it would be useful > to introduce state benchmarks based on TTL as Flink has some overhead to > support TTL. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30535) Introduce TTL state based benchmarks
[ https://issues.apache.org/jira/browse/FLINK-30535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796562#comment-17796562 ] Yun Tang commented on FLINK-30535: -- [~Zakelly] would you like to take this ticket? > Introduce TTL state based benchmarks > > > Key: FLINK-30535 > URL: https://issues.apache.org/jira/browse/FLINK-30535 > Project: Flink > Issue Type: New Feature > Components: Benchmarks >Reporter: Yun Tang >Priority: Major > > This ticket is inspired by https://issues.apache.org/jira/browse/FLINK-30088 > which wants to optimize the TTL state performance. I think it would be useful > to introduce state benchmarks based on TTL as Flink has some overhead to > support TTL. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31752) SourceOperatorStreamTask increments numRecordsOut twice
[ https://issues.apache.org/jira/browse/FLINK-31752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-31752: - Fix Version/s: 1.17.1 > SourceOperatorStreamTask increments numRecordsOut twice > --- > > Key: FLINK-31752 > URL: https://issues.apache.org/jira/browse/FLINK-31752 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics >Affects Versions: 1.17.0 >Reporter: !huwh >Assignee: Yunfeng Zhou >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.17.1 > > Attachments: image-2023-04-07-15-51-44-304.png > > > The counter of numRecordsOut was introduce to ChainingOutput to reduce the > function call stack depth in > https://issues.apache.org/jira/browse/FLINK-30536 > But SourceOperatorStreamTask.AsyncDataOutputToOutput increments the counter > of numRecordsOut too. This results in the source operator's numRecordsOut are > doubled. > We should delete the numRecordsOut.inc in > SourceOperatorStreamTask.AsyncDataOutputToOutput. > [~xtsong][~lindong] Could you please take a look at this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31752) SourceOperatorStreamTask increments numRecordsOut twice
[ https://issues.apache.org/jira/browse/FLINK-31752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-31752: - Fix Version/s: 1.18.0 > SourceOperatorStreamTask increments numRecordsOut twice > --- > > Key: FLINK-31752 > URL: https://issues.apache.org/jira/browse/FLINK-31752 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics >Affects Versions: 1.17.0 >Reporter: !huwh >Assignee: Yunfeng Zhou >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > Attachments: image-2023-04-07-15-51-44-304.png > > > The counter of numRecordsOut was introduce to ChainingOutput to reduce the > function call stack depth in > https://issues.apache.org/jira/browse/FLINK-30536 > But SourceOperatorStreamTask.AsyncDataOutputToOutput increments the counter > of numRecordsOut too. This results in the source operator's numRecordsOut are > doubled. > We should delete the numRecordsOut.inc in > SourceOperatorStreamTask.AsyncDataOutputToOutput. > [~xtsong][~lindong] Could you please take a look at this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-23346) RocksDBStateBackend may core dump in flink_compactionfilterjni.cc
[ https://issues.apache.org/jira/browse/FLINK-23346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795654#comment-17795654 ] Yun Tang commented on FLINK-23346: -- [~Zakelly] current Flink community is trying to release a new FRocksDB version due to https://issues.apache.org/jira/browse/FLINK-8. Do you think you can fix it in this version? > RocksDBStateBackend may core dump in flink_compactionfilterjni.cc > - > > Key: FLINK-23346 > URL: https://issues.apache.org/jira/browse/FLINK-23346 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.14.0, 1.13.1, 1.12.4 >Reporter: Congxian Qiu >Priority: Major > > The code in [flink_compactionfilte.cpp > |https://github.com/ververica/frocksdb/blob/49bc897d5d768026f1eb816d960c1f2383396ef4/java/rocksjni/flink_compactionfilterjni.cc#L21] > {code:cpp} > inline void CheckAndRethrowException(JNIEnv* env) const { > if (env->ExceptionCheck()) { > env->ExceptionDescribe(); > env->Throw(env->ExceptionOccurred()); > } > {code} > may core dump in some sence, please see more information here[1][2][3] > We can fix it by changing this to > {code:cpp} > inline void CheckAndRethrowException(JNIEnv* env) const { > if (env->ExceptionCheck()) { > env->Throw(env->ExceptionOccurred()); > } > } > {code} > or > {code:cpp} >inline void CheckAndRethrowException(JNIEnv* env) const { > if (env->ExceptionCheck()) { > jobject obj = env->ExceptionOccurred(); > env->ExceptionDescribe(); > env->Throw(obj); > } > } > {code} > [1] > [https://stackoverflow.com/questions/30971068/does-jniexceptiondescribe-implicitily-clear-the-exception-trace-of-the-jni-env] > [2] [https://bugs.openjdk.java.net/browse/JDK-4067541] > [3] [https://bugs.openjdk.java.net/browse/JDK-8051947] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33246) Add RescalingIT case that uses checkpoints and resource requests
[ https://issues.apache.org/jira/browse/FLINK-33246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-33246. -- Fix Version/s: 1.19.0 Resolution: Fixed merged in master 98e4610f09f35a942e55472b5d358ebe113b0dba > Add RescalingIT case that uses checkpoints and resource requests > > > Key: FLINK-33246 > URL: https://issues.apache.org/jira/browse/FLINK-33246 > Project: Flink > Issue Type: Improvement > Components: Tests >Reporter: Stefan Richter >Assignee: Stefan Richter >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > RescalingITCase currently uses savepoints and cancel/restart for rescaling. > We should add a test that also tests rescaling from checkpoints under > changing resource requirements, i.e. without cancelation of the job. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33341) Use available local keyed state for rescaling
[ https://issues.apache.org/jira/browse/FLINK-33341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-33341: - Fix Version/s: 1.19.0 > Use available local keyed state for rescaling > - > > Key: FLINK-33341 > URL: https://issues.apache.org/jira/browse/FLINK-33341 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Stefan Richter >Assignee: Stefan Richter >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > Local state is currently only used for recovery. However, it would make sense > to also use available local state in rescaling scenarios to reduce the amount > of data to download from remote storage. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33741) Exposed Rocksdb statistics in Flink metrics and introduce 2 Rocksdb statistic related configuration
[ https://issues.apache.org/jira/browse/FLINK-33741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795142#comment-17795142 ] Yun Tang commented on FLINK-33741: -- [~zhoujira86] I think there exists valuable information in the RocksDB statistics, assigned to you, please go ahead. > Exposed Rocksdb statistics in Flink metrics and introduce 2 Rocksdb statistic > related configuration > --- > > Key: FLINK-33741 > URL: https://issues.apache.org/jira/browse/FLINK-33741 > Project: Flink > Issue Type: New Feature >Reporter: xiaogang zhou >Assignee: xiaogang zhou >Priority: Major > > I think we can also parse the multi-line string of the rocksdb statistics. > {code:java} > // code placeholder > /** > * DB implements can export properties about their state > * via this method on a per column family level. > * > * If {@code property} is a valid property understood by this DB > * implementation, fills {@code value} with its current value and > * returns true. Otherwise returns false. > * > * Valid property names include: > * > * "rocksdb.num-files-at-levelN" - return the number of files at > * level N, where N is an ASCII representation of a level > * number (e.g. "0"). > * "rocksdb.stats" - returns a multi-line string that describes statistics > * about the internal operation of the DB. > * "rocksdb.sstables" - returns a multi-line string that describes all > *of the sstables that make up the db contents. > * > * > * @param columnFamilyHandle {@link org.rocksdb.ColumnFamilyHandle} > * instance, or null for the default column family. > * @param property to be fetched. See above for examples > * @return property value > * > * @throws RocksDBException thrown if error happens in underlying > *native library. > */ > public String getProperty( > /* @Nullable */ final ColumnFamilyHandle columnFamilyHandle, > final String property) throws RocksDBException { {code} > > Then we can directly export these rt latency number in metrics. > > I'd like to introduce 2 rocksdb statistic related configuration. > Then we can customize stats > {code:java} > // code placeholder > Statistics s = new Statistics(); > s.setStatsLevel(EXCEPT_TIME_FOR_MUTEX); > currentOptions.setStatsDumpPeriodSec(internalGetOption(RocksDBConfigurableOptions.STATISTIC_DUMP_PERIOD)) > .setStatistics(s); {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33741) Exposed Rocksdb statistics in Flink metrics and introduce 2 Rocksdb statistic related configuration
[ https://issues.apache.org/jira/browse/FLINK-33741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-33741: Assignee: xiaogang zhou > Exposed Rocksdb statistics in Flink metrics and introduce 2 Rocksdb statistic > related configuration > --- > > Key: FLINK-33741 > URL: https://issues.apache.org/jira/browse/FLINK-33741 > Project: Flink > Issue Type: New Feature >Reporter: xiaogang zhou >Assignee: xiaogang zhou >Priority: Major > > I think we can also parse the multi-line string of the rocksdb statistics. > {code:java} > // code placeholder > /** > * DB implements can export properties about their state > * via this method on a per column family level. > * > * If {@code property} is a valid property understood by this DB > * implementation, fills {@code value} with its current value and > * returns true. Otherwise returns false. > * > * Valid property names include: > * > * "rocksdb.num-files-at-levelN" - return the number of files at > * level N, where N is an ASCII representation of a level > * number (e.g. "0"). > * "rocksdb.stats" - returns a multi-line string that describes statistics > * about the internal operation of the DB. > * "rocksdb.sstables" - returns a multi-line string that describes all > *of the sstables that make up the db contents. > * > * > * @param columnFamilyHandle {@link org.rocksdb.ColumnFamilyHandle} > * instance, or null for the default column family. > * @param property to be fetched. See above for examples > * @return property value > * > * @throws RocksDBException thrown if error happens in underlying > *native library. > */ > public String getProperty( > /* @Nullable */ final ColumnFamilyHandle columnFamilyHandle, > final String property) throws RocksDBException { {code} > > Then we can directly export these rt latency number in metrics. > > I'd like to introduce 2 rocksdb statistic related configuration. > Then we can customize stats > {code:java} > // code placeholder > Statistics s = new Statistics(); > s.setStatsLevel(EXCEPT_TIME_FOR_MUTEX); > currentOptions.setStatsDumpPeriodSec(internalGetOption(RocksDBConfigurableOptions.STATISTIC_DUMP_PERIOD)) > .setStatistics(s); {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-24819) Higher APIServer cpu load after using SharedIndexInformer replaced naked Kubernetes watch
[ https://issues.apache.org/jira/browse/FLINK-24819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-24819: - Fix Version/s: 1.19.0 1.18.1 1.17.3 > Higher APIServer cpu load after using SharedIndexInformer replaced naked > Kubernetes watch > - > > Key: FLINK-24819 > URL: https://issues.apache.org/jira/browse/FLINK-24819 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.14.0 >Reporter: Yang Wang >Priority: Major > Fix For: 1.19.0, 1.18.1, 1.17.3 > > > In FLINK-22054, Flink has used a shared informer for ConfigMap to replace the > naked K8s watch. After then, each Flink JVM process(JM/TM) only needs one > connection to APIServer for ConfigMap watching. It aims to reduce the network > pressure on K8s APIServer. > > However, in our recent tests, we found that the CPU and memory cost of > APIServer have been doubled while running same Flink workloads. After digging > more details in the K8s, I think the root cause might be that ETCD does not > have indexes for labels. It means APIServer need to pull all the events from > ETCD for each watch and then filter with specified labels(e.g. > app=xxx,type=flink-native-kubernetes,configmap-type=high-availability) > internally. Before FLINK-22054, we started a dedicated connection for each > ConfigMap watching. And it seems that APIServer only need to pull the events > for the specified ConfigMap name. > > Watch URL example(Before): > [https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?metadata.name=job-009d4f51-ca02-4793-a49b-a3344538719b-resourcemanager-leader=true|https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?labelSelector=app%3Dk8s-ha-app-1-1636077491-23461%2Ctype%3Dflink-native-kubernetes%2Cconfigmap-type%3Dhigh-availability=1153687321=true] > > Watch URL example(After): > [https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?labelSelector=app%3Dk8s-ha-app-1-1636077491-23461%2Ctype%3Dflink-native-kubernetes%2Cconfigmap-type%3Dhigh-availability=1153687321=true] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-24819) Higher APIServer cpu load after using SharedIndexInformer replaced naked Kubernetes watch
[ https://issues.apache.org/jira/browse/FLINK-24819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-24819. -- Resolution: Fixed > Higher APIServer cpu load after using SharedIndexInformer replaced naked > Kubernetes watch > - > > Key: FLINK-24819 > URL: https://issues.apache.org/jira/browse/FLINK-24819 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.14.0 >Reporter: Yang Wang >Priority: Major > > In FLINK-22054, Flink has used a shared informer for ConfigMap to replace the > naked K8s watch. After then, each Flink JVM process(JM/TM) only needs one > connection to APIServer for ConfigMap watching. It aims to reduce the network > pressure on K8s APIServer. > > However, in our recent tests, we found that the CPU and memory cost of > APIServer have been doubled while running same Flink workloads. After digging > more details in the K8s, I think the root cause might be that ETCD does not > have indexes for labels. It means APIServer need to pull all the events from > ETCD for each watch and then filter with specified labels(e.g. > app=xxx,type=flink-native-kubernetes,configmap-type=high-availability) > internally. Before FLINK-22054, we started a dedicated connection for each > ConfigMap watching. And it seems that APIServer only need to pull the events > for the specified ConfigMap name. > > Watch URL example(Before): > [https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?metadata.name=job-009d4f51-ca02-4793-a49b-a3344538719b-resourcemanager-leader=true|https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?labelSelector=app%3Dk8s-ha-app-1-1636077491-23461%2Ctype%3Dflink-native-kubernetes%2Cconfigmap-type%3Dhigh-availability=1153687321=true] > > Watch URL example(After): > [https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?labelSelector=app%3Dk8s-ha-app-1-1636077491-23461%2Ctype%3Dflink-native-kubernetes%2Cconfigmap-type%3Dhigh-availability=1153687321=true] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-32611) Redirect to Apache Paimon's link instead of legacy flink table store
[ https://issues.apache.org/jira/browse/FLINK-32611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-32611. -- Fix Version/s: 1.18.1 Resolution: Fixed merged in flink-web: 4cd8ab2fa927f48d74ed53e79a5e83efa674a720 > Redirect to Apache Paimon's link instead of legacy flink table store > > > Key: FLINK-32611 > URL: https://issues.apache.org/jira/browse/FLINK-32611 > Project: Flink > Issue Type: Improvement > Components: Documentation, Project Website >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Major > Labels: pull-request-available, stale-assigned > Fix For: 1.19.0, 1.18.1 > > > Current Flink's official web site would always point to the legacy flink > table store. However, we should point to the new Apache Paimon website and > docs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33707) Verify the snapshot migration on Java17
Yun Tang created FLINK-33707: Summary: Verify the snapshot migration on Java17 Key: FLINK-33707 URL: https://issues.apache.org/jira/browse/FLINK-33707 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing Reporter: Yun Tang This task is like FLINK-33699, I think we could introduce a StatefulJobSnapshotMigrationITCase-like test to restore snapshots containing scala code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33699) Verify the snapshot migration on Java21
[ https://issues.apache.org/jira/browse/FLINK-33699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791592#comment-17791592 ] Yun Tang commented on FLINK-33699: -- I think it's better to introduce new tests to cover this problem. I checked the [CI|https://dev.azure.com/snuyanzin/flink/_build/results?buildId=2620=logs=0a15d512-44ac-5ba5-97ab-13a5d066c22c=9a028d19-6c4b-5a4e-d378-03fca149d0b1] you triggered before and noticed that the {{StatefulJobSnapshotMigrationITCase}} related tests have passed, which proves what I guessed before, most checkpoints/savepoints should be restored successfully. > Verify the snapshot migration on Java21 > --- > > Key: FLINK-33699 > URL: https://issues.apache.org/jira/browse/FLINK-33699 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Checkpointing >Reporter: Yun Tang >Priority: Major > > In Java 21 builds, Scala is being bumped to 2.12.18, which causes > incompatibilities within Flink. > This could affect loading savepoints from a Java 8/11/17 build. We already > have tests extending {{SnapshotMigrationTestBase}} to verify the logic of > migrating snapshots generated by the older Flink version. I think we can also > introduce similar tests to verify the logic across different Java versions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33699) Verify the snapshot migration on Java21
Yun Tang created FLINK-33699: Summary: Verify the snapshot migration on Java21 Key: FLINK-33699 URL: https://issues.apache.org/jira/browse/FLINK-33699 Project: Flink Issue Type: Sub-task Components: Runtime / Checkpointing Reporter: Yun Tang In Java 21 builds, Scala is being bumped to 2.12.18, which causes incompatibilities within Flink. This could affect loading savepoints from a Java 8/11/17 build. We already have tests extending {{SnapshotMigrationTestBase}} to verify the logic of migrating snapshots generated by the older Flink version. I think we can also introduce similar tests to verify the logic across different Java versions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33395) The join hint doesn't work when appears in subquery
[ https://issues.apache.org/jira/browse/FLINK-33395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-33395: - Fix Version/s: 1.17.3 (was: 1.17.2) > The join hint doesn't work when appears in subquery > --- > > Key: FLINK-33395 > URL: https://issues.apache.org/jira/browse/FLINK-33395 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.16.0, 1.17.0, 1.18.0 >Reporter: xuyang >Assignee: xuyang >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0, 1.18.1, 1.17.3 > > > See the existent test > 'NestLoopJoinHintTest#testJoinHintWithJoinHintInCorrelateAndWithAgg', the > test plan is > {code:java} > HashJoin(joinType=[LeftSemiJoin], where=[=(a1, EXPR$0)], select=[a1, b1], > build=[right], tryDistinctBuildRow=[true]) > :- Exchange(distribution=[hash[a1]]) > : +- TableSourceScan(table=[[default_catalog, default_database, T1]], > fields=[a1, b1]) > +- Exchange(distribution=[hash[EXPR$0]]) >+- LocalHashAggregate(groupBy=[EXPR$0], select=[EXPR$0]) > +- Calc(select=[EXPR$0]) > +- HashAggregate(isMerge=[true], groupBy=[a1], select=[a1, > Final_COUNT(count$0) AS EXPR$0]) > +- Exchange(distribution=[hash[a1]]) >+- LocalHashAggregate(groupBy=[a1], select=[a1, > Partial_COUNT(a2) AS count$0]) > +- NestedLoopJoin(joinType=[InnerJoin], where=[=(a2, a1)], > select=[a2, a1], build=[right]) > :- TableSourceScan(table=[[default_catalog, > default_database, T2, project=[a2], metadata=[]]], fields=[a2], > hints=[[[ALIAS options:[T2) > +- Exchange(distribution=[broadcast]) > +- TableSourceScan(table=[[default_catalog, > default_database, T1, project=[a1], metadata=[]]], fields=[a1], > hints=[[[ALIAS options:[T1) {code} > but the NestedLoopJoin should broadcase left side. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31385) Introduce extended Assertj Matchers for completable futures
[ https://issues.apache.org/jira/browse/FLINK-31385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789347#comment-17789347 ] Yun Tang commented on FLINK-31385: -- Also picked in release-1.17 2a83c910eee711f8b5f9dd4697de60221f21fb9d for the pick of FLINK-33598. > Introduce extended Assertj Matchers for completable futures > --- > > Key: FLINK-31385 > URL: https://issues.apache.org/jira/browse/FLINK-31385 > Project: Flink > Issue Type: Sub-task > Components: Tests >Reporter: David Morávek >Assignee: David Morávek >Priority: Minor > Labels: pull-request-available > Fix For: 1.18.0, 1.17.3 > > > Introduce extended Assertj Matchers for completable futures that don't rely > on timeouts. > In general, we want to avoid relying on timeouts in the Flink test suite to > get additional context (thread dump) in case something gets stuck. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31385) Introduce extended Assertj Matchers for completable futures
[ https://issues.apache.org/jira/browse/FLINK-31385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-31385: - Fix Version/s: 1.17.3 > Introduce extended Assertj Matchers for completable futures > --- > > Key: FLINK-31385 > URL: https://issues.apache.org/jira/browse/FLINK-31385 > Project: Flink > Issue Type: Sub-task > Components: Tests >Reporter: David Morávek >Assignee: David Morávek >Priority: Minor > Labels: pull-request-available > Fix For: 1.18.0, 1.17.3 > > > Introduce extended Assertj Matchers for completable futures that don't rely > on timeouts. > In general, we want to avoid relying on timeouts in the Flink test suite to > get additional context (thread dump) in case something gets stuck. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33598) Watch HA configmap via name instead of lables to reduce pressure on APIserver
[ https://issues.apache.org/jira/browse/FLINK-33598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-33598. -- Resolution: Fixed Merged master: 608546e090f5d41c6a8b9af2c264467279181027 ... b7e8b792c086c3c445ee8429fbcfe035097a878c release-1.18: 6f30c6e427251dd4b2e4ad03f89bed06a519b05f release-1.17: 18d5a4696eccac3b5e7fe1d579547feef4537c08 > Watch HA configmap via name instead of lables to reduce pressure on APIserver > -- > > Key: FLINK-33598 > URL: https://issues.apache.org/jira/browse/FLINK-33598 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.18.0, 1.17.1 >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Critical > Labels: pull-request-available > Fix For: 1.19.0, 1.18.1, 1.17.3 > > > As FLINK-24819 described, the k8s API server would receive more pressure when > HA is enabled, due to the configmap watching being achieved via filter with > labels instead of just querying the configmap name. This could be done after > FLINK-24038, which reduced the number of configmaps to only one as > {{-cluster-config-map}}. > This ticket would not touch {{--config-map}}, which stores > the checkpoint information, as that configmap is directly accessed by JM and > not watched by taskmanagers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33149) Bump snappy-java to 1.1.10.4
[ https://issues.apache.org/jira/browse/FLINK-33149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788969#comment-17788969 ] Yun Tang commented on FLINK-33149: -- [~mapohl] When can we close this ticket? > Bump snappy-java to 1.1.10.4 > > > Key: FLINK-33149 > URL: https://issues.apache.org/jira/browse/FLINK-33149 > Project: Flink > Issue Type: Bug > Components: API / Core, Connectors / AWS, Connectors / HBase, > Connectors / Kafka, Stateful Functions >Affects Versions: 1.18.0, 1.16.3, 1.17.2 >Reporter: Ryan Skraba >Assignee: Ryan Skraba >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, kafka-4.0.0, 1.16.3, 1.17.2 > > > Xerial published a security alert for a Denial of Service attack that [exists > on > 1.1.10.1|https://github.com/xerial/snappy-java/security/advisories/GHSA-55g7-9cwv-5qfv]. > This is included in flink-dist, but also in flink-statefun, and several > connectors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33598) Watch HA configmap via name instead of lables to reduce pressure on APIserver
[ https://issues.apache.org/jira/browse/FLINK-33598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-33598: - Description: As FLINK-24819 described, the k8s API server would receive more pressure when HA is enabled, due to the configmap watching being achieved via filter with labels instead of just querying the configmap name. This could be done after FLINK-24038, which reduced the number of configmaps to only one as {{-cluster-config-map}}. This ticket would not touch {{--config-map}}, which stores the checkpoint information, as that configmap is directly accessed by JM and not watched by taskmanagers. was: As FLINK-24819 described, the k8s API server would receive more pressure when HA is enabled, due to the configmap watching being achieved via filter with labels instead of just querying the configmap name. This could be done after FLINK-24038, which reduced the number of configmaps to only one as {{-cluster-config-map}}. This ticket would not touch {{--config-map}}, which stores the checkpoint information, as that configmap is only used by JM and not watched by taskmanagers. > Watch HA configmap via name instead of lables to reduce pressure on APIserver > -- > > Key: FLINK-33598 > URL: https://issues.apache.org/jira/browse/FLINK-33598 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.18.0, 1.17.1 >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Critical > Labels: pull-request-available > Fix For: 1.19.0, 1.18.1, 1.17.3 > > > As FLINK-24819 described, the k8s API server would receive more pressure when > HA is enabled, due to the configmap watching being achieved via filter with > labels instead of just querying the configmap name. This could be done after > FLINK-24038, which reduced the number of configmaps to only one as > {{-cluster-config-map}}. > This ticket would not touch {{--config-map}}, which stores > the checkpoint information, as that configmap is directly accessed by JM and > not watched by taskmanagers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33598) Watch HA configmap via name instead of lables to reduce pressure on APIserver
[ https://issues.apache.org/jira/browse/FLINK-33598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-33598: - Description: As FLINK-24819 described, the k8s API server would receive more pressure when HA is enabled, due to the configmap watching being achieved via filter with labels instead of just querying the configmap name. This could be done after FLINK-24038, which reduced the number of configmaps to only one as {{-cluster-config-map}}. This ticket would not touch {{--config-map}}, which stores the checkpoint information, as that configmap is only used by JM and not watched by taskmanagers. was:As FLINK-24819 described, the k8s API server would receive more pressure when HA is enabled, due to the configmap watching being achieved via filter with labels instead of just querying the configmap name. This could be done after FLINK-24038, which reduced the number of configmaps to only one as {{-cluster-config-map}}. > Watch HA configmap via name instead of lables to reduce pressure on APIserver > -- > > Key: FLINK-33598 > URL: https://issues.apache.org/jira/browse/FLINK-33598 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.18.0, 1.17.1 >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Critical > Labels: pull-request-available > Fix For: 1.19.0, 1.18.1, 1.17.3 > > > As FLINK-24819 described, the k8s API server would receive more pressure when > HA is enabled, due to the configmap watching being achieved via filter with > labels instead of just querying the configmap name. This could be done after > FLINK-24038, which reduced the number of configmaps to only one as > {{-cluster-config-map}}. > This ticket would not touch {{--config-map}}, which stores > the checkpoint information, as that configmap is only used by JM and not > watched by taskmanagers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33598) Watch HA configmap via name instead of lables to reduce pressure on APIserver
[ https://issues.apache.org/jira/browse/FLINK-33598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang updated FLINK-33598: - Description: As FLINK-24819 described, the k8s API server would receive more pressure when HA is enabled, due to the configmap watching being achieved via filter with labels instead of just querying the configmap name. This could be done after FLINK-24038, which reduced the number of configmaps to only one as {{-cluster-config-map}}. (was: As FLINK-24819 described, the k8s API server would receive more pressure when HA is enabled, due to the configmap watching being achieved via filter with labels instead of just querying the configmap name. This could be done after FLINK-24038, which reduced the number of configmaps to only one.) > Watch HA configmap via name instead of lables to reduce pressure on APIserver > -- > > Key: FLINK-33598 > URL: https://issues.apache.org/jira/browse/FLINK-33598 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.18.0, 1.17.1 >Reporter: Yun Tang >Assignee: Yun Tang >Priority: Critical > Labels: pull-request-available > Fix For: 1.19.0, 1.18.1, 1.17.3 > > > As FLINK-24819 described, the k8s API server would receive more pressure when > HA is enabled, due to the configmap watching being achieved via filter with > labels instead of just querying the configmap name. This could be done after > FLINK-24038, which reduced the number of configmaps to only one as > {{-cluster-config-map}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33598) Watch HA configmap via name instead of lables to reduce pressure on APIserver
Yun Tang created FLINK-33598: Summary: Watch HA configmap via name instead of lables to reduce pressure on APIserver Key: FLINK-33598 URL: https://issues.apache.org/jira/browse/FLINK-33598 Project: Flink Issue Type: Improvement Components: Deployment / Kubernetes Affects Versions: 1.17.1, 1.18.0 Reporter: Yun Tang Assignee: Yun Tang Fix For: 1.19.0, 1.18.1, 1.17.3 As FLINK-24819 described, the k8s API server would receive more pressure when HA is enabled, due to the configmap watching being achieved via filter with labels instead of just querying the configmap name. This could be done after FLINK-24038, which reduced the number of configmaps to only one. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33263) Implement ParallelismProvider for sources in the table planner
[ https://issues.apache.org/jira/browse/FLINK-33263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786219#comment-17786219 ] Yun Tang commented on FLINK-33263: -- [~Zhanghao Chen] Thanks for the update. Looking forward to the PR. > Implement ParallelismProvider for sources in the table planner > -- > > Key: FLINK-33263 > URL: https://issues.apache.org/jira/browse/FLINK-33263 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Zhanghao Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-33263) Implement ParallelismProvider for sources in Blink planner
[ https://issues.apache.org/jira/browse/FLINK-33263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784448#comment-17784448 ] Yun Tang edited comment on FLINK-33263 at 11/15/23 4:01 AM: [~Zhanghao Chen] Do we still have some specific planner called {{Blink}} planner currently? There is only one table planner now. was (Author: yunta): Do we still have some specific planner called {{Blink}} planner currently? There is only one table planner now. > Implement ParallelismProvider for sources in Blink planner > -- > > Key: FLINK-33263 > URL: https://issues.apache.org/jira/browse/FLINK-33263 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Zhanghao Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33263) Implement ParallelismProvider for sources in Blink planner
[ https://issues.apache.org/jira/browse/FLINK-33263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784448#comment-17784448 ] Yun Tang commented on FLINK-33263: -- Do we still have some specific planner called {{Blink}} planner currently? There is only one table planner now. > Implement ParallelismProvider for sources in Blink planner > -- > > Key: FLINK-33263 > URL: https://issues.apache.org/jira/browse/FLINK-33263 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Zhanghao Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-20672) notifyCheckpointAborted RPC failure can fail JM
[ https://issues.apache.org/jira/browse/FLINK-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784035#comment-17784035 ] Yun Tang commented on FLINK-20672: -- [~Zakelly] Thanks for the information. If so, I have another question: do we really need the {{io-executor}} to work with {{FatalExitExceptionHandler}}? From my point of view, if we do not delete the Savepoint correctly (as this is also executed on the {{io-executor}}), shall we need to fail the whole JobManager? If the correct behavior of the exception handler of {{io-executor}} is not fatal exiting, I think we shall correct that behavior first. [~Zakelly], [~roman], [~srichter] WDYT? > notifyCheckpointAborted RPC failure can fail JM > --- > > Key: FLINK-20672 > URL: https://issues.apache.org/jira/browse/FLINK-20672 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.11.3, 1.12.0 >Reporter: Roman Khachatryan >Assignee: Zakelly Lan >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor, > pull-request-available > > Introduced in FLINK-8871, aborted RPC notifications are done asynchonously: > > {code} > private void sendAbortedMessages(long checkpointId, long timeStamp) { > // send notification of aborted checkpoints asynchronously. > executor.execute(() -> { > // send the "abort checkpoint" messages to necessary > vertices. > // .. > }); > } > {code} > However, the executor that eventually executes this request is created as > follows > {code} > final ScheduledExecutorService futureExecutor = > Executors.newScheduledThreadPool( > Hardware.getNumberCPUCores(), > new ExecutorThreadFactory("jobmanager-future")); > {code} > ExecutorThreadFactory uses UncaughtExceptionHandler that exits JVM on error. > cc: [~yunta] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-20672) notifyCheckpointAborted RPC failure can fail JM
[ https://issues.apache.org/jira/browse/FLINK-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783868#comment-17783868 ] Yun Tang edited comment on FLINK-20672 at 11/8/23 3:01 AM: --- [~Zakelly] Thanks for picking up the stale tickets. However, I think this is not true after FLINK-23654 is resolved. was (Author: yunta): [~Zakelly] Thanks for picking up the stale tickets. However, I think this is not true after FLINK-20672 is resolved. > notifyCheckpointAborted RPC failure can fail JM > --- > > Key: FLINK-20672 > URL: https://issues.apache.org/jira/browse/FLINK-20672 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.11.3, 1.12.0 >Reporter: Roman Khachatryan >Assignee: Zakelly Lan >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor, > pull-request-available > > Introduced in FLINK-8871, aborted RPC notifications are done asynchonously: > > {code} > private void sendAbortedMessages(long checkpointId, long timeStamp) { > // send notification of aborted checkpoints asynchronously. > executor.execute(() -> { > // send the "abort checkpoint" messages to necessary > vertices. > // .. > }); > } > {code} > However, the executor that eventually executes this request is created as > follows > {code} > final ScheduledExecutorService futureExecutor = > Executors.newScheduledThreadPool( > Hardware.getNumberCPUCores(), > new ExecutorThreadFactory("jobmanager-future")); > {code} > ExecutorThreadFactory uses UncaughtExceptionHandler that exits JVM on error. > cc: [~yunta] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-20672) notifyCheckpointAborted RPC failure can fail JM
[ https://issues.apache.org/jira/browse/FLINK-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783868#comment-17783868 ] Yun Tang commented on FLINK-20672: -- [~Zakelly] Thanks for picking up the stale tickets. However, I think this is not true after FLINK-20672 is resolved. > notifyCheckpointAborted RPC failure can fail JM > --- > > Key: FLINK-20672 > URL: https://issues.apache.org/jira/browse/FLINK-20672 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.11.3, 1.12.0 >Reporter: Roman Khachatryan >Assignee: Zakelly Lan >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor, > pull-request-available > > Introduced in FLINK-8871, aborted RPC notifications are done asynchonously: > > {code} > private void sendAbortedMessages(long checkpointId, long timeStamp) { > // send notification of aborted checkpoints asynchronously. > executor.execute(() -> { > // send the "abort checkpoint" messages to necessary > vertices. > // .. > }); > } > {code} > However, the executor that eventually executes this request is created as > follows > {code} > final ScheduledExecutorService futureExecutor = > Executors.newScheduledThreadPool( > Hardware.getNumberCPUCores(), > new ExecutorThreadFactory("jobmanager-future")); > {code} > ExecutorThreadFactory uses UncaughtExceptionHandler that exits JVM on error. > cc: [~yunta] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33474) ShowPlan throws undefined exception In Flink Web Submit Page
[ https://issues.apache.org/jira/browse/FLINK-33474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang resolved FLINK-33474. -- Fix Version/s: 1.17.2 1.19.0 1.18.1 Resolution: Fixed merged master: 008e1916e8bbeb18c1d06c74e2797da5a439cd47 release-1.18: 2409184456aa2d07c5bbc580916370802fb3ae8e release-1.17: 89cbd394a6cbfce1ca685362bf9ce4cf476bca7d > ShowPlan throws undefined exception In Flink Web Submit Page > > > Key: FLINK-33474 > URL: https://issues.apache.org/jira/browse/FLINK-33474 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.17.2, 1.19.0, 1.18.1 > > Attachments: image-2023-11-07-13-53-08-216.png > > > The exception as shown in the figure below, meanwhile, the job plan cannot be > displayed properly. > > The root cause is that the dagreComponent is located in the nz-drawer and is > only loaded when the drawer is visible, so we need to wait for the drawer to > finish loading and then render the job plan. > !image-2023-11-07-13-53-08-216.png|width=400,height=190! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33474) ShowPlan throws undefined exception In Flink Web Submit Page
[ https://issues.apache.org/jira/browse/FLINK-33474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-33474: Assignee: Yu Chen > ShowPlan throws undefined exception In Flink Web Submit Page > > > Key: FLINK-33474 > URL: https://issues.apache.org/jira/browse/FLINK-33474 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > Labels: pull-request-available > Attachments: image-2023-11-07-13-53-08-216.png > > > The exception as shown in the figure below, meanwhile, the job plan cannot be > displayed properly. > > The root cause is that the dagreComponent is located in the nz-drawer and is > only loaded when the drawer is visible, so we need to wait for the drawer to > finish loading and then render the job plan. > !image-2023-11-07-13-53-08-216.png|width=400,height=190! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33433) Support invoke async-profiler on Jobmanager through REST API
[ https://issues.apache.org/jira/browse/FLINK-33433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-33433: Assignee: Yu Chen > Support invoke async-profiler on Jobmanager through REST API > > > Key: FLINK-33433 > URL: https://issues.apache.org/jira/browse/FLINK-33433 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33325) FLIP-375: Built-in cross-platform powerful java profiler
[ https://issues.apache.org/jira/browse/FLINK-33325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-33325: Assignee: Yu Chen > FLIP-375: Built-in cross-platform powerful java profiler > > > Key: FLINK-33325 > URL: https://issues.apache.org/jira/browse/FLINK-33325 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > > This is an umbrella JIRA of > [FLIP-375|https://cwiki.apache.org/confluence/x/64lEE] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33436) Documentation on the built-in Profiler
[ https://issues.apache.org/jira/browse/FLINK-33436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-33436: Assignee: Yu Chen > Documentation on the built-in Profiler > -- > > Key: FLINK-33436 > URL: https://issues.apache.org/jira/browse/FLINK-33436 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33434) Support invoke async-profiler on Taskmanager through REST API
[ https://issues.apache.org/jira/browse/FLINK-33434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-33434: Assignee: Yu Chen > Support invoke async-profiler on Taskmanager through REST API > - > > Key: FLINK-33434 > URL: https://issues.apache.org/jira/browse/FLINK-33434 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33435) The visualization and download capabilities of profiling history
[ https://issues.apache.org/jira/browse/FLINK-33435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang reassigned FLINK-33435: Assignee: Yu Chen > The visualization and download capabilities of profiling history > - > > Key: FLINK-33435 > URL: https://issues.apache.org/jira/browse/FLINK-33435 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Yu Chen >Assignee: Yu Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33355) can't reduce the parallelism from 'n' to '1' when recovering through a savepoint.
[ https://issues.apache.org/jira/browse/FLINK-33355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779364#comment-17779364 ] Yun Tang commented on FLINK-33355: -- I think this is because you forgot to set the uid for each operator. Since `windowAll` operator could only have parallelism 1, all operators would chain together once you change the parallelism to 1. Please assign the operator id as doc https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/state/savepoints/#assigning-operator-ids said. > can't reduce the parallelism from 'n' to '1' when recovering through a > savepoint. > - > > Key: FLINK-33355 > URL: https://issues.apache.org/jira/browse/FLINK-33355 > Project: Flink > Issue Type: Bug > Components: API / Core > Environment: flink 1.17.1 >Reporter: zhang >Priority: Major > > If the program includes operators with window, it is not possible to reduce > the parallelism of the operators from n to 1 when restarting from a > savepoint, and it will result in an error: > {code:java} > //IllegalStateException: Failed to rollback to checkpoint/savepoint > Checkpoint Metadata. Max parallelism mismatch between checkpoint/savepoint > state and new program. Cannot map operator 0e059b9f403cf6f35592ab773c9408d4 > with max parallelism 128 to new program with max parallelism 1. This > indicates that the program has been changed in a non-compatible way after the > checkpoint/savepoint. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33355) can't reduce the parallelism from 'n' to '1' when recovering through a savepoint.
[ https://issues.apache.org/jira/browse/FLINK-33355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779358#comment-17779358 ] Yun Tang commented on FLINK-33355: -- [~edmond_j] How did you assign the parallelism, by setting the configuration of `parallelism.default`? > can't reduce the parallelism from 'n' to '1' when recovering through a > savepoint. > - > > Key: FLINK-33355 > URL: https://issues.apache.org/jira/browse/FLINK-33355 > Project: Flink > Issue Type: Bug > Components: API / Core > Environment: flink 1.17.1 >Reporter: zhang >Priority: Major > > If the program includes operators with window, it is not possible to reduce > the parallelism of the operators from n to 1 when restarting from a > savepoint, and it will result in an error: > {code:java} > //IllegalStateException: Failed to rollback to checkpoint/savepoint > Checkpoint Metadata. Max parallelism mismatch between checkpoint/savepoint > state and new program. Cannot map operator 0e059b9f403cf6f35592ab773c9408d4 > with max parallelism 128 to new program with max parallelism 1. This > indicates that the program has been changed in a non-compatible way after the > checkpoint/savepoint. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33355) can't reduce the parallelism from 'n' to '1' when recovering through a savepoint.
[ https://issues.apache.org/jira/browse/FLINK-33355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779340#comment-17779340 ] Yun Tang commented on FLINK-33355: -- [~edmond_j] could you please share the code to reproduce this problem? > can't reduce the parallelism from 'n' to '1' when recovering through a > savepoint. > - > > Key: FLINK-33355 > URL: https://issues.apache.org/jira/browse/FLINK-33355 > Project: Flink > Issue Type: Bug > Components: API / Core > Environment: flink 1.17.1 >Reporter: zhang >Priority: Major > > If the program includes operators with window, it is not possible to reduce > the parallelism of the operators from n to 1 when restarting from a > savepoint, and it will result in an error: > {code:java} > //IllegalStateException: Failed to rollback to checkpoint/savepoint > Checkpoint Metadata. Max parallelism mismatch between checkpoint/savepoint > state and new program. Cannot map operator 0e059b9f403cf6f35592ab773c9408d4 > with max parallelism 128 to new program with max parallelism 1. This > indicates that the program has been changed in a non-compatible way after the > checkpoint/savepoint. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33355) can't reduce the parallelism from 'n' to '1' when recovering through a savepoint.
[ https://issues.apache.org/jira/browse/FLINK-33355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779315#comment-17779315 ] Yun Tang commented on FLINK-33355: -- Changing the max-parallelism (instead of parallelism), would break the checkpoint compatibility, which is built by design. You can refer to https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/execution/parallel/#setting-the-maximum-parallelism for more details. > can't reduce the parallelism from 'n' to '1' when recovering through a > savepoint. > - > > Key: FLINK-33355 > URL: https://issues.apache.org/jira/browse/FLINK-33355 > Project: Flink > Issue Type: Bug > Components: API / Core > Environment: flink 1.17.1 >Reporter: zhang >Priority: Major > > If the program includes operators with window, it is not possible to reduce > the parallelism of the operators from n to 1 when restarting from a > savepoint, and it will result in an error: > {code:java} > //IllegalStateException: Failed to rollback to checkpoint/savepoint > Checkpoint Metadata. Max parallelism mismatch between checkpoint/savepoint > state and new program. Cannot map operator 0e059b9f403cf6f35592ab773c9408d4 > with max parallelism 128 to new program with max parallelism 1. This > indicates that the program has been changed in a non-compatible way after the > checkpoint/savepoint. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33355) can't reduce the parallelism from 'n' to '1' when recovering through a savepoint.
[ https://issues.apache.org/jira/browse/FLINK-33355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Tang closed FLINK-33355. Resolution: Information Provided > can't reduce the parallelism from 'n' to '1' when recovering through a > savepoint. > - > > Key: FLINK-33355 > URL: https://issues.apache.org/jira/browse/FLINK-33355 > Project: Flink > Issue Type: Bug > Components: API / Core > Environment: flink 1.17.1 >Reporter: zhang >Priority: Major > > If the program includes operators with window, it is not possible to reduce > the parallelism of the operators from n to 1 when restarting from a > savepoint, and it will result in an error: > {code:java} > //IllegalStateException: Failed to rollback to checkpoint/savepoint > Checkpoint Metadata. Max parallelism mismatch between checkpoint/savepoint > state and new program. Cannot map operator 0e059b9f403cf6f35592ab773c9408d4 > with max parallelism 128 to new program with max parallelism 1. This > indicates that the program has been changed in a non-compatible way after the > checkpoint/savepoint. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)