[jira] [Comment Edited] (FLINK-25646) Document buffer debloating issues with high parallelism

2022-01-18 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477658#comment-17477658 ] Piotr Nowojski edited comment on FLINK-25646 at 1/18/22, 8:39 AM: --

[jira] [Created] (FLINK-25688) Resolve performance degradation with high parallelism when using buffer debloating

2022-01-18 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-25688: -- Summary: Resolve performance degradation with high parallelism when using buffer debloating Key: FLINK-25688 URL: https://issues.apache.org/jira/browse/FLINK-25688

[jira] [Comment Edited] (FLINK-25646) Document buffer debloating issues with high parallelism

2022-01-18 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477658#comment-17477658 ] Piotr Nowojski edited comment on FLINK-25646 at 1/18/22, 8:34 AM: --

[jira] [Closed] (FLINK-25646) Document buffer debloating issues with high parallelism

2022-01-18 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-25646. -- Fix Version/s: 1.15.0 Resolution: Fixed merged commit 25f3706 into apache:master >

[jira] [Commented] (FLINK-22643) Too many TCP connections among TaskManagers for large scale jobs

2022-01-14 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476009#comment-17476009 ] Piotr Nowojski commented on FLINK-22643: Thanks for the analysis! Ok, sounds good to me, as we

[jira] [Closed] (FLINK-25407) Network stack deadlock when cancellation happens during initialisation

2022-01-13 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-25407. -- Resolution: Fixed Thanks [~kevin.cyj] for the fix! Merged to master as 1ea2a7a5c90^ and

[jira] [Closed] (FLINK-24918) Support to specify the data dir for state benchmark

2022-01-13 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-24918. -- Fix Version/s: 1.15.0 Resolution: Fixed > Support to specify the data dir for state

[jira] [Closed] (FLINK-25441) ProducerFailedException will cause task status switch from RUNNING to CANCELED, which will cause the job to hang.

2022-01-13 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-25441. -- Assignee: Lijie Wang Resolution: Fixed merged commit f957e3f into apache:master now >

[jira] [Commented] (FLINK-25441) ProducerFailedException will cause task status switch from RUNNING to CANCELED, which will cause the job to hang.

2022-01-11 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472719#comment-17472719 ] Piotr Nowojski commented on FLINK-25441: I think your suggestion [~kevin.cyj] is the right thing

[jira] (FLINK-25441) ProducerFailedException will cause task status switch from RUNNING to CANCELED, which will cause the job to hang.

2022-01-11 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25441 ] Piotr Nowojski deleted comment on FLINK-25441: was (Author: pnowojski): [~kevin.cyj] what do you think about this question of mine? {quote} is the correct thing to do. A better question

[jira] [Commented] (FLINK-25441) ProducerFailedException will cause task status switch from RUNNING to CANCELED, which will cause the job to hang.

2022-01-11 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472715#comment-17472715 ] Piotr Nowojski commented on FLINK-25441: [~kevin.cyj] what do you think about this question of

[jira] [Closed] (FLINK-24694) Translate "Checkpointing under backpressure" page into Chinese

2022-01-11 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-24694. -- Resolution: Fixed Thanks for the contribution [~zlzhang0122]! merged commit d1cc458 into

[jira] [Commented] (FLINK-25427) SavepointITCase.testTriggerSavepointAndResumeWithNoClaim fails on AZP

2022-01-11 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472597#comment-17472597 ] Piotr Nowojski commented on FLINK-25427: {quote} What are the TMs doing that should provide the

[jira] [Closed] (FLINK-25414) Provide metrics to measure how long task has been blocked

2022-01-02 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-25414. -- Fix Version/s: 1.15.0 Resolution: Fixed Merged to master as bc22d2b90cb..1166d11f61a

[jira] [Comment Edited] (FLINK-25296) [state.checkpoints.num-retained ]The default value does not take effect

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467258#comment-17467258 ] Piotr Nowojski edited comment on FLINK-25296 at 12/31/21, 2:15 PM: --- Hi

[jira] [Commented] (FLINK-25296) [state.checkpoints.num-retained ]The default value does not take effect

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467258#comment-17467258 ] Piotr Nowojski commented on FLINK-25296: Hi [~libra_816]. What do you mean by

[jira] [Closed] (FLINK-21186) RecordWriterOutput swallows interrupt state when interrupted.

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-21186. -- Fix Version/s: 1.15.0 Resolution: Fixed Merged a small hotfix change as commit ef839ff

[jira] [Updated] (FLINK-25105) Enables final checkpoint by default

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25105: --- Release Note: In 1.15 we enabled the support of checkpoints after part of tasks finished by

[jira] [Closed] (FLINK-25454) Negative time in throughput calculator

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-25454. -- Resolution: Fixed > Negative time in throughput calculator >

[jira] [Updated] (FLINK-25454) Negative time in throughput calculator

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25454: --- Affects Version/s: 1.15.0 (was: 1.14.0) > Negative time in

[jira] [Updated] (FLINK-25454) Negative time in throughput calculator

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25454: --- Fix Version/s: 1.15.0 > Negative time in throughput calculator >

[jira] [Commented] (FLINK-25454) Negative time in throughput calculator

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467187#comment-17467187 ] Piotr Nowojski commented on FLINK-25454: Merged to master as 7ccd525a15c^..7ccd525a15c >

[jira] [Commented] (FLINK-22643) Too many TCP connections among TaskManagers for large scale jobs

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467186#comment-17467186 ] Piotr Nowojski commented on FLINK-22643: Great [~fanrui], I've assigned the ticket to you :) I

[jira] [Assigned] (FLINK-22643) Too many TCP connections among TaskManagers for large scale jobs

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-22643: -- Assignee: fanrui > Too many TCP connections among TaskManagers for large scale jobs

[jira] [Closed] (FLINK-24657) Add metric of the total real size of input/output buffers queue

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-24657. -- Fix Version/s: 1.15.0 Resolution: Fixed Merged to master as 60aa2ac2979^^..60aa2ac2979

[jira] [Comment Edited] (FLINK-22643) Too many TCP connections among TaskManagers for large scale jobs

2021-12-31 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466423#comment-17466423 ] Piotr Nowojski edited comment on FLINK-22643 at 12/31/21, 8:59 AM: ---

[jira] [Closed] (FLINK-25424) Checkpointing is currently not supported for operators that implement InputSelectable

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-25424. -- Resolution: Duplicate Hi [~ghandzhipeng]. Thanks for rising this issue. This is a well known

[jira] [Updated] (FLINK-25307) Resuming Savepoint (hashmap, async, no parallelism change) end-to-end test timeout on azure

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25307: --- Component/s: Runtime / Coordination (was: Runtime / Checkpointing) >

[jira] [Commented] (FLINK-25026) UnalignedCheckpointRescaleITCase.shouldRescaleUnalignedCheckpoint fails on AZP

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466888#comment-17466888 ] Piotr Nowojski commented on FLINK-25026: Ok. In that case let's disregard those latter two

[jira] [Updated] (FLINK-25026) UnalignedCheckpointRescaleITCase.shouldRescaleUnalignedCheckpoint fails on AZP

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25026: --- Component/s: Runtime / Checkpointing (was: Runtime / Coordination) >

[jira] [Assigned] (FLINK-25026) UnalignedCheckpointRescaleITCase.shouldRescaleUnalignedCheckpoint fails on AZP

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-25026: -- Assignee: (was: Anton Kalashnikov) >

[jira] [Commented] (FLINK-20928) KafkaSourceReaderTest.testOffsetCommitOnCheckpointComplete:189->pollUntil:270 ยป Timeout

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466872#comment-17466872 ] Piotr Nowojski commented on FLINK-20928:

[jira] [Commented] (FLINK-25026) UnalignedCheckpointRescaleITCase.shouldRescaleUnalignedCheckpoint fails on AZP

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466852#comment-17466852 ] Piotr Nowojski commented on FLINK-25026: [~trohrmann] our best guess is/was that this, or at the

[jira] [Assigned] (FLINK-25026) UnalignedCheckpointRescaleITCase.shouldRescaleUnalignedCheckpoint fails on AZP

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-25026: -- Assignee: Anton Kalashnikov >

[jira] [Commented] (FLINK-25441) ProducerFailedException will cause task status switch from RUNNING to CANCELED, which will cause the job to hang.

2021-12-30 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466731#comment-17466731 ] Piotr Nowojski commented on FLINK-25441: > because the producer (upstream task) was FINISHED.

[jira] [Closed] (FLINK-5241) Introduce AbstractStreamOperator get CloseableRegistry

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-5241. - Resolution: Won't Fix I don't see a clear motivation to provide this improvement, so I'm closing

[jira] [Comment Edited] (FLINK-25441) ProducerFailedException will cause task status switch from RUNNING to CANCELED, which will cause the job to hang.

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466466#comment-17466466 ] Piotr Nowojski edited comment on FLINK-25441 at 12/29/21, 2:03 PM: ---

[jira] [Commented] (FLINK-25441) ProducerFailedException will cause task status switch from RUNNING to CANCELED, which will cause the job to hang.

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466466#comment-17466466 ] Piotr Nowojski commented on FLINK-25441: [~wanglijie95], do you mean that [exactly this

[jira] [Updated] (FLINK-25026) UnalignedCheckpointRescaleITCase.shouldRescaleUnalignedCheckpoint fails on AZP

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25026: --- Component/s: Runtime / Coordination (was: Runtime / Checkpointing) >

[jira] [Updated] (FLINK-25426) UnalignedCheckpointRescaleITCase.shouldRescaleUnalignedCheckpoint fails on AZP because it cannot allocate enough network buffers

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25426: --- Component/s: Runtime / Coordination (was: Runtime / Checkpointing) >

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Component/s: Runtime / State Backends (was: Runtime / Coordination) >

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Component/s: Runtime / Coordination (was: Runtime / State Backends)

[jira] [Commented] (FLINK-22643) Too many TCP connections among TaskManagers for large scale jobs

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466423#comment-17466423 ] Piotr Nowojski commented on FLINK-22643: Currently it looks like nobody is working on this

[jira] [Closed] (FLINK-25417) Too many connections for TM

2021-12-29 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-25417. -- Resolution: Duplicate > Too many connections for TM > --- > >

[jira] [Updated] (FLINK-25414) Provide metrics to measure how long task has been blocked

2021-12-22 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25414: --- Description: Currently back pressured/busy metrics tell the user whether task is

[jira] [Assigned] (FLINK-25414) Provide metrics to measure how long task has been blocked

2021-12-22 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-25414: -- Assignee: Piotr Nowojski > Provide metrics to measure how long task has been blocked

[jira] [Assigned] (FLINK-25407) Network stack deadlock when cancellation happens during initialisation

2021-12-22 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-25407: -- Assignee: Yingjie Cao > Network stack deadlock when cancellation happens during

[jira] [Created] (FLINK-25414) Provide metrics to measure how long task has been blocked

2021-12-22 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-25414: -- Summary: Provide metrics to measure how long task has been blocked Key: FLINK-25414 URL: https://issues.apache.org/jira/browse/FLINK-25414 Project: Flink

[jira] [Updated] (FLINK-25395) FileNotFoundException during recovery caused by Incremental shared state being discarded by TM

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25395: --- Priority: Blocker (was: Critical) > FileNotFoundException during recovery caused by

[jira] [Comment Edited] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17463325#comment-17463325 ] Piotr Nowojski edited comment on FLINK-25185 at 12/21/21, 4:03 PM: ---

[jira] [Commented] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17463325#comment-17463325 ] Piotr Nowojski commented on FLINK-25185: After an offline discussion with [~roman] and some

[jira] [Updated] (FLINK-25407) Network stack deadlock when cancellation happens during initialisation

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25407: --- Description: This issue was extracted from and initially reported in FLINK-25185. It is

[jira] [Created] (FLINK-25407) Network stack deadlock when cancellation happens during initialisation

2021-12-21 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-25407: -- Summary: Network stack deadlock when cancellation happens during initialisation Key: FLINK-25407 URL: https://issues.apache.org/jira/browse/FLINK-25407 Project:

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Affects Version/s: 1.13.3 > StreamFaultToleranceTestBase hangs on AZP >

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Affects Version/s: (was: 1.13.3) > StreamFaultToleranceTestBase hangs on AZP >

[jira] [Updated] (FLINK-25395) FileNotFoundException during recovery caused by Incremental shared state being discarded by TM

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25395: --- Summary: FileNotFoundException during recovery caused by Incremental shared state being

[jira] [Updated] (FLINK-25395) Incremental shared state might be discarded by TM

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25395: --- Description: Extracting from [FLINK-25185

[jira] [Updated] (FLINK-25395) Incremental shared state might be discarded by TM

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25395: --- Description: Extracting from [FLINK-25185

[jira] [Commented] (FLINK-25399) AZP fails with exit code 137 when running checkpointing test cases

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17463269#comment-17463269 ] Piotr Nowojski commented on FLINK-25399: The {{FileNotFoundException}} indicates this might be a

[jira] [Commented] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17463093#comment-17463093 ] Piotr Nowojski commented on FLINK-25185: {quote} I don't think so: the decision whether to

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Priority: Blocker (was: Critical) > StreamFaultToleranceTestBase hangs on AZP >

[jira] [Assigned] (FLINK-21186) RecordWriterOutput swallows interrupt state when interrupted.

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-21186: -- Assignee: Piotr Nowojski > RecordWriterOutput swallows interrupt state when

[jira] [Commented] (FLINK-21186) RecordWriterOutput swallows interrupt state when interrupted.

2021-12-21 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17463075#comment-17463075 ] Piotr Nowojski commented on FLINK-21186: I still think there is no issue in this code. It's

[jira] [Assigned] (FLINK-25194) Implement an API for duplicating artefacts

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-25194: -- Assignee: Piotr Nowojski (was: Dawid Wysakowicz) > Implement an API for duplicating

[jira] [Commented] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462658#comment-17462658 ] Piotr Nowojski commented on FLINK-25185: {quote} When a checkpoitnt is aborted, TM will try to

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Component/s: Runtime / State Backends (was: Runtime / Coordination) >

[jira] [Commented] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462554#comment-17462554 ] Piotr Nowojski commented on FLINK-25185: It looks like those tests were stuck in an endless loop

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Component/s: Runtime / Coordination (was: Runtime / Checkpointing) >

[jira] [Closed] (FLINK-25382) Failure in "Upload Logs" task

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-25382. -- Resolution: Duplicate > Failure in "Upload Logs" task > - > >

[jira] [Updated] (FLINK-22090) Upload logs fails

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-22090: --- Priority: Critical (was: Not a Priority) > Upload logs fails > - > >

[jira] [Commented] (FLINK-22090) Upload logs fails

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462503#comment-17462503 ] Piotr Nowojski commented on FLINK-22090: Another instances:

[jira] [Created] (FLINK-25382) Failure in "Upload Logs" task

2021-12-20 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-25382: -- Summary: Failure in "Upload Logs" task Key: FLINK-25382 URL: https://issues.apache.org/jira/browse/FLINK-25382 Project: Flink Issue Type: Bug

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Summary: StreamFaultToleranceTestBase hangs on AZP (was:

[jira] [Updated] (FLINK-25185) UdfStreamOperatorCheckpointingITCase (StreamFaultToleranceTestBase) hangs on AZP

2021-12-20 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25185: --- Summary: UdfStreamOperatorCheckpointingITCase (StreamFaultToleranceTestBase) hangs on AZP

[jira] [Assigned] (FLINK-25199) fromValues does not emit final MAX watermark

2021-12-17 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-25199: -- Assignee: Marios Trivyzas > fromValues does not emit final MAX watermark >

[jira] [Closed] (FLINK-24846) AsyncWaitOperator fails during stop-with-savepoint

2021-12-17 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-24846. -- Resolution: Fixed > AsyncWaitOperator fails during stop-with-savepoint >

[jira] [Comment Edited] (FLINK-24846) AsyncWaitOperator fails during stop-with-savepoint

2021-12-17 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459916#comment-17459916 ] Piotr Nowojski edited comment on FLINK-24846 at 12/17/21, 11:37 AM:

[jira] [Commented] (FLINK-25318) Improvement of scheduler and execution for Flink OLAP

2021-12-16 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460615#comment-17460615 ] Piotr Nowojski commented on FLINK-25318: Hi all. Thanks for taking up this interesting

[jira] [Comment Edited] (FLINK-24846) AsyncWaitOperator fails during stop-with-savepoint

2021-12-16 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459916#comment-17459916 ] Piotr Nowojski edited comment on FLINK-24846 at 12/16/21, 8:08 AM: ---

[jira] [Comment Edited] (FLINK-24846) AsyncWaitOperator fails during stop-with-savepoint

2021-12-15 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459916#comment-17459916 ] Piotr Nowojski edited comment on FLINK-24846 at 12/15/21, 1:57 PM: ---

[jira] [Updated] (FLINK-24846) AsyncWaitOperator fails during stop-with-savepoint

2021-12-15 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-24846: --- Fix Version/s: 1.13.6 > AsyncWaitOperator fails during stop-with-savepoint >

[jira] [Commented] (FLINK-24846) AsyncWaitOperator fails during stop-with-savepoint

2021-12-15 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459916#comment-17459916 ] Piotr Nowojski commented on FLINK-24846: merged commit 4065bfb into apache:master merged as

[jira] [Updated] (FLINK-24846) AsyncWaitOperator fails during stop-with-savepoint

2021-12-15 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-24846: --- Fix Version/s: 1.15.0 1.14.3 > AsyncWaitOperator fails during

[jira] [Assigned] (FLINK-25199) fromValues does not emit final MAX watermark

2021-12-15 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-25199: -- Assignee: (was: Dawid Wysakowicz) > fromValues does not emit final MAX watermark

[jira] [Assigned] (FLINK-18808) Task-level numRecordsOut metric may be underestimated

2021-12-15 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-18808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-18808: -- Assignee: Lijie Wang > Task-level numRecordsOut metric may be underestimated >

[jira] [Commented] (FLINK-18808) Task-level numRecordsOut metric may be underestimated

2021-12-14 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-18808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459274#comment-17459274 ] Piotr Nowojski commented on FLINK-18808: That would be great :) would you like me to assign the

[jira] [Commented] (FLINK-6755) Allow triggering Checkpoints through command line client

2021-12-13 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17458438#comment-17458438 ] Piotr Nowojski commented on FLINK-6755: --- The motivation behind this feature request will be covered

[jira] [Closed] (FLINK-12619) Support TERMINATE/SUSPEND Job with Checkpoint

2021-12-13 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-12619. -- Resolution: Won't Do FLINK-25276 should address the motivation behind this feature, while the

[jira] [Created] (FLINK-25276) Support native and incremental savepoints

2021-12-13 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-25276: -- Summary: Support native and incremental savepoints Key: FLINK-25276 URL: https://issues.apache.org/jira/browse/FLINK-25276 Project: Flink Issue Type:

[jira] [Closed] (FLINK-11748) Optimize savepoint: Remove the task states and KeyedState into increments.

2021-12-13 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-11748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-11748. -- Resolution: Abandoned The PR has been closed. It looks like this ticket should have been

[jira] [Created] (FLINK-25275) Weighted KeyGroup assignment

2021-12-13 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-25275: -- Summary: Weighted KeyGroup assignment Key: FLINK-25275 URL: https://issues.apache.org/jira/browse/FLINK-25275 Project: Flink Issue Type: New Feature

[jira] [Commented] (FLINK-25256) Savepoints do not work with ExternallyInducedSources

2021-12-10 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457073#comment-17457073 ] Piotr Nowojski commented on FLINK-25256: This is related to supporting/handling "force full

[jira] [Updated] (FLINK-25256) Savepoints do not work with ExternallyInducedSources

2021-12-10 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-25256: --- Description: It is not possible to take a proper savepoint with {{ExternallyInducedSource}}

[jira] [Commented] (FLINK-18808) Task-level numRecordsOut metric may be underestimated

2021-12-10 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-18808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457071#comment-17457071 ] Piotr Nowojski commented on FLINK-18808: [~wanglijie95] as far as I remember, counting calls to

[jira] [Created] (FLINK-25255) Consider/design implementing State Processor API (FC)

2021-12-10 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-25255: -- Summary: Consider/design implementing State Processor API (FC) Key: FLINK-25255 URL: https://issues.apache.org/jira/browse/FLINK-25255 Project: Flink

[jira] [Commented] (FLINK-18808) Task-level numRecordsOut metric may be underestimated

2021-12-09 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-18808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17456235#comment-17456235 ] Piotr Nowojski commented on FLINK-18808: I'm not sure. Maybe it would be actually better to pick

[jira] [Updated] (FLINK-18647) How to handle processing time timers with bounded input

2021-12-09 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-18647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-18647: --- Priority: Not a Priority (was: Minor) > How to handle processing time timers with bounded

[jira] [Commented] (FLINK-25167) Support user-defined `StreamOperatorFactory` in `ConnectedStreams`#transform

2021-12-09 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-25167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17456224#comment-17456224 ] Piotr Nowojski commented on FLINK-25167: Until we overhaul {{ProcessFunction}} I think exposing

[jira] [Commented] (FLINK-24149) Make checkpoint self-contained and relocatable

2021-12-08 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17455237#comment-17455237 ] Piotr Nowojski commented on FLINK-24149: [~Feifan Wang] I've noticed that you opened a PR for

[jira] [Assigned] (FLINK-24086) Do not re-register SharedStateRegistry to reduce the recovery time of the job

2021-12-08 Thread Piotr Nowojski (Jira)
[ https://issues.apache.org/jira/browse/FLINK-24086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-24086: -- Assignee: Roman Khachatryan (was: ming li) > Do not re-register SharedStateRegistry

<    4   5   6   7   8   9   10   11   12   13   >