[jira] [Closed] (FLINK-22677) Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real asynchronous fashion

2021-07-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22677. --- Resolution: Done Done via 0d099b79fddc5e254884e44f2167c625744079a4 0b28fadccfb6b0d2a85592ced9e98b03a0c2d3bf

[jira] [Assigned] (FLINK-22672) Some enhancements for pluggable shuffle service framework

2021-07-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22672: --- Assignee: Jin Xing > Some enhancements for pluggable shuffle service framework >

[jira] [Closed] (FLINK-22676) The partition tracker should support remote shuffle properly

2021-07-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22676. --- Resolution: Done Done via 62a342b647fc1eac7f87769be92fda798649d6d4 > The partition tracker should support

[jira] [Updated] (FLINK-22676) The partition tracker should support remote shuffle properly

2021-07-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22676: Affects Version/s: (was: 1.4) 1.14.0 > The partition tracker should support

[jira] [Updated] (FLINK-22676) The partition tracker should support remote shuffle properly

2021-07-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22676: Fix Version/s: 1.14.0 > The partition tracker should support remote shuffle properly >

[jira] [Assigned] (FLINK-22676) The partition tracker should support remote shuffle properly

2021-07-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22676: --- Assignee: Jin Xing > The partition tracker should support remote shuffle properly >

[jira] [Updated] (FLINK-22676) The partition tracker should support remote shuffle properly

2021-07-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22676: Affects Version/s: 1.4 > The partition tracker should support remote shuffle properly >

[jira] [Issue Comment Deleted] (FLINK-22017) Regions may never be scheduled when there are cross-region blocking edges

2021-07-15 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22017: Comment: was deleted (was: I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I

[jira] [Issue Comment Deleted] (FLINK-22017) Regions may never be scheduled when there are cross-region blocking edges

2021-07-15 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22017: Comment: was deleted (was: This critical issue is unassigned and itself and all of its Sub-Tasks have

[jira] [Issue Comment Deleted] (FLINK-22017) Regions may never be scheduled when there are cross-region blocking edges

2021-07-15 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22017: Comment: was deleted (was: I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I

[jira] [Issue Comment Deleted] (FLINK-22017) Regions may never be scheduled when there are cross-region blocking edges

2021-07-15 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22017: Comment: was deleted (was: This issue was labeled "stale-critical" 7 ago and has not received any

[jira] [Closed] (FLINK-22017) Regions may never be scheduled when there are cross-region blocking edges

2021-07-15 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22017. --- Assignee: Zhilong Hong Resolution: Fixed Fixed via d2005268b1eeb0fe928b69c5e56ca54862fbf508

[jira] [Commented] (FLINK-11634) Translate "State Backends" page into Chinese

2021-07-13 Thread Shen Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380275#comment-17380275 ] Shen Zhu commented on FLINK-11634: -- Hey [~jark], In the latest version of flink, seems

[jira] [Commented] (FLINK-11627) Translate the "JobManager High Availability (HA)" page into Chinese

2021-07-12 Thread Shen Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379579#comment-17379579 ] Shen Zhu commented on FLINK-11627: -- Hey [~jark] , in the latest version, 

[jira] [Commented] (FLINK-11627) Translate the "JobManager High Availability (HA)" page into Chinese

2021-07-10 Thread Shen Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378554#comment-17378554 ] Shen Zhu commented on FLINK-11627: -- Hey [~jark], could you please assign this ticket to me? I can work

[jira] [Commented] (FLINK-23218) Distribute the ShuffleDescriptors via blob server

2021-07-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377963#comment-17377963 ] Zhu Zhu commented on FLINK-23218: - I took another think and 10GB sounds good to me now. If we always

[jira] [Commented] (FLINK-23218) Distribute the ShuffleDescriptors via blob server

2021-07-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377961#comment-17377961 ] Zhu Zhu commented on FLINK-23218: - 10GB looks a bit too large to limit the blob size by default. If we

[jira] [Commented] (FLINK-23218) Distribute the ShuffleDescriptors via blob server

2021-07-08 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377811#comment-17377811 ] Zhu Zhu commented on FLINK-23218: - 1. To not affect existing users, I prefer limit to not be too small

[jira] [Closed] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-07-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-15031. --- Fix Version/s: (was: 1.12.0) 1.14.0 Resolution: Fixed Done via

[jira] [Commented] (FLINK-23262) FileReadingWatermarkITCase.testWatermarkEmissionWithChaining fails on azure

2021-07-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375376#comment-17375376 ] Zhu Zhu commented on FLINK-23262: - another instance:

[jira] [Commented] (FLINK-22677) Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real asynchronous fashion

2021-07-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375218#comment-17375218 ] Zhu Zhu commented on FLINK-22677: - One thing need to mention is that I did not change

[jira] [Commented] (FLINK-22677) Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real asynchronous fashion

2021-07-05 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375217#comment-17375217 ] Zhu Zhu commented on FLINK-22677: - Problems below could happen if enabling partition registration is

[jira] [Assigned] (FLINK-22677) Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real asynchronous fashion

2021-07-02 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22677: --- Assignee: Zhu Zhu > Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real

[jira] [Updated] (FLINK-22677) Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real asynchronous fashion

2021-07-02 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22677: Affects Version/s: 1.14.0 > Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real

[jira] [Updated] (FLINK-22677) Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real asynchronous fashion

2021-07-02 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22677: Fix Version/s: 1.14.0 > Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real >

[jira] [Closed] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22945. --- Resolution: Fixed Fixed via: master: 5badc356abdcbb3d5cae1fe3f00f1ec18f414d98 1.13:

[jira] [Updated] (FLINK-23172) Links of restart strategy in configuration page is broken

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-23172: Priority: Major (was: Minor) > Links of restart strategy in configuration page is broken >

[jira] [Updated] (FLINK-23172) Links of restart strategy in configuration page is broken

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-23172: Issue Type: Bug (was: Technical Debt) > Links of restart strategy in configuration page is broken >

[jira] [Closed] (FLINK-23078) Scheduler Benchmarks not compiling

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-23078. --- Resolution: Fixed Fixed via flink: 439dbfa48122df164780f55da2cb05f64669a247

[jira] [Comment Edited] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371290#comment-17371290 ] Zhu Zhu edited comment on FLINK-15031 at 6/29/21, 10:52 AM: Discussed with

[jira] [Commented] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371290#comment-17371290 ] Zhu Zhu commented on FLINK-15031: - Discussed with Till offline. His concern was that the network

[jira] [Comment Edited] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371207#comment-17371207 ] Zhu Zhu edited comment on FLINK-15031 at 6/29/21, 8:18 AM: --- I think it should

[jira] [Commented] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371207#comment-17371207 ] Zhu Zhu commented on FLINK-15031: - I think it should be an advanced and experimental config. It can be

[jira] [Commented] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370577#comment-17370577 ] Zhu Zhu commented on FLINK-15031: - Thanks for reviving this discussion! This improvement is necessary

[jira] [Updated] (FLINK-15031) Automatically calculate required network memory for fine-grained jobs

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15031: Summary: Automatically calculate required network memory for fine-grained jobs (was: Automatically

[jira] [Reopened] (FLINK-15031) Automatically calculate required shuffle memory for fine-grained jobs

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reopened FLINK-15031: - Assignee: Jin Xing (was: Zhu Zhu) > Automatically calculate required shuffle memory for fine-grained

[jira] [Updated] (FLINK-15031) Automatically calculate required shuffle memory for fine-grained jobs

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-15031: Summary: Automatically calculate required shuffle memory for fine-grained jobs (was: Calculate required

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Fix Version/s: 1.13.2 1.14.0 > StackOverflowException can happen when a large scale

[jira] [Assigned] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22945: --- Assignee: Gen Luo (was: Luo Gen) > StackOverflowException can happen when a large scale job is

[jira] [Assigned] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22945: --- Assignee: Luo Gen > StackOverflowException can happen when a large scale job is CANCELING/FAILING

[jira] [Commented] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370462#comment-17370462 ] Zhu Zhu commented on FLINK-22945: - [~pltbkd] I have assign you the ticket. Feel free to open a fix for

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Priority: Critical (was: Major) > StackOverflowException can happen when a large scale job is

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Labels: (was: auto-deprioritized-critical) > StackOverflowException can happen when a large scale job

[jira] [Commented] (FLINK-23153) Benchmark not compiling

2021-06-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369262#comment-17369262 ] Zhu Zhu commented on FLINK-23153: - Thanks for reporting this issue [~Thesharing]. I have assigned you

[jira] [Assigned] (FLINK-23153) Benchmark not compiling

2021-06-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-23153: --- Assignee: Zhilong Hong > Benchmark not compiling > --- > >

[jira] [Commented] (FLINK-23005) Optimize the deployment of tasks

2021-06-21 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366577#comment-17366577 ] Zhu Zhu commented on FLINK-23005: - Thanks [~Thesharing] for looking into the problem and proposing an

[jira] [Assigned] (FLINK-23005) Optimize the deployment of tasks

2021-06-21 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-23005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-23005: --- Assignee: Zhilong Hong > Optimize the deployment of tasks > > >

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Description: The pending requests in ExecutionSlotAllocator are not cleared when a job transitions to

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Component/s: Runtime / Coordination > StackOverflowException can happen when a large scale job is

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELING/FAILING

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Summary: StackOverflowException can happen when a large scale job is CANCELING/FAILING (was:

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELED/FAILED

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Description: The pending requests in ExecutionSlotAllocator are not cleared when a job transitions to

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELED/FAILED

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Priority: Critical (was: Major) > StackOverflowException can happen when a large scale job is

[jira] [Updated] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELED/FAILED

2021-06-09 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22945: Issue Type: Bug (was: Improvement) > StackOverflowException can happen when a large scale job is

[jira] [Created] (FLINK-22945) StackOverflowException can happen when a large scale job is CANCELED/FAILED

2021-06-09 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-22945: --- Summary: StackOverflowException can happen when a large scale job is CANCELED/FAILED Key: FLINK-22945 URL: https://issues.apache.org/jira/browse/FLINK-22945 Project: Flink

[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-06-08 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-19142: Labels: pull-request-available (was: pull-request-available stale-assigned) > Investigate slot hijacking

[jira] [Closed] (FLINK-22863) ArrayIndexOutOfBoundsException may happen when building rescale edges

2021-06-04 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22863. --- Resolution: Fixed Fixed via master: 739a12add50c90e020e4b9aaafc1cc45465fa937 release-1.13:

[jira] [Comment Edited] (FLINK-22115) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2021-06-03 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357036#comment-17357036 ] Zhu Zhu edited comment on FLINK-22115 at 6/4/21, 3:09 AM: -- Close this ticket

[jira] [Closed] (FLINK-22115) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2021-06-03 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22115. --- Resolution: Won't Fix Close this ticket since it may be already fixed and has been inactive for too long.

[jira] [Commented] (FLINK-22863) ArrayIndexOutOfBoundsException may happen when building rescale edges

2021-06-03 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356274#comment-17356274 ] Zhu Zhu commented on FLINK-22863: - Thanks for reporting this issue. [~Thesharing] It is indeed a

[jira] [Updated] (FLINK-22863) ArrayIndexOutOfBoundsException may happen when building rescale edges

2021-06-03 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22863: Priority: Blocker (was: Critical) > ArrayIndexOutOfBoundsException may happen when building rescale

[jira] [Comment Edited] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354757#comment-17354757 ] Zhu Zhu edited comment on FLINK-16069 at 6/1/21, 2:54 AM: -- Even if the main

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354757#comment-17354757 ] Zhu Zhu commented on FLINK-16069: - Even if the main thread can have the highest priority, GC problem can

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-31 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354420#comment-17354420 ] Zhu Zhu commented on FLINK-16069: - Yes a dedicated {{serializationExecutor}} is an alternative. One

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-27 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352943#comment-17352943 ] Zhu Zhu commented on FLINK-16069: - Thanks for the suggestion and sorry for the late reply! [~trohrmann]

[jira] [Updated] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-05-27 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-16069: Labels: (was: stale-major) > Creation of TaskDeploymentDescriptor can block main thread for long time >

[jira] [Commented] (FLINK-22677) Scheduler should invoke ShuffleMaster#registerPartitionWithProducer by a real asynchronous fashion

2021-05-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347310#comment-17347310 ] Zhu Zhu commented on FLINK-22677: - I will take a look to see how we can improve the partition

[jira] [Assigned] (FLINK-22305) Improve log messages of sort-merge blocking shuffle

2021-05-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22305: --- Assignee: Yingjie Cao > Improve log messages of sort-merge blocking shuffle >

[jira] [Closed] (FLINK-14327) Getting "Could not forward element to next operator" error

2021-05-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-14327. --- Fix Version/s: (was: 1.9.4) Resolution: Invalid Close it because the ticket has been inactive

[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-19142: Labels: pull-request-available (was: auto-unassigned pull-request-available) > Investigate slot

[jira] [Assigned] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-06 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-19142: --- Assignee: Zhu Zhu > Investigate slot hijacking from preceding pipelined regions after failover >

[jira] [Comment Edited] (FLINK-17726) Scheduler should take care of tasks directly canceled by TaskManager

2021-04-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332834#comment-17332834 ] Zhu Zhu edited comment on FLINK-17726 at 4/27/21, 1:50 AM: --- I think it is a

[jira] [Commented] (FLINK-17726) Scheduler should take care of tasks directly canceled by TaskManager

2021-04-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332834#comment-17332834 ] Zhu Zhu commented on FLINK-17726: - I think it is a potential issue and is not a real production problem

[jira] [Commented] (FLINK-22115) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2021-04-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331685#comment-17331685 ] Zhu Zhu commented on FLINK-22115: - Hi [~wym_maozi], I could not reproduce this problem (or related

[jira] [Commented] (FLINK-22115) JobManager dies with IllegalStateException SharedSlot (physical request SlotRequestId{%}) has been released

2021-04-22 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17327150#comment-17327150 ] Zhu Zhu commented on FLINK-22115: - Thanks for reporting this issue. [~wym_maozi] I will take a look. The

[jira] [Assigned] (FLINK-14510) Remove the lazy vertex attaching mechanism from ExecutionGraph

2021-04-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-14510: --- Assignee: (was: Zhu Zhu) > Remove the lazy vertex attaching mechanism from ExecutionGraph >

[jira] [Assigned] (FLINK-12138) Limit input split count of each source task for better failover experience

2021-04-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-12138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-12138: --- Assignee: (was: Zhu Zhu) > Limit input split count of each source task for better failover

[jira] [Updated] (FLINK-14510) Remove the lazy vertex attaching mechanism from ExecutionGraph

2021-04-18 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-14510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-14510: Fix Version/s: (was: 1.13.0) > Remove the lazy vertex attaching mechanism from ExecutionGraph >

[jira] [Closed] (FLINK-22037) Remove the redundant blocking queue from DeployingTasksBenchmarkBase

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22037. --- Resolution: Fixed Fixed via b89736b1dd6f350d16529def539f1a9ebac909f1 > Remove the redundant blocking queue

[jira] [Updated] (FLINK-22037) Remove the redundant blocking queue from DeployingTasksBenchmarkBase

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22037: Component/s: (was: Runtime / Coordination) > Remove the redundant blocking queue from

[jira] [Commented] (FLINK-16069) Creation of TaskDeploymentDescriptor can block main thread for long time

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311960#comment-17311960 ] Zhu Zhu commented on FLINK-16069: - >From what I can see, heartbeat timeout happens because the scheduled

[jira] [Updated] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-22007: Fix Version/s: (was: 1.13.0) > PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing >

[jira] [Closed] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-22007. --- Resolution: Duplicate > PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing >

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311456#comment-17311456 ] Zhu Zhu commented on FLINK-22007: - PartitionReleaseInBatchJobBenchmark is working

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311426#comment-17311426 ] Zhu Zhu commented on FLINK-22007: - I'd like to wait a bit time for the next run of the scheduler

[jira] [Assigned] (FLINK-22037) Remove the redundant blocking queue from DeployingTasksBenchmarkBase

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-22037: --- Assignee: Zhilong Hong > Remove the redundant blocking queue from DeployingTasksBenchmarkBase >

[jira] [Closed] (FLINK-20757) Optimize data broadcast for sort-merge shuffle

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-20757. --- Resolution: Fixed Done via ae0a615f4490c548fcb53b15d2f6f0595371d303 > Optimize data broadcast for

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311313#comment-17311313 ] Zhu Zhu commented on FLINK-22007: - [~pnowojski] FLINK-21332 is merged and hopefully

[jira] [Closed] (FLINK-21332) Optimize releasing result partitions in RegionPartitionReleaseStrategy

2021-03-30 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21332. --- Resolution: Fixed Done via 9951be845e14026b17518373c73e28796e63407d > Optimize releasing result partitions

[jira] [Closed] (FLINK-21330) Optimize the performance of PipelinedRegionSchedulingStrategy

2021-03-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21330. --- Resolution: Fixed Done via 5f0c76f2e87326cf844d9914e8b8f6cd7f311c8f > Optimize the performance of

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310496#comment-17310496 ] Zhu Zhu commented on FLINK-22007: - Hopefully we can get the fix merged tomorrow. I will disable the

[jira] [Closed] (FLINK-19938) Implement shuffle data read scheduling for sort-merge blocking shuffle

2021-03-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-19938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-19938. --- Resolution: Fixed Done via f1e69bbde05cb834e6726e88b3c354299922ed46 > Implement shuffle data read

[jira] [Commented] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-29 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310470#comment-17310470 ] Zhu Zhu commented on FLINK-22007: - Hi [~pnowojski], thanks for reporting this problem! We also noticed

[jira] [Closed] (FLINK-21731) Add benchmarks for DefaultScheduler's creation, scheduling and deploying

2021-03-28 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21731. --- Resolution: Fixed The benchmarks are enabled in flink-benchmark via

[jira] [Assigned] (FLINK-21850) Improve document and config description of sort-merge blocking shuffle

2021-03-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21850: --- Assignee: Yingjie Cao > Improve document and config description of sort-merge blocking shuffle >

[jira] [Closed] (FLINK-20740) Use managed memory to avoid direct memory OOM error for sort-merge shuffle (introduce a separated buffer pool)

2021-03-26 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-20740. --- Resolution: Fixed done via a1f079d1968ae286bd8d91b48801a732b88b0bc7 > Use managed memory to avoid direct

[jira] [Closed] (FLINK-21331) Optimize calculating tasks to restart in RestartPipelinedRegionFailoverStrategy

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21331. --- Resolution: Fixed done via 9c95cc19bed1a8c9dddcfa3969614474ee4934c2 > Optimize calculating tasks to

[jira] [Commented] (FLINK-21117) KafkaProducerExactlyOnceITCase fails with "Exceeded checkpoint tolerable failure threshold."

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17309124#comment-17309124 ] Zhu Zhu commented on FLINK-21117: - another instance:

[jira] [Closed] (FLINK-21975) Remove hamcrest dependency from SchedulerBenchmarkBase

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-21975. --- Resolution: Fixed Fixed via 94ce6f9f638f7e346344e6e078ecbdd8933b44d6 > Remove hamcrest dependency from

[jira] [Commented] (FLINK-20329) Elasticsearch7DynamicSinkITCase hangs

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17309121#comment-17309121 ] Zhu Zhu commented on FLINK-20329: - Another instance:

[jira] [Assigned] (FLINK-21975) Remove hamcrest dependency from SchedulerBenchmarkBase

2021-03-25 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-21975: --- Assignee: Zhilong Hong > Remove hamcrest dependency from SchedulerBenchmarkBase >

<    6   7   8   9   10   11   12   13   14   15   >