[jira] [Closed] (FLINK-35130) Simplify AvailabilityNotifierImpl to support speculative scheduler and improve performance

2024-05-21 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan closed FLINK-35130.
-
Fix Version/s: 1.20.0
   Resolution: Fixed

> Simplify AvailabilityNotifierImpl to support speculative scheduler and 
> improve performance
> --
>
> Key: FLINK-35130
> URL: https://issues.apache.org/jira/browse/FLINK-35130
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> The AvailabilityNotifierImpl in SingleInputGate has maps storing the channel 
> ids. But the map key is the result partition id, which will change according 
> to the different attempt numbers when speculation is enabled.  This can be 
> resolved by using `inputChannels` to get channel and the map key of 
> inputChannels will not vary with the attempts. 
> In addition, using that map instead can also improve performance for large 
> scale jobs because no extra maps are created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35130) Simplify AvailabilityNotifierImpl to support speculative scheduler and improve performance

2024-05-21 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848153#comment-17848153
 ] 

Yuxin Tan commented on FLINK-35130:
---

master(1.20.0): 46aaea8083047fc86c35491336d795ddcd565128

> Simplify AvailabilityNotifierImpl to support speculative scheduler and 
> improve performance
> --
>
> Key: FLINK-35130
> URL: https://issues.apache.org/jira/browse/FLINK-35130
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> The AvailabilityNotifierImpl in SingleInputGate has maps storing the channel 
> ids. But the map key is the result partition id, which will change according 
> to the different attempt numbers when speculation is enabled.  This can be 
> resolved by using `inputChannels` to get channel and the map key of 
> inputChannels will not vary with the attempts. 
> In addition, using that map instead can also improve performance for large 
> scale jobs because no extra maps are created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35214) Update result partition id for remote input channel when unknown input channel is updated

2024-04-23 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-35214:
--
Fix Version/s: 1.20.0

> Update result partition id for remote input channel when unknown input 
> channel is updated
> -
>
> Key: FLINK-35214
> URL: https://issues.apache.org/jira/browse/FLINK-35214
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> In [FLINK-29768|https://issues.apache.org/jira/browse/FLINK-29768], the 
> result partition in the local input channel has been updated to support 
> speculation. It is necessary to similarly update the result partition ID in 
> the remote input channel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-35214) Update result partition id for remote input channel when unknown input channel is updated

2024-04-22 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-35214:
-

Assignee: Yuxin Tan

> Update result partition id for remote input channel when unknown input 
> channel is updated
> -
>
> Key: FLINK-35214
> URL: https://issues.apache.org/jira/browse/FLINK-35214
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> In [FLINK-29768|https://issues.apache.org/jira/browse/FLINK-29768], the 
> result partition in the local input channel has been updated to support 
> speculation. It is necessary to similarly update the result partition ID in 
> the remote input channel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35214) Update result partition id for remote input channel when unknown input channel is updated

2024-04-22 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-35214:
-

 Summary: Update result partition id for remote input channel when 
unknown input channel is updated
 Key: FLINK-35214
 URL: https://issues.apache.org/jira/browse/FLINK-35214
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.20.0
Reporter: Yuxin Tan


In [FLINK-29768|https://issues.apache.org/jira/browse/FLINK-29768], the result 
partition in the local input channel has been updated to support speculation. 
It is necessary to similarly update the result partition ID in the remote input 
channel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-35166) Improve the performance of Hybrid Shuffle when enable memory decoupling

2024-04-21 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-35166:
-

Assignee: Jiang Xin

> Improve the performance of Hybrid Shuffle when enable memory decoupling
> ---
>
> Key: FLINK-35166
> URL: https://issues.apache.org/jira/browse/FLINK-35166
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Reporter: Jiang Xin
>Assignee: Jiang Xin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> Currently, the tiered result partition creates the SortBufferAccumulator with 
> the number of expected buffers as min(numSubpartitions+1, 512), thus the 
> SortBufferAccumulator may obtain very few buffers when the parallelism is 
> small. We can easily make the number of expected buffers 512 by default to 
> have a better performance when the buffers are sufficient.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35169) Recycle buffers to freeSegments before releasing data buffer for sort accumulator

2024-04-18 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-35169:
-

 Summary: Recycle buffers to freeSegments before releasing data 
buffer for sort accumulator
 Key: FLINK-35169
 URL: https://issues.apache.org/jira/browse/FLINK-35169
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.20.0
Reporter: Yuxin Tan
Assignee: Yuxin Tan


When using sortBufferAccumulator, we should recycle the buffers to freeSegments 
before releasing the data buffer. The reason is that when getting buffers from 
the DataBuffer, it may require more buffers than the current quantity available 
in freeSegments. Consequently, to ensure adequate buffers from DataBuffer, the 
flushed and recycled buffers should also be added to freeSegments for reuse.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-35130) Simplify AvailabilityNotifierImpl to support speculative scheduler and improve performance

2024-04-17 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-35130:
-

Assignee: Yuxin Tan

> Simplify AvailabilityNotifierImpl to support speculative scheduler and 
> improve performance
> --
>
> Key: FLINK-35130
> URL: https://issues.apache.org/jira/browse/FLINK-35130
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> The AvailabilityNotifierImpl in SingleInputGate has maps storing the channel 
> ids. But the map key is the result partition id, which will change according 
> to the different attempt numbers when speculation is enabled.  This can be 
> resolved by using `inputChannels` to get channel and the map key of 
> inputChannels will not vary with the attempts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35130) Simplify AvailabilityNotifierImpl to support speculative scheduler and improve performance

2024-04-17 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-35130:
-

 Summary: Simplify AvailabilityNotifierImpl to support speculative 
scheduler and improve performance
 Key: FLINK-35130
 URL: https://issues.apache.org/jira/browse/FLINK-35130
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Affects Versions: 1.20.0
Reporter: Yuxin Tan


The AvailabilityNotifierImpl in SingleInputGate has maps storing the channel 
ids. But the map key is the result partition id, which will change according to 
the different attempt numbers when speculation is enabled.  This can be 
resolved by using `inputChannels` to get channel and the map key of 
inputChannels will not vary with the attempts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35130) Simplify AvailabilityNotifierImpl to support speculative scheduler and improve performance

2024-04-17 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-35130:
--
Description: 
The AvailabilityNotifierImpl in SingleInputGate has maps storing the channel 
ids. But the map key is the result partition id, which will change according to 
the different attempt numbers when speculation is enabled.  This can be 
resolved by using `inputChannels` to get channel and the map key of 
inputChannels will not vary with the attempts. 
In addition, using that map instead can also improve performance for large 
scale jobs because no extra maps are created.

  was:The AvailabilityNotifierImpl in SingleInputGate has maps storing the 
channel ids. But the map key is the result partition id, which will change 
according to the different attempt numbers when speculation is enabled.  This 
can be resolved by using `inputChannels` to get channel and the map key of 
inputChannels will not vary with the attempts. In addition, using that map 
instead can also improve performance for large scale jobs because 


> Simplify AvailabilityNotifierImpl to support speculative scheduler and 
> improve performance
> --
>
> Key: FLINK-35130
> URL: https://issues.apache.org/jira/browse/FLINK-35130
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> The AvailabilityNotifierImpl in SingleInputGate has maps storing the channel 
> ids. But the map key is the result partition id, which will change according 
> to the different attempt numbers when speculation is enabled.  This can be 
> resolved by using `inputChannels` to get channel and the map key of 
> inputChannels will not vary with the attempts. 
> In addition, using that map instead can also improve performance for large 
> scale jobs because no extra maps are created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35130) Simplify AvailabilityNotifierImpl to support speculative scheduler and improve performance

2024-04-17 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-35130:
--
Description: The AvailabilityNotifierImpl in SingleInputGate has maps 
storing the channel ids. But the map key is the result partition id, which will 
change according to the different attempt numbers when speculation is enabled.  
This can be resolved by using `inputChannels` to get channel and the map key of 
inputChannels will not vary with the attempts. In addition, using that map 
instead can also improve performance for large scale jobs because   (was: The 
AvailabilityNotifierImpl in SingleInputGate has maps storing the channel ids. 
But the map key is the result partition id, which will change according to the 
different attempt numbers when speculation is enabled.  This can be resolved by 
using `inputChannels` to get channel and the map key of inputChannels will not 
vary with the attempts.)

> Simplify AvailabilityNotifierImpl to support speculative scheduler and 
> improve performance
> --
>
> Key: FLINK-35130
> URL: https://issues.apache.org/jira/browse/FLINK-35130
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> The AvailabilityNotifierImpl in SingleInputGate has maps storing the channel 
> ids. But the map key is the result partition id, which will change according 
> to the different attempt numbers when speculation is enabled.  This can be 
> resolved by using `inputChannels` to get channel and the map key of 
> inputChannels will not vary with the attempts. In addition, using that map 
> instead can also improve performance for large scale jobs because 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-34544) Fix tiered result partition check state issue when finishing partition

2024-03-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan closed FLINK-34544.
-
Resolution: Won't Fix

> Fix tiered result partition check state issue when finishing partition
> --
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When finishing the tiered result partition, it may throw a check state issue 
> because the downstream consumes it too quickly. The bug should be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (FLINK-34544) Fix tiered result partition check state issue when finishing partition

2024-03-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reopened FLINK-34544:
---

> Fix tiered result partition check state issue when finishing partition
> --
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When finishing the tiered result partition, it may throw a check state issue 
> because the downstream consumes it too quickly. The bug should be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-34544) Fix tiered result partition check state issue when finishing partition

2024-03-02 Thread Yuxin Tan (Jira)


[ https://issues.apache.org/jira/browse/FLINK-34544 ]


Yuxin Tan deleted comment on FLINK-34544:
---

was (Author: tanyuxin):
It is already fixed in https://issues.apache.org/jira/browse/FLINK-33743.

> Fix tiered result partition check state issue when finishing partition
> --
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When finishing the tiered result partition, it may throw a check state issue 
> because the downstream consumes it too quickly. The bug should be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-34544) Fix tiered result partition check state issue when finishing partition

2024-03-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan closed FLINK-34544.
-
Resolution: Fixed

It is already fixed in https://issues.apache.org/jira/browse/FLINK-33743.

> Fix tiered result partition check state issue when finishing partition
> --
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When finishing the tiered result partition, it may throw a check state issue 
> because the downstream consumes it too quickly. The bug should be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34544) Fix tiered result partition check state issue when finishing partition

2024-03-02 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822842#comment-17822842
 ] 

Yuxin Tan commented on FLINK-34544:
---

I noticed that the issue has been fixed by changing the order of checkState and 
broadcasting the end of partition event in 
https://issues.apache.org/jira/browse/FLINK-33743. Close it.

> Fix tiered result partition check state issue when finishing partition
> --
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When finishing the tiered result partition, it may throw a check state issue 
> because the downstream consumes it too quickly. The bug should be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) Fix tiered result partition check state issue when finishing partition

2024-03-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Description: When finishing the tiered result partition, it may throw a 
check state issue because the downstream consumes it too quickly. The bug 
should be fixed.  (was: When finishing )

> Fix tiered result partition check state issue when finishing partition
> --
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When finishing the tiered result partition, it may throw a check state issue 
> because the downstream consumes it too quickly. The bug should be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) Fix tiered result partition check state issue when finishing partition

2024-03-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Description: When finishing   (was: When releasing `TieredResultPartition`, 
it is not protected by a lock(just like SortMergeResultPartition), then it may 
throw illegal state exceptions occasionally. We should fix it.)

> Fix tiered result partition check state issue when finishing partition
> --
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When finishing 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) Fix tiered result partition check state issue when finishing partition

2024-03-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Summary: Fix tiered result partition check state issue when finishing 
partition  (was: Tiered result partition should be released with lock)

> Fix tiered result partition check state issue when finishing partition
> --
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When releasing `TieredResultPartition`, it is not protected by a lock(just 
> like SortMergeResultPartition), then it may throw illegal state exceptions 
> occasionally. We should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) Tiered result partition should be released with lock

2024-02-29 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Description: When releasing `TieredResultPartition`, it is not protected by 
a lock(just like SortMergeResultPartition), then it may throw illegal state 
exceptions occasionally. We should fix it.  (was: When releasing 
`TieredResultPartition`, it is not protected by a lock(just like 
SortMergeResultPartition), then it may throw iIlegal state exception 
occasionally. We should fix it.)

> Tiered result partition should be released with lock
> 
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When releasing `TieredResultPartition`, it is not protected by a lock(just 
> like SortMergeResultPartition), then it may throw illegal state exceptions 
> occasionally. We should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) Tiered result partition should be released with lock

2024-02-29 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Description: When releasing `TieredResultPartition`, it is not protected by 
a lock(just like SortMergeResultPartition), then it may throw iIlegal state 
exception occasionally. We should fix it.  (was: When releasing 
`TieredResultPartition`, it is not protected by a lock, then it may throw 
iIlegal state exception occasionally. We should fix it.)

> Tiered result partition should be released with lock
> 
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When releasing `TieredResultPartition`, it is not protected by a lock(just 
> like SortMergeResultPartition), then it may throw iIlegal state exception 
> occasionally. We should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) Tiered result partition should be released with lock

2024-02-28 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Description: When releasing `TieredResultPartition`, it is not protected by 
a lock, then it may throw iIlegal state exception occasionally. We should fix 
it.  (was: When releasing `TieredResultPartition`, it is not protected by a 
lock, then it may thrown iIlegal state exception occasionally. We should fix 
it.)

> Tiered result partition should be released with lock
> 
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> When releasing `TieredResultPartition`, it is not protected by a lock, then 
> it may throw iIlegal state exception occasionally. We should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) Tiered result partition should be released with lock

2024-02-28 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Description: When releasing `TieredResultPartition`, it is not protected by 
a lock, then it may thrown iIlegal state exception occasionally. We should fix 
it.  (was: The `TieredStorageMemoryManagerImpl` has a `numRequestedBuffers` 
check when releasing resources. However, this check is performed in the task 
thread, while the buffer recycle may occur in the Netty thread. As a result, it 
may incorrectly throw an exception when the release is too quick for the 
vertex, which has almost no data.
We should fix it.)

> Tiered result partition should be released with lock
> 
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> When releasing `TieredResultPartition`, it is not protected by a lock, then 
> it may thrown iIlegal state exception occasionally. We should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) Tiered result partition should be released with lock

2024-02-28 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Summary: Tiered result partition should be released with lock  (was: The 
release check bug in tiered memory manager of hybrid shuffle)

> Tiered result partition should be released with lock
> 
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> The `TieredStorageMemoryManagerImpl` has a `numRequestedBuffers` check when 
> releasing resources. However, this check is performed in the task thread, 
> while the buffer recycle may occur in the Netty thread. As a result, it may 
> incorrectly throw an exception when the release is too quick for the vertex, 
> which has almost no data.
> We should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34544) The release check bug in tiered memory manager of hybrid shuffle

2024-02-28 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-34544:
--
Affects Version/s: 1.20.0
   (was: 1.19.0)

> The release check bug in tiered memory manager of hybrid shuffle
> 
>
> Key: FLINK-34544
> URL: https://issues.apache.org/jira/browse/FLINK-34544
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.20.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> The `TieredStorageMemoryManagerImpl` has a `numRequestedBuffers` check when 
> releasing resources. However, this check is performed in the task thread, 
> while the buffer recycle may occur in the Netty thread. As a result, it may 
> incorrectly throw an exception when the release is too quick for the vertex, 
> which has almost no data.
> We should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34544) The release check bug in tiered memory manager of hybrid shuffle

2024-02-28 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-34544:
-

 Summary: The release check bug in tiered memory manager of hybrid 
shuffle
 Key: FLINK-34544
 URL: https://issues.apache.org/jira/browse/FLINK-34544
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.19.0
Reporter: Yuxin Tan
Assignee: Yuxin Tan


The `TieredStorageMemoryManagerImpl` has a `numRequestedBuffers` check when 
releasing resources. However, this check is performed in the task thread, while 
the buffer recycle may occur in the Netty thread. As a result, it may 
incorrectly throw an exception when the release is too quick for the vertex, 
which has almost no data.
We should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34424) BoundedBlockingSubpartitionWriteReadTest#testRead10ConsumersConcurrent times out

2024-02-19 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818615#comment-17818615
 ] 

Yuxin Tan commented on FLINK-34424:
---

[~mapohl] [~pnowojski] Sorry for the late reply. I tried to reproduce the 
issue, but it can not be reproduced in my local environment. I think it may be 
an occasional case with a low probability. I and [~yunfengzhou] will continue 
investigating the cause.

> BoundedBlockingSubpartitionWriteReadTest#testRead10ConsumersConcurrent times 
> out
> 
>
> Key: FLINK-34424
> URL: https://issues.apache.org/jira/browse/FLINK-34424
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Matthias Pohl
>Assignee: Yunfeng Zhou
>Priority: Critical
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57446=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=9151
> {code}
> Feb 11 13:55:29 "ForkJoinPool-50-worker-25" #414 daemon prio=5 os_prio=0 
> tid=0x7f19503af800 nid=0x284c in Object.wait() [0x7f191b6db000]
> Feb 11 13:55:29java.lang.Thread.State: WAITING (on object monitor)
> Feb 11 13:55:29   at java.lang.Object.wait(Native Method)
> Feb 11 13:55:29   at java.lang.Thread.join(Thread.java:1252)
> Feb 11 13:55:29   - locked <0xe2e019a8> (a 
> org.apache.flink.runtime.io.network.partition.BoundedBlockingSubpartitionWriteReadTest$LongReader)
> Feb 11 13:55:29   at 
> org.apache.flink.core.testutils.CheckedThread.trySync(CheckedThread.java:104)
> Feb 11 13:55:29   at 
> org.apache.flink.core.testutils.CheckedThread.sync(CheckedThread.java:92)
> Feb 11 13:55:29   at 
> org.apache.flink.core.testutils.CheckedThread.sync(CheckedThread.java:81)
> Feb 11 13:55:29   at 
> org.apache.flink.runtime.io.network.partition.BoundedBlockingSubpartitionWriteReadTest.testRead10ConsumersConcurrent(BoundedBlockingSubpartitionWriteReadTest.java:177)
> Feb 11 13:55:29   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33743) Support consuming multiple subpartitions on a single channel

2023-12-04 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33743:
-

Assignee: (was: Yuxin Tan)

> Support consuming multiple subpartitions on a single channel
> 
>
> Key: FLINK-33743
> URL: https://issues.apache.org/jira/browse/FLINK-33743
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yuxin Tan
>Priority: Major
>
> In Flink jobs that use the AdaptiveBatchScheduler and enable adaptive 
> parallelism, a downstream operator might consume multiple subpartitions from 
> an upstream operator. While downstream operators would create an InputChannel 
> for each upstream subpartition in Flink's current implementation, The many 
> InputChannels created in this situation may consume more memory resources 
> than needed, affecting the usability of Hybrid Shuffle and 
> AdaptiveBatchScheduler. In order to solve this problem, we plan to allow one 
> InputChannel to consume multiple subpartitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33743) Support consuming multiple subpartitions on a single channel

2023-12-04 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33743:
-

Assignee: Yunfeng Zhou

> Support consuming multiple subpartitions on a single channel
> 
>
> Key: FLINK-33743
> URL: https://issues.apache.org/jira/browse/FLINK-33743
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yuxin Tan
>Assignee: Yunfeng Zhou
>Priority: Major
>
> In Flink jobs that use the AdaptiveBatchScheduler and enable adaptive 
> parallelism, a downstream operator might consume multiple subpartitions from 
> an upstream operator. While downstream operators would create an InputChannel 
> for each upstream subpartition in Flink's current implementation, The many 
> InputChannels created in this situation may consume more memory resources 
> than needed, affecting the usability of Hybrid Shuffle and 
> AdaptiveBatchScheduler. In order to solve this problem, we plan to allow one 
> InputChannel to consume multiple subpartitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33743) Support consuming multiple subpartitions on a single channel

2023-12-04 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33743:
-

Assignee: Yuxin Tan

> Support consuming multiple subpartitions on a single channel
> 
>
> Key: FLINK-33743
> URL: https://issues.apache.org/jira/browse/FLINK-33743
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> In Flink jobs that use the AdaptiveBatchScheduler and enable adaptive 
> parallelism, a downstream operator might consume multiple subpartitions from 
> an upstream operator. While downstream operators would create an InputChannel 
> for each upstream subpartition in Flink's current implementation, The many 
> InputChannels created in this situation may consume more memory resources 
> than needed, affecting the usability of Hybrid Shuffle and 
> AdaptiveBatchScheduler. In order to solve this problem, we plan to allow one 
> InputChannel to consume multiple subpartitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33743) Support consuming multiple subpartitions on a single channel

2023-12-04 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33743:
--
Description: In Flink jobs that use the AdaptiveBatchScheduler and enable 
adaptive parallelism, a downstream operator might consume multiple 
subpartitions from an upstream operator. While downstream operators would 
create an InputChannel for each upstream subpartition in Flink's current 
implementation, The many InputChannels created in this situation may consume 
more memory resources than needed, affecting the usability of Hybrid Shuffle 
and AdaptiveBatchScheduler. In order to solve this problem, we plan to allow 
one InputChannel to consume multiple subpartitions.  (was: At present, a 
downstream channel is limited to consuming data from a single subpartition, a 
constraint that can lead to increased memory consumption. Addressing this issue 
is also a critical step in ensuring that Hybrid Shuffle functions effectively 
with Adaptive Query Execution (AQE). )

> Support consuming multiple subpartitions on a single channel
> 
>
> Key: FLINK-33743
> URL: https://issues.apache.org/jira/browse/FLINK-33743
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yuxin Tan
>Priority: Major
>
> In Flink jobs that use the AdaptiveBatchScheduler and enable adaptive 
> parallelism, a downstream operator might consume multiple subpartitions from 
> an upstream operator. While downstream operators would create an InputChannel 
> for each upstream subpartition in Flink's current implementation, The many 
> InputChannels created in this situation may consume more memory resources 
> than needed, affecting the usability of Hybrid Shuffle and 
> AdaptiveBatchScheduler. In order to solve this problem, we plan to allow one 
> InputChannel to consume multiple subpartitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33746) More precise dynamic selection of Hybrid Shuffle or AQE

2023-12-04 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33746:
-

 Summary: More precise dynamic selection of Hybrid Shuffle or AQE
 Key: FLINK-33746
 URL: https://issues.apache.org/jira/browse/FLINK-33746
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination, Runtime / Network
Affects Versions: 1.19.0
Reporter: Yuxin Tan


We can even adopt more precise and intelligent strategies to select between 
Hybrid Shuffle and AQE. For instance, the choice could be made based on the 
edge type between tasks, or we could leverage historical job performance data 
and other metrics to inform our decision. Such tailored strategies would enable 
us to utilize each feature where it is most beneficial.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33745) Dynamically choose hybrid shuffle or AQE in a job level

2023-12-04 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33745:
-

 Summary: Dynamically choose hybrid shuffle or AQE in a job level
 Key: FLINK-33745
 URL: https://issues.apache.org/jira/browse/FLINK-33745
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Affects Versions: 1.19.0
Reporter: Yuxin Tan


To enhance the initial integration of Hybrid Shuffle with Adaptive Query 
Execution (AQE), we could implement a coarse-grained mode selection strategy. 
For instance, we can opt for either Hybrid Shuffle or AQE at the granularity 
level of an entire job. This approach would allow us to better align the two 
features in the early stages of adoption.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33744) Hybrid shuffle avoids restarting the whole job when failover

2023-12-04 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33744:
-

 Summary: Hybrid shuffle avoids restarting the whole job when 
failover
 Key: FLINK-33744
 URL: https://issues.apache.org/jira/browse/FLINK-33744
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Affects Versions: 1.19.0
Reporter: Yuxin Tan


If Hybrid shuffle is enabled, the whole job will be restarted when failover. 
This is a critical issue for large-scale jobs. We should improve the logic and 
avoid restarting the whole job when failover.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33742) Hybrid Shuffle should work well with Adaptive Query Execution

2023-12-04 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33742:
--
Description: 
At present, Hybrid Shuffle and Adaptive Query Execution (AQE), which includes 
features such as Dynamic Partition Pruning (DPP), Runtime Filter, and Adaptive 
Batch Scheduler, are not fully compatible. While they can be used concurrently 
at the same time, the activation of AQE inhibits the key capability of Hybrid 
Shuffle to perform simultaneous reading and writing. This limitation arises 
because AQE dictates that downstream tasks may only initiate once upstream 
tasks have finished, a requirement that is inconsistent with the simultaneous 
read-write process facilitated by Hybrid Shuffle. In addition, Hybrid Shuffle 
will restart the whole job when failover, which is also an essential issue for 
production usage.

To harness the full potential of Hybrid Shuffle and AQE, it is essential to 
refine their integration. By doing so, we can capitalize on each feature's 
distinct advantages and enhance overall system performance.

  was:
At present, Hybrid Shuffle and Adaptive Query Execution (AQE), which includes 
features such as Dynamic Partition Pruning (DPP), Runtime Filter, and Adaptive 
Batch Scheduler, are not fully compatible. While they can be used concurrently 
at the same time, the activation of AQE inhibits the key capability of Hybrid 
Shuffle to perform simultaneous reading and writing. This limitation arises 
because AQE dictates that downstream tasks may only initiate once upstream 
tasks have finished, a requirement that is inconsistent with the simultaneous 
read-write process facilitated by Hybrid Shuffle.

To harness the full potential of Hybrid Shuffle and AQE, it is essential to 
refine their integration. By doing so, we can capitalize on each feature's 
distinct advantages and enhance overall system performance.


> Hybrid Shuffle should work well with Adaptive Query Execution
> -
>
> Key: FLINK-33742
> URL: https://issues.apache.org/jira/browse/FLINK-33742
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Yuxin Tan
>Priority: Major
>  Labels: Umbrella
>
> At present, Hybrid Shuffle and Adaptive Query Execution (AQE), which includes 
> features such as Dynamic Partition Pruning (DPP), Runtime Filter, and 
> Adaptive Batch Scheduler, are not fully compatible. While they can be used 
> concurrently at the same time, the activation of AQE inhibits the key 
> capability of Hybrid Shuffle to perform simultaneous reading and writing. 
> This limitation arises because AQE dictates that downstream tasks may only 
> initiate once upstream tasks have finished, a requirement that is 
> inconsistent with the simultaneous read-write process facilitated by Hybrid 
> Shuffle. In addition, Hybrid Shuffle will restart the whole job when 
> failover, which is also an essential issue for production usage.
> To harness the full potential of Hybrid Shuffle and AQE, it is essential to 
> refine their integration. By doing so, we can capitalize on each feature's 
> distinct advantages and enhance overall system performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33743) Support consuming multiple subpartitions on a single channel

2023-12-04 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33743:
-

 Summary: Support consuming multiple subpartitions on a single 
channel
 Key: FLINK-33743
 URL: https://issues.apache.org/jira/browse/FLINK-33743
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Yuxin Tan


At present, a downstream channel is limited to consuming data from a single 
subpartition, a constraint that can lead to increased memory consumption. 
Addressing this issue is also a critical step in ensuring that Hybrid Shuffle 
functions effectively with Adaptive Query Execution (AQE). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33742) Hybrid Shuffle should work well with Adaptive Query Execution

2023-12-04 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33742:
-

 Summary: Hybrid Shuffle should work well with Adaptive Query 
Execution
 Key: FLINK-33742
 URL: https://issues.apache.org/jira/browse/FLINK-33742
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination, Runtime / Network
Affects Versions: 1.19.0
Reporter: Yuxin Tan


At present, Hybrid Shuffle and Adaptive Query Execution (AQE), which includes 
features such as Dynamic Partition Pruning (DPP), Runtime Filter, and Adaptive 
Batch Scheduler, are not fully compatible. While they can be used concurrently 
at the same time, the activation of AQE inhibits the key capability of Hybrid 
Shuffle to perform simultaneous reading and writing. This limitation arises 
because AQE dictates that downstream tasks may only initiate once upstream 
tasks have finished, a requirement that is inconsistent with the simultaneous 
read-write process facilitated by Hybrid Shuffle.

To harness the full potential of Hybrid Shuffle and AQE, it is essential to 
refine their integration. By doing so, we can capitalize on each feature's 
distinct advantages and enhance overall system performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33635) Some connectors can not compile in 1.19-SNAPSHOT

2023-11-23 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789300#comment-17789300
 ] 

Yuxin Tan commented on FLINK-33635:
---

The failed reason is that the APIs are changed by 
[FLINK-25857|https://issues.apache.org/jira/browse/FLINK-25857]. This change 
may affect all connector's implementation. I think we should fix the issue with 
high priority.

> Some connectors can not compile in 1.19-SNAPSHOT
> 
>
> Key: FLINK-33635
> URL: https://issues.apache.org/jira/browse/FLINK-33635
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>
> The sink API compatibility was broken in FLINK-25857. 
> org.apache.flink.api.connector.sink2.Sink#createWriter(InitContext) was 
> changed to 
> org.apache.flink.api.connector.sink2.Sink#createWriter(WriterInitContext).
> All external connectors sink can not compile as this change.
> For example:
> es: 
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/6976181890/job/18984287421
> aws: 
> https://github.com/apache/flink-connector-aws/actions/runs/6975253086/job/18982104160



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33612) The table plan of hybrid shuffle may introduce additional blocking edges occasionally

2023-11-21 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33612:
-

Assignee: Yuxin Tan

> The table plan of hybrid shuffle may introduce additional blocking edges 
> occasionally
> -
>
> Key: FLINK-33612
> URL: https://issues.apache.org/jira/browse/FLINK-33612
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> To enhance the performance of hybrid shuffle, it is imperative to address the 
> inconsistency between hybrid shuffle mode and blocking shuffle mode in 
> certain query plans of TPC-DS (such as q88.sql, q14a.sql, q14b.sql, etc). 
> In hybrid shuffle mode, these plans introduce additional blocking shuffle 
> edges and result in increased shuffle times, potentially impacting overall 
> efficiency. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33612) The table plan of hybrid shuffle may introduce additional blocking edges occasionally

2023-11-21 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33612:
-

 Summary: The table plan of hybrid shuffle may introduce additional 
blocking edges occasionally
 Key: FLINK-33612
 URL: https://issues.apache.org/jira/browse/FLINK-33612
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Yuxin Tan


To enhance the performance of hybrid shuffle, it is imperative to address the 
inconsistency between hybrid shuffle mode and blocking shuffle mode in certain 
query plans of TPC-DS (such as q88.sql, q14a.sql, q14b.sql, etc). 
In hybrid shuffle mode, these plans introduce additional blocking shuffle edges 
and result in increased shuffle times, potentially impacting overall 
efficiency. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33582) Update flink-shaded version

2023-11-20 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33582:
--
Affects Version/s: 1.17.3
   (was: 1.17.2)

> Update flink-shaded version
> ---
>
> Key: FLINK-33582
> URL: https://issues.apache.org/jira/browse/FLINK-33582
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: 1.17.3
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.3
>
>
> This is a follow-up task for 
> https://issues.apache.org/jira/browse/FLINK-33417. 
> After flink-shaded 16.2 is released, we should update the flink-shaded 
> version for Flink 1.17 to resolve the issue thoroughly.
> Note the update is only for 1.17.x, because 1.18.x and 1.19.x have been 
> updated and the issue does not exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33593) Left title bar of the official documentation for the master branch is misaligned

2023-11-20 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787936#comment-17787936
 ] 

Yuxin Tan commented on FLINK-33593:
---

[~lsy] Hi, Dalong, the issue has been resolved at 
https://issues.apache.org/jira/browse/FLINK-33356. I have personally checked 
the web page on my local machine and everything appears to be functioning 
normally. Have you attempted to refresh your web browser's cache?

> Left title bar of the official documentation for the master branch is 
> misaligned
> 
>
> Key: FLINK-33593
> URL: https://issues.apache.org/jira/browse/FLINK-33593
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: dalongliu
>Priority: Major
> Attachments: image-2023-11-20-11-34-07-084.png
>
>
> The left title bar of the official documentation for the Master branch is 
> misaligned, but the 1.18 branch is normal, so I'm guessing there's something 
> wrong with this and we should fix it.
> !image-2023-11-20-11-34-07-084.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33582) Update flink-shaded version

2023-11-17 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33582:
-

Assignee: Yuxin Tan

> Update flink-shaded version
> ---
>
> Key: FLINK-33582
> URL: https://issues.apache.org/jira/browse/FLINK-33582
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: 1.17.2
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> This is a follow-up task for 
> https://issues.apache.org/jira/browse/FLINK-33417. 
> After flink-shaded 16.2 is released, we should update the flink-shaded 
> version for Flink 1.17 to resolve the issue thoroughly.
> Note the update is only for 1.17.x, because 1.18.x and 1.19.x have been 
> updated and the issue does not exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33582) Update flink-shaded version

2023-11-17 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33582:
--
Description: 
This is a follow-up task for https://issues.apache.org/jira/browse/FLINK-33417. 

After flink-shaded 16.2 is released, we should update the flink-shaded version 
for Flink 1.17 to resolve the issue thoroughly.

Note the update is only for 1.17.x, because 1.18.x and 1.19.x have been updated 
and the issue does not exist.

  was:
This is a follow-up task for https://issues.apache.org/jira/browse/FLINK-33417. 

After flink-shaded 16.2 is released, we should update the flink-shaded version 
for Flink 1.17 to resolve the issue thoroughly.


> Update flink-shaded version
> ---
>
> Key: FLINK-33582
> URL: https://issues.apache.org/jira/browse/FLINK-33582
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: 1.17.2
>Reporter: Yuxin Tan
>Priority: Major
>
> This is a follow-up task for 
> https://issues.apache.org/jira/browse/FLINK-33417. 
> After flink-shaded 16.2 is released, we should update the flink-shaded 
> version for Flink 1.17 to resolve the issue thoroughly.
> Note the update is only for 1.17.x, because 1.18.x and 1.19.x have been 
> updated and the issue does not exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33582) Update flink-shaded version

2023-11-17 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33582:
-

 Summary: Update flink-shaded version
 Key: FLINK-33582
 URL: https://issues.apache.org/jira/browse/FLINK-33582
 Project: Flink
  Issue Type: Bug
  Components: BuildSystem / Shaded
Affects Versions: 1.17.2
Reporter: Yuxin Tan


This is a follow-up task for https://issues.apache.org/jira/browse/FLINK-33417. 

After flink-shaded 16.2 is released, we should update the flink-shaded version 
for Flink 1.17 to resolve the issue thoroughly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33537) Docs version nav is missing 1.18 as an option

2023-11-14 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33537:
--
Affects Version/s: 1.19.0

> Docs version nav is missing 1.18 as an option
> -
>
> Key: FLINK-33537
> URL: https://issues.apache.org/jira/browse/FLINK-33537
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: Robin Moffatt
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-11-13-17-51-58-830.png
>
>
> The docs site is missing 1.18 from the version navigation in the bottom-left  
> !image-2023-11-13-17-51-58-830.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33537) Docs version nav is missing 1.18 as an option

2023-11-14 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786187#comment-17786187
 ] 

Yuxin Tan commented on FLINK-33537:
---

[~rmoff] Thanks for reporting this issue. I will take a look.

> Docs version nav is missing 1.18 as an option
> -
>
> Key: FLINK-33537
> URL: https://issues.apache.org/jira/browse/FLINK-33537
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Robin Moffatt
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-11-13-17-51-58-830.png
>
>
> The docs site is missing 1.18 from the version navigation in the bottom-left  
> !image-2023-11-13-17-51-58-830.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33512) Update download link in doc of Kafka connector

2023-11-14 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786144#comment-17786144
 ] 

Yuxin Tan commented on FLINK-33512:
---

Fixed in https://issues.apache.org/jira/browse/FLINK-33401.

> Update download link in doc of Kafka connector
> --
>
> Key: FLINK-33512
> URL: https://issues.apache.org/jira/browse/FLINK-33512
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Documentation
>Affects Versions: kafka-3.0.1
>Reporter: Qingsheng Ren
>Priority: Major
>
> Currently the download link of Kafka connector in documentations points to a 
> non-existed version `1.18.0`:
> DataStream API: 
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> Table API Kafka: 
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/kafka/]
> Table API Upsert Kafka: 
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/upsert-kafka/]
> The latest version should be 3.0.1-1.17 and 3.0.1-1.18.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33401) Kafka connector has broken version

2023-11-10 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784724#comment-17784724
 ] 

Yuxin Tan edited comment on FLINK-33401 at 11/10/23 8:47 AM:
-

After merging the Flink 1.19 hotfix (fa1036c73e3bcd66b57d835c7859572ca4b2250d, 
Remove Kafka documentation for SQL/Table API, since this is now externalized), 
I conducted tests on the Flink 1.19 version, and it shows the correct version.

I noticed that the hotfix has also been backported to the Flink release-1.18. 
Once this fix is merged into the Kafka connector repository, then 1.18 
documentation will display the accurate version.

 !screenshot-1.png! 


was (Author: tanyuxin):
After merging the Flink 1.19 hotfix (fa1036c73e3bcd66b57d835c7859572ca4b2250d, 
Remove Kafka documentation for SQL/Table API, since this is now externalized), 
I conducted tests on the Flink 1.19 version, and it showes the correct version.

I noticed that the hotfix has also been backported to the Flink release-1.18. 
Once this fix is merged into the Kafka connector repository, then 1.18 
documentation will display the accurate version.

 !screenshot-1.png! 

> Kafka connector has broken version
> --
>
> Key: FLINK-33401
> URL: https://issues.apache.org/jira/browse/FLINK-33401
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Pavel Khokhlov
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> Trying to run Flink 1.18 with Kafka Connector
> but official documentation has a bug  
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> {noformat}
> 
> org.apache.flink
> flink-connector-kafka
> -1.18
> {noformat}
> Basically version *-1.18* doesn't exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33401) Kafka connector has broken version

2023-11-10 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784724#comment-17784724
 ] 

Yuxin Tan edited comment on FLINK-33401 at 11/10/23 8:47 AM:
-

After merging the Flink 1.19 hotfix (fa1036c73e3bcd66b57d835c7859572ca4b2250d, 
Remove Kafka documentation for SQL/Table API, since this is now externalized), 
I conducted tests on the Flink 1.19 version, and it showes the correct version.

I noticed that the hotfix has also been backported to the Flink release-1.18. 
Once this fix is merged into the Kafka connector repository, then 1.18 
documentation will display the accurate version.

 !screenshot-1.png! 


was (Author: tanyuxin):
After merging the Flink 1.19 hotfix (fa1036c73e3bcd66b57d835c7859572ca4b2250d, 
Remove Kafka documentation for SQL/Table API, since this is now externalized), 
I conducted tests on the Flink 1.19 version, and it reflected the correct 
version.

I observed that the hotfix has also been backported to the Flink release-1.18. 
Once this fix is merged into the Kafka connector repository, the 1.18 
documentation will display the accurate version.

 !screenshot-1.png! 

> Kafka connector has broken version
> --
>
> Key: FLINK-33401
> URL: https://issues.apache.org/jira/browse/FLINK-33401
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Pavel Khokhlov
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> Trying to run Flink 1.18 with Kafka Connector
> but official documentation has a bug  
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> {noformat}
> 
> org.apache.flink
> flink-connector-kafka
> -1.18
> {noformat}
> Basically version *-1.18* doesn't exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33401) Kafka connector has broken version

2023-11-10 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784724#comment-17784724
 ] 

Yuxin Tan commented on FLINK-33401:
---

After merging the Flink 1.19 hotfix (fa1036c73e3bcd66b57d835c7859572ca4b2250d, 
Remove Kafka documentation for SQL/Table API, since this is now externalized), 
I conducted tests on the Flink 1.19 version, and it reflected the correct 
version.

I observed that the hotfix has also been backported to the Flink release-1.18. 
Once this fix is merged into the Kafka connector repository, the 1.18 
documentation will display the accurate version.

 !screenshot-1.png! 

> Kafka connector has broken version
> --
>
> Key: FLINK-33401
> URL: https://issues.apache.org/jira/browse/FLINK-33401
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Pavel Khokhlov
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> Trying to run Flink 1.18 with Kafka Connector
> but official documentation has a bug  
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> {noformat}
> 
> org.apache.flink
> flink-connector-kafka
> -1.18
> {noformat}
> Basically version *-1.18* doesn't exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33401) Kafka connector has broken version

2023-11-10 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33401:
--
Attachment: screenshot-1.png

> Kafka connector has broken version
> --
>
> Key: FLINK-33401
> URL: https://issues.apache.org/jira/browse/FLINK-33401
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Pavel Khokhlov
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> Trying to run Flink 1.18 with Kafka Connector
> but official documentation has a bug  
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> {noformat}
> 
> org.apache.flink
> flink-connector-kafka
> -1.18
> {noformat}
> Basically version *-1.18* doesn't exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33493) Elasticsearch connector ElasticsearchWriterITCase test failed

2023-11-09 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784674#comment-17784674
 ] 

Yuxin Tan commented on FLINK-33493:
---

[~martijnvisser] Yeah. I will take a look at this issue.

> Elasticsearch connector ElasticsearchWriterITCase test failed
> -
>
> Key: FLINK-33493
> URL: https://issues.apache.org/jira/browse/FLINK-33493
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Reporter: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> When I ran tests, the test failed. The failed reason is
> {code:java}
> Error:  
> /home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
>  cannot find symbol
>   symbol:   method 
> mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
>   location: class 
> org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
> Error:  
> /home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
>  cannot find symbol
>   symbol:   method mock(org.apache.flink.metrics.MetricGroup)
> {code}
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.
> ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
> and it is renamed in https://github.com/apache/flink/pull/23541 
> ([FLINK-33295|https://issues.apache.org/jira/browse/FLINK-33295] in Flink 
> 1.19). So the test failed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33493) Elasticsearch connector ElasticsearchWriterITCase test failed

2023-11-09 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784361#comment-17784361
 ] 

Yuxin Tan commented on FLINK-33493:
---

In addition, the affected version can not add 4.0.x, because the version 
es-4.0.x does not exist currently.

> Elasticsearch connector ElasticsearchWriterITCase test failed
> -
>
> Key: FLINK-33493
> URL: https://issues.apache.org/jira/browse/FLINK-33493
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Reporter: Yuxin Tan
>Priority: Major
>
> When I ran tests, the test failed. The failed reason is
> {code:java}
> Error:  
> /home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
>  cannot find symbol
>   symbol:   method 
> mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
>   location: class 
> org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
> Error:  
> /home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
>  cannot find symbol
>   symbol:   method mock(org.apache.flink.metrics.MetricGroup)
> {code}
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.
> ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
> and it is renamed in https://github.com/apache/flink/pull/23541 
> ([FLINK-33295|https://issues.apache.org/jira/browse/FLINK-33295] in Flink 
> 1.19). So the test failed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33493) Elasticsearch connector ElasticsearchWriterITCase test failed

2023-11-09 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33493:
--
Description: 
When I ran tests, the test failed. The failed reason is

{code:java}
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
 cannot find symbol
  symbol:   method 
mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
  location: class 
org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
 cannot find symbol
  symbol:   method mock(org.apache.flink.metrics.MetricGroup)
{code}

https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.

ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
and it is renamed in https://github.com/apache/flink/pull/23541 
(https://issues.apache.org/jira/browse/FLINK-33295 in Flink 1.19). So the test 
failed.

  was:
When I ran tests, the test failed. The failed reason is

{code:java}
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
 cannot find symbol
  symbol:   method 
mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
  location: class 
org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
 cannot find symbol
  symbol:   method mock(org.apache.flink.metrics.MetricGroup)
{code}

https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.

ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
and it is renamed in https://github.com/apache/flink/pull/23541 (in Flink 
1.19). So the test failed.


> Elasticsearch connector ElasticsearchWriterITCase test failed
> -
>
> Key: FLINK-33493
> URL: https://issues.apache.org/jira/browse/FLINK-33493
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Reporter: Yuxin Tan
>Priority: Major
>
> When I ran tests, the test failed. The failed reason is
> {code:java}
> Error:  
> /home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
>  cannot find symbol
>   symbol:   method 
> mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
>   location: class 
> org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
> Error:  
> /home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
>  cannot find symbol
>   symbol:   method mock(org.apache.flink.metrics.MetricGroup)
> {code}
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.
> ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
> and it is renamed in https://github.com/apache/flink/pull/23541 
> (https://issues.apache.org/jira/browse/FLINK-33295 in Flink 1.19). So the 
> test failed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33493) Elasticsearch connector ElasticsearchWriterITCase test failed

2023-11-09 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33493:
--
Description: 
When I ran tests, the test failed. The failed reason is

{code:java}
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
 cannot find symbol
  symbol:   method 
mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
  location: class 
org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
 cannot find symbol
  symbol:   method mock(org.apache.flink.metrics.MetricGroup)
{code}

https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.

ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
and it is renamed in https://github.com/apache/flink/pull/23541 
([FLINK-33295|https://issues.apache.org/jira/browse/FLINK-33295] in Flink 
1.19). So the test failed.

  was:
When I ran tests, the test failed. The failed reason is

{code:java}
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
 cannot find symbol
  symbol:   method 
mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
  location: class 
org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
 cannot find symbol
  symbol:   method mock(org.apache.flink.metrics.MetricGroup)
{code}

https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.

ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
and it is renamed in https://github.com/apache/flink/pull/23541 
(https://issues.apache.org/jira/browse/FLINK-33295 in Flink 1.19). So the test 
failed.


> Elasticsearch connector ElasticsearchWriterITCase test failed
> -
>
> Key: FLINK-33493
> URL: https://issues.apache.org/jira/browse/FLINK-33493
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Reporter: Yuxin Tan
>Priority: Major
>
> When I ran tests, the test failed. The failed reason is
> {code:java}
> Error:  
> /home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
>  cannot find symbol
>   symbol:   method 
> mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
>   location: class 
> org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
> Error:  
> /home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
>  cannot find symbol
>   symbol:   method mock(org.apache.flink.metrics.MetricGroup)
> {code}
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.
> ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
> and it is renamed in https://github.com/apache/flink/pull/23541 
> ([FLINK-33295|https://issues.apache.org/jira/browse/FLINK-33295] in Flink 
> 1.19). So the test failed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33493) Elasticsearch connector ElasticsearchWriterITCase test failed

2023-11-09 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33493:
-

 Summary: Elasticsearch connector ElasticsearchWriterITCase test 
failed
 Key: FLINK-33493
 URL: https://issues.apache.org/jira/browse/FLINK-33493
 Project: Flink
  Issue Type: Bug
  Components: Connectors / ElasticSearch
Reporter: Yuxin Tan


When I ran tests, the test failed. The failed reason is

{code:java}
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[197,46]
 cannot find symbol
  symbol:   method 
mock(org.apache.flink.metrics.MetricGroup,org.apache.flink.metrics.groups.OperatorIOMetricGroup)
  location: class 
org.apache.flink.runtime.metrics.groups.InternalSinkWriterMetricGroup
Error:  
/home/runner/work/flink-connector-elasticsearch/flink-connector-elasticsearch/flink-connector-elasticsearch-base/src/test/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchWriterITCase.java:[273,46]
 cannot find symbol
  symbol:   method mock(org.apache.flink.metrics.MetricGroup)
{code}

https://github.com/apache/flink-connector-elasticsearch/actions/runs/6809899863/job/18517273714?pr=77#step:13:134.

ElasticsearchWriterITCase called Flink "InternalSinkWriterMetricGroup#mock", 
and it is renamed in https://github.com/apache/flink/pull/23541 (in Flink 
1.19). So the test failed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33401) Kafka connector has broken version

2023-11-06 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783166#comment-17783166
 ] 

Yuxin Tan commented on FLINK-33401:
---

[~pavelhp] I noticed that it has been released. You can find the release thread 
here: https://lists.apache.org/thread/dn8dw8551ckrm5cw6rs73v4b4zm0vy05. From 
the given link, the connector's download link can be found: 
https://www.apache.org/dyn/closer.lua/flink/flink-connector-kafka-3.0.1/flink-connector-kafka-3.0.1-src.tgz.

> Kafka connector has broken version
> --
>
> Key: FLINK-33401
> URL: https://issues.apache.org/jira/browse/FLINK-33401
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Pavel Khokhlov
>Priority: Major
>  Labels: pull-request-available
>
> Trying to run Flink 1.18 with Kafka Connector
> but official documentation has a bug  
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> {noformat}
> 
> org.apache.flink
> flink-connector-kafka
> -1.18
> {noformat}
> Basically version *-1.18* doesn't exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-11-03 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782463#comment-17782463
 ] 

Yuxin Tan commented on FLINK-33417:
---

[~mapohl] ok. Thanks for the tracking. The error is on our test machine, not on 
the local Mac. The error can be reproduced and the reproduced log `run.log` is 
attached.

> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: shaded-16.1
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 09DD8FAA-7701-4AA0-85F9-4ABA6C5117DF.png, 
> BE207321-9677-4721-9415-BD3312C29824.png, run.log
>
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
>  (The BE20xxx pic is the error when using 4.1.82. The 09DDxx pic is the pic 
> of compiling successfully after using 4.1.83.)
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-11-03 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33417:
--
Attachment: run.log

> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: shaded-16.1
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 09DD8FAA-7701-4AA0-85F9-4ABA6C5117DF.png, 
> BE207321-9677-4721-9415-BD3312C29824.png, run.log
>
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
>  (The BE20xxx pic is the error when using 4.1.82. The 09DDxx pic is the pic 
> of compiling successfully after using 4.1.83.)
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-33445) Translate DataSet migration guideline to Chinese

2023-11-03 Thread Yuxin Tan (Jira)


[ https://issues.apache.org/jira/browse/FLINK-33445 ]


Yuxin Tan deleted comment on FLINK-33445:
---

was (Author: tanyuxin):
[~liyubin117] Thanks for helping, assigned to you.

> Translate DataSet migration guideline to Chinese
> 
>
> Key: FLINK-33445
> URL: https://issues.apache.org/jira/browse/FLINK-33445
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation
>Affects Versions: 1.19.0
>Reporter: Wencong Liu
>Assignee: Yubin Li
>Priority: Major
>  Labels: starter
> Fix For: 1.19.0
>
>
> The [FLIINK-33041|https://issues.apache.org/jira/browse/FLINK-33041] about 
> adding an introduction about how to migrate DataSet API to DataStream has 
> been merged into master branch. Here is the 
> [LINK|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/dataset_migration/]
>  in the Flink website.
> According to the [contribution 
> guidelines|https://flink.apache.org/how-to-contribute/contribute-documentation/#chinese-documentation-translation],
>  we should add an identical markdown file in {{content.zh/}} and translate it 
> to Chinese. Any community volunteers are welcomed to take this task.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-11-02 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782399#comment-17782399
 ] 

Yuxin Tan commented on FLINK-33417:
---

[~mapohl] Actually, as I don't possess an in-depth understanding of Netty, I 
haven't determined the internal cause of the error or how this can resolve the 
issue. But it indeed works.

> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: shaded-16.1
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 09DD8FAA-7701-4AA0-85F9-4ABA6C5117DF.png, 
> BE207321-9677-4721-9415-BD3312C29824.png
>
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
>  (The BE20xxx pic is the error when using 4.1.82. The 09DDxx pic is the pic 
> of compiling successfully after using 4.1.83.)
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33445) Translate DataSet migration guideline to Chinese

2023-11-02 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782385#comment-17782385
 ] 

Yuxin Tan commented on FLINK-33445:
---

[~liyubin117] Thanks for helping, assigned to you.

> Translate DataSet migration guideline to Chinese
> 
>
> Key: FLINK-33445
> URL: https://issues.apache.org/jira/browse/FLINK-33445
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation
>Affects Versions: 1.19.0
>Reporter: Wencong Liu
>Assignee: Yubin Li
>Priority: Major
>  Labels: starter
> Fix For: 1.19.0
>
>
> The [FLIINK-33041|https://issues.apache.org/jira/browse/FLINK-33041] about 
> adding an introduction about how to migrate DataSet API to DataStream has 
> been merged into master branch. Here is the 
> [LINK|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/dataset_migration/]
>  in the Flink website.
> According to the [contribution 
> guidelines|https://flink.apache.org/how-to-contribute/contribute-documentation/#chinese-documentation-translation],
>  we should add an identical markdown file in {{content.zh/}} and translate it 
> to Chinese. Any community volunteers are welcomed to take this task.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33445) Translate DataSet migration guideline to Chinese

2023-11-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33445:
-

Assignee: Yubin Li

> Translate DataSet migration guideline to Chinese
> 
>
> Key: FLINK-33445
> URL: https://issues.apache.org/jira/browse/FLINK-33445
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation
>Affects Versions: 1.19.0
>Reporter: Wencong Liu
>Assignee: Yubin Li
>Priority: Major
>  Labels: starter
> Fix For: 1.19.0
>
>
> The [FLIINK-33041|https://issues.apache.org/jira/browse/FLINK-33041] about 
> adding an introduction about how to migrate DataSet API to DataStream has 
> been merged into master branch. Here is the 
> [LINK|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/dataset_migration/]
>  in the Flink website.
> According to the [contribution 
> guidelines|https://flink.apache.org/how-to-contribute/contribute-documentation/#chinese-documentation-translation],
>  we should add an identical markdown file in {{content.zh/}} and translate it 
> to Chinese. Any community volunteers are welcomed to take this task.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-11-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33417:
--
Description: 
In our ARM environment, we encounter a compile error when using Flink 1.17.
 (The BE20xxx pic is the error when using 4.1.82. The 09DDxx pic is the pic of 
compiling successfully after using 4.1.83.)

Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
flink-shaded 16.1 fails to compile in the ARM environment. As a result, we are 
unable to compile Flink 1.17 due to this issue.

We have tested compiling flink-shaded using netty 4.1.83 or a later version in 
ARM env, and it can compile successfully.

Taking into consideration the previous discussions regarding compatibility and 
the dependency of external connectors on this version, I propose addressing the 
bug by only updating flink-shaded's netty to a minor version (e.g., 4.1.83) 
rather than backporting FLINK-32032. 

To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
released.

The discussion details is at 
https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.


  was:
In our ARM environment, we encounter a compile error when using Flink 1.17.

Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
flink-shaded 16.1 fails to compile in the ARM environment. As a result, we are 
unable to compile Flink 1.17 due to this issue.

We have tested compiling flink-shaded using netty 4.1.83 or a later version in 
ARM env, and it can compile successfully.

Taking into consideration the previous discussions regarding compatibility and 
the dependency of external connectors on this version, I propose addressing the 
bug by only updating flink-shaded's netty to a minor version (e.g., 4.1.83) 
rather than backporting FLINK-32032. 

To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
released.

The discussion details is at 
https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: shaded-16.1
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 09DD8FAA-7701-4AA0-85F9-4ABA6C5117DF.png, 
> BE207321-9677-4721-9415-BD3312C29824.png
>
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
>  (The BE20xxx pic is the error when using 4.1.82. The 09DDxx pic is the pic 
> of compiling successfully after using 4.1.83.)
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-11-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33417:
--
Attachment: 09DD8FAA-7701-4AA0-85F9-4ABA6C5117DF.png

> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: shaded-16.1
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 09DD8FAA-7701-4AA0-85F9-4ABA6C5117DF.png, 
> BE207321-9677-4721-9415-BD3312C29824.png
>
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-11-02 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33417:
--
Attachment: BE207321-9677-4721-9415-BD3312C29824.png

> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: shaded-16.1
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Attachments: BE207321-9677-4721-9415-BD3312C29824.png
>
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-11-01 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33417:
--
Affects Version/s: shaded-16.1

> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: shaded-16.1
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-11-01 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33417:
-

Assignee: Yuxin Tan

> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33401) Kafka connector has broken version

2023-10-31 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781563#comment-17781563
 ] 

Yuxin Tan edited comment on FLINK-33401 at 11/1/23 12:24 AM:
-

[~pavelhp] If you wish to try it in Flink 1.18, you may need to wait for the 
new release of the new version connector. (Currently, the connector that can 
adapt to Flink 1.18 has not been released yet.)


was (Author: tanyuxin):
[~pavelhp] If you wish to try it in Flink 1.18, you may need to wait for the 
new release of the new version connector.

> Kafka connector has broken version
> --
>
> Key: FLINK-33401
> URL: https://issues.apache.org/jira/browse/FLINK-33401
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Pavel Khokhlov
>Priority: Major
>  Labels: pull-request-available
>
> Trying to run Flink 1.18 with Kafka Connector
> but official documentation has a bug  
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> {noformat}
> 
> org.apache.flink
> flink-connector-kafka
> -1.18
> {noformat}
> Basically version *-1.18* doesn't exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33401) Kafka connector has broken version

2023-10-31 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781563#comment-17781563
 ] 

Yuxin Tan commented on FLINK-33401:
---

[~pavelhp] If you wish to try it in Flink 1.18, you may need to wait for the 
new release of the new version connector.

> Kafka connector has broken version
> --
>
> Key: FLINK-33401
> URL: https://issues.apache.org/jira/browse/FLINK-33401
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Pavel Khokhlov
>Priority: Major
>  Labels: pull-request-available
>
> Trying to run Flink 1.18 with Kafka Connector
> but official documentation has a bug  
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> {noformat}
> 
> org.apache.flink
> flink-connector-kafka
> -1.18
> {noformat}
> Basically version *-1.18* doesn't exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-10-31 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33417:
--
Description: 
In our ARM environment, we encounter a compile error when using Flink 1.17.

Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
flink-shaded 16.1 fails to compile in the ARM environment. As a result, we are 
unable to compile Flink 1.17 due to this issue.

We have tested compiling flink-shaded using netty 4.1.83 or a later version in 
ARM env, and it can compile successfully.

Taking into consideration the previous discussions regarding compatibility and 
the dependency of external connectors on this version, I propose addressing the 
bug by only updating flink-shaded's netty to a minor version (e.g., 4.1.83) 
rather than backporting FLINK-32032. 

To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
released.

The discussion details is at 
https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.


  was:
In our ARM environment, we encounter a compile error when
using Flink 1.17.

Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82.
However, flink-shaded 16.1 fails to compile in the ARM
environment. As a result, we are unable to compile Flink 1.17
due to this issue.

We have tested compiling flink-shaded using netty 4.1.83 or
a later version in ARM env, and it can compile successfully.

Taking into consideration the previous discussions regarding
compatibility and the dependency of external connectors on
this version, I propose addressing the bug by only updating
flink-shaded's netty to a minor version (e.g., 4.1.83) rather than
backporting FLINK-32032. 

To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
released.

The discussion details is at 
https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



> Update netty version to 4.1.83 for flink-shaded
> ---
>
> Key: FLINK-33417
> URL: https://issues.apache.org/jira/browse/FLINK-33417
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Reporter: Yuxin Tan
>Priority: Major
>
> In our ARM environment, we encounter a compile error when using Flink 1.17.
> Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, 
> flink-shaded 16.1 fails to compile in the ARM environment. As a result, we 
> are unable to compile Flink 1.17 due to this issue.
> We have tested compiling flink-shaded using netty 4.1.83 or a later version 
> in ARM env, and it can compile successfully.
> Taking into consideration the previous discussions regarding compatibility 
> and the dependency of external connectors on this version, I propose 
> addressing the bug by only updating flink-shaded's netty to a minor version 
> (e.g., 4.1.83) rather than backporting FLINK-32032. 
> To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
> released.
> The discussion details is at 
> https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33417) Update netty version to 4.1.83 for flink-shaded

2023-10-31 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33417:
-

 Summary: Update netty version to 4.1.83 for flink-shaded
 Key: FLINK-33417
 URL: https://issues.apache.org/jira/browse/FLINK-33417
 Project: Flink
  Issue Type: Bug
  Components: BuildSystem / Shaded
Reporter: Yuxin Tan


In our ARM environment, we encounter a compile error when
using Flink 1.17.

Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82.
However, flink-shaded 16.1 fails to compile in the ARM
environment. As a result, we are unable to compile Flink 1.17
due to this issue.

We have tested compiling flink-shaded using netty 4.1.83 or
a later version in ARM env, and it can compile successfully.

Taking into consideration the previous discussions regarding
compatibility and the dependency of external connectors on
this version, I propose addressing the bug by only updating
flink-shaded's netty to a minor version (e.g., 4.1.83) rather than
backporting FLINK-32032. 

To implement the update, maybe a new release of flink-shaded 16.2 needs to be 
released.

The discussion details is at 
https://lists.apache.org/thread/y1c8545bcsx2836d9pgfdzj65knvw7kb.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33401) Kafka connector has broken version

2023-10-30 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781203#comment-17781203
 ] 

Yuxin Tan commented on FLINK-33401:
---

[~pavelhp] Thanks for reporting this issue. I will take a look at this.

Since the new version of Kafka connector has not been released, the new version 
can not be available even if the bug is fixed. You can use the old version 
(https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/connectors/datastream/kafka/)
 before the new connector version is released. 

> Kafka connector has broken version
> --
>
> Key: FLINK-33401
> URL: https://issues.apache.org/jira/browse/FLINK-33401
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: Pavel Khokhlov
>Priority: Major
>  Labels: pull-request-available
>
> Trying to run Flink 1.18 with Kafka Connector
> but official documentation has a bug  
> [https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/]
> {noformat}
> 
> org.apache.flink
> flink-connector-kafka
> -1.18
> {noformat}
> Basically version *-1.18* doesn't exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33370) Simplify validateAndParseHostsString in Elasticsearch connecter's configuration

2023-10-26 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33370:
-

 Summary: Simplify validateAndParseHostsString in Elasticsearch 
connecter's configuration
 Key: FLINK-33370
 URL: https://issues.apache.org/jira/browse/FLINK-33370
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / ElasticSearch
Reporter: Yuxin Tan
Assignee: Yuxin Tan


Currently, the validateAndParseHostsString method exists in each configuration 
file(repeated for 3 times), but the method logic is exactly the same. We can 
simplify it  by introducing a common util.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33323) HybridShuffleITCase fails with produced an uncaught exception in FatalExitExceptionHandler

2023-10-26 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779779#comment-17779779
 ] 

Yuxin Tan commented on FLINK-33323:
---

[~Wencong Liu] Could you please take a look at this issue?

> HybridShuffleITCase fails with produced an uncaught exception in 
> FatalExitExceptionHandler
> --
>
> Key: FLINK-33323
> URL: https://issues.apache.org/jira/browse/FLINK-33323
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: mvn-3.zip
>
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53853=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9166
> fails with
> {noformat}
> 01:15:38,516 [blocking-shuffle-io-thread-4] ERROR 
> org.apache.flink.util.FatalExitExceptionHandler  [] - FATAL: 
> Thread 'blocking-shuffle-io-thread-4' produced an uncaught exception. 
> Stopping the process...
> java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@4275bb45[Not
>  completed, task = 
> java.util.concurrent.Executors$RunnableAdapter@488dd035[Wrapped task = 
> org.apache.fl
> ink.runtime.io.network.partition.hybrid.tiered.tier.disk.DiskIOScheduler$$Lambda$2561/0x000801a2f728@464a3754]]
>  rejected from 
> java.util.concurrent.ScheduledThreadPoolExecutor@22747816[Shutting down, pool 
> size = 10, active threads = 9,
>  queued tasks = 1, completed tasks = 1]
> at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2065)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:833) 
> ~[?:?]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340)
>  ~[?:?]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562)
>  ~[?:?]
> at 
> org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier.disk.DiskIOScheduler.run(DiskIOScheduler.java:151)
>  ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
> at 
> org.apache.flink.runtime.io.network.partition.hybrid.tiered.tier.disk.DiskIOScheduler.lambda$triggerScheduling$0(DiskIOScheduler.java:308)
>  ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  [?:?]
> at java.lang.Thread.run(Thread.java:833) [?:?]
> {noformat}
> also logs are attached



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33280) tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to timeout

2023-10-18 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33280:
--
Component/s: Runtime / Network

> tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to 
> timeout
> ---
>
> Key: FLINK-33280
> URL: https://issues.apache.org/jira/browse/FLINK-33280
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
> Fix For: 1.19.0
>
>
> [https://github.com/XComp/flink/actions/runs/6529754573/job/17728223099#step:12:9120]
> {code:java}
>  Oct 16 09:13:15 java.lang.AssertionError: 
> org.apache.flink.runtime.JobException: org.apache.flink.runtime.JobException: 
> Recovery is suppressed by 
> FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=10, 
> backoffTimeMS=0)
> 8959Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.JobGraphRunningUtil.execute(JobGraphRunningUtil.java:59)
> 8960Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.BatchShuffleITCaseBase.executeJob(BatchShuffleITCaseBase.java:118)
> 8961Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.HybridShuffleITCase.testHybridSelectiveExchangesRestart(HybridShuffleITCase.java:77)
> 8962Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 8963Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 8964Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 8965Oct 16 09:13:15   at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 8966Oct 16 09:13:15   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> 8967Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> 8968Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> 8969Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> 8970Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> 8971Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestTemplateMethod(TimeoutExtension.java:94)
> 8972Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> 8973Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> 8974Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> 8975Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> 8976Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> 8977Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> 8978Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
> 8979Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
> 8980Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
> 8981Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8982Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
> 8983Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
> 8984Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
> 8985Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
> 8986Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8987Oct 16 09:13:15   at 
> 

[jira] [Updated] (FLINK-33280) tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to timeout

2023-10-18 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33280:
--
Fix Version/s: 1.19.0

> tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to 
> timeout
> ---
>
> Key: FLINK-33280
> URL: https://issues.apache.org/jira/browse/FLINK-33280
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
> Fix For: 1.19.0
>
>
> [https://github.com/XComp/flink/actions/runs/6529754573/job/17728223099#step:12:9120]
> {code:java}
>  Oct 16 09:13:15 java.lang.AssertionError: 
> org.apache.flink.runtime.JobException: org.apache.flink.runtime.JobException: 
> Recovery is suppressed by 
> FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=10, 
> backoffTimeMS=0)
> 8959Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.JobGraphRunningUtil.execute(JobGraphRunningUtil.java:59)
> 8960Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.BatchShuffleITCaseBase.executeJob(BatchShuffleITCaseBase.java:118)
> 8961Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.HybridShuffleITCase.testHybridSelectiveExchangesRestart(HybridShuffleITCase.java:77)
> 8962Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 8963Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 8964Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 8965Oct 16 09:13:15   at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 8966Oct 16 09:13:15   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> 8967Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> 8968Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> 8969Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> 8970Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> 8971Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestTemplateMethod(TimeoutExtension.java:94)
> 8972Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> 8973Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> 8974Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> 8975Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> 8976Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> 8977Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> 8978Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
> 8979Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
> 8980Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
> 8981Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8982Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
> 8983Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
> 8984Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
> 8985Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
> 8986Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8987Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> 8988Oct 16 09:13:15   

[jira] [Updated] (FLINK-33280) tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to timeout

2023-10-18 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-33280:
--
Affects Version/s: 1.19.0

> tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to 
> timeout
> ---
>
> Key: FLINK-33280
> URL: https://issues.apache.org/jira/browse/FLINK-33280
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
> Fix For: 1.19.0
>
>
> [https://github.com/XComp/flink/actions/runs/6529754573/job/17728223099#step:12:9120]
> {code:java}
>  Oct 16 09:13:15 java.lang.AssertionError: 
> org.apache.flink.runtime.JobException: org.apache.flink.runtime.JobException: 
> Recovery is suppressed by 
> FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=10, 
> backoffTimeMS=0)
> 8959Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.JobGraphRunningUtil.execute(JobGraphRunningUtil.java:59)
> 8960Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.BatchShuffleITCaseBase.executeJob(BatchShuffleITCaseBase.java:118)
> 8961Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.HybridShuffleITCase.testHybridSelectiveExchangesRestart(HybridShuffleITCase.java:77)
> 8962Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 8963Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 8964Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 8965Oct 16 09:13:15   at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 8966Oct 16 09:13:15   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> 8967Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> 8968Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> 8969Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> 8970Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> 8971Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestTemplateMethod(TimeoutExtension.java:94)
> 8972Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> 8973Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> 8974Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> 8975Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> 8976Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> 8977Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> 8978Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
> 8979Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
> 8980Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
> 8981Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8982Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
> 8983Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
> 8984Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
> 8985Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
> 8986Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8987Oct 16 09:13:15   at 
> 

[jira] [Closed] (FLINK-33280) tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to timeout

2023-10-18 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan closed FLINK-33280.
-
Resolution: Fixed

> tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to 
> timeout
> ---
>
> Key: FLINK-33280
> URL: https://issues.apache.org/jira/browse/FLINK-33280
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> [https://github.com/XComp/flink/actions/runs/6529754573/job/17728223099#step:12:9120]
> {code:java}
>  Oct 16 09:13:15 java.lang.AssertionError: 
> org.apache.flink.runtime.JobException: org.apache.flink.runtime.JobException: 
> Recovery is suppressed by 
> FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=10, 
> backoffTimeMS=0)
> 8959Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.JobGraphRunningUtil.execute(JobGraphRunningUtil.java:59)
> 8960Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.BatchShuffleITCaseBase.executeJob(BatchShuffleITCaseBase.java:118)
> 8961Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.HybridShuffleITCase.testHybridSelectiveExchangesRestart(HybridShuffleITCase.java:77)
> 8962Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 8963Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 8964Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 8965Oct 16 09:13:15   at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 8966Oct 16 09:13:15   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> 8967Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> 8968Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> 8969Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> 8970Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> 8971Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestTemplateMethod(TimeoutExtension.java:94)
> 8972Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> 8973Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> 8974Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> 8975Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> 8976Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> 8977Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> 8978Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
> 8979Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
> 8980Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
> 8981Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8982Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
> 8983Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
> 8984Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
> 8985Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
> 8986Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8987Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> 8988Oct 16 09:13:15   at 
> 

[jira] [Commented] (FLINK-33280) tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to timeout

2023-10-18 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776628#comment-17776628
 ] 

Yuxin Tan commented on FLINK-33280:
---

The reason is the same as https://issues.apache.org/jira/browse/FLINK-33185. 
The fix has been merged.

Feel free to re-open this issue if it‘s also reproducible.

> tests stage: HybridShuffleITCase.testHybridSelectiveExchangesRestart due to 
> timeout
> ---
>
> Key: FLINK-33280
> URL: https://issues.apache.org/jira/browse/FLINK-33280
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> [https://github.com/XComp/flink/actions/runs/6529754573/job/17728223099#step:12:9120]
> {code:java}
>  Oct 16 09:13:15 java.lang.AssertionError: 
> org.apache.flink.runtime.JobException: org.apache.flink.runtime.JobException: 
> Recovery is suppressed by 
> FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=10, 
> backoffTimeMS=0)
> 8959Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.JobGraphRunningUtil.execute(JobGraphRunningUtil.java:59)
> 8960Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.BatchShuffleITCaseBase.executeJob(BatchShuffleITCaseBase.java:118)
> 8961Oct 16 09:13:15   at 
> org.apache.flink.test.runtime.HybridShuffleITCase.testHybridSelectiveExchangesRestart(HybridShuffleITCase.java:77)
> 8962Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 8963Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 8964Oct 16 09:13:15   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 8965Oct 16 09:13:15   at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 8966Oct 16 09:13:15   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> 8967Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> 8968Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> 8969Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> 8970Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> 8971Oct 16 09:13:15   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestTemplateMethod(TimeoutExtension.java:94)
> 8972Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> 8973Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> 8974Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> 8975Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> 8976Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> 8977Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> 8978Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
> 8979Oct 16 09:13:15   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
> 8980Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
> 8981Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8982Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
> 8983Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
> 8984Oct 16 09:13:15   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
> 8985Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
> 8986Oct 16 09:13:15   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 8987Oct 16 

[jira] [Assigned] (FLINK-33185) HybridShuffleITCase fails with TimeoutException: Pending slot request timed out in slot pool on AZP

2023-10-17 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33185:
-

Assignee: Yuxin Tan

> HybridShuffleITCase fails with TimeoutException: Pending slot request timed 
> out in slot pool on AZP
> ---
>
> Key: FLINK-33185
> URL: https://issues.apache.org/jira/browse/FLINK-33185
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Assignee: Yuxin Tan
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53519=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=8641
> fails as 
> {noformat}
> Sep 29 05:13:54 Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CompletionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.util.FlinkException: Pending slot request with 
> SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released.
> Sep 29 05:13:54   at 
> org.apache.flink.runtime.scheduler.DefaultExecutionDeployer.lambda$assignResource$4(DefaultExecutionDeployer.java:226)
> Sep 29 05:13:54   ... 36 more
> Sep 29 05:13:54 Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.util.FlinkException: Pending slot request with 
> SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released.
> Sep 29 05:13:54   at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
> Sep 29 05:13:54   at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
> Sep 29 05:13:54   at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
> Sep 29 05:13:54   at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> Sep 29 05:13:54   ... 34 more
> Sep 29 05:13:54 Caused by: org.apache.flink.util.FlinkException: 
> org.apache.flink.util.FlinkException: Pending slot request with 
> SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released.
> Sep 29 05:13:54   at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.releaseSlot(DeclarativeSlotPoolBridge.java:373)
> Sep 29 05:13:54   ... 30 more
> Sep 29 05:13:54 Caused by: java.util.concurrent.TimeoutException: 
> java.util.concurrent.TimeoutException: Pending slot request timed out in slot 
> pool.
> Sep 29 05:13:54   ... 30 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33185) HybridShuffleITCase fails with TimeoutException: Pending slot request timed out in slot pool on AZP

2023-10-17 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776471#comment-17776471
 ] 

Yuxin Tan commented on FLINK-33185:
---

[~Sergey Nuyanzin] Sorry for the late reply. I will take a look.

> HybridShuffleITCase fails with TimeoutException: Pending slot request timed 
> out in slot pool on AZP
> ---
>
> Key: FLINK-33185
> URL: https://issues.apache.org/jira/browse/FLINK-33185
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53519=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=8641
> fails as 
> {noformat}
> Sep 29 05:13:54 Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CompletionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.util.FlinkException: Pending slot request with 
> SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released.
> Sep 29 05:13:54   at 
> org.apache.flink.runtime.scheduler.DefaultExecutionDeployer.lambda$assignResource$4(DefaultExecutionDeployer.java:226)
> Sep 29 05:13:54   ... 36 more
> Sep 29 05:13:54 Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.util.FlinkException: Pending slot request with 
> SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released.
> Sep 29 05:13:54   at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
> Sep 29 05:13:54   at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
> Sep 29 05:13:54   at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
> Sep 29 05:13:54   at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> Sep 29 05:13:54   ... 34 more
> Sep 29 05:13:54 Caused by: org.apache.flink.util.FlinkException: 
> org.apache.flink.util.FlinkException: Pending slot request with 
> SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released.
> Sep 29 05:13:54   at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.releaseSlot(DeclarativeSlotPoolBridge.java:373)
> Sep 29 05:13:54   ... 30 more
> Sep 29 05:13:54 Caused by: java.util.concurrent.TimeoutException: 
> java.util.concurrent.TimeoutException: Pending slot request timed out in slot 
> pool.
> Sep 29 05:13:54   ... 30 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32193) AWS connector removes the dependency on flink-shaded

2023-10-12 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774368#comment-17774368
 ] 

Yuxin Tan commented on FLINK-32193:
---

Duplicated with https://issues.apache.org/jira/browse/FLINK-33194

> AWS connector removes the dependency on flink-shaded
> 
>
> Key: FLINK-32193
> URL: https://issues.apache.org/jira/browse/FLINK-32193
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / AWS
>Affects Versions: aws-connector-4.2.0
>Reporter: Yuxin Tan
>Priority: Major
>
> The AWS connector depends on flink-shaded. With the externalization of the 
> connector, connectors shouldn't rely on Flink-Shaded



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32193) AWS connector removes the dependency on flink-shaded

2023-10-12 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan closed FLINK-32193.
-
Resolution: Duplicate

> AWS connector removes the dependency on flink-shaded
> 
>
> Key: FLINK-32193
> URL: https://issues.apache.org/jira/browse/FLINK-32193
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / AWS
>Affects Versions: aws-connector-4.2.0
>Reporter: Yuxin Tan
>Priority: Major
>
> The AWS connector depends on flink-shaded. With the externalization of the 
> connector, connectors shouldn't rely on Flink-Shaded



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32060) Migrate subclasses of BatchAbstractTestBase in table and other modules to JUnit5

2023-09-18 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan closed FLINK-32060.
-

> Migrate subclasses of BatchAbstractTestBase in table and other modules to 
> JUnit5
> 
>
> Key: FLINK-32060
> URL: https://issues.apache.org/jira/browse/FLINK-32060
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
> Environment: Migrate subclasses of BatchAbstractTestBase in table and 
> other modules to JUnit5.
>Reporter: Yuxin Tan
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32058) Migrate subclasses of BatchAbstractTestBase in runtime.batch.sql to JUnit5

2023-09-18 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan closed FLINK-32058.
-

> Migrate subclasses of BatchAbstractTestBase in runtime.batch.sql to JUnit5
> --
>
> Key: FLINK-32058
> URL: https://issues.apache.org/jira/browse/FLINK-32058
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Migrate subclasses of BatchAbstractTestBase in runtime.batch.sql to JUnit5.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32055) Migrate all subclasses of BatchAbstractTestBase to JUnit5

2023-09-18 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-32055:
-

Assignee: Yuxin Tan

> Migrate all subclasses of BatchAbstractTestBase to JUnit5
> -
>
> Key: FLINK-32055
> URL: https://issues.apache.org/jira/browse/FLINK-32055
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: Umbrella
> Fix For: 1.18.0
>
>
> After [FLINK-30815|https://issues.apache.org/jira/browse/FLINK-30815], we 
> should also migrate all the subclasses of BatchAbstractTestBase to JUnit5.
> Because there are too many classes to migrate, we can split these classes 
> into 3 sub-tasks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33103) Hybrid shuffle ITCase supports the new mode

2023-09-18 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-33103:
-

Assignee: Yuxin Tan

> Hybrid shuffle ITCase supports the new mode
> ---
>
> Key: FLINK-33103
> URL: https://issues.apache.org/jira/browse/FLINK-33103
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.18.1
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the Hybrid shuffle ITCase only supports the legacy mode. The new 
> mode should also be verified, so we should improve it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33103) Hybrid shuffle ITCase supports the new mode

2023-09-18 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33103:
-

 Summary: Hybrid shuffle ITCase supports the new mode
 Key: FLINK-33103
 URL: https://issues.apache.org/jira/browse/FLINK-33103
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Affects Versions: 1.18.1
Reporter: Yuxin Tan


Currently, the Hybrid shuffle ITCase only supports the legacy mode. The new 
mode should also be verified, so we should improve it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33088) Fix NullPointerException in RemoteTierConsumerAgent of tiered storage

2023-09-15 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan closed FLINK-33088.
-
Resolution: Fixed

> Fix NullPointerException in RemoteTierConsumerAgent of tiered storage
> -
>
> Key: FLINK-33088
> URL: https://issues.apache.org/jira/browse/FLINK-33088
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> Currently, when getting a buffer from RemoteTierConsumerAgent of tiered 
> storage, a NullPointerException may be thrown, we should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33088) Fix NullPointerException in RemoteTierConsumerAgent of tiered storage

2023-09-15 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765530#comment-17765530
 ] 

Yuxin Tan commented on FLINK-33088:
---

master(1.19): 1774b84a74f5d348a94bf7dbd5694122150881dd
release-1.18: 0501d03658495b86b33ea57913c4baa131287388

> Fix NullPointerException in RemoteTierConsumerAgent of tiered storage
> -
>
> Key: FLINK-33088
> URL: https://issues.apache.org/jira/browse/FLINK-33088
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> Currently, when getting a buffer from RemoteTierConsumerAgent of tiered 
> storage, a NullPointerException may be thrown, we should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33088) Fix NullPointerException in RemoteTierConsumerAgent of tiered storage

2023-09-14 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33088:
-

 Summary: Fix NullPointerException in RemoteTierConsumerAgent of 
tiered storage
 Key: FLINK-33088
 URL: https://issues.apache.org/jira/browse/FLINK-33088
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.18.0
Reporter: Yuxin Tan
Assignee: Yuxin Tan


Currently, when getting a buffer from RemoteTierConsumerAgent of tiered 
storage, a NullPointerException may be thrown, we should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33044) Reduce the frequency of triggering flush for the disk tier of the tiered storage

2023-09-06 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33044:
-

 Summary: Reduce the frequency of triggering flush for the disk 
tier of the tiered storage
 Key: FLINK-33044
 URL: https://issues.apache.org/jira/browse/FLINK-33044
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.18.0
Reporter: Yuxin Tan
Assignee: Yuxin Tan


The disk cache of tiered storage will flush at the end of each subpartition's 
segment, which is too frequent and is bad for performance. We should improve it 
with some better flushing methods, e.g. flushing buffers with batch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32779) Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage

2023-08-23 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-32779:
--
Description: 
This ticket aims to verify https://issues.apache.org/jira/browse/FLINK-31634.

This verification mainly contains two parts.

Part 1. Run without remote storage. 
This part mainly is to verify the new mode can use the Memory tier and Disk 
tier dynamically when shuffling.
Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: 
ALL_EXCHANGES_HYBRID_SELECTIVE), and run a simple job. For example(tpcds 
q1.sql). When the resource is enough, then the upstream and the downstream can 
run at the same time.

Part2. Run with remote storage.
This part mainly is to verify the new mode can use the Memory tier, Disk tier, 
Remote tier dynamically when shuffling.
   2.1 Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: 
ALL_EXCHANGES_HYBRID_SELECTIVE)
   2.2 set the remote storage path with the 
option(taskmanager.network.hybrid-shuffle.remote.path: 
oss://flink-runtime/runtime/shuffle, note that the path 
oss://flink-runtime/runtime/shuffle in oss should be exist).
   2.3 Modify the 
 option TieredStorageConfiguration#DEFAULT_MIN_RESERVE_DISK_SPACE_FRACTION to 
1, compile the package, then run a simple job. For example(tpcds q1.sql).  
Check the shuffle data is written to the remote storage in the path 
oss://flink-runtime/runtime/shuffle.


> Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage
> 
>
> Key: FLINK-32779
> URL: https://issues.apache.org/jira/browse/FLINK-32779
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Priority: Major
> Fix For: 1.18.0
>
>
> This ticket aims to verify https://issues.apache.org/jira/browse/FLINK-31634.
> This verification mainly contains two parts.
> Part 1. Run without remote storage. 
> This part mainly is to verify the new mode can use the Memory tier and Disk 
> tier dynamically when shuffling.
> Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: 
> ALL_EXCHANGES_HYBRID_SELECTIVE), and run a simple job. For example(tpcds 
> q1.sql). When the resource is enough, then the upstream and the downstream 
> can run at the same time.
> Part2. Run with remote storage.
> This part mainly is to verify the new mode can use the Memory tier, Disk 
> tier, Remote tier dynamically when shuffling.
>2.1 Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: 
> ALL_EXCHANGES_HYBRID_SELECTIVE)
>2.2 set the remote storage path with the 
> option(taskmanager.network.hybrid-shuffle.remote.path: 
> oss://flink-runtime/runtime/shuffle, note that the path 
> oss://flink-runtime/runtime/shuffle in oss should be exist).
>2.3 Modify the 
>  option TieredStorageConfiguration#DEFAULT_MIN_RESERVE_DISK_SPACE_FRACTION to 
> 1, compile the package, then run a simple job. For example(tpcds q1.sql).  
> Check the shuffle data is written to the remote storage in the path 
> oss://flink-runtime/runtime/shuffle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32779) Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage

2023-08-23 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-32779:
-

Assignee: Weijie Guo  (was: Weijie Tong)

> Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage
> 
>
> Key: FLINK-32779
> URL: https://issues.apache.org/jira/browse/FLINK-32779
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Assignee: Weijie Guo
>Priority: Major
> Fix For: 1.18.0
>
>
> This ticket aims to verify https://issues.apache.org/jira/browse/FLINK-31634.
> This verification mainly contains two parts.
> Part 1. Run without remote storage. 
> This part mainly is to verify the new mode can use the Memory tier and Disk 
> tier dynamically when shuffling.
> Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: 
> ALL_EXCHANGES_HYBRID_SELECTIVE), and run a simple job. For example(tpcds 
> q1.sql). When the resource is enough, then the upstream and the downstream 
> can run at the same time.
> Part2. Run with remote storage.
> This part mainly is to verify the new mode can use the Memory tier, Disk 
> tier, Remote tier dynamically when shuffling.
>2.1 Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: 
> ALL_EXCHANGES_HYBRID_SELECTIVE)
>2.2 set the remote storage path with the 
> option(taskmanager.network.hybrid-shuffle.remote.path: 
> oss://flink-runtime/runtime/shuffle, note that the path 
> oss://flink-runtime/runtime/shuffle in oss should be exist).
>2.3 Modify the 
>  option TieredStorageConfiguration#DEFAULT_MIN_RESERVE_DISK_SPACE_FRACTION to 
> 1, compile the package, then run a simple job. For example(tpcds q1.sql).  
> Check the shuffle data is written to the remote storage in the path 
> oss://flink-runtime/runtime/shuffle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32779) Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage

2023-08-23 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan reassigned FLINK-32779:
-

Assignee: Weijie Tong

> Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage
> 
>
> Key: FLINK-32779
> URL: https://issues.apache.org/jira/browse/FLINK-32779
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Assignee: Weijie Tong
>Priority: Major
> Fix For: 1.18.0
>
>
> This ticket aims to verify https://issues.apache.org/jira/browse/FLINK-31634.
> This verification mainly contains two parts.
> Part 1. Run without remote storage. 
> This part mainly is to verify the new mode can use the Memory tier and Disk 
> tier dynamically when shuffling.
> Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: 
> ALL_EXCHANGES_HYBRID_SELECTIVE), and run a simple job. For example(tpcds 
> q1.sql). When the resource is enough, then the upstream and the downstream 
> can run at the same time.
> Part2. Run with remote storage.
> This part mainly is to verify the new mode can use the Memory tier, Disk 
> tier, Remote tier dynamically when shuffling.
>2.1 Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: 
> ALL_EXCHANGES_HYBRID_SELECTIVE)
>2.2 set the remote storage path with the 
> option(taskmanager.network.hybrid-shuffle.remote.path: 
> oss://flink-runtime/runtime/shuffle, note that the path 
> oss://flink-runtime/runtime/shuffle in oss should be exist).
>2.3 Modify the 
>  option TieredStorageConfiguration#DEFAULT_MIN_RESERVE_DISK_SPACE_FRACTION to 
> 1, compile the package, then run a simple job. For example(tpcds q1.sql).  
> Check the shuffle data is written to the remote storage in the path 
> oss://flink-runtime/runtime/shuffle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32779) Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage

2023-08-22 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757308#comment-17757308
 ] 

Yuxin Tan commented on FLINK-32779:
---

[~renqs] Thanks for adding the tests. I will add some instructions and drive 
the tests.

> Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage
> 
>
> Key: FLINK-32779
> URL: https://issues.apache.org/jira/browse/FLINK-32779
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32870) Reading multiple small buffers by reading and slicing one large buffer for tiered storage

2023-08-22 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-32870:
--
Description: 
Currently, when the file reader of tiered storage loads data from the disk 
file, it reads data in buffer granularity. Before compression, each buffer is 
32K by default. After compressed, the size will become smaller (may less than 
5K), which is pretty small for the network buffer and the file IO. 
We should read multiple small buffers by reading and slicing one large buffer 
to decrease the buffer competition and the file IO, leading to better 
performance.

  was:
Currently, when the file reader of tiered storage loads data from the disk 
file, it reads data in buffer granularity. Before compression, each buffer is 
32K by default, after compression the size will become smaller (may less than 
5K), which is pretty small for the network buffer and the file IO. 
We should merge the multiple small buffers into a larger one to decrease the 
buffer competition and the file IO, leading to better performance.


> Reading multiple small buffers by reading and slicing one large buffer for 
> tiered storage
> -
>
> Key: FLINK-32870
> URL: https://issues.apache.org/jira/browse/FLINK-32870
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> Currently, when the file reader of tiered storage loads data from the disk 
> file, it reads data in buffer granularity. Before compression, each buffer is 
> 32K by default. After compressed, the size will become smaller (may less than 
> 5K), which is pretty small for the network buffer and the file IO. 
> We should read multiple small buffers by reading and slicing one large buffer 
> to decrease the buffer competition and the file IO, leading to better 
> performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32870) Reading multiple small buffers by reading and slicing one large buffer for tiered storage

2023-08-22 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-32870:
--
Summary: Reading multiple small buffers by reading and slicing one large 
buffer for tiered storage  (was: Merge multiple small buffers into one large 
buffer when loading data for tiered storage)

> Reading multiple small buffers by reading and slicing one large buffer for 
> tiered storage
> -
>
> Key: FLINK-32870
> URL: https://issues.apache.org/jira/browse/FLINK-32870
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>
> Currently, when the file reader of tiered storage loads data from the disk 
> file, it reads data in buffer granularity. Before compression, each buffer is 
> 32K by default, after compression the size will become smaller (may less than 
> 5K), which is pretty small for the network buffer and the file IO. 
> We should merge the multiple small buffers into a larger one to decrease the 
> buffer competition and the file IO, leading to better performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   >