[ 
https://issues.apache.org/jira/browse/FLINK-31080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17717456#comment-17717456
 ] 

Weijie Guo commented on FLINK-31080:
------------------------------------

Thanks for pick-up this, I didn't fix this anytime soon because I found out 
that this code path (releasing the idle slot under the adaptive scheduler) 
should never have been triggered before, but after FLINK-31399 it should be, so 
let's fix it now.

> I think we need to put timestamp maintenance inside the declarativeSlotPool 
> to maintain uniform timestamp semantics.

Yes, they should share the same clock Ideally, and we can also pass the clock 
to the {{Scheduler}}.

>If you are fine , can i share the patch in flink-runtime ? 

Sure, but I'm leaning toward unifying the clocks between {{SlotPoolService}} 
and {{Scheduler}}, like Weihua said. What do you think?

> Idle slots are not released due to a mismatch in time between 
> DeclarativeSlotPoolService and SlotSharingSlotAllocator
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-31080
>                 URL: https://issues.apache.org/jira/browse/FLINK-31080
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.17.0, 1.16.1
>            Reporter: Prabhu Joseph
>            Assignee: Weijie Guo
>            Priority: Major
>              Labels: pull-request-available
>
> Due to a timing mismatch between {{DeclarativeSlotPoolService}} and 
> {{{}SlotSharingSlotAllocator{}}}, idle slots are not released.
> {{DeclarativeSlotPoolService}} uses {{{}SystemClock#relativeTimeMillis{}}}, 
> i.e., {{{}System.nanoTime{}}}() / 1_000_000, while offering a slot, whereas 
> {{SlotSharingSlotAllocator}} uses {{{}System.currentTimeMillis{}}}() while 
> freeing the reserved slot. 
> The idle timeout check fails wrongly as "{{{}System.currentTimeMillis(){}}}" 
> will have a very high value compared to 
> "{{{}SystemClock#relativeTimeMillis{}}}".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to