Hi Jingsong,
Good point!
1. If it doesn't matter which task performs the finalize work, then I think
task-0 suggested by Jark is a very good solution.
2. If it requires the last finished task to perform the finalize work, then we
have to consider other solutions.
WRT fault-tolerant of StreamingRuntimeContext#getGlobalAggregateManager,
AFAIK, there is no built-in support.
1) Regarding to TM failover, I think it's not a problem. We can use an
accumulator i.e. finish_count and it is increased by 1 when a sub-task is
finished(i.e. close() method is called).
When finish_count == RuntimeContext.getNumberOfParallelSubtasks() for
some sub-task, then we can know that it's the last finished sub-task. This
holds true even in case of TM failover.
2) Regarding to JM failover, I have no idea how to work around it so far.
Maybe @Jamie Grier who is the author of this feature could share more thoughts.
Not sure if there is already solution/plan to support JM failover or this
feature is not designed for this kind of use case?
Regards,
Dian
> 在 2019年9月9日,下午3:08,shimin yang <[email protected]> 写道:
>
> Hi Jingsong,
>
> Although it would be nice if the accumulators in GlobalAggregateManager is
> fault-tolerant, we could still take advantage of managed state to guarantee
> the semantic and use the accumulators to implement distributed barrier or
> lock to solve the distributed access problem.
>
> Best,
> Shimin
>
> JingsongLee <[email protected]> 于2019年9月9日周一 下午1:33写道:
>
>> Thanks jark and dian:
>> 1.jark's approach: do the work in task-0. Simple way.
>> 2.dian's approach: use StreamingRuntimeContext#getGlobalAggregateManager
>> Can do more operation. But these accumulators are not fault-tolerant?
>>
>> Best,
>> Jingsong Lee
>>
>>
>> ------------------------------------------------------------------
>> From:shimin yang <[email protected]>
>> Send Time:2019年9月6日(星期五) 15:21
>> To:dev <[email protected]>
>> Subject:Re: [DISCUSS] Support notifyOnMaster for notifyCheckpointComplete
>>
>> Hi Fu,
>>
>> That'll be nice.
>>
>> Thanks.
>>
>> Best,
>> Shimin
>>
>> Dian Fu <[email protected]> 于2019年9月6日周五 下午3:17写道:
>>
>>> Hi Shimin,
>>>
>>> It can be guaranteed to be an atomic operation. This is ensured by the
>> RPC
>>> framework. You could take a look at RpcEndpoint for more details.
>>>
>>> Regards,
>>> Dian
>>>
>>>> 在 2019年9月6日,下午2:35,shimin yang <[email protected]> 写道:
>>>>
>>>> Hi Fu,
>>>>
>>>> Thank you for the remind. I think it would work in my case as long as
>>> it's
>>>> an atomic operation.
>>>>
>>>> Dian Fu <[email protected]> 于2019年9月6日周五 下午2:22写道:
>>>>
>>>>> Hi Jingsong,
>>>>>
>>>>> Thanks for bring up this discussion. You can try to look at the
>>>>> GlobalAggregateManager to see if it can meet your requirements. It can
>>> be
>>>>> got via StreamingRuntimeContext#getGlobalAggregateManager().
>>>>>
>>>>> Regards,
>>>>> Dian
>>>>>
>>>>>> 在 2019年9月6日,下午1:39,shimin yang <[email protected]> 写道:
>>>>>>
>>>>>> Hi Jingsong,
>>>>>>
>>>>>> Big fan of this idea. We faced the same problem and resolved by
>> adding
>>> a
>>>>>> distributed lock. It would be nice to have this feature in JobMaster,
>>>>> which
>>>>>> can replace the lock.
>>>>>>
>>>>>> Best,
>>>>>> Shimin
>>>>>>
>>>>>> JingsongLee <[email protected]> 于2019年9月6日周五
>> 下午12:20写道:
>>>>>>
>>>>>>> Hi devs:
>>>>>>>
>>>>>>> I try to implement streaming file sink for table[1] like
>>>>> StreamingFileSink.
>>>>>>> If the underlying is a HiveFormat, or a format that updates
>> visibility
>>>>>>> through a metaStore, I have to update the metaStore in the
>>>>>>> notifyCheckpointComplete, but this operation occurs on the task
>> side,
>>>>>>> which will lead to distributed access to the metaStore, which will
>>>>>>> lead to bottleneck.
>>>>>>>
>>>>>>> So I'm curious if we can support notifyOnMaster for
>>>>>>> notifyCheckpointComplete like FinalizeOnMaster.
>>>>>>>
>>>>>>> What do you think?
>>>>>>>
>>>>>>> [1]
>>>>>>>
>>>>>
>>>
>> https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing
>>>>>>>
>>>>>>> Best,
>>>>>>> Jingsong Lee
>>>>>
>>>>>
>>>
>>>
>>