Re: Saga，TCC overhead

赵俊 Thu, 01 Nov 2018 20:49:02 -0700

Hi Zheng

I think Willem have designed recovery mechanisms like XA we have discussed last 
week.
Omega also can write recently transactions into file system, 
when omega restart we can compare the omega transaction log and alpha 
transaction event to do the recovery.


> On Nov 1, 2018, at 5:33 PM, Willem Jiang <willem.ji...@gmail.com> wrote:
> 
> If we expose the transaction query interface between the Omega and
> Alpha, we need to let the Omega store all the transaction, maybe we
> just need to let Omega store the most recent transaction status.
> 
> Willem Jiang
> 
> Twitter: willemjiang
> Weibo: 姜宁willem
> 
> On Thu, Nov 1, 2018 at 3:37 PM Zheng Feng <zh.f...@gmail.com> wrote:
>> 
>> OK, that makes sense to have such status query interface and this should be 
>> introduced by the Java annotations ?
>> 
>> Willem Jiang <willem.ji...@gmail.com> 于2018年11月1日周四 下午3:27写道：
>>> 
>>> Hi Zheng,
>>> 
>>> If the Omega failed to send the Saga event to the Alpha,  Alpha could
>>> think the transaction is failed and start the compensation process.
>>> In this way we need the Omega provide the query interface for the
>>> Alpha to verify the states of the transaction to avoid this kind of
>>> situation.
>>> 
>>> If we let the Omega (transaction initiator) to know about the timeout,
>>> the Omega help Alpha to do more things, as the transaction initiator
>>> Omega knows all the other services invocation status. In this way
>>> Alpha just need to deal with the transaction initiator Omega crash
>>> situation.
>>> 
>>> 
>>> Willem Jiang
>>> 
>>> Twitter: willemjiang
>>> Weibo: 姜宁willem
>>> 
>>> On Thu, Nov 1, 2018 at 12:13 PM Zheng Feng <zh.f...@gmail.com> wrote:
>>>> 
>>>> +1. I think we had some implementation codes before by using the 
>>>> Executors. @Willem Jiang can you recall why we move the timeout handle to 
>>>> the alpha server ? is there any particular reason ? I think maybe the 
>>>> following
>>>> 1) the omega fails to send the cancel message when timeout happens due to 
>>>> the network error ? So the compensation will not happen.
>>>> 2) the omega crashes and can not recovery to send the message when it 
>>>> re-starts ?
>>>> 
>>>> Willem Jiang <willem.ji...@gmail.com> 于2018年11月1日周四 上午11:24写道：
>>>>> 
>>>>> +1 to use the POC show us the fact :)
>>>>> Now I'm thinking to let Omega more smart[1] by doing the timeout
>>>>> monitor itself to reduce the complexity of Alpha.
>>>>> In this way the Alpha just need to store the message and response the
>>>>> request from Omega.
>>>>> 
>>>>> [1]https://issues.apache.org/jira/browse/SCB-1000
>>>>> 
>>>>> Willem Jiang
>>>>> 
>>>>> Twitter: willemjiang
>>>>> Weibo: 姜宁willem
>>>>> On Thu, Nov 1, 2018 at 11:13 AM 赵俊 <zhaoju...@jd.com> wrote:
>>>>>> 
>>>>>> We can write a simple demo to prove reactive or original netty can 
>>>>>> improve throughout using omega/alpha architecture
>>>>>> 
>>>>>> 
>>>>>>> On Nov 1, 2018, at 8:29 AM, Willem Jiang <willem.ji...@gmail.com> wrote:
>>>>>>> 
>>>>>>> I thinking to use actor to do the reactive work, but it looks like we
>>>>>>> could make alpha more simple by implement some logic on the Omega
>>>>>>> side, such as the timeout function.
>>>>>>> 
>>>>>>> Willem Jiang
>>>>>>> 
>>>>>>> Twitter: willemjiang
>>>>>>> Weibo: 姜宁willem
>>>>>>> 
>>>>>>> On Thu, Nov 1, 2018 at 1:57 AM wjm wjm <zzz...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> async is not enough, better to be reactive.
>>>>>>>> 
>>>>>>>> 赵俊 <zhaoju...@jd.com> 于2018年10月31日周三 下午5:07写道：
>>>>>>>> 
>>>>>>>>> Hi, Willem
>>>>>>>>> 
>>>>>>>>> I think make the last invocation async is limitation for performance 
>>>>>>>>> tuning
>>>>>>>>> As block grpc invoking also use async way internal, only blocking in
>>>>>>>>> futureTask.get().
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Oct 30, 2018, at 4:51 PM, Willem Jiang <willem.ji...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Thanks for feedback,
>>>>>>>>>> I just used one participator to show the most simplest way of service
>>>>>>>>>> interaction.
>>>>>>>>>> I just add some words about the "initial service" and the
>>>>>>>>>> "participant service".
>>>>>>>>>> 
>>>>>>>>>> Now we could think about how to reduce the overheads of the
>>>>>>>>>> distributed transaction.  I think we can make the last invocation
>>>>>>>>>> async to speed up the processing, but it could be a challenge for us
>>>>>>>>>> to leverage the async remote invocation without introduce the risk of
>>>>>>>>>> losing messages.
>>>>>>>>>> 
>>>>>>>>>> Any thoughts?
>>>>>>>>>> 
>>>>>>>>>> Willem Jiang
>>>>>>>>>> 
>>>>>>>>>> Twitter: willemjiang
>>>>>>>>>> Weibo: 姜宁willem
>>>>>>>>>> On Tue, Oct 30, 2018 at 4:37 PM Zheng Feng <zh.f...@gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Great work ! It could be more clear if you can mark the invocation
>>>>>>>>> arrows
>>>>>>>>>>> with the step numbers. And it usual has two or more participants in 
>>>>>>>>>>> a
>>>>>>>>>>> distribute transaction.
>>>>>>>>>>> So you need to improve the sequence diagram to show these actors.
>>>>>>>>>>> 
>>>>>>>>>>> It also could be helpful to describe what is the "initial service" 
>>>>>>>>>>> and
>>>>>>>>> the
>>>>>>>>>>> "participant service" ?
>>>>>>>>>>> 
>>>>>>>>>>> Willem Jiang <willem.ji...@gmail.com> 于2018年10月30日周二 下午4:23写道：
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Team,
>>>>>>>>>>>> 
>>>>>>>>>>>> I wrote a page[1] to analyze the overheads that Saga or TCC could
>>>>>>>>>>>> introduce.
>>>>>>>>>>>> Please check it out and let me know what you think.
>>>>>>>>>>>> You can either reply this mail or just add comment on the wiki 
>>>>>>>>>>>> page.
>>>>>>>>>>>> 
>>>>>>>>>>>> [1]
>>>>>>>>>>>> 
>>>>>>>>> https://cwiki.apache.org/confluence/display/SERVICECOMB/Distributed+Transaction+Coordinator+Overhead
>>>>>>>>>>>> 
>>>>>>>>>>>> Willem Jiang
>>>>>>>>>>>> 
>>>>>>>>>>>> Twitter: willemjiang
>>>>>>>>>>>> Weibo: 姜宁willem
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>

Re: Saga，TCC overhead

Reply via email to