Normally, if there are some exceptions that Omega cannot handle itself, we just let the exception throw, so the application can decide if it is OK or not to continue process. For the TCCEnd event, it's important for Alpha to know otherwise Alpha could think the transaction is timeout. It could be better if we can introduce some kind of transaction lookup mechanism to let Alpha query the state of transaction from the Omega.
For the event scanner, it could be much easy to resolve the concurrency issue by introducing a sharding algorithm. Willem Jiang Twitter: willemjiang Weibo: 姜宁willem On Tue, Sep 11, 2018 at 2:46 PM cherrylzhao <[email protected]> wrote: > > Hi, all > > I have faced some HA issue when implementing TCC workflow, this is our design > document [1]. > HA issue is following. > > 1. Omega finished try logic, but sending participate event failed, how about > the retry mechanisms should we design? > 2. Omega finished try logic, but sending participate event failed, should > omega invoke cancel method automatically? > 3. Omega finished try logic, alpha received participate event and persistence > success, but sending ACK to omega failed, > should alpha do rollback automatically, when omega received failed > feedback, also invoke cancel method automatically? > 4. When sending TCCEnd event to alpha failed, how about alpha do compensation > recovery? > Maybe event scanner is necessary for this scenario to do recovery, but we > need to do detail design for this. > 5. If we introduce event scanner, how to handle concurrency from TCCEnd > command? > > Please feel free to give some advices. > > [1] > https://github.com/apache/incubator-servicecomb-saga/blob/master/docs/design.md#workflow-tcc > > <https://github.com/apache/incubator-servicecomb-saga/blob/master/docs/design.md#workflow-tcc> > > > Best Wishes & Regards
