Hi All I have completed the acceptance test for the state machine and pushed to branch SCB-1321 and CI pass. See more feature progress here[1].
In the acceptance test, the timeout is different from the previous one. When the timeout occurs, the transaction will enter the suspended state because we are not sure whether the sub-transaction is completed and when it is completed. For example: when booking timeout, we are not sure about the execution status of car or hotel. If car or hotel sends TxEndedEvent after compensation, they will not be compensated. Alpha [x] State machine design document [x] State machine prototype [x] State machine prototype unit test [x] Receive saga events using the internal message bus [x] State machine integration test [x] Enable state machine support via parameters [x] Verify Akka persistent [ ] Verify Akka cluster reliability [ ] Save the terminated transaction data to the database [ ] Support for in-process nested global transactions [ ] Support for cross-process nested global transactions [ ] Support for query terminated transaction data by RESTful API [ ] Support for query running transaction data by RESTful API [ ] Support for query running transaction data by RESTful API [ ] Support for query suspended global transaction by RESTful API [ ] Support for compensate failed sub-transaction by RESTful API Omega Components [x] Enable state machine support via parameters [x] State machine calls omega side compensation [x] @SagaStart supports thread termination after the timeout Alpha & Omega [x] Acceptance-pack-akka-spring-demo pass [ ] Add sub-transaction timeout exception for akka acceptance test [ ] Add compensation failure for akka acceptance test [ ] Add compensation retry success for akka acceptance test [ ] Alpha single node benchmark performance test [ ] Alpha cluster benchmark performance test Tools [ ] Alpha Benchmark tools Do Next: 1. State machine metrics collection 2. Alpha Benchmark tools 3. Single alpha benchmark performance test 4. Verify Akka cluster reliability [1] https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm <https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> Lei Zhang > 在 2019年6月28日,下午5:50,Zhang Lei <[email protected]> 写道: > > Hi, All > > alpha-fsm has been pushed to the branch SCB-1321 > > Completed: > 1. State machine design document[1] > 2. State machine prototype > 3. State machine test case > 4. Receive saga events using the internal message bus > > Key emphasis of next stage in work: > In order to carry out the feasibility verification as soon as possible, I > will not consider the reliability issue for the time being. > 1. Refactor Omega components, add SagaAbortedEvent, SagaTimeoutEvent, > TxComponsitedEvent > 2. Save compensation method parameters in Actor and trigger compensation in > Actor > 3. Do not use Kafka and only verify single node alpha, The Alpha server > receives the saga event and puts it into the internal message bus. > > Planning: > 1. Persist actor data to the database when it terminates > 2. Integration Kafka > 3. Support WAL[2] recovery mode > 4. Verify Akka cluster reliability > > [1] https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm > <https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> > [2] https://en.wikipedia.org/wiki/Write-ahead_logging > <https://en.wikipedia.org/wiki/Write-ahead_logging> > > if you have other comments, please let us know. > > Thanks, > Lei Zhang > >> 在 2019年6月27日,上午9:50,Willem Jiang <[email protected]> 写道: >> >> We just leverage the message broker to make sure Alpha get the >> transaction event from Omega. >> In most cases Alpha don't need to talk back to Omega, we just need to >> make sure all the transaction message are stored (Alpha can process it >> later). >> >> If Omega cannot talk the message broker, Omega should abort the >> transaction processing with transport exception. >> >> Willem Jiang >> >> Twitter: willemjiang >> Weibo: 姜宁willem >> >> On Tue, Jun 25, 2019 at 8:42 AM Zhang Lei <[email protected]> wrote: >>> >>> Hi, Zhang jun >>> >>>> I just cared about the recovery scan thread design. >>>> Kafka can ensure event message can be consumed by alpha exactly, but >>>> recovery need know all the participated transaction response to decide >>>> rollback or commit, so I think scan thread is also necessary. >>> >>> I am not sure, but I think Akka's persistence can solve this problem you >>> care about. >>> Of course, this ability needs to be verified >>> >>> Thanks, >>> Zhang Lei >>> >>>> 在 2019年6月24日,上午10:46,赵俊 <[email protected]> 写道: >>>> >>>> Hi, Zhang Lei >>>> >>>>> A2 : I think we only need to ensure that the message can be reliably >>>>> delivered to the state machine, The state machine is only a synchronous >>>>> record state transition when the transaction is executed normally. At >>>>> present, the compensation method based on table scan is also >>>>> asynchronous. I am not sure if I have answered your question, or you can >>>>> give me more information. >>>> >>>> If we have a mechanism that ensure main service can collect all the >>>> participated transaction response from alpha correctly before >>>> commit/rollback, it is OK. >>>> >>>>> Q2 : Also we should consider about recovery, it seems that recovery is as >>>>> same as before based on database. >>>>> A2 : I think the question you care about is how to recover when the alpha >>>>> is down, this is a little different from the current version. >>>>> 1. We can base on Kafka's reliability and control the offset of the >>>>> topic, one message at a time >>>>> 2. Of course, we can also do some extra design for it, such as logging >>>>> the data log file locally after receiving the Kafka message. Resume the >>>>> message by reading the data log file when the alpha machine restarts >>>> >>>> I just cared about the recovery scan thread design. >>>> Kafka can ensure event message can be consumed by alpha exactly, but >>>> recovery need know all the participated transaction response to decide >>>> rollback or commit, so I think scan thread is also necessary. >>>> >>>> >>>> >>>>> On Jun 23, 2019, at 1:04 PM, Zhang Lei <[email protected]> wrote: >>>>> >>>>> Hi, Zhao Jun >>>>> >>>>> Thank you for your reply! >>>>> >>>>> This design document does not elaborate on reliability aspects. >>>>> >>>>> My initial thought is this >>>>> >>>>> Q1 : It seems that omega should hold on after consuming the event message >>>>> from Kafka instead of completing pushing message >>>>> A2 : I think we only need to ensure that the message can be reliably >>>>> delivered to the state machine, The state machine is only a synchronous >>>>> record state transition when the transaction is executed normally. At >>>>> present, the compensation method based on table scan is also >>>>> asynchronous. I am not sure if I have answered your question, or you can >>>>> give me more information. >>>>> >>>>> Q2 : Also we should consider about recovery, it seems that recovery is as >>>>> same as before based on database. >>>>> A2 : I think the question you care about is how to recover when the alpha >>>>> is down, this is a little different from the current version. >>>>> 1. We can base on Kafka's reliability and control the offset of the >>>>> topic, one message at a time >>>>> 2. Of course, we can also do some extra design for it, such as logging >>>>> the data log file locally after receiving the Kafka message. Resume the >>>>> message by reading the data log file when the alpha machine restarts >>>>> >>>>> >>>>> Thanks, >>>>> Lei Zhang >>>>> >>>>>> 在 2019年6月23日,上午7:08,zhaojun <[email protected]> 写道: >>>>>> >>>>>> I have some questions about the design. >>>>>> 1. It seems that omega should hold on after consuming the event message >>>>>> from Kafka instead of completing pushing message. >>>>>> 2. Also we should consider about recovery, it seems that recovery is as >>>>>> same as before based on database. >>>>>> >>>>>> ------------------ >>>>>> Zhao Jun >>>>>> Apache Sharding-Sphere & ServiceComb >>>>>> >>>>>>> On Jun 21, 2019, at 6:41 PM, Zhang Lei <[email protected]> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have created the alpha-fsm module on branch SCB-1321 and submitted >>>>>>> the design documentation, state machine prototype and test cases. >>>>>>> If there is any problem, please let me know. >>>>>>> >>>>>>> Thanks, >>>>>>> Lei Zhang >>>>>>> >>>>>>> [1] >>>>>>> https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm >>>>>>> >>>>>>> <https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> >>>>>>> >>>>>>>> 在 2019年6月20日,下午3:25,Zheng Feng <[email protected]> 写道: >>>>>>>> >>>>>>>> Yeah, I think Willem has create one [1] before and do you mind I assign >>>>>>>> this issue to you ? >>>>>>>> >>>>>>>> [1] https://issues.apache.org/jira/browse/SCB-1258 >>>>>>>> >>>>>>>> Zhang Lei <[email protected]> 于2019年6月20日周四 下午2:34写道: >>>>>>>> >>>>>>>>> Hi, Zheng Feng >>>>>>>>> >>>>>>>>> Thanks for your advice, I will create a JIRA first and start with the >>>>>>>>> design documentation. >>>>>>>>> >>>>>>>>> Lei Zhang >>>>>>>>> >>>>>>>>>> 在 2019年6月19日,下午8:09,Zheng Feng <[email protected]> 写道: >>>>>>>>>> >>>>>>>>>> Thanks a lot for sharing these information ! I think this state >>>>>>>>>> machine >>>>>>>>>> could be very experimental so it would helpful to create an >>>>>>>>>> experimental >>>>>>>>>> branch to add this module but not in the master branch. >>>>>>>>>> >>>>>>>>>> Zhang Lei <[email protected]> 于2019年6月19日周三 下午5:42写道: >>>>>>>>>> >>>>>>>>>>> I have completed some of the design and prototype in my github. >>>>>>>>>>> >>>>>>>>>>> In the design document [1] my original idea was that a transaction >>>>>>>>>>> consisted of a SagaActor and several TxActors, and later TxAcotr was >>>>>>>>>>> removed to reduce implementation complexity. >>>>>>>>>>> I haven't had time to modify the documentation yet, but the >>>>>>>>>>> SagaActor >>>>>>>>>>> state machine [2] is up to date. >>>>>>>>>>> Here you can see the test cases of SagaActor [3] >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> >>>>>>>>> https://github.com/coolbeevip/playground/tree/master/state_machine_demo/saga-akkafsm >>>>>>>>>>> < >>>>>>>>>>> >>>>>>>>> https://github.com/coolbeevip/playground/tree/master/state_machine_demo/saga-akkafsm >>>>>>>>>>>> >>>>>>>>>>> [2] >>>>>>>>>>> >>>>>>>>> https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/assets/saga_state_diagram.png >>>>>>>>>>> < >>>>>>>>>>> >>>>>>>>> https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/assets/saga_state_diagram.png >>>>>>>>>>>> >>>>>>>>>>> [3] >>>>>>>>>>> >>>>>>>>> https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/src/test/java/coolbeevip/playgroud/statemachine/saga/SagaActorTest.java >>>>>>>>>>> < >>>>>>>>>>> >>>>>>>>> https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/src/test/java/coolbeevip/playgroud/statemachine/saga/SagaActorTest.java >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Lei Zhang >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> 在 2019年6月19日,下午2:34,zhaojun <[email protected]> 写道: >>>>>>>>>>>> >>>>>>>>>>>> If we use AKKA, how can we design the actors, and how can we >>>>>>>>>>>> guarantee >>>>>>>>>>> omega will receive the message synchronize. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>> >>> >
