Thanks Lei - it looks very very incredible for the performance results ! Also it could be very helpful to describe how the akka improves the performance. I think it maybe related to the concurrency of the state machine and persistence of the transaction status.
Looking forward to the new release of the sevicecomb-pack ! Zhang Lei <zhang_...@boco.com.cn> 于2019年7月9日周二 下午10:10写道: > Hi, all > > State machine-based Alpha improves performance by an order of magnitude. I > have pushed Alpha's benchmark report [1] and added the stress test tool > module alpha-benchmark [2]. > Next I will merge branch SCB-1321 to master branch. > > Any questions please tell me. > > [1] > https://github.com/apache/servicecomb-pack/blob/SCB-1321/alpha/alpha-fsm/Benchmark.md > < > https://github.com/apache/servicecomb-pack/blob/SCB-1321/alpha/alpha-fsm/Benchmark.md > > > [2] > https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-benchmark > < > https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-benchmark > > > > Lei Zhang > > > 在 2019年7月9日,上午8:50,Willem Jiang <willem.ji...@gmail.com> 写道: > > > > Hi Zhanglei, > > > > I agree with you, in the timeout scenario, it's hard to tell if the > > transaction is finished successfully or not. > > I think we could provide an extension or plugin to let the user do > > some extra actions to do the compensation work after verifying the > > transaction states. > > > > BTW, as there are some changes happening in the master, maybe you can > > consider to merge the SCB-1321 branch to master branch. > > > > Willem Jiang > > > > Twitter: willemjiang > > Weibo: 姜宁willem > > > > On Tue, Jul 9, 2019 at 8:21 AM Zhang Lei <zhang_...@boco.com.cn> wrote: > >> > >> Hi All > >> > >> I have completed the acceptance test for the state machine and pushed > to branch SCB-1321 and CI pass. See more feature progress here[1]. > >> > >> In the acceptance test, the timeout is different from the previous one. > When the timeout occurs, the transaction will enter the suspended state > because we are not sure whether the sub-transaction is completed and when > it is completed. > >> > >> For example: > >> when booking timeout, we are not sure about the execution status of car > or hotel. If car or hotel sends TxEndedEvent after compensation, they will > not be compensated. > >> > >> Alpha > >> [x] State machine design document > >> [x] State machine prototype > >> [x] State machine prototype unit test > >> [x] Receive saga events using the internal message bus > >> [x] State machine integration test > >> [x] Enable state machine support via parameters > >> [x] Verify Akka persistent > >> [ ] Verify Akka cluster reliability > >> [ ] Save the terminated transaction data to the database > >> [ ] Support for in-process nested global transactions > >> [ ] Support for cross-process nested global transactions > >> [ ] Support for query terminated transaction data by RESTful API > >> [ ] Support for query running transaction data by RESTful API > >> [ ] Support for query running transaction data by RESTful API > >> [ ] Support for query suspended global transaction by RESTful API > >> [ ] Support for compensate failed sub-transaction by RESTful API > >> > >> Omega Components > >> [x] Enable state machine support via parameters > >> [x] State machine calls omega side compensation > >> [x] @SagaStart supports thread termination after the timeout > >> > >> Alpha & Omega > >> [x] Acceptance-pack-akka-spring-demo pass > >> [ ] Add sub-transaction timeout exception for akka acceptance test > >> [ ] Add compensation failure for akka acceptance test > >> [ ] Add compensation retry success for akka acceptance test > >> [ ] Alpha single node benchmark performance test > >> [ ] Alpha cluster benchmark performance test > >> > >> Tools > >> [ ] Alpha Benchmark tools > >> > >> Do Next: > >> 1. State machine metrics collection > >> 2. Alpha Benchmark tools > >> 3. Single alpha benchmark performance test > >> 4. Verify Akka cluster reliability > >> > >> [1] > https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm < > https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> > >> > >> Lei Zhang > >> > >> > >>> 在 2019年6月28日,下午5:50,Zhang Lei <zhang_...@boco.com.cn> 写道: > >>> > >>> Hi, All > >>> > >>> alpha-fsm has been pushed to the branch SCB-1321 > >>> > >>> Completed: > >>> 1. State machine design document[1] > >>> 2. State machine prototype > >>> 3. State machine test case > >>> 4. Receive saga events using the internal message bus > >>> > >>> Key emphasis of next stage in work: > >>> In order to carry out the feasibility verification as soon as > possible, I will not consider the reliability issue for the time being. > >>> 1. Refactor Omega components, add SagaAbortedEvent, SagaTimeoutEvent, > TxComponsitedEvent > >>> 2. Save compensation method parameters in Actor and trigger > compensation in Actor > >>> 3. Do not use Kafka and only verify single node alpha, The Alpha > server receives the saga event and puts it into the internal message bus. > >>> > >>> Planning: > >>> 1. Persist actor data to the database when it terminates > >>> 2. Integration Kafka > >>> 3. Support WAL[2] recovery mode > >>> 4. Verify Akka cluster reliability > >>> > >>> [1] > https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm < > https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> > >>> [2] https://en.wikipedia.org/wiki/Write-ahead_logging < > https://en.wikipedia.org/wiki/Write-ahead_logging> > >>> > >>> if you have other comments, please let us know. > >>> > >>> Thanks, > >>> Lei Zhang > >>> > >>>> 在 2019年6月27日,上午9:50,Willem Jiang <willem.ji...@gmail.com> 写道: > >>>> > >>>> We just leverage the message broker to make sure Alpha get the > >>>> transaction event from Omega. > >>>> In most cases Alpha don't need to talk back to Omega, we just need to > >>>> make sure all the transaction message are stored (Alpha can process it > >>>> later). > >>>> > >>>> If Omega cannot talk the message broker, Omega should abort the > >>>> transaction processing with transport exception. > >>>> > >>>> Willem Jiang > >>>> > >>>> Twitter: willemjiang > >>>> Weibo: 姜宁willem > >>>> > >>>> On Tue, Jun 25, 2019 at 8:42 AM Zhang Lei <zhang_...@boco.com.cn> > wrote: > >>>>> > >>>>> Hi, Zhang jun > >>>>> > >>>>>> I just cared about the recovery scan thread design. > >>>>>> Kafka can ensure event message can be consumed by alpha exactly, > but recovery need know all the participated transaction response to decide > rollback or commit, so I think scan thread is also necessary. > >>>>> > >>>>> I am not sure, but I think Akka's persistence can solve this problem > you care about. > >>>>> Of course, this ability needs to be verified > >>>>> > >>>>> Thanks, > >>>>> Zhang Lei > >>>>> > >>>>>> 在 2019年6月24日,上午10:46,赵俊 <zhaoju...@jd.com> 写道: > >>>>>> > >>>>>> Hi, Zhang Lei > >>>>>> > >>>>>>> A2 : I think we only need to ensure that the message can be > reliably delivered to the state machine, The state machine is only a > synchronous record state transition when the transaction is executed > normally. At present, the compensation method based on table scan is also > asynchronous. I am not sure if I have answered your question, or you can > give me more information. > >>>>>> > >>>>>> If we have a mechanism that ensure main service can collect all the > participated transaction response from alpha correctly before > commit/rollback, it is OK. > >>>>>> > >>>>>>> Q2 : Also we should consider about recovery, it seems that > recovery is as same as before based on database. > >>>>>>> A2 : I think the question you care about is how to recover when > the alpha is down, this is a little different from the current version. > >>>>>>> 1. We can base on Kafka's reliability and control the offset of > the topic, one message at a time > >>>>>>> 2. Of course, we can also do some extra design for it, such as > logging the data log file locally after receiving the Kafka message. Resume > the message by reading the data log file when the alpha machine restarts > >>>>>> > >>>>>> I just cared about the recovery scan thread design. > >>>>>> Kafka can ensure event message can be consumed by alpha exactly, > but recovery need know all the participated transaction response to decide > rollback or commit, so I think scan thread is also necessary. > >>>>>> > >>>>>> > >>>>>> > >>>>>>> On Jun 23, 2019, at 1:04 PM, Zhang Lei <zhang_...@boco.com.cn> > wrote: > >>>>>>> > >>>>>>> Hi, Zhao Jun > >>>>>>> > >>>>>>> Thank you for your reply! > >>>>>>> > >>>>>>> This design document does not elaborate on reliability aspects. > >>>>>>> > >>>>>>> My initial thought is this > >>>>>>> > >>>>>>> Q1 : It seems that omega should hold on after consuming the event > message from Kafka instead of completing pushing message > >>>>>>> A2 : I think we only need to ensure that the message can be > reliably delivered to the state machine, The state machine is only a > synchronous record state transition when the transaction is executed > normally. At present, the compensation method based on table scan is also > asynchronous. I am not sure if I have answered your question, or you can > give me more information. > >>>>>>> > >>>>>>> Q2 : Also we should consider about recovery, it seems that > recovery is as same as before based on database. > >>>>>>> A2 : I think the question you care about is how to recover when > the alpha is down, this is a little different from the current version. > >>>>>>> 1. We can base on Kafka's reliability and control the offset of > the topic, one message at a time > >>>>>>> 2. Of course, we can also do some extra design for it, such as > logging the data log file locally after receiving the Kafka message. Resume > the message by reading the data log file when the alpha machine restarts > >>>>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Lei Zhang > >>>>>>> > >>>>>>>> 在 2019年6月23日,上午7:08,zhaojun <zhaoju...@126.com> 写道: > >>>>>>>> > >>>>>>>> I have some questions about the design. > >>>>>>>> 1. It seems that omega should hold on after consuming the event > message from Kafka instead of completing pushing message. > >>>>>>>> 2. Also we should consider about recovery, it seems that recovery > is as same as before based on database. > >>>>>>>> > >>>>>>>> ------------------ > >>>>>>>> Zhao Jun > >>>>>>>> Apache Sharding-Sphere & ServiceComb > >>>>>>>> > >>>>>>>>> On Jun 21, 2019, at 6:41 PM, Zhang Lei <zhang_...@boco.com.cn> > wrote: > >>>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I have created the alpha-fsm module on branch SCB-1321 and > submitted the design documentation, state machine prototype and test cases. > >>>>>>>>> If there is any problem, please let me know. > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Lei Zhang > >>>>>>>>> > >>>>>>>>> [1] > https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm < > https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> > >>>>>>>>> > >>>>>>>>>> 在 2019年6月20日,下午3:25,Zheng Feng <zh.f...@gmail.com> 写道: > >>>>>>>>>> > >>>>>>>>>> Yeah, I think Willem has create one [1] before and do you mind > I assign > >>>>>>>>>> this issue to you ? > >>>>>>>>>> > >>>>>>>>>> [1] https://issues.apache.org/jira/browse/SCB-1258 > >>>>>>>>>> > >>>>>>>>>> Zhang Lei <zhang_...@boco.com.cn> 于2019年6月20日周四 下午2:34写道: > >>>>>>>>>> > >>>>>>>>>>> Hi, Zheng Feng > >>>>>>>>>>> > >>>>>>>>>>> Thanks for your advice, I will create a JIRA first and start > with the > >>>>>>>>>>> design documentation. > >>>>>>>>>>> > >>>>>>>>>>> Lei Zhang > >>>>>>>>>>> > >>>>>>>>>>>> 在 2019年6月19日,下午8:09,Zheng Feng <zh.f...@gmail.com> 写道: > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks a lot for sharing these information ! I think this > state machine > >>>>>>>>>>>> could be very experimental so it would helpful to create an > experimental > >>>>>>>>>>>> branch to add this module but not in the master branch. > >>>>>>>>>>>> > >>>>>>>>>>>> Zhang Lei <cool...@qq.com> 于2019年6月19日周三 下午5:42写道: > >>>>>>>>>>>> > >>>>>>>>>>>>> I have completed some of the design and prototype in my > github. > >>>>>>>>>>>>> > >>>>>>>>>>>>> In the design document [1] my original idea was that a > transaction > >>>>>>>>>>>>> consisted of a SagaActor and several TxActors, and later > TxAcotr was > >>>>>>>>>>>>> removed to reduce implementation complexity. > >>>>>>>>>>>>> I haven't had time to modify the documentation yet, but the > SagaActor > >>>>>>>>>>>>> state machine [2] is up to date. > >>>>>>>>>>>>> Here you can see the test cases of SagaActor [3] > >>>>>>>>>>>>> > >>>>>>>>>>>>> [1] > >>>>>>>>>>>>> > >>>>>>>>>>> > https://github.com/coolbeevip/playground/tree/master/state_machine_demo/saga-akkafsm > >>>>>>>>>>>>> < > >>>>>>>>>>>>> > >>>>>>>>>>> > https://github.com/coolbeevip/playground/tree/master/state_machine_demo/saga-akkafsm > >>>>>>>>>>>>>> > >>>>>>>>>>>>> [2] > >>>>>>>>>>>>> > >>>>>>>>>>> > https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/assets/saga_state_diagram.png > >>>>>>>>>>>>> < > >>>>>>>>>>>>> > >>>>>>>>>>> > https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/assets/saga_state_diagram.png > >>>>>>>>>>>>>> > >>>>>>>>>>>>> [3] > >>>>>>>>>>>>> > >>>>>>>>>>> > https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/src/test/java/coolbeevip/playgroud/statemachine/saga/SagaActorTest.java > >>>>>>>>>>>>> < > >>>>>>>>>>>>> > >>>>>>>>>>> > https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/src/test/java/coolbeevip/playgroud/statemachine/saga/SagaActorTest.java > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Lei Zhang > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>> 在 2019年6月19日,下午2:34,zhaojun <zhaoju...@126.com> 写道: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> If we use AKKA, how can we design the actors, and how can > we guarantee > >>>>>>>>>>>>> omega will receive the message synchronize. > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>> > >>>>> > >>> > >> > >