The different alpha server I assumes that it has the different name, So we can insert the name of the alpha server in the TxEvent record. When the alpha server is scanning the TxEvent records for the time out handling, it could only select these ones match the alpha name. It looks like that we don't need the lock here and it has to make sure the alpha server name is unique.
2018-01-27 14:36 GMT+08:00 郑扬勇 <[email protected]>: > It seems all solution need import "lock"; > If one event can only handle by one alpha at the same time,we may need > election mechanism ? > > ------------------ 原始邮件 ------------------ > 发件人: "Eric Lee";<[email protected]>; > 发送时间: 2018年1月26日(星期五) 上午10:58 > 收件人: "dev"<[email protected]>; > > 主题: [Discussion] How to make sure events are handled only once > amongdifferent stateless Saga pack alphas > > > > Background > Currently, the transaction timeout is controlled by omega which makes omega > stateful. Being stateful makes omega recovery relies greatly on the > previous states. Hence, we need to move the timeout management from omega > to alpha to simplify implementation of omega. After that, omega will be a > stateless agent. > > Difficulty > How to make sure each timeout record are handled only once globally by > multiple alpha servers? Each alpha server is also stateless. All states are > stored in database. Alpha will scan the timeout events and handles them one > by one periodically. Different alpha may process the same event at the same > time which should be avoided because each event should be handled only > once. > > Possible Solutions: > 1. Add a expireTime column in TxEvent entity. Then lock the access to the > timeout event to avoid concurrent access to the same event. Since TxEvent > may involves many operations, adding the lock may introduce latency in > other transaction. > 2. Create a new entity like the Command entity. Then lock the access to > this entity and update the status asynchronously when it is done. > 3. Register timeout settings to alpha whenever omega starts. Then query > TxEvent and ServiceConfig table to find out timeout events. This way still > can not make sure each event is handled once as the range of the lock is > too wide to target at a specific event. > > However, the above solutions still not perfect for the problem because the > lock will become invalid as soon as the query is done and another alpha may > query from database and process the same event before the timeout event > being processed by the previous alpha. > > Current implementation details can move forward to > https://github.com/apache/incubator-servicecomb-saga/pull/122 . > > Any suggestion is welcome. > > > Best Regards! > Eric Lee >
