Current implementation to resolve conflicts on same snapshotEdge is overwhelmingly complicated. I am suggesting refactor this part so we can understand what is going on for consistency.
More importantly, with positive fail probability for RPC(hbase.fail.prob=0.01), test cases actually fail. we can reproduce failure by running `sbt "project s2core" -Dconfig.file=s2rest_play/conf/test.conf test`. What we intended initially here is guarantee to resolve conflicts even some RPC into storage fail. I was fixing this issue and quickly realize current code base is way too complicated and not efficient. here is brief explanation( https://steamshon.gitbooks.io/s2graph-book/content/request_state_diagram.html ). the big picture for this process goes like below. - retry on failure -- fetch snapshotEdge as oldSN. --- merge oldSN and requesting edges and build newSN containing updated state. --- lock this snapshotEdge by storing (oldSN, requesting edge as pendingEdge) using checkAndSet(statusCode 0 -> 1). --- fire RPCs for indexEdge insert + delete mutations(statusCode 1 -> 2). --- fire RPCs for indexEdge degree increment mutations(statusCode 2 -> 3). --- release lock on this snapshotEdge by storing (newSN, None as pendingEdge) using CheckAndSet(statusCode 3 -> 0). I think there is no need to use CheckAndSet on release lock phase since it is guaranteed for only one thread can success to lock snapshotEdge. Also fetch snapshotEdge on ever retry produce lots of RPC on same row. Actually, I think there is no need to fetch snapshotEdge over and over on every retry, once this snapshotEdge is locked. Of course threads that failed to acquire lock, still need to fetch snapshotEdge on retry, but thread who own lock on this snapshotEdge do not need to fetch it again. it is guaranteed that no other thread changed snapshotEdge while thread who own lock is retrying on partial failure. Also current implementation use Thread.sleep on Future and throw exception inside Future(not return Future.failed(ex)) which does not seems normal in asynchronous manner. Finally there is no comments and it is very hard to follow control flow for this resolving logic. I think it is worthwhile to refactor conflict resolving logic. What do you guys think? Best Regards. DOYUNG YOON
