Hi guys, I am writing you since I am facing an issue I would like to solve to complete the implementation of the RexsterOutputFormat API.
Based on a short discussion I had with Nitay and Claudio, I am trying to implement a barrier inside the RexsterOutputFormat so that vertices are guaranteed to be saved before the edges are sent to the Rexster endpoint. This is needed since while saving the edges, the source vertices as well as the destination vertices need to be already present in the database otherwise Blueprints cannot save the edges consistently. The naive implementation based on the new EdgeOutputFormat API is already working but it does not have any global/cluster wide barrier. This means that it works on a pseudo-distributed environment but cannot work on an actual cluster (this is due to the fact that saving edges is ordered after saving the vertices). At this point the implementation of the barrier could be straightforward if I could have either the number of workers currently up and running while writing the vertices or the global number of vertices. The second info looks better to me because in this manner I would not to deal with workers dying. I would just need to check that all the edges accounted at the end of the final superstep are saved. I could use a znode where the workers save the number of vertices saved and the barrier can be quitted when the total number of saved vertices is the same as the expected global count. Now, my problem here is how do I access this information form the OutputFormat? I checked around and it looks to me that the global state is only accessible in the service scope. This would mean that I should add something in the BspServiceWorker to pass this information to the RexsterVertexOutputFormat. Do you have any idea how I could achieve this keeping the approach consistent with the Giraph code-base? Do you have any other suggestions or possible solutions that I could implement to achieve the same goal, namely saving the vertices before the edges at cluster level? Thanks! Armando
