-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6013/#review9845
-----------------------------------------------------------


Couple of minor things.  Looks pretty good to me though.  Will be a lot cleaner 
once Hadoop RPC is gone!


http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClient.java
<https://reviews.apache.org/r/6013/#comment20890>

    So the first request is for the vertices and the second one is for 
messages...



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClient.java
<https://reviews.apache.org/r/6013/#comment20889>

    Why is this check here?  We are using netty if we got here.



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClient.java
<https://reviews.apache.org/r/6013/#comment20891>

    Would be good to generally try to avoid line breaks on '.' whenever 
possible.



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerServer.java
<https://reviews.apache.org/r/6013/#comment20892>

    Would be good to generally try to avoid line breaks on '.' whenever 
possible.



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerServer.java
<https://reviews.apache.org/r/6013/#comment20893>

    Please try to keep <I, M> on the same line.



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerServer.java
<https://reviews.apache.org/r/6013/#comment20895>

    Nice to get rid of this!



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/RPCCommunications.java
<https://reviews.apache.org/r/6013/#comment20896>

    Should throw an exception here, right?  Shouldn't ever happen.



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/DiskBackedMessageStore.java
<https://reviews.apache.org/r/6013/#comment20900>

    map is probably not a great name.



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/DiskBackedMessageStoreByPartition.java
<https://reviews.apache.org/r/6013/#comment20902>

    prefer split on = rather than .



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/DiskBackedMessageStoreByPartition.java
<https://reviews.apache.org/r/6013/#comment20901>

    please join '.' and split on : instead.



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
<https://reviews.apache.org/r/6013/#comment20903>

    another break on '.'



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java
<https://reviews.apache.org/r/6013/#comment20904>

    Wow, can't wait to get rid of Hadoop RPC.


- Avery Ching


On Aug. 1, 2012, 7:25 p.m., Maja Kabiljo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6013/
> -----------------------------------------------------------
> 
> (Updated Aug. 1, 2012, 7:25 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Description
> -------
> 
> This patch introduces out-of-core messages support for Giraph. Some ideas are 
> taken from discussion in https://issues.apache.org/jira/browse/GIRAPH-45. 
> 
> We keep messages in MessageStore, from which they will be flushed to disk 
> when necessary. Messages are separated by partition. In moments we only flush 
> single partition to disk so we would still keep things in memory in case it's 
> time for next superstep. When flushing to disk, we write in the following 
> format:
> numberOfVertices
> vertexId1 numberOfMessages1 messagesForVertexId1
> vertexId2 numberOfMessages2 messagesForVertexId2
> ...
> Vertex ids are sorted. We don't require enough memory to fit all the messages 
> for partition, but we require that messages for a single vertex fit in 
> memory. 
> In the end we potentially have several files for each partition. When reading 
> messages, all the files are read sequentially.
> 
> DiskBackedMessageStoreByPartition handles all messages, 
> DiskBackedMessageStore is then used for a single partition, and 
> SequentialFileMessageStore handles single file.
> There is also SimpleMessageStore which doesn't use disk at all.
> 
> Options available to user:
> - whether or not to use out-of-core messaging
> - number of messages to keep in memory - this should probably be changed 
> (explained below)
> - size of buffer when reading from and writing to disk
> 
> ServerData now has two instances of message stores: one which is consumed in 
> current superstep with messages from previous superstep, and one in which it 
> will keep incoming messages for next superstep.
> 
> Other things which had to be changed:
> - Checkpointing - since messages are not kept in the vertex anymore, they 
> need to be stored separately.
> - Partition exchange between workers - same reasons as above - added 
> SendMessagesRequest
> - Messages are not assigned to vertex, they are just passed in compute
> - compute methods are now executed in order of vertex id inside of partition, 
> so we could have fast reading from disk
> 
> For memory check I only have the number of messages which I allow in memory. 
> This should be done better, but there is a problem since Alessandro's patch 
> for out-of-core graph also has memory checks. We don't want one of those 
> parts to use all the memory and leave too little space for the other, but I'm 
> not aware of a way to separately check memory usage of different data 
> structures.
> 
> I didn't integrate this with RPC, that's why there are some checks for 
> useNetty, those can be removed once the RPC is removed. Also, since vertex 
> doesn't keep messages in itself anymore, once RPC is removed we should also 
> remove getMessages/putMessages/getNumMessages from vertex, change initialize 
> to (id, value, edges, hasMessages) and just give messages to vertex when 
> calling compute.
> 
> I'll fix the part when partitions are sent around before superstep, since 
> that's the only part now which requires that all the messages for single 
> partition fit in memory.
> 
> 
> This addresses bug GIRAPH-45.
>     https://issues.apache.org/jira/browse/GIRAPH-45
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyServer.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClient.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClientServer.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerServer.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/RPCCommunications.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/RequestRegistry.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/SendPartitionMessagesRequest.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/ServerData.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/WorkerServer.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/BasicMessageStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/DiskBackedMessageStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/DiskBackedMessageStoreByPartition.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/FlushableMessageStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/MessageStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/MessageStoreByPartition.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/MessageStoreFactory.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/SendPartitionCurrentMessagesRequest.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/SequentialFileMessageStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/SimpleMessageStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/messages/package-info.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BasicVertexResolver.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/VertexResolver.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/Partition.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/utils/CollectionUtils.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/ConnectionTest.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/RequestTest.java
>  1366303 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/TestMessageStores.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/utils/MockUtils.java
>  1366303 
> 
> Diff: https://reviews.apache.org/r/6013/diff/
> 
> 
> Testing
> -------
> 
> Run mvn verify and tests in pseudo-distributed mode, all apart from this one 
> https://issues.apache.org/jira/browse/GIRAPH-259 pass.
> 
> 
> Thanks,
> 
> Maja Kabiljo
> 
>

Reply via email to