-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5987/#review10579
-----------------------------------------------------------


Nice work, Alessandro.

Have you checked from which places in the code getPartition is called? For 
example, from the quick look, one of the things which worries me is in 
NettyWorkerServer.prepareSuperstep, line 118. service.getVertex() ends up 
calling it, and the order in which vertices are visited here is not sorted by 
partition. So you could possibly end up reading and writing the same partition 
to disk many times. Did you maybe do some testing to see how many times are the 
partitions actually swapped to/from disk during a superstep? 
(the one which I mentioned could be fixed by adding hasVertexId() to 
PartitionStore, since we just check if the vertex exists at that place)


http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
<https://reviews.apache.org/r/5987/#comment22646>

    This is not used anymore.



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/DiskBackedPartitionStore.java
<https://reviews.apache.org/r/5987/#comment22648>

    I think you want to do this before you put it in the map. Like this someone 
can take it and lock it before the thread creating it does.
    Could you also say which assumptions are you making for concurrency in this 
code so it would be easier to review (i.e. which operations can happen in 
parallel and which can't)?



http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/PartitionStore.java
<https://reviews.apache.org/r/5987/#comment22647>

    Typo.


- Maja Kabiljo


On Aug. 21, 2012, 11:16 a.m., Alessandro Presta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5987/
> -----------------------------------------------------------
> 
> (Updated Aug. 21, 2012, 11:16 a.m.)
> 
> 
> Review request for giraph and Avery Ching.
> 
> 
> Description
> -------
> 
> I gave this another shot. This time it plays nicely with input superstep: I 
> replaced both the temporary partitions and the worker partition map with a 
> common data structure, PartitionStore, held in ServerData. This is similar to 
> what we now have for messages.
> 
> Partition is now thread-safe so that we can have concurrent calls to 
> putVertex().
> 
> SimplePartitionStore is backed by a concurrent hash map (nothing new here, 
> except that we skip copying partitions to the worker).
> 
> DiskBackedPartitionStore can hold up to a user-defined number of partitions 
> in memory, and spill the remaining ones to disk. Each partition is stored in 
> a separate file.
> Adding vertices to an out-of-core partition consists in appending them to the 
> file, which makes processing vertex requests relatively fast.
> We use a ReadWriteLock for each partition: performing operations on a 
> partition held in memory only requires a read-lock (since Partition is 
> thread-safe), while creating a new partition, moving it in and out of core or 
> appending vertices requires a write-lock (we can't have concurrent writes).
> 
> Also note that this breaks Hadoop RPC: I preferred to keep it clean (this 
> also shows what code we get rid of) instead of working around it. I suppose 
> the Netty security patch will be completed soon. If not, I will restore RPC 
> compatibility.
> 
> More here: 
> https://issues.apache.org/jira/browse/GIRAPH-249?focusedCommentId=13435280&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13435280
> 
> 
> This addresses bug GIRAPH-249.
>     https://issues.apache.org/jira/browse/GIRAPH-249
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClientServer.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerServer.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/SendVertexRequest.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/ServerData.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/WorkerServer.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/DiskBackedPartitionStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/Partition.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/PartitionStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/SimplePartitionStore.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/WorkerGraphPartitioner.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/TestMutateGraphVertex.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/ConnectionTest.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/RequestTest.java
>  1375453 
>   
> http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/graph/partition/TestPartitionStores.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/5987/diff/
> 
> 
> Testing
> -------
> 
> mvn verify and pseudo-distributed mode tests with both SimplePartitionStore 
> and DiskBackedPartitionStore
> 
> 
> Thanks,
> 
> Alessandro Presta
> 
>

Reply via email to