> On Aug. 22, 2012, 8:55 a.m., Maja Kabiljo wrote: > > Alessandro, can you please try the same test with a lot of vertices (you > > only had 10 vertices here)? If vertices are visited in random order in the > > prepareSuperstep, I think you should have much more disk operations. > > > > > Also, addPartition() would then have to read all vertex ids even when the > > > partition is in memory, which would make it way slower in the standard > > > use case. > > What do you mean by this?
Yeah, 10 was an unfortunate choice (more partitions than vertices!), I guess last night I was really too tired :P Here's what I see with 1000 vertices, 999 edges/vertex (I also tried 10 edges/vertex and got the same pattern): http://pastebin.com/raw.php?i=jGBzaZA8 So we're loading each out-of-core partition twice. I get this same result with different numbers of in-memory partitions. I added some logging and it looks like MessageStore#getDestinationVertices() is returning vertices grouped by partition. Do you have any idea why? I wonder if it's because of hashing (messages are stored in a hash-map indexed by vertex id, and partitions are formed by hashing vertex ids). An adversarial configuration could make us load partitions back and forth in a random fashion. Regarding addPartition, I mean that whenever we add a partition in memory, we currently simply move the reference (fast), whereas if we need to keep track of vertex ids we would have to copy them all in a global map. Anyway hold on, I'll see if I can do something about this. I'm mainly concerned with code calling Partition#putVertex() directly though, I see no way to disallow it. > On Aug. 22, 2012, 8:55 a.m., Maja Kabiljo wrote: > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/DiskBackedPartitionStore.java, > > lines 235-236 > > <https://reviews.apache.org/r/5987/diff/8/?file=143644#file143644line235> > > > > You could end up with more than maxInMemoryPartitions. I see, we could have two threads concurrently write to inMemoryPartitions. Fixed by synchronizing on inMemoryPartitions and returning early upon successful put(). - Alessandro ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5987/#review10614 ----------------------------------------------------------- On Aug. 21, 2012, 11:02 p.m., Alessandro Presta wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/5987/ > ----------------------------------------------------------- > > (Updated Aug. 21, 2012, 11:02 p.m.) > > > Review request for giraph and Avery Ching. > > > Description > ------- > > I gave this another shot. This time it plays nicely with input superstep: I > replaced both the temporary partitions and the worker partition map with a > common data structure, PartitionStore, held in ServerData. This is similar to > what we now have for messages. > > Partition is now thread-safe so that we can have concurrent calls to > putVertex(). > > SimplePartitionStore is backed by a concurrent hash map (nothing new here, > except that we skip copying partitions to the worker). > > DiskBackedPartitionStore can hold up to a user-defined number of partitions > in memory, and spill the remaining ones to disk. Each partition is stored in > a separate file. > Adding vertices to an out-of-core partition consists in appending them to the > file, which makes processing vertex requests relatively fast. > We use a ReadWriteLock for each partition: performing operations on a > partition held in memory only requires a read-lock (since Partition is > thread-safe), while creating a new partition, moving it in and out of core or > appending vertices requires a write-lock (we can't have concurrent writes). > > Also note that this breaks Hadoop RPC: I preferred to keep it clean (this > also shows what code we get rid of) instead of working around it. I suppose > the Netty security patch will be completed soon. If not, I will restore RPC > compatibility. > > More here: > https://issues.apache.org/jira/browse/GIRAPH-249?focusedCommentId=13435280&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13435280 > > > This addresses bug GIRAPH-249. > https://issues.apache.org/jira/browse/GIRAPH-249 > > > Diffs > ----- > > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClientServer.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerServer.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/SendVertexRequest.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/ServerData.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/WorkerServer.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/DiskBackedPartitionStore.java > PRE-CREATION > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/Partition.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/PartitionStore.java > PRE-CREATION > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/SimplePartitionStore.java > PRE-CREATION > > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/WorkerGraphPartitioner.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/TestMutateGraphVertex.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/ConnectionTest.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/RequestTest.java > 1375453 > > http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/graph/partition/TestPartitionStores.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/5987/diff/ > > > Testing > ------- > > mvn verify and pseudo-distributed mode tests with both SimplePartitionStore > and DiskBackedPartitionStore > > > Thanks, > > Alessandro Presta > >
