-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13909/
-----------------------------------------------------------
Review request for giraph.
Bugs: GIRAPH-752
https://issues.apache.org/jira/browse/GIRAPH-752
Repository: giraph-git
Description
-------
We've seen before that we crash when we have a vertex which receives a lot of
messages and we don't use a combiner. That is because the total size of
serialized messages for that vertex is bigger than the allowed size of an array.
We should implement OutputStream which can handle arbitrary size of data and
add an option to use that kind of stream for messages.
Diffs
-----
giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessagesPerVertexStore.java
6518da6
giraph-core/src/main/java/org/apache/giraph/comm/messages/MessagesIterable.java
a466a8d
giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/PartitionDiskBackedMessageStore.java
7b3e548
giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/SequentialFileMessageStore.java
64031c3
giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/IntByteArrayMessageStore.java
597e7af
giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/LongByteArrayMessageStore.java
3fe6356
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 604729a
giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java
2506c21
giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterable.java
cf2c187
giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterator.java
76ed789
giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessages.java
56cc01c
giraph-core/src/main/java/org/apache/giraph/utils/Factory.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterable.java
e3992ed
giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterator.java
b6151c5
giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java
PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInputOutput.java
PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java
PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/utils/io/DataInputOutput.java
PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java
PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/utils/io/package-info.java
PRE-CREATION
Diff: https://reviews.apache.org/r/13909/diff/
Testing
-------
Run a job which fails with original code and when the new option is not used,
and verified it works properly when the job is used.
Also compared the performance with and without the change, it's the same, when
option is turned on it seems to add about 5% overhead.
mvn clean verify
Thanks,
Maja Kabiljo