Avery Ching commented on GIRAPH-185:
Thanks for the opinion, Avery.
Yep, the space for ConcurrenHashMap is not a concern. I estimate the space
overhead for those empty entries is not too large though. An empty entry is
around 69 bytes, just the size of a couple of messages. Statistically
speaking, most vertices will receive one or more messages, for example, in
PageRank. Actually, each Vertex object also has an internal messageList
structure of the same size, whether it receives a message or not. With
pre-population the time for entry creation and insertion can be saved as
well as the time spent on garbage collection.
Do you think it's worth the trade-off? If not, I am pretty open to using
Bo, I added your comments from the email to the jira. I think there are two
possible improvements here.
1) Converting the HashMap to ConcurrentHashMap will increase concurrency.
- This seems to be agreed on.
2) Making sure that all the possible destinations have a message list.
- I do agree that many applications will likely have most of the vertices
receiving the messages. That being said, memory is one of the limitations of
Giraph. Doubling up on the message lists is not likely to worth the benefit in
performance (I could be wrong if you have benchmarks that say otherwise).
Perhaps it might be possible to have the vertex not have a message list and
instead use the one from inMessages. This would be a nice memory savings.
> Improve concurrency of putMsg / putMsgList
> Key: GIRAPH-185
> URL: https://issues.apache.org/jira/browse/GIRAPH-185
> Project: Giraph
> Issue Type: Improvement
> Components: graph
> Affects Versions: 0.2.0
> Reporter: Bo Wang
> Assignee: Bo Wang
> Fix For: 0.2.0
> Attachments: GIRAPH-185.patch
> Original Estimate: 2h
> Remaining Estimate: 2h
> Currently in putMsg / putMsgList, a synchronized closure is used to protect
> the whole transientInMessages when adding the new message. This lock prevents
> other concurrent calls to putMsg/putMsgList and increases the response time.
> We should use fine-grain locks to allow high concurrency in message
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
For more information on JIRA, see: http://www.atlassian.com/software/jira