[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-26 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262526#comment-13262526
 ] 

Claudio Martella commented on GIRAPH-185:
-

I think we showed there shouldn't be any additional memory with this patch. 
Let's see what performance gain we get. I also expect very little impact, but 
let's see.

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-26 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262709#comment-13262709
 ] 

Bo Wang commented on GIRAPH-185:


I did a small test (2 machines, each with 2 cores) and the performance
change is relatively small. I am now scheduling a time to run it on a
larger environment so that there are more conflicts. There are always too
many concurrent jobs on the cluster and I need to wait for a time slot.
Sorry for having your waiting.




 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261774#comment-13261774
 ] 

Claudio Martella commented on GIRAPH-185:
-

The performance of concurrentlinkedqueue is going to be faster than a 
synchronized block as it's just a CAS operation on the tail pointer, at least 
for the add() method which adds to the tail of the queue. Also, arrayList in 
any case should be slower on adding elements as it requires the memory 
expansion and copying when the allocated memory is exhausted.
Iteration could indeed be a bit slower than an arrayList because of cache.

The memory overhead of each entry of the queue is indeed something that should 
be investigated. Worst case, one might think of copying the 
concurrentlinkedqueue implementation and remove the prev pointer which we 
don't need.

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261787#comment-13261787
 ] 

Claudio Martella commented on GIRAPH-185:
-

Actually I checked the source and there's no prev pointer, each Node has just a 
pointer to the payload and to the next node. The memory overhead should be 
small.

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261830#comment-13261830
 ] 

Bo Wang commented on GIRAPH-185:


I checked the source and found the same thing. I think LinkedList should be ok 
in terms of space. ArrayList also has to keep empty space in the array to 
future insertion. Should we close this issue?

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261854#comment-13261854
 ] 

Claudio Martella commented on GIRAPH-185:
-

Personally, I'd like to see some benchmarking on this issue. If we commit this, 
we should have an idea of the impact.

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261906#comment-13261906
 ] 

Avery Ching commented on GIRAPH-185:


I agree that a benchmark should be done, although I expect the impact to be 
very small.  We should at least show it's not slower. =)

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Hyunsik Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262278#comment-13262278
 ] 

Hyunsik Choi commented on GIRAPH-185:
-

If there is a trade-off relationship between the performance and memory 
consumption, the memory consumption seems more important in the current giraph 
implementation. Also, I agree that some benchmarks are necessary.

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-24 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260275#comment-13260275
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-185:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/
---

Review request for giraph.


Summary
---

Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess to 
message map. The concurrencyLevel of ConcurrentHashMap uses the default value. 
There may be some performance gain by tuning this value.


This addresses bug GIRAPH-185.
https://issues.apache.org/jira/browse/GIRAPH-185


Diffs
-

  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1328747 

Diff: https://reviews.apache.org/r/4852/diff


Testing
---


Thanks,

Bo



 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-24 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260416#comment-13260416
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-185:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7169
---


Looks good to me, wound't hurt to see some stress test to check performance, 
although I wouldn't expect this to be slower than the synchronized block. Also, 
I'd agree that moving the messages directly from the inMessages to the Vertex 
could be a memory win.

- Claudio


On 2012-04-24 06:11:38, Bo Wang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4852/
bq.  ---
bq.  
bq.  (Updated 2012-04-24 06:11:38)
bq.  
bq.  
bq.  Review request for giraph.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess 
to message map. The concurrencyLevel of ConcurrentHashMap uses the default 
value. There may be some performance gain by tuning this value.
bq.  
bq.  
bq.  This addresses bug GIRAPH-185.
bq.  https://issues.apache.org/jira/browse/GIRAPH-185
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1328747 
bq.  
bq.  Diff: https://reviews.apache.org/r/4852/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Bo
bq.  
bq.



 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-24 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261004#comment-13261004
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-185:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7185
---



http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
https://reviews.apache.org/r/4852/#comment15860

Bo, I'm a little leery about converting the List and ArrayList to 
LinkedList and ConcurrentLinkedList.  I believe that linked list's will use 
more memory than the array list due to the double links (forward and backward). 
 Also, is ConcurrentLinkedList supposted to outperform a synchronized 
ArrayList?  I haven't seen much on that.

The concurrenthashmap changes look good.


- Avery


On 2012-04-24 06:11:38, Bo Wang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4852/
bq.  ---
bq.  
bq.  (Updated 2012-04-24 06:11:38)
bq.  
bq.  
bq.  Review request for giraph.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess 
to message map. The concurrencyLevel of ConcurrentHashMap uses the default 
value. There may be some performance gain by tuning this value.
bq.  
bq.  
bq.  This addresses bug GIRAPH-185.
bq.  https://issues.apache.org/jira/browse/GIRAPH-185
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1328747 
bq.  
bq.  Diff: https://reviews.apache.org/r/4852/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Bo
bq.  
bq.



 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-24 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261029#comment-13261029
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-185:
--



bq.  On 2012-04-24 20:53:33, Avery Ching wrote:
bq.   
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java,
 lines 776-777
bq.   https://reviews.apache.org/r/4852/diff/1/?file=104059#file104059line776
bq.  
bq.   Bo, I'm a little leery about converting the List and ArrayList to 
LinkedList and ConcurrentLinkedList.  I believe that linked list's will use 
more memory than the array list due to the double links (forward and backward). 
 Also, is ConcurrentLinkedList supposted to outperform a synchronized 
ArrayList?  I haven't seen much on that.
bq.   
bq.   The concurrenthashmap changes look good.

Avery, thanks for the comments. I just measured the sizes of these classes and 
below are an estimation. 

java.util.ArrayList: 149 bytes
java.util.LinkedList: 101 bytes
java.util.concurrent.ConcurrentLinkedQueue: 118 bytes

The tool I was using is a program from the link below.
http://www.javapractices.com/topic/TopicAction.do?Id=83

In terms of performance, here is a benchmark.
http://www.javacodegeeks.com/2010/09/java-best-practices-queue-battle-and.html

In its test #1 (adding element), ConcurrentLinkedQueue performed slightly 
better than LinkedList. In test #3 (iterator), LinkedList outperformed 
ConcurrentLinkedQueue. I think the most time consuming part is add, while 
iteration is also heavily used but no concurrent accesses. 


- Bo


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7185
---


On 2012-04-24 06:11:38, Bo Wang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4852/
bq.  ---
bq.  
bq.  (Updated 2012-04-24 06:11:38)
bq.  
bq.  
bq.  Review request for giraph.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess 
to message map. The concurrencyLevel of ConcurrentHashMap uses the default 
value. There may be some performance gain by tuning this value.
bq.  
bq.  
bq.  This addresses bug GIRAPH-185.
bq.  https://issues.apache.org/jira/browse/GIRAPH-185
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1328747 
bq.  
bq.  Diff: https://reviews.apache.org/r/4852/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Bo
bq.  
bq.



 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-24 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261046#comment-13261046
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-185:
--



bq.  On 2012-04-24 20:53:33, Avery Ching wrote:
bq.   
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java,
 lines 776-777
bq.   https://reviews.apache.org/r/4852/diff/1/?file=104059#file104059line776
bq.  
bq.   Bo, I'm a little leery about converting the List and ArrayList to 
LinkedList and ConcurrentLinkedList.  I believe that linked list's will use 
more memory than the array list due to the double links (forward and backward). 
 Also, is ConcurrentLinkedList supposted to outperform a synchronized 
ArrayList?  I haven't seen much on that.
bq.   
bq.   The concurrenthashmap changes look good.
bq.  
bq.  Bo Wang wrote:
bq.  Avery, thanks for the comments. I just measured the sizes of these 
classes and below are an estimation. 
bq.  
bq.  java.util.ArrayList: 149 bytes
bq.  java.util.LinkedList: 101 bytes
bq.  java.util.concurrent.ConcurrentLinkedQueue: 118 bytes
bq.  
bq.  The tool I was using is a program from the link below.
bq.  http://www.javapractices.com/topic/TopicAction.do?Id=83
bq.  
bq.  In terms of performance, here is a benchmark.
bq.  
http://www.javacodegeeks.com/2010/09/java-best-practices-queue-battle-and.html
bq.  
bq.  In its test #1 (adding element), ConcurrentLinkedQueue performed 
slightly better than LinkedList. In test #3 (iterator), LinkedList outperformed 
ConcurrentLinkedQueue. I think the most time consuming part is add, while 
iteration is also heavily used but no concurrent accesses. 
bq.  
bq.

Thanks for the response Bo.

Those numbers are for the empty data structures I'm assuming.  I was referring 
to the incremental cost of adding elements (messages) to the data structures.  
The performance isn't a a concern to me (unless we call size() somewhere).


- Avery


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7185
---


On 2012-04-24 06:11:38, Bo Wang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4852/
bq.  ---
bq.  
bq.  (Updated 2012-04-24 06:11:38)
bq.  
bq.  
bq.  Review request for giraph.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess 
to message map. The concurrencyLevel of ConcurrentHashMap uses the default 
value. There may be some performance gain by tuning this value.
bq.  
bq.  
bq.  This addresses bug GIRAPH-185.
bq.  https://issues.apache.org/jira/browse/GIRAPH-185
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1328747 
bq.  
bq.  Diff: https://reviews.apache.org/r/4852/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Bo
bq.  
bq.



 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-24 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261054#comment-13261054
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-185:
--



bq.  On 2012-04-24 20:53:33, Avery Ching wrote:
bq.   
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java,
 lines 776-777
bq.   https://reviews.apache.org/r/4852/diff/1/?file=104059#file104059line776
bq.  
bq.   Bo, I'm a little leery about converting the List and ArrayList to 
LinkedList and ConcurrentLinkedList.  I believe that linked list's will use 
more memory than the array list due to the double links (forward and backward). 
 Also, is ConcurrentLinkedList supposted to outperform a synchronized 
ArrayList?  I haven't seen much on that.
bq.   
bq.   The concurrenthashmap changes look good.
bq.  
bq.  Bo Wang wrote:
bq.  Avery, thanks for the comments. I just measured the sizes of these 
classes and below are an estimation. 
bq.  
bq.  java.util.ArrayList: 149 bytes
bq.  java.util.LinkedList: 101 bytes
bq.  java.util.concurrent.ConcurrentLinkedQueue: 118 bytes
bq.  
bq.  The tool I was using is a program from the link below.
bq.  http://www.javapractices.com/topic/TopicAction.do?Id=83
bq.  
bq.  In terms of performance, here is a benchmark.
bq.  
http://www.javacodegeeks.com/2010/09/java-best-practices-queue-battle-and.html
bq.  
bq.  In its test #1 (adding element), ConcurrentLinkedQueue performed 
slightly better than LinkedList. In test #3 (iterator), LinkedList outperformed 
ConcurrentLinkedQueue. I think the most time consuming part is add, while 
iteration is also heavily used but no concurrent accesses. 
bq.  
bq.
bq.  
bq.  Avery Ching wrote:
bq.  Thanks for the response Bo.
bq.  
bq.  Those numbers are for the empty data structures I'm assuming.  I was 
referring to the incremental cost of adding elements (messages) to the data 
structures.  The performance isn't a a concern to me (unless we call size() 
somewhere).

By the incremental cost, I mean the memory cost, sorry.


- Avery


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7185
---


On 2012-04-24 06:11:38, Bo Wang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4852/
bq.  ---
bq.  
bq.  (Updated 2012-04-24 06:11:38)
bq.  
bq.  
bq.  Review request for giraph.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess 
to message map. The concurrencyLevel of ConcurrentHashMap uses the default 
value. There may be some performance gain by tuning this value.
bq.  
bq.  
bq.  This addresses bug GIRAPH-185.
bq.  https://issues.apache.org/jira/browse/GIRAPH-185
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1328747 
bq.  
bq.  Diff: https://reviews.apache.org/r/4852/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Bo
bq.  
bq.



 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch, GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-23 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259408#comment-13259408
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-185:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4840/
---

Review request for giraph.


Summary
---

Improve concurrency of putMsg / putMsgList.


This addresses bug GIRAPH-185.
https://issues.apache.org/jira/browse/GIRAPH-185


Diffs
-

  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1328747 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/WorkerCommunications.java
 1328747 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
 1328747 

Diff: https://reviews.apache.org/r/4840/diff


Testing
---

All test passed. Wait for review. Thanks.


Thanks,

Bo



 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-23 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259412#comment-13259412
 ] 

Avery Ching commented on GIRAPH-185:


Thanks for looking at this Bo.  I have a couple of questions/comments.

1)  Do you have any idea what kind of performance gain there is?  Can you run a 
few experiments?

2)  The one downside is memory related.  By pre-allocating a list for every 
vertex, we are going to use memory, whether the vertex will receive a message 
or not.

I thought you might be looking into a higher level concurrency by using 
ConcurrentHashMap or something like that for transientInMessages?

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-23 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259704#comment-13259704
 ] 

Claudio Martella commented on GIRAPH-185:
-

I agree with Avery. Last time I modified this code, I populated the 
transientInMessages datastructure so that it contained lists only for vertices 
that were receiving messages.
You could do this by using a ConcurrentHashMap and by using the putIfAbsent() 
method.

public void add(Object key, Object val) {
Queue q = map.get(key);
if (q == null) {
q = new ConcurrentLinkedQueue();
Queue temp = map.putIfAbsent(q);
if (temp != null)
q = temp;
}
q.add(val);
}

This will populate the datastructure only for vertices that have messages AND 
by using the ConcurrentLinkedQueue we'd also get rid of the other 
synchronization block.

What do you think?

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-23 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259798#comment-13259798
 ] 

Bo Wang commented on GIRAPH-185:


Thanks for the comments, Avery and Claudio.

I am measuring the performance, but I found the #Sent messages (reported by
counter) is always zero in both the original and changed version. Is that a
bug? In the output of Quick Start Guide, it is also zero.

I think ConcurrentHashMap is a good way to go. But for the current
approach, I think the memory overhead won't be much since most of the
vertices will receive messages except for those isolated ones and the one
whose neighbors are all inactive. In comparison, ConcurrentHashMap is much
larger than HashMap. An empty ConcurrentHashMap takes around 1673 bytes
while an empty HashMap only takes 137 bytes.

Another reason to use pre-population is to avoid the time spent on
allocating new message lists and inserting them into the map. We just need
to clear the list before the next super-step.

What's your opinion?

Thanks,
Bo




 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-23 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259841#comment-13259841
 ] 

Avery Ching commented on GIRAPH-185:


Since we only allocate one ConcurrentHashMap per worker, the empty overhead 
isn't a concern.  If however, the per element memory cost of a entry into the 
concurrent hash map is much more expensive then I would definitely be worried.  
We can also tune the concurrency level (default 16) to a reasonable tradeoff.

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
Assignee: Bo Wang
 Fix For: 0.2.0

 Attachments: GIRAPH-185.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-17 Thread Bo Wang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255933#comment-13255933
 ] 

Bo Wang commented on GIRAPH-185:


putVertexIdMessagesList can be improved together

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
 Fix For: 0.2.0

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-17 Thread Claudio Martella (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255952#comment-13255952
 ] 

Claudio Martella commented on GIRAPH-185:
-

totally agree with that. I actually addressed that in GIRAPH-45, but it hasn't 
seen the light yet... :) So go on!

 Improve concurrency of putMsg / putMsgList
 --

 Key: GIRAPH-185
 URL: https://issues.apache.org/jira/browse/GIRAPH-185
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.2.0
Reporter: Bo Wang
 Fix For: 0.2.0

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently in putMsg / putMsgList, a synchronized closure is used to protect 
 the whole transientInMessages when adding the new message. This lock prevents 
 other concurrent calls to putMsg/putMsgList and increases the response time. 
 We should use fine-grain locks to allow high concurrency in message 
 communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira