[ 
https://issues.apache.org/jira/browse/HAMA-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256753#comment-13256753
 ] 

Thomas Jungblut commented on HAMA-521:
--------------------------------------

Thanks for the stacktrace Edward, I was quite overworked back then and just 
hacked a few lines together. I fixed this now and it works.
I added a testcase though.

Please review again and maybe we can put this into 0.5.0. Maybe we can add the 
sorted queue as well. This is a cool feature and just a neat implementation.

But please check on your cluster, I tested in pseudo-distributed mode.

{noformat}
thomasjungblut@ubuntu:~/workspace/hama-trunk$ /usr/local/hama/bin/hama jar 
/usr/local/hama/hama-examples-0.5.0-incubating.jar sssp 1 /tmp/sssp-in 
/tmp/sssp-out
12/04/18 19:55:30 INFO bsp.FileInputFormat: Total input paths to process : 1
12/04/18 19:55:31 INFO bsp.FileInputFormat: Total # of splits: 8
12/04/18 19:55:33 INFO bsp.FileInputFormat: Total input paths to process : 8
12/04/18 19:55:34 INFO bsp.BSPJobClient: Running job: job_201204181918_0003
12/04/18 19:55:37 INFO bsp.BSPJobClient: Current supersteps number: 0
12/04/18 19:55:40 INFO bsp.BSPJobClient: Current supersteps number: 6
12/04/18 19:55:43 INFO bsp.BSPJobClient: Current supersteps number: 96
12/04/18 19:55:46 INFO bsp.BSPJobClient: Current supersteps number: 243
12/04/18 19:55:46 INFO bsp.BSPJobClient: The total number of supersteps: 243
12/04/18 19:55:46 INFO bsp.BSPJobClient: Counters: 10
12/04/18 19:55:46 INFO bsp.BSPJobClient:   
org.apache.hama.bsp.JobInProgress$JobCounter
12/04/18 19:55:46 INFO bsp.BSPJobClient:     LAUNCHED_TASKS=8
12/04/18 19:55:46 INFO bsp.BSPJobClient:   
org.apache.hama.bsp.BSPPeerImpl$PeerCounter
12/04/18 19:55:46 INFO bsp.BSPJobClient:     SUPERSTEPS=243
12/04/18 19:55:46 INFO bsp.BSPJobClient:     SUPERSTEP_SUM=1944
12/04/18 19:55:46 INFO bsp.BSPJobClient:     MESSAGE_BYTES_TRANSFERED=158336
12/04/18 19:55:46 INFO bsp.BSPJobClient:     TIME_IN_SYNC_MS=37868
12/04/18 19:55:46 INFO bsp.BSPJobClient:     IO_BYTES_READ=3411067
12/04/18 19:55:46 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_SENT=2184
12/04/18 19:55:46 INFO bsp.BSPJobClient:     TASK_INPUT_RECORDS=100000
12/04/18 19:55:46 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_RECEIVED=2184
12/04/18 19:55:46 INFO bsp.BSPJobClient:     MESSAGE_BYTES_RECEIVED=158336
Job Finished in 15.236 seconds
{noformat}
                
> Improve message buffering to save memory
> ----------------------------------------
>
>                 Key: HAMA-521
>                 URL: https://issues.apache.org/jira/browse/HAMA-521
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-521.patch, HAMA-521_1.patch, HAMA-521_2.patch
>
>
> Suraj and I had a bit of discussion about incoming and outgoing message 
> buffering and scalability.
> Currently everything lies on the heap, causing huge amounts of GC and waste 
> of memory. We can do better.
> Therefore we need to extract an abstract Messenger class which is directly 
> under the interface but over the compressor class.
> It should abstract the use of the queues in the back (currently lot of 
> duplicated code) and it should be backed by a sequencefile on local disk.
> Once sync() starts it should return a message iterator for combining and then 
> gets put into a message bundle which is send over RPC.
> On the other side we get a bundle and looping over it putting everything into 
> the heap making it much larger than it needs to be. Here we can also flush on 
> disk because we are just using a queue-like method to the user-side.
> Plus points:
> In case we have enough heap (see our new metric system), we can also 
> implement a buffering technology that is not flushing everything to disk.
> Open questions:
> I don't know how much slower the whole system gets, but it would save alot of 
> memory. Maybe we should first evaluate if it is really needed.
> In any case, the refactoring of the duplicate code in the messengers is 
> needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to