[ https://issues.apache.org/jira/browse/HAMA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Suraj Menon updated HAMA-559: ----------------------------- Attachment: spilling_buffer_profile_timesplit_text_write.png spilling_buffer_profile_cpu_graph_test_write.png spilling_buffer_cpu_usage_text_write.png The result for profiling with the following code. I am writing a file close to 4GB size. With Text.write it took close to half hour. {noformat} SpillingBuffer buffer = new SpillingBuffer(); // super(new SpillingStream(2, 1 << 24, 1 << 24, true, // //super(new SpillingStream( 2, 1 << 20, 1 << 8, true, // System.getProperty("java.io.tmpdir") + File.separatorChar // + new BigInteger(128, new SecureRandom()).toString(32))); Text t = new Text("Testing the spillage of spilling buffer"); for (int i = 0; i < 100000000; ++i) t.write(buffer); buffer.close(); {noformat} > Add a spilling message queue > ---------------------------- > > Key: HAMA-559 > URL: https://issues.apache.org/jira/browse/HAMA-559 > Project: Hama > Issue Type: Sub-task > Components: bsp core > Affects Versions: 0.5.0 > Reporter: Thomas Jungblut > Assignee: Suraj Menon > Priority: Minor > Fix For: 0.7.0 > > Attachments: HAMA-559.patch-v1, > spilling_buffer_cpu_usage_text_write.png, > spilling_buffer_profile_cpu_graph_test_write.png, > spilling_buffer_profile_timesplit_text_write.png > > > After HAMA-521 is done, we can add a spilling queue which just holds the > messages in RAM that fit into the heap space. The rest can be flushed to disk. > We may call this a HybridQueue or something like that. > The benefits should be that we don't have to flush to disk so often and get > faster. However we may have more GC so it is always overall faster. > The requirements for this queue also include: > - The message object once written to the queue (after returning from the > write call) could be modified, but the changes should not be reflected in the > messages stored in the queue. > - For now let's implement a queue that does not support concurrent reading > and writing. This feature is needed when we implement asynchronous > communication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira