[ 
https://issues.apache.org/jira/browse/HAMA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459495#comment-13459495
 ] 

Apurv Verma commented on HAMA-559:
----------------------------------

My mistake should hv written in JIRA. Anyways Copy+Pssting my mail.

Yeah ehcache would be random access. But the good thing about it is that it can 
make the disk spilling process transparent and seamless, you just specify the 
size you want to keep in memory and rest goes to disk. With Apache DirectMemory 
coming up behind ehcache these messages can even be flushed to offheap first 
and then to disk. Which would be really fast but that's when it comes.

For now if we are not using EhCache I think then our current queue 
implementation needs to cleaned up slightly. Then I have another idea of 
building a SegmentedQueue on top of the normal DiskQueue to fasten it up.
In  a segment queue instead of storing messages in just one queue you store it 
in multiple segments, then at the time of reading. You spawn multiple threads 
to read from the segment queue and do the message send phase. 
This should give us significant parallelization benefits, I will try to do the 
AsyncSend and SegmentQueue, this and the next weekend. Its  pretty active in my 
head ;)
                
> Add a caching message queue
> ---------------------------
>
>                 Key: HAMA-559
>                 URL: https://issues.apache.org/jira/browse/HAMA-559
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp core
>    Affects Versions: 0.5.0
>            Reporter: Thomas Jungblut
>            Priority: Minor
>             Fix For: 0.6.0
>
>
> After HAMA-521 is done, we can add a caching queue which just holds the 
> messages in RAM that fit into the heap space. The rest can be flushed to disk.
> We may call this a HybridQueue or something like that.
> The benefits should be that we don't have to flush to disk so often and get 
> faster. However we may have more GC so it is always overall faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to