Re: Discussion for memory and scalability issues of Graph package.

Edward J. Yoon Thu, 31 Jan 2013 00:17:05 -0800

P.S., IMO, our future default queue should be a spilling queue, and
message merge-sort function should be a optional.


On Thu, Jan 31, 2013 at 5:12 PM, Edward J. Yoon <[email protected]> wrote:
> If I remember correctly, in hadoop's case, MR framework merges and
> sorts intermediate data files by key between map and reduce functions.
> If we provide this function, I think we can solve disk queue,
> message-grouping and message-sort at once.
>
> BTW, can we specify the queue type per job?
>
> On Thu, Jan 31, 2013 at 4:20 PM, Suraj Menon <[email protected]> wrote:
>> Thanks for bringing up our discussion online.
>>
>> For 1. Let's implement something withing bsp-core that could be re-used by
>> graph package. [HAMA-724]
>>
>> For 2. For sorted queue, It would be expensive to do all the sorting on the
>> sender side. We need to have a send protocol and the receive protocol
>> (merge sort) [HAMA-722][HAMA-723]
>>
>> Regards,
>> Suraj
>>
>> On Wed, Jan 30, 2013 at 3:05 AM, Edward J. Yoon <[email protected]>wrote:
>>
>>> Hi devs,
>>>
>>> As you know, many people reports OOM problems with graph algorithms.
>>> It is about handling messages. I roughly think that every vertex can
>>> send or receive as many messages as the number of outgoing or incoming
>>> links. For example, you know, Barack Obama has an 26,000,000+
>>> followers.
>>>
>>> I believe the issue of message queue will be fixed by adding spilling
>>> queue. Another issue is the grouping messages by vertex ID[1]. To
>>> solve this issue, I'm thinking about two ways: 1) Support grouping
>>> function of key-value pair messages in BSP framework (like
>>> Map/Reduce). 2) Write messages and Sort by vertex ID on local disk
>>> (external merge sort).
>>>
>>> If you have any ideas or suggestions, Pls let me know.
>>>
>>> 1. https://issues.apache.org/jira/browse/HAMA-704
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Discussion for memory and scalability issues of Graph package.

Reply via email to