Re: Decomposing large graphs

Dmitriy Lyubimov Sat, 17 Dec 2011 19:27:57 -0800

also make sure you are not hitting swap. If you specifiy 3G per task,
and you have typical quad core commodity nodes, it probably means that
you have at least 4 map / 4 reduce tasks configured. that's 8 task
total, unless you have ~32G ram on these machines, i think you are in
danger of hitting swap with settings that high when GC gets all
excited and all. I never though i would say that, but maybe you can
run it a little bit more conservatively.


my cluster is cdh3u0.

-Dmitriy

On Sat, Dec 17, 2011 at 7:24 PM, Dmitriy Lyubimov <[email protected]> wrote:
> Sebastian,
>
> I think my commit got screwed somehow. Here's the benchmark very close
> to yours (I also added broadcast option for B thru distributed cache
> which seems to have been a 30% win in my situation, ok, i leave it on
> by default).
>
> input: 4.5M x 4.5M sparse matrix with 40 non-zero elements per row, total size
>
> -rw-r--r--   3 dmitriy supergroup 2353201134 2011-12-17 17:17
> /user/dmitriy/drm-sparse.seq
>
> This is probably a little smaller than your wikipedia test in terms of
> dimensions but not in size.
>
> ssvd -i /user/dmitriy/drm-sparse.seq -o /user/dmitriy/SSVD-OUT
> --tempDir /user/dmitriy/temp -ow -br false -k 10 -p 1 -t 20 -q 1
> -Dmapred.child.java.opts="-Xmx800m"
>
> so this is for 10 useful eigenvalues and U,V with 1 power iteration:
>
> with broadcast option:
> QJob 1 min 11s
> BtJob 1 min 58s
> AB' Job 2 min 6s
> B'Job 1 min 51 s
>
> U Job - 0m 12 s, V job - 20 s (U and V run in parallel, not sequential)
>
> without broadcast option (only affects AB' step)
>
> QJob 1 min 12s
> BtJob 1 min 59s
> AB' Job 3 min 3s
> B'Job 2 min 0 s
>
> 10 nodes 4 cpu each. (I had 35 splits of original input, so everything
> got allocated). It also runs our QA stuff starting something else
> almost every minute. frequent other jobs so while not loaded, job/task
> trackers are fairly busy setting up/tearing down.
>
> I'll commit shortly.
>
> On Sat, Dec 17, 2011 at 2:58 PM, Sebastian Schelter <[email protected]> wrote:
>> On 17.12.2011 17:27, Dmitriy Lyubimov wrote:
>>> Interesting.
>>>
>>> Well so how did your decomposing go?
>>
>> I tested the decomposition of the wikipedia pagelink graph (130M edges,
>> 5.6M vertices making approx. quarter of a billion non-zeros in the
>> symmetric adjacency matrix) on a 6 machine hadoop cluster.
>>
>> Got these running times for k = 10, p = 5 and one power-iteration:
>>
>> Q-job    1mins, 41sec
>> Bt-job   9mins, 30sec
>> ABt-job  37mins, 22sec
>> Bt-job   9mins, 41sec
>> U-job    30sec
>>
>> I think I'd need a couple more machines to handle the twitter graph
>> though...
>>
>> --sebastian
>>
>>
>>> On Dec 17, 2011 6:00 AM, "Sebastian Schelter" <[email protected]>
>>> wrote:
>>>
>>>> Hi there,
>>>>
>>>> I played with Mahout to decompose the adjacency matrices of large graphs
>>>> lately. I stumbled on a paper of Christos Faloutsos that describes a
>>>> variation of the Lanczos algorithm they use for this on top of Hadoop.
>>>> They even explicitly mention Mahout:
>>>>
>>>> "Very recently(March 2010), the Mahout project [2] provides
>>>> SVD on top of HADOOP. Due to insufficient documentation, we were not
>>>> able to find the input format and run a head-to-head comparison. But,
>>>> reading the source code, we discovered that Mahout suffers from two
>>>> major issues: (a) it assumes that the vector (b, with n=O(billion)
>>>> entries) fits in the memory of a single machine, and (b) it implements
>>>> the full re-orthogonalization which is inefficient."
>>>>
>>>> http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf
>>>>
>>>> --sebastian
>>>>
>>>
>>

Re: Decomposing large graphs

Reply via email to