Hi Jae,

Thanks so much for pointing out that it wasn't directed. I made the changes
and made a directed graph and connected components now works :)

Thanks,
Ghufran


On Wed, Apr 16, 2014 at 7:31 PM, Yu, Jaewook <[email protected]> wrote:

>  Ghufran,
>
>
>
> The Youtube community dataset 
> (com-youtube.ungraph.txt.gz<https://snap.stanford.edu/data/bigdata/communities/com-youtube.ungraph.txt.gz>)
> [1] is formatted as directed graph although the description says it’s
> undirected graph. With some minor changes in your conversion program, you
> should be able to generated a proper undirected adjacency list.
>
>
>
> Hope this will help.
>
>
>
> Thanks,
>
> Jae
>
>
>
> [1] https://snap.stanford.edu/data/com-Youtube.html
>
>
>
> *From:* Yu, Jaewook [mailto:[email protected]]
> *Sent:* Wednesday, April 16, 2014 11:00 AM
> *To:* [email protected]
> *Subject:* RE: Running ConnectedComponents in a cluster.
>
>
>
> Hi Ghufran,
>
>
>
> Have you verified the neighbors of each vertex actually exist? From your
> adjacency list, for example, 278447 278447 532613, is the neighbor’s vertex
> id 532613 valid?
>
>
>
> Thanks,
>
> Jae
>
>
>
>
>
> *From:* ghufran malik 
> [mailto:[email protected]<[email protected]>]
>
> *Sent:* Wednesday, April 16, 2014 9:22 AM
> *To:* [email protected]
> *Subject:* Running ConnectedComponents in a cluster.
>
>
>
> Hi,
>
> I have setup Giraph on my university cluster of computers (Giraph
> 1.1.0-SNAPSHOT-for-hadoop-2.0.0-cdh4.3.1). I've successfully ran the
> connected components algorithm on a very small test dataset using 4 workers
> and it produced the expected output.
>
>
> dataset:
>
> vertex id, vertex value, neighbours....
>
> 0 0 1
> 1 1 0 2 3
> 2 2 1 3
> 3 3 1 2
>
> output:
> 1    0
> 0    0
> 3    0
> 2    0
>
>
>
> However when I tried to run this algorithm on a larger dataset
> (reformatted version of com-youtube.ungraph from Stanford snap to match the
> IntIntNullTextVertexInputFormat) it successfully complets but the incorrect
> output is produced. It seems to just output the vertex id with its orignal
> value (its vertex id is its original value that i set).
>
> A snippet of the dataset is provided:
>
> vertex id, vertex value, neighbours....
> .......
> 278447 278447 532613
> 278449 278449 305447 324115 414238
> 83899 83899 153460 172614 176613 211448
> 773749 773749 845366
> 773748 773748 960388
> .......
>
> output produced:
> .............
> 73132    73132
> 831308    831308
> 199788    199788
> 763644    763644
> 300572    300572
> .............
>
> there's not one vertex value that isn't the same as its original vertex
> ID.
>
> The computation also stops after superstep 0 is done and goes no further,
> whereas on my smaller data set completes 3 supersteps.
>
> Does anyone have an idea to why this is?
>
> Kind regards,
>
> Ghufran
>
>
>

Reply via email to