Re: sample data for pagerank?

2014-03-18 Thread ankurdave
The examples in graphx/data are meant to show the input data format, but if
you want to play around with larger and more interesting datasets, we've
been using the following ones, among others:

- SNAP's web-Google dataset (5M edges):
https://snap.stanford.edu/data/web-Google.html
- SNAP's soc-LiveJournal1 dataset (69M edges):
https://snap.stanford.edu/data/soc-LiveJournal1.html

These come in edge list format and, after decompression, can directly be
loaded using GraphLoader.

Ankur



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/sample-data-for-pagerank-tp2655p2839.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: sample data for pagerank?

2014-03-13 Thread Mo
You can find it here:
https://github.com/apache/incubator-spark/tree/master/graphx/data


On Thu, Mar 13, 2014 at 10:13 AM, Diana Carroll dcarr...@cloudera.comwrote:

 I'd like to play around with the Page Rank example included with Spark but
 I can't find that any sample data to work with is included.  Am I missing
 it?  Anyone got a sample file to share?

  Thanks,
 Diana