The examples in graphx/data are meant to show the input data format, but if
you want to play around with larger and more interesting datasets, we've
been using the following ones, among others:
- SNAP's web-Google dataset (5M edges):
https://snap.stanford.edu/data/web-Google.html
- SNAP's soc-LiveJournal1 dataset (69M edges):
https://snap.stanford.edu/data/soc-LiveJournal1.html
These come in edge list format and, after decompression, can directly be
loaded using GraphLoader.
Ankur
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/sample-data-for-pagerank-tp2655p2839.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.