The examples in graphx/data are meant to show the input data format, but if you want to play around with larger and more interesting datasets, we've been using the following ones, among others:
- SNAP's web-Google dataset (5M edges): https://snap.stanford.edu/data/web-Google.html - SNAP's soc-LiveJournal1 dataset (69M edges): https://snap.stanford.edu/data/soc-LiveJournal1.html These come in edge list format and, after decompression, can directly be loaded using GraphLoader. Ankur -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/sample-data-for-pagerank-tp2655p2839.html Sent from the Apache Spark User List mailing list archive at Nabble.com.