Hello~
I was running some pagerank tests of GraphX in my 8 nodes cluster. I
allocated each worker 32G memory and 8 CPU cores. The LiveJournal dataset
used 370s, which in my mind is reasonable. But when I tried the
com-Friendster data ( http://snap.stanford.edu/data/com-Friendster.html )
with 65608366 nodes and 1806067135 edges, it took more than 70 hours and is
still running. I'm not sure what caused such a strange phenomenon, the
graph's structure or some unrealized properties of GraphX?
Thanks~
 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Long-running-time-for-GraphX-pagerank-in-dataset-com-Friendster-tp4511.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to