RE: Graphx hangs and crashes on EdgeRDD creation

2015-10-06 Thread William Saar
); graph.connectedComponents().vertices From: Robin East [mailto:robin.e...@xense.co.uk] Sent: den 5 oktober 2015 19:07 To: William Saar <william.s...@king.com>; user@spark.apache.org Subject: Re: Graphx hangs and crashes on EdgeRDD creation Have you tried using Graph.partitionBy? e.g.

Graphx hangs and crashes on EdgeRDD creation

2015-10-05 Thread William Saar
Hi, I am trying to run a GraphX job on 20 million edges with Spark 1.5.1, but the job seems to hang for 30 minutes on a single executor when creating the graph and eventually crashes with "IllegalArgumentException: Size exceeds Integer.MAX_VALUE" I suspect this is because of partitioning