A graph is nodes and vertices. What else are you expecting to save/load? You
could save/load the triplets, but that is actually more work to reconstruct the
graph than the nodes and vertices separately.
Dave
From: Gaurav Kumar [mailto:gauravkuma...@gmail.com]
Sent: Friday, November 13, 2015
I have verified that this error exists on my system as well, and the suggested
workaround also works.
Spark version: 1.5.1; 1.5.2
Mesos version: 0.21.1
CDH version: 4.7
I have set up the spark-env.sh to contain HADOOP_CONF_DIR pointing to the
correct place, and I have also linked in the hdfs-si
would be if the AMP Lab or Databricks
maintained a set of benchmarks on the web that showed how much each successive
version of Spark improved.
Dave
From: Madabhattula Rajesh Kumar [mailto:mrajaf...@gmail.com]
Sent: Monday, January 12, 2015 9:24 PM
To: Buttler, David
Subject: Re: GraphX vs
Hi,
I am building a graph from a large CSV file. Each record contains a couple of
nodes and about 10 edges. When I try to load a large portion of the graph,
using multiple partitions, I get inconsistent results in the number of edges
between different runs. However, if I use a single partitio
@spark.apache.org
Cc: user@spark.apache.org
Subject: Re: K-means with large K
David,
Just curious to know what kind of use cases demand such large k clusters
Chester
Sent from my iPhone
On Apr 28, 2014, at 9:19 AM, "Buttler, David"
mailto:buttl...@llnl.gov>> wrote:
Hi,
I am trying to
Hi,
I am trying to run the K-means code in mllib, and it works very nicely with
small K (less than 1000). However, when I try for a larger K (I am looking for
2000-4000 clusters), it seems like the code gets part way through (perhaps just
the initialization step) and freezes. The compute nodes
This sounds like a configuration issue. Either you have not set the MASTER
correctly, or possibly another process is using up all of the cores
Dave
From: ge ko [mailto:koenig@gmail.com]
Sent: Sunday, April 13, 2014 12:51 PM
To: user@spark.apache.org
Subject:
Hi,
I'm still going to start w