Hello,
I am trying to debug a PySpark program and quite frankly, I am stumped.
I see the following error in the logs. I verified the input parameters - all
appear to be in order. Driver and executors appear to be proper - about 3MB of
7GB being used on each node.
I do see that the DAG plan that i
Hi, I am training a GMM with 10 gaussians on a 4 GB dataset(720,000 * 760).
The spark (1.3.1) job is allocated 120 executors with 6GB each and the
driver also has 6GB.
Spark Config Params:
.set("spark.hadoop.validateOutputSpecs",
"false").set("spark.dynamicAllocation.enabled",
"false").set("spark.