subject:"collecting fails \- requirements for collecting \(clone, hashCode etc\?\)"

RE: collecting fails - requirements for collecting (clone, hashCode etc?)

2014-12-03 Thread Ron Ayoub

I didn't realize I do get a nice stack trace if not running in debug mode. 
Basically, I believe Document has to be serializable. 
But since the question has already been asked, are the other requirements for 
objects within an RDD that I should be aware of. serializable is very 
understandable. How about clone, hashCode, etc...

From: ronalday...@live.com
To: user@spark.apache.org
Subject: collecting fails - requirements for collecting (clone, hashCode etc?)
Date: Wed, 3 Dec 2014 07:48:53 -0600

The following code is failing on the collect. If I don't do the collect and go 
with a JavaRDD it works fine. Except I really would like to collect. 
At first I was getting an error regarding JDI threads and an index being 0. 
Then it just started locking up. I'm running the spark context locally on 8 
cores. 

long count = documents.filter(d -> d.getFeatures().size() > 
Parameters.MIN_CENTROID_FEATURES).count();  List 
sampledDocuments = documents.filter(d -> d.getFeatures().size() > 
Parameters.MIN_CENTROID_FEATURES)  .sample(false, 
samplingFraction(count)).collect();

collecting fails - requirements for collecting (clone, hashCode etc?)

2014-12-03 Thread Ron Ayoub

The following code is failing on the collect. If I don't do the collect and go 
with a JavaRDD it works fine. Except I really would like to collect. 
At first I was getting an error regarding JDI threads and an index being 0. 
Then it just started locking up. I'm running the spark context locally on 8 
cores. 

long count = documents.filter(d -> d.getFeatures().size() > 
Parameters.MIN_CENTROID_FEATURES).count();  List 
sampledDocuments = documents.filter(d -> d.getFeatures().size() > 
Parameters.MIN_CENTROID_FEATURES)  .sample(false, 
samplingFraction(count)).collect();

RE: collecting fails - requirements for collecting (clone, hashCode etc?)

collecting fails - requirements for collecting (clone, hashCode etc?)

2 matches

Site Navigation

Mail list logo

Footer information