MatrixFactorizationModel serialization

2014-11-07 Thread Dariusz Kobylarz
I am trying to persist MatrixFactorizationModel (Collaborative Filtering example) and use it in another script to evaluate/apply it. This is the exception I get when I try to use a deserialized model instance: Exception in thread main java.lang.NullPointerException at

MLlib - Naive Bayes Java example bug

2014-11-03 Thread Dariusz Kobylarz
Hi, I noticed a bug in the sample java code in MLlib - Naive Bayes docs page: http://spark.apache.org/docs/1.1.0/mllib-naive-bayes.html In the filter: |double accuracy = 1.0 * predictionAndLabel.filter(new FunctionTuple2Double, Double, Boolean() { @Override public Boolean

saveAsHadoopFile into avro format

2014-09-08 Thread Dariusz Kobylarz
What is the right way of saving any PairRDD into avro output format. GraphArray extends SpecificRecord etc. I have the following java rdd: JavaPairRDDGraphArray, NullWritable pairRDD = ... and want to save it to avro format: org.apache.hadoop.mapred.JobConf jc = new