Spark 1.4 MLLib Bug?: Multiclass Classification requirement failed: sizeInBytes was negative

2015-07-03 Thread Danny Linden
hi, i want to run a multiclass classification with 390 classes on120k label points(tf-idf vectors). but i get the following exception. If i reduce the number of classes to ~20 everythings work fine. How can i fix this? i use the LogisticRegressionWithLBFGS class for my classification on a 8

Re: which mllib algorithm for large multi-class classification?

2015-06-24 Thread Danny Linden
Hi, here the Stack trace, thx for every help: 15/06/24 23:15:26 INFO DAGScheduler: Submitting ShuffleMapStage 19 (MapPartitionsRDD[49] at treeAggregate at LBFGS.scala:218), which has no missing parents [error] (dag-scheduler-event-loop) java.lang.OutOfMemoryError: Requested array size exceeds

New Spark Meetup group in Munich

2015-06-22 Thread Danny Linden
in special topics about Spark. It would be nice if someone can add our meetup group to the spark website (http://spark.apache.org/community.html) :) You find us here: http://www.meetup.com/de/Spark-Munich/ http://www.meetup.com/de/Spark-Munich/ Thanks, Danny Linden