Re: Spark sql and csv data processing question

2015-05-16 Thread Don Drake
Your parenthesis don't look right as you're embedding the filter on the Row.fromSeq(). Try this: val trainRDD = rawTrainData .filter(!_.isEmpty) .map(rawRow = Row.fromSeq(rawRow.split(,))) .filter(_.length == 15) .map(_.toString).map(_.trim) -Don On Fri,

Spark sql and csv data processing question

2015-05-15 Thread Mike Frampton
Hi Im getting the following error when trying to process a csv based data file. Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 10.0 failed 4 times, most recent failure: Lost task 1.3 in stage 10.0 (TID 262,