Hi All, I would like to apply a regression to my data. One of the workflow is the prepare my data as a JavaRDD<LabeledPoint> starting from a Dataset<Row> with its header. So, what I did was the following:
== Step 1: transform the Dataset<Row> into JavaRDD<Row> JavaRDD<Row> dataPointsWithHeader =modelDS.toJavaRDD(); == Step 2: take the first row (I was thinking that it was the header) Row header= dataPointsWithHeader.first(); == Step 3: eliminate the row header by JavaRDD<Row> dataPointsWithoutHeader = dataPointsWithHeader.filter((Row row) -> { return !row.equals(header); }); The issue with the above approach are: a) the result of the Step 2 is not the header row; b) the application of the Step 3 is very inefficient in case there is a way to access to the header. My question is: Is the an efficient way to access to the header and eliminate it ? Many Thanks in advance for your help and suggestion. Regards, Carlo -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org