Hi I am trying something like
final Dataset<String> df = spark.read().csv("src/main/resources/star2000.csv").select("_c1").as(Encoders.STRING()); final Dataset<ArrayList> arrayListDataset = df.mapPartitions(new MapPartitionsFunction<String, ArrayList>() { @Override public Iterator<ArrayList> call(Iterator<String> iterator) throws Exception { ArrayList<String> s = new ArrayList<>(); iterator.forEachRemaining(it -> s.add(it)); return Iterators.singletonIterator(s); } }, Encoders.javaSerialization(ArrayList.class)); JavaEsSparkSQL.saveToEs(arrayListDataset,"spark/docs"); Is there a better/performant way of building arrayListDataset above. Rohit