I'm trying to generate multiple rows from a single row I have schema
Name Id Date 0100 0200 0300 0400 and would like to make it into a vertical format with schema Name Id Date Time I have the code below and get the error Caused by: java.lang.RuntimeException: org.apache.spark.sql.catalyst.expressions.GenericRow is not a valid external type for schema of string at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) StructType schemata = DataTypes.createStructType( new StructField[]{ DataTypes.createStructField("Name", DataTypes.StringType, false), DataTypes.createStructField("Id", DataTypes.StringType, false), DataTypes.createStructField("Date", DataTypes.StringType, false), DataTypes.createStructField("Time", DataTypes.StringType, false) } ); ExpressionEncoder<Row> encoder = RowEncoder.apply(schemata); Dataset<Row> modifiedRDD = intervalDF.flatMap(new FlatMapFunction<Row,Row>() { @Override public Iterator<Row> call (Row row) throws Exception { List<Row> rowList = new ArrayList<Row>(); String[] timeList = {"0100", "0200", "0300", "0400"} for (String time : timeList) { Row r1 = RowFactory.create(row.<String>getAs("sdp_id"), "WGL", row.<String>getAs("Name"), row.<String>getAs("Id"), row.<String>getAs("Date"), timeList[0], row.<String>getAs(timeList[0])); //updated row by creating new Row rowList.add(RowFactory.create(r1)); } return rowList.iterator(); } }, encoder); modifiedRDD.write().csv("file:///Users/mod/out"); -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org