You have to take each row and zip the lists, each element of the result
becomes one new row.
So turn write a method that turns
Row(List("A","B","null"), List("C","D","null"), List("E","null","null"))
into
List(List("A","C","E"), List("B","D","null"), List("null","null","null"))
and use flatmap with that method.
In Scala, this would read:
df.flatMap { row => (row.getSeq[String](0), row.getSeq[String](1),
row.getSeq[String](2)).zipped.toIterable }.show()
Enrico
Am 14.02.23 um 22:54 schrieb sam smith:
Hello guys,
I have the following dataframe:
*col1*
*col2*
*col3*
["A","B","null"]
["C","D","null"]
["E","null","null"]
I want to explode it to the following dataframe:
*col1*
*col2*
*col3*
"A"
"C"
"E"
"B"
"D"
"null"
"null"
"null"
"null"
How to do that (preferably in Java) using the explode() method ?
knowing that something like the following won't yield correct output:
for (String colName: dataset.columns())
dataset=dataset.withColumn(colName,explode(dataset.col(colName)));