sql:   select inline(arrays_zip(col1, col2, col3)) as (c1, c2, c3) from t1


---- Replied Message ----
| From | Enrico Minack<i...@enrico.minack.dev> |
| Date | 02/16/2023 16:06 |
| To | <user@spark.apache.org> ,
sam smith<qustacksm2123...@gmail.com> |
| Subject | Re: How to explode array columns of a dataframe having the same 
length |
You have to take each row and zip the lists, each element of the result becomes 
one new row.


So turn write a method that turns

  Row(List("A","B","null"), List("C","D","null"), List("E","null","null"))

into
  List(List("A","C","E"), List("B","D","null"), List("null","null","null"))

and use flatmap with that method.



In Scala, this would read:


df.flatMap { row => (row.getSeq[String](0), row.getSeq[String](1), 
row.getSeq[String](2)).zipped.toIterable }.show()


Enrico




Am 14.02.23 um 22:54 schrieb sam smith:

Hello guys,


I have the following dataframe:


|
|

col1

|

col2

|

col3

|
|

["A","B","null"]

|

["C","D","null"]

|

["E","null","null"]

|
|
|
|
I want to explode it to the following dataframe:


|

col1

|

col2

|

col3

|
|

"A"

|

"C"

|

"E"

|
|

"B"

|

"D"

|

"null"

|
|

"null"

|

"null"

|

"null"

|


How to do that (preferably in Java) using the explode() method ? knowing that 
something like the following won't yield correct output:


for (String colName: dataset.columns())
    dataset=dataset.withColumn(colName,explode(dataset.col(colName)));






Reply via email to