Re: Write custom JSON from DataFrame in PySpark

2023-05-04 Thread Marco Costantini
Hi Enrico, What a great answer. Thank you. Seems like I need to get comfortable with the 'struct' and then I will be golden. Thank you again, friend. Marco. On Thu, May 4, 2023 at 3:00 AM Enrico Minack wrote: > Hi, > > You could rearrange the DataFrame so that writing the DataFrame as-is > prod

Re: Write custom JSON from DataFrame in PySpark

2023-05-04 Thread Enrico Minack
Hi, You could rearrange the DataFrame so that writing the DataFrame as-is produces your structure: df = spark.createDataFrame([(1, "a1"), (2, "a2"), (3, "a3")], "id int, datA string") +---++ | id|datA| +---++ |  1|  a1| |  2|  a2| |  3|  a3| +---++ df2 = df.select(df.id, struct(df.d

Write custom JSON from DataFrame in PySpark

2023-05-03 Thread Marco Costantini
Hello, Let's say I have a very simple DataFrame, as below. +---++ | id|datA| +---++ | 1| a1| | 2| a2| | 3| a3| +---++ Let's say I have a requirement to write this to a bizarre JSON structure. For example: { "id": 1, "stuff": { "datA": "a1" } } How can I achieve this