Re: Explode/Flatten Map type Data Using Pyspark
Hi Anbutech in that case you have variable number of columns in output df and then in csv. it will not be the best way to read csv On Fri, 15 Nov 2019 at 2:30 pm, anbutech wrote: > Hello Guha, > > The number of keys will be different for each event id.for example if the > event id:005 it is has 10 keys then i have to flatten all those 10 keys in > the final output.here there is no fixed number of keys for each event id. > > 001 -> 2 keys > > 002 -> 4 keys > > 003 -> 5 keys > > above event id has different key values combinations and different from > other.i want to dynamically flatten the incoming data > > in the ouput s3 csv file(want to write all the flattened keys in the csv > path) > > flatten.csv > > eve_id k1k2 k3 > 001 abc x y > > eve_id, k1 k2 k3 k4 > 002, 12 jack 0.01 0998 > > eve_id, k1 k2k3 k4 k5 > 003, aaa device endpoint - > > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Best Regards, Ayan Guha
Re: Explode/Flatten Map type Data Using Pyspark
Hello Guha, The number of keys will be different for each event id.for example if the event id:005 it is has 10 keys then i have to flatten all those 10 keys in the final output.here there is no fixed number of keys for each event id. 001 -> 2 keys 002 -> 4 keys 003 -> 5 keys above event id has different key values combinations and different from other.i want to dynamically flatten the incoming data in the ouput s3 csv file(want to write all the flattened keys in the csv path) flatten.csv eve_id k1k2 k3 001 abc x y eve_id, k1 k2 k3 k4 002, 12 jack 0.01 0998 eve_id, k1 k2k3 k4 k5 003, aaa device endpoint - -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Explode/Flatten Map type Data Using Pyspark
Hi How do you want your final DF to look like? Is it with all 5 value columns? Do you have a finite set of columns? On Fri, Nov 15, 2019 at 4:50 AM anbutech wrote: > Hello Sir, > > I have a scenario to flatten the different combinations of map type(key > value) in a column called eve_data like below: > > How do we flatten the map type into proper columns using pyspark > > > 1) Source Dataframe having 2 columns(event id,data) > > eve_id,eve_data > 001, "k1":"abc", > "k2":"xyz" > "k3":"10091" > > eve_id,eve_data > > 002, "k1":"12", > "k2":"jack", >"k3":"0.01", >"k4":"0998" > > eve_id,eve_data > > 003, "k1":"aaa", > "k2":"", > "k3":"device", > "k4":"endpoint", > "k5":"-" > > > Final output: > > (flatten the output of each event ids key values).The number of key values > will be different for each event id.so i want to flatten the records for > all > the map type(key values) as below > > eve_id k1 k2 k3 > 001abc xyz 10091 > > eve_id, k1 k2 k3 k4 > 002, 12 jack 0.01 0998 > > eve_id, k1 k2k3 k4 k5 > 003, aaa device endpoint - > > > Thanks > Anbu > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Best Regards, Ayan Guha
Explode/Flatten Map type Data Using Pyspark
Hello Sir, I have a scenario to flatten the different combinations of map type(key value) in a column called eve_data like below: How do we flatten the map type into proper columns using pyspark 1) Source Dataframe having 2 columns(event id,data) eve_id,eve_data 001, "k1":"abc", "k2":"xyz" "k3":"10091" eve_id,eve_data 002, "k1":"12", "k2":"jack", "k3":"0.01", "k4":"0998" eve_id,eve_data 003, "k1":"aaa", "k2":"", "k3":"device", "k4":"endpoint", "k5":"-" Final output: (flatten the output of each event ids key values).The number of key values will be different for each event id.so i want to flatten the records for all the map type(key values) as below eve_id k1 k2 k3 001abc xyz 10091 eve_id, k1 k2 k3 k4 002, 12 jack 0.01 0998 eve_id, k1 k2k3 k4 k5 003, aaa device endpoint - Thanks Anbu -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org