[ 
https://issues.apache.org/jira/browse/SPARK-23495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AIT OUFKIR updated SPARK-23495:
-------------------------------
                 Flags: Important
    Remaining Estimate: 4h
     Original Estimate: 4h

This issue can create  Major inconsistencies in data

> Creating a json file using a dataframe Generates an issue
> ---------------------------------------------------------
>
>                 Key: SPARK-23495
>                 URL: https://issues.apache.org/jira/browse/SPARK-23495
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.1.0
>            Reporter: AIT OUFKIR
>            Priority: Major
>             Fix For: 2.1.0
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> Issue happen when trying to create json file using a dataframe (see code 
> below)
> from pyspark.sql import SQLContext
> a = ["a1","a2"]
> b = ["b1","b2","b3"]
> c = ["c1","c2","c3", "c4"]
> d = \{'d1':1, 'd2':2}
> e = \{'e1':1, 'e2':2, 'e3':3}
> f = ['f1','f2','f3']
> g = ['g1','g2','g3','g4']
> metadata_dump = dict(asi=a, basi=b, casi = c, dasi=d, fasi=f, 
> gasi=g*{color:#FF0000}, easi=e{color}*)
> md = sqlContext.createDataFrame([metadata_dump]).collect()
> metadata = sqlContext.createDataFrame(md,['asi', 'basi', 
> 'casi','dasi','fasi', 'gasi', 'easi'])
> metadata_path = "/folder/fileNameErr"
> metadata.write.mode('overwrite').json(metadata_path)
> {"{color:#14892c}asi":["a1","a2"],"basi":["b1","b2","b3"],"casi":["c1","c2","c3","c4"],"dasi":{"d1":1,"d2":2{color}},{color:#FF0000}"fasi":\{"e1":1,"e2":2,"e3":3},"gasi":["f1","f2","f3"],"easi":["g1","g2","g3","g4{color}"]}
>  
> when switching the dictionary e
>  
> metadata_dump = dict(asi=a, basi=b, casi = c, dasi=d{color:#FF0000}*, 
> easi=e*{color}, fasi=f, gasi=g)
> md = sqlContext.createDataFrame([metadata_dump]).collect()
> metadata = sqlContext.createDataFrame(md,['asi', 'basi', 'casi','dasi', 
> {color:#FF0000}*'easi',*{color}'fasi', 'gasi'])
> metadata_path = "/folder/fileNameCorr"
> metadata.write.mode('overwrite').json(metadata_path)
> {color:#14892c}{"asi":["a1","a2"],"basi":["b1","b2","b3"],"casi":["c1","c2","c3","c4"],"dasi":\{"d1":1,"d2":2},"easi":\{"e1":1,"e2":2,"e3":3},"fasi":["f1","f2","f3"],"gasi":["g1","g2","g3","g4"]}{color}
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to