subject:"Spark SQL Save CSV with JSON Column"

Re: Spark SQL Save CSV with JSON Column

2015-11-24 Thread Davies Liu

I think you could have a Python UDF to turn the properties into JSON string: import simplejson def to_json(row): return simplejson.dumps(row.asDict(recursive=Trye)) to_json_udf = pyspark.sql.funcitons.udf(to_json) df.select("col_1", "col_2", to_json_udf(df.properties)).write.format("com.dat

Spark SQL Save CSV with JSON Column

2015-11-24 Thread Ross.Cramblit

I am generating a set of tables in pyspark SQL from a JSON source dataset. I am writing those tables to disk as CSVs using df.write.format(com.databricks.spark.csv).save(…). I have a schema like: root |-- col_1: string (nullable = true) |-- col_2: string (nullable = true) |-- col_3: timestamp