[jira] [Commented] (SPARK-17969) I think it's user unfriendly to process standard json file with DataFrame

2016-10-17 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581607#comment-15581607
 ] 

Apache Spark commented on SPARK-17969:
--

User 'codlife' has created a pull request for this issue:
https://github.com/apache/spark/pull/15511

> I think it's user unfriendly to process standard json file with DataFrame 
> --
>
> Key: SPARK-17969
> URL: https://issues.apache.org/jira/browse/SPARK-17969
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 2.0.1
>Reporter: Jianfei Wang
>Priority: Minor
>
> Currently, with DataFrame API,  we can't load standard json file directly, 
> maybe we can provide an override method to process this, the logic is as 
> below:
> ```
> val df = spark.sparkContext.wholeTextFiles("data/test.json") 
>  val json_rdd = df.map( x => x.toString.replaceAll("\\s+","")).map{ x => 
>   val index = x.indexOf(',') 
>   x.substring(index + 1, x.length - 1) 
> } 
> val json_df = spark.read.json(json_rdd) 
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17969) I think it's user unfriendly to process standard json file with DataFrame

2016-10-17 Thread Jianfei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581357#comment-15581357
 ] 

Jianfei Wang commented on SPARK-17969:
--

I can do this mini job. thank you!

> I think it's user unfriendly to process standard json file with DataFrame 
> --
>
> Key: SPARK-17969
> URL: https://issues.apache.org/jira/browse/SPARK-17969
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 2.0.1
>Reporter: Jianfei Wang
>Priority: Minor
>
> Currently, with DataFrame API,  we can't load standard json file directly, 
> maybe we can provide an override method to process this, the logic is as 
> below:
> ```
> val df = spark.sparkContext.wholeTextFiles("data/test.json") 
>  val json_rdd = df.map( x => x.toString.replaceAll("\\s+","")).map{ x => 
>   val index = x.indexOf(',') 
>   x.substring(index + 1, x.length - 1) 
> } 
> val json_df = spark.read.json(json_rdd) 
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17969) I think it's user unfriendly to process standard json file with DataFrame

2016-10-16 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581273#comment-15581273
 ] 

Reynold Xin commented on SPARK-17969:
-

+1

It would be good to have a mode in which each file is a single JSON object.


> I think it's user unfriendly to process standard json file with DataFrame 
> --
>
> Key: SPARK-17969
> URL: https://issues.apache.org/jira/browse/SPARK-17969
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 2.0.1
>Reporter: Jianfei Wang
>Priority: Minor
>
> Currently, with DataFrame API,  we can't load standard json file directly, 
> maybe we can provide an override method to process this, the logic is as 
> below:
> ```
> val df = spark.sparkContext.wholeTextFiles("data/test.json") 
>  val json_rdd = df.map( x => x.toString.replaceAll("\\s+","")).map{ x => 
>   val index = x.indexOf(',') 
>   x.substring(index + 1, x.length - 1) 
> } 
> val json_df = spark.read.json(json_rdd) 
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org