[ 
https://issues.apache.org/jira/browse/SPARK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193573#comment-14193573
 ] 

William Benton commented on SPARK-4190:
---------------------------------------

I'll take this, since I'm interested in working on it and it seems like a quick 
fix.  [~yhuai], will you be willing to review a WIP PR sometime soon?

> Allow users to provide transformation rules at JSON ingest
> ----------------------------------------------------------
>
>                 Key: SPARK-4190
>                 URL: https://issues.apache.org/jira/browse/SPARK-4190
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.1.0, 1.2.0
>            Reporter: William Benton
>
> It would be great if it were possible to provide transformation rules (to be 
> executed within jsonRDD or jsonFile) so that users could 
>    (1) deal with JSON files that confound schema inference or are otherwise 
> insufficiently disciplined, or
>    (2) simply perform arbitrary object transformations at ingest before a 
> schema is inferred.
> json4s, which Spark already uses, has nice interfaces for specifying 
> transformations as partial functions on objects and accessing nested 
> structures via path expressions.  (We might want to introduce an abstraction 
> atop json4s for a public API, but the json4s API seems like a good first 
> step.)  There are some examples of these transformations at 
> https://github.com/json4s/json4s and at 
> http://chapeau.freevariable.com/2014/10/fedmsg-and-spark.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to