[
https://issues.apache.org/jira/browse/SPARK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193598#comment-14193598
]
Yin Huai commented on SPARK-4190:
---------------------------------
Yeah, no problem
> Allow users to provide transformation rules at JSON ingest
> ----------------------------------------------------------
>
> Key: SPARK-4190
> URL: https://issues.apache.org/jira/browse/SPARK-4190
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.1.0, 1.2.0
> Reporter: William Benton
> Assignee: William Benton
>
> It would be great if it were possible to provide transformation rules (to be
> executed within jsonRDD or jsonFile) so that users could
> (1) deal with JSON files that confound schema inference or are otherwise
> insufficiently disciplined, or
> (2) simply perform arbitrary object transformations at ingest before a
> schema is inferred.
> json4s, which Spark already uses, has nice interfaces for specifying
> transformations as partial functions on objects and accessing nested
> structures via path expressions. (We might want to introduce an abstraction
> atop json4s for a public API, but the json4s API seems like a good first
> step.) There are some examples of these transformations at
> https://github.com/json4s/json4s and at
> http://chapeau.freevariable.com/2014/10/fedmsg-and-spark.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]