Hi, as far as I understand, if you create an RDD with a relational structure from your JSON, you should be able to do much of that already today. For example, take lift-json's deserializer and do something like
val json_table: RDD[MyCaseClass] = json_data.flatMap(json => json.extractOpt[MyCaseClass]) then I guess you can use Spark SQL on that. (Something like your likes[2] query won't work, though, I guess.) Regards Tobias On Thu, May 22, 2014 at 5:32 AM, Nicholas Chammas <nicholas.cham...@gmail.com> wrote: > Looking forward to that update! > > Given a table of JSON objects like this one: > > { > "name": "Nick", > "location": { > "x": 241.6, > "y": -22.5 > }, > "likes": ["ice cream", "dogs", "Vanilla Ice"] > } > > It would be SUPER COOL if we could query that table in a way that is as > natural as follows: > > SELECT DISTINCT name > FROM json_table; > > SELECT MAX(location.x) > FROM json_table; > > SELECT likes[2] -- Ice Ice Baby > FROM json_table > WHERE name = "Nick"; > > Of course, this is just a hand-wavy suggestion of how I’d like to be able to > query JSON (particularly that last example) using SQL. I’m interested in > seeing what y’all come up with. > > A large part of what my team does is make it easy for analysts to explore > and query JSON data using SQL. We have a fairly complex home-grown process > to do that and are looking to replace it with something more out of the box. > So if you’d like more input on how users might use this feature, I’d be glad > to chime in. > > Nick > > > > On Wed, May 21, 2014 at 11:21 AM, Michael Armbrust <mich...@databricks.com> > wrote: >> >> You can already extract fields from json data using Hive UDFs. We have an >> intern working on on better native support this summer. We will be sure to >> post updates once there is a working prototype. >> >> Michael >> >> >> On Tue, May 20, 2014 at 6:46 PM, Nick Chammas <nicholas.cham...@gmail.com> >> wrote: >>> >>> The Apache Drill home page has an interesting heading: "Liberate Nested >>> Data". >>> >>> Is there any current or planned functionality in Spark SQL or Shark to >>> enable SQL-like querying of complex JSON? >>> >>> Nick >>> >>> >>> ________________________________ >>> View this message in context: Using Spark to analyze complex JSON >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> >