Joe - I think that's a legit and useful thing to do. Do you want to give it a shot?
On Mon, May 4, 2015 at 12:36 AM, Joe Halliwell <joe.halliw...@gmail.com> wrote: > I think Reynold’s argument shows the impossibility of the general case. > > But a “maximum object depth” hint could enable a new input format to do > its job both efficiently and correctly in the common case where the input > is an array of similarly structured objects! I’d certainly be interested in > an implementation along those lines. > > Cheers, > Joe > > http://www.joehalliwell.com > @joehalliwell > > > On Mon, May 4, 2015 at 7:55 AM, Reynold Xin <r...@databricks.com> wrote: > >> I took a quick look at that implementation. I'm not sure if it actually >> handles JSON correctly, because it attempts to find the first { starting >> from a random point. However, that random point could be in the middle of >> a >> string, and thus the first { might just be part of a string, rather than >> a >> real JSON object starting position. >> >> >> On Sun, May 3, 2015 at 11:13 PM, Emre Sevinc <emre.sev...@gmail.com> >> wrote: >> >> > You can check out the following library: >> > >> > https://github.com/alexholmes/json-mapreduce >> > >> > -- >> > Emre Sevinç >> > >> > >> > On Sun, May 3, 2015 at 10:04 PM, Olivier Girardot < >> > o.girar...@lateral-thoughts.com> wrote: >> > >> > > Hi everyone, >> > > Is there any way in Spark SQL to load multi-line JSON data >> efficiently, I >> > > think there was in the mailing list a reference to >> > > http://pivotal-field-engineering.github.io/pmr-common/ for its >> > > JSONInputFormat >> > > >> > > But it's rather inaccessible considering the dependency is not >> available >> > in >> > > any public maven repo (If you know of one, I'd be glad to hear it). >> > > >> > > Is there any plan to address this or any public recommendation ? >> > > (considering the documentation clearly states that >> sqlContext.jsonFile >> > will >> > > not work for multi-line json(s)) >> > > >> > > Regards, >> > > >> > > Olivier. >> > > >> > >> > >> > >> > -- >> > Emre Sevinc >> > >> > >