@joe, I'd be glad to help if you need.

Le lun. 4 mai 2015 à 20:06, Matei Zaharia <matei.zaha...@gmail.com> a
écrit :

> I don't know whether this is common, but we might also allow another
> separator for JSON objects, such as two blank lines.
>
> Matei
>
> > On May 4, 2015, at 2:28 PM, Reynold Xin <r...@databricks.com> wrote:
> >
> > Joe - I think that's a legit and useful thing to do. Do you want to give
> it
> > a shot?
> >
> > On Mon, May 4, 2015 at 12:36 AM, Joe Halliwell <joe.halliw...@gmail.com>
> > wrote:
> >
> >> I think Reynold’s argument shows the impossibility of the general case.
> >>
> >> But a “maximum object depth” hint could enable a new input format to do
> >> its job both efficiently and correctly in the common case where the
> input
> >> is an array of similarly structured objects! I’d certainly be
> interested in
> >> an implementation along those lines.
> >>
> >> Cheers,
> >> Joe
> >>
> >> http://www.joehalliwell.com
> >> @joehalliwell
> >>
> >>
> >> On Mon, May 4, 2015 at 7:55 AM, Reynold Xin <r...@databricks.com>
> wrote:
> >>
> >>> I took a quick look at that implementation. I'm not sure if it actually
> >>> handles JSON correctly, because it attempts to find the first {
> starting
> >>> from a random point. However, that random point could be in the middle
> of
> >>> a
> >>> string, and thus the first { might just be part of a string, rather
> than
> >>> a
> >>> real JSON object starting position.
> >>>
> >>>
> >>> On Sun, May 3, 2015 at 11:13 PM, Emre Sevinc <emre.sev...@gmail.com>
> >>> wrote:
> >>>
> >>>> You can check out the following library:
> >>>>
> >>>> https://github.com/alexholmes/json-mapreduce
> >>>>
> >>>> --
> >>>> Emre Sevinç
> >>>>
> >>>>
> >>>> On Sun, May 3, 2015 at 10:04 PM, Olivier Girardot <
> >>>> o.girar...@lateral-thoughts.com> wrote:
> >>>>
> >>>>> Hi everyone,
> >>>>> Is there any way in Spark SQL to load multi-line JSON data
> >>> efficiently, I
> >>>>> think there was in the mailing list a reference to
> >>>>> http://pivotal-field-engineering.github.io/pmr-common/ for its
> >>>>> JSONInputFormat
> >>>>>
> >>>>> But it's rather inaccessible considering the dependency is not
> >>> available
> >>>> in
> >>>>> any public maven repo (If you know of one, I'd be glad to hear it).
> >>>>>
> >>>>> Is there any plan to address this or any public recommendation ?
> >>>>> (considering the documentation clearly states that
> >>> sqlContext.jsonFile
> >>>> will
> >>>>> not work for multi-line json(s))
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Olivier.
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Emre Sevinc
> >>>>
> >>>
> >>
> >>
>
>

Reply via email to