I'll try to study that and get back to you. Regards, Olivier.
Le lun. 4 mai 2015 à 04:05, Reynold Xin <r...@databricks.com> a écrit : > How does the pivotal format decides where to split the files? It seems to > me the challenge is to decide that, and on the top of my head the only way > to do this is to scan from the beginning and parse the json properly, which > makes it not possible with large files (doable for whole input with a lot > of small files though). If there is a better way, we should do it. > > > On Sun, May 3, 2015 at 1:04 PM, Olivier Girardot < > o.girar...@lateral-thoughts.com> wrote: > >> Hi everyone, >> Is there any way in Spark SQL to load multi-line JSON data efficiently, I >> think there was in the mailing list a reference to >> http://pivotal-field-engineering.github.io/pmr-common/ for its >> JSONInputFormat >> >> But it's rather inaccessible considering the dependency is not available >> in >> any public maven repo (If you know of one, I'd be glad to hear it). >> >> Is there any plan to address this or any public recommendation ? >> (considering the documentation clearly states that sqlContext.jsonFile >> will >> not work for multi-line json(s)) >> >> Regards, >> >> Olivier. >> > >