This limit is due to underlying inputFormat implementation. you can always write your own inputFormat and then use spark newAPIHadoopFile api to pass your inputFormat class path. You will have to place the jar file in /lib location on all the nodes..
Ashish On Sun, May 8, 2016 at 4:02 PM, Hyukjin Kwon <gurwls...@gmail.com> wrote: > > I remember this Jira, https://issues.apache.org/jira/browse/SPARK-7366. > Parsing multiple lines are not supported in Json fsta source. > > Instead this can be done by sc.wholeTextFiles(). I found some examples > here, > http://searchdatascience.com/spark-adventures-1-processing-multi-line-json-files > > Although this reads a file as a whole record, this should work. > > Thanks! > On 9 May 2016 7:20 a.m., "KhajaAsmath Mohammed" <mdkhajaasm...@gmail.com> > wrote: > >> Hi, >> >> I am working on parsing the json in spark but most of the information >> available online states that I need to have entire JSON in single line. >> >> In my case, Json file is delivered in complex structure and not in a >> single line. could anyone know how to process this in SPARK. >> >> I used Jackson jar to process json and was able to do it when it is >> present in single line. Any ideas? >> >> Thanks, >> Asmath >> >