Re: About Error while reading large JSON file in Spark

2016-10-19 Thread Steve Loughran
On 18 Oct 2016, at 10:58, Chetan Khatri > wrote: Dear Xi shen, Thank you for getting back to question. The approach i am following are as below: I have MSSQL server as Enterprise data lack. 1. run Java jobs and generated JSON files,

Re: About Error while reading large JSON file in Spark

2016-10-18 Thread Steve Loughran
On 18 Oct 2016, at 08:43, Chetan Khatri > wrote: Hello Community members, I am getting error while reading large JSON file in spark, the underlying read code can't handle more than 2^31 bytes in a single line: if (bytesConsumed >

Re: About Error while reading large JSON file in Spark

2016-10-18 Thread Chetan Khatri
Dear Xi shen, Thank you for getting back to question. The approach i am following are as below: I have MSSQL server as Enterprise data lack. 1. run Java jobs and generated JSON files, every file is almost 6 GB. *Correct spark need every JSON on **separate line, so i did * sed -e 's/}/}\n/g' -s

Re: About Error while reading large JSON file in Spark

2016-10-18 Thread Xi Shen
It is a plain Java IO error. Your line is too long. You should alter your JSON schema, so each line is a small JSON object. Please do not concatenate all the object into an array, then write the array in one line. You will have difficulty handling your super large JSON array in Spark anyway.

About Error while reading large JSON file in Spark

2016-10-18 Thread Chetan Khatri
Hello Community members, I am getting error while reading large JSON file in spark, *Code:* val landingVisitor = sqlContext.read.json("s3n://hist-ngdp/lvisitor/lvisitor-01-aug.json") *Error:* 16/10/18 07:30:30 ERROR Executor: Exception in task 8.0 in stage 0.0 (TID 8) java.io.IOException: Too