Hi Peter,

Thank you for the clarification.

Now we need to store each JSON object into one line. Is there any
limitation of length of JSON object? So, JSON object will not go to the
next line.

What will happen if JSON object is a big/huge one?  Will it store in a
single line in HDFS?

What will happen, if JSON object contains BLOB/CLOB value? Is this entire
JSON object stores in single line of HDFS?

What will happen, if JSON object exceeding the HDFS block size. For
example, single JSON object split into two different worker nodes. In this
case, How Spark will read this JSON object?

Could you please clarify above questions

Regards,
Rajesh


On Mon, Dec 15, 2014 at 6:52 PM, Peter Vandenabeele <pe...@vandenabeele.com>
wrote:
>
>
>
> On Sat, Dec 13, 2014 at 5:43 PM, Helena Edelson <
> helena.edel...@datastax.com> wrote:
>
>> One solution can be found here:
>> https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#json-datasets
>>
>>
> As far as I understand, the people.json file is not really a proper json
> file, but a file documented as:
>
>   "... JSON files where each line of the files is a JSON object.".
>
> This means that is a file with multiple lines, but each line needs to have
> a fully self-contained JSON object
> (initially confusing, this will not parse a standard multi-line JSON
> file). We are working to clarify this in
> https://github.com/apache/spark/pull/3517
>
> HTH,
>
> Peter
>
>
>
>
>> - Helena
>> @helenaedelson
>>
>> On Dec 13, 2014, at 11:18 AM, Madabhattula Rajesh Kumar <
>> mrajaf...@gmail.com> wrote:
>>
>> Hi Team,
>>
>> I have a large JSON file in Hadoop. Could you please let me know
>>
>> 1. How to read the JSON file
>> 2. How to parse the JSON file
>>
>> Please share any example program based on Scala
>>
>> Regards,
>> Rajesh
>>
>>
>>
>
>
> --
> Peter Vandenabeele
> http://www.allthingsdata.io
> http://www.linkedin.com/in/petervandenabeele
> https://twitter.com/peter_v
> gsm: +32-478-27.40.69
> e-mail: pe...@vandenabeele.com
> skype: peter_v_be
>

Reply via email to