+1 

This would be a good value add.
It will result in achieving higher throughput for input operators which needs 
to consume selective json fields.

Thx,
Ashish

> On 23-Mar-2016, at 10:31 AM, Bhupesh Chawda <[email protected]> wrote:
> 
> A multi-line JSON format is very common and is usually the case with REST
> API results.
> I think this could be a valuable addition.
> 
> Regarding the issues that you mentioned, I think it can be solved by having
> a custom file splitter which takes care of splitting on a JSON record
> boundary.
> 
> +1 for a streaming JSON parser.
> 
> ~Bhupesh
> 
> On Wed, Mar 23, 2016 at 5:22 AM, Devendra Tagare <[email protected]>
> wrote:
> 
>> Hi All,
>> 
>> Starting this thread to get opinions for adding a streaming JSON parser for
>> converting a JSON to POJO.This parser would be in addition to the databind
>> parser (com.fasterxml.jackson.databind) we already have.
>> 
>> The advantage of a streaming JSON parser is,
>> 
>> 1.The parser need not parse entire input to set the fields of the POJO.
>> 2.Can be used with multiline JSON records.eg if a user is using the
>> AbstractFileInputOperator to read a file line by line & a JSON is spanning
>> multiple lines, then the existing parser will not work even if the required
>> fields are covered in the single line input.
>> 3.These parsers have the least read/write overhead as compared to databind
>> or tree based parsers.
>> 
>> Please refer http://wiki.fasterxml.com/JacksonStreamingApi for more
>> details.
>> 
>> The disadvantages are (from the documentation)
>> 
>> 1.All content to read/write has to be processed in exact same order as
>> input comes in (or output is to go out) -- for random access, you need to
>> use Data Binding or Tree Model (which both actually use Streaming Api for
>> actual JSON reading/writing).
>> [Dev] This could be tricky if one row of input goes to one partition of the
>> parser and the other one goes to another.
>> [Dev] This also means that we cannot use it with the existing file
>> splitter,since different splits may not go to the same partition of the
>> parser.
>> 
>> 2.No Java objects are created unless specifically requested; and even then
>> only very basic types are supported (Strings, byte[] for base64-encoded
>> binary content)
>> [Dev] Should be fine for the use-cases we are covering.
>> 
>> Please send across your inputs and comments.
>> 
>> Thanks,
>> Dev
>> 

Reply via email to