[jackson-user] Re: How to represent JSON to allow efficient manipulations

volkan . yazici Wed, 27 Feb 2019 04:59:04 -0800

Thanks for the reply Burak.

You are indeed right that JsonNode employs standard Java Map, List objects. 
Though note that each JsonNode includes specialized 
serialize(JsonGenerator, SerializerProvider) methods which I think way 
better than reflection-based blind serialization. While this doesn't bring 
any memory efficiency, it certainly yields better processing efficiency. 
Another potential advantage I was considering for JsonNode is taking 
advantage of its object pooling capabilities such as (in pseudo code)


try (var sessionFactory = jsonNodeFactory.clone()) {
    JsonNode oldNode = sessionFactory.read(jsonText);
    JsonNode newNode = transform(sessionFactory, oldNode);
    byte[] newNodeBytes = sessionFactory.write(newNode);
    persist(newNodeBytes);
}

Though I am not sure if this is possible at all.

You mentioned about using Avro to represent JSON in memory. To the best of 
my knowledge, Avro requires a schema. Maybe I am missing something. Would 
you mind elaborating a little bit on your idea here?

Best.

On Wednesday, February 27, 2019 at 12:48:07 PM UTC+1, Burak Emre Kabakcı 
wrote:
>
> 1. JsonNode doesn't actually have any magic under the hood. It uses 
> Map<String, JsonNode> for objects, List<JsonNode> for arrays and wrapper 
> objects for primitive types. Therefore, it's not memory efficient IMO so it 
> actually depends on your priorities:
>    1.1. If you care about the code complexity compared to memory 
> bottleneck or doesn't really have any schema, I would stick with JsonNode 
> as it's feature-rich and well-documented.
>    1.2. If you care about the performance and doesn't really have any 
> option to use class data binding, you may write your custom parser and 
> deserialize the JSON blob into a more compact representation such as Apache 
> Avro rather than JsonNode. If you have a metadata store, you can basically 
> parse the JSON blob, validate it and convert them into Avro instances which 
> will occupy less heap memory. We prefer this approach as it also provides 
> us a native way to validate the JSON blob.
>
> 2. AFAIK, JsonNodeFactory is thread-safe and stateless but it would be 
> better if someone who has experience with it could answer this question.
>
> On Tuesday, February 26, 2019 at 11:48:13 PM UTC+3, [email protected] 
> wrote:
>>
>> Hello,
>>
>> In a code base I have been working on for more than a year, we receive 
>> JSON (as byte[] from queue), we transform JSON, and we persist JSON (to 
>> Elasticsearch). In the initial design, tempted by its convenience, we made 
>> the mistake of representing the JSON as java.lang.Object in this JSON 
>> pipeline. Transformation functions receive Object and spit out Object. 
>> Though this introduced other problems (e.g., certain types like Set<V> do 
>> not map 1-to-1 to JSON, variable types do not properly communicate the 
>> intent of the value, lost caching and object pooling opportunities, etc.) 
>> revealed themselves in pretty late stages of the development. If I would 
>> have been starting from scratch today, I would pick Jackson's JsonNode as 
>> the only allowed JSON type. Though getting off the beaten path might raise 
>> other issues that we cannot oversee right now and this is where my question 
>> comes into play.
>>
>>    1. What are the best practices to represent JSON documents to allow 
>>    efficient read/write/update operations? (Note that in our case there is 
>> no 
>>    schema, hence no class data binding.)
>>    2. If one would use JsonNode, what are the best practices for using 
>>    JsonNodeFactory'ies? A single global factory? A thread-local factory? No 
>>    factory but explicit JsonNode::new calls?
>>
>> Thanks in advance.
>> Best.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"jackson-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

[jackson-user] Re: How to represent JSON to allow efficient manipulations

Reply via email to