On Wed, Feb 27, 2019 at 4:58 AM <[email protected]> wrote:
>
> Thanks for the reply Burak.
>
> You are indeed right that JsonNode employs standard Java Map, List objects.
> Though note that each JsonNode includes specialized serialize(JsonGenerator,
> SerializerProvider) methods which I think way better than reflection-based
> blind serialization. While this >
doesn't bring any memory efficiency, it certainly yields better
processing efficiency. > Another
For what it is worth, various processing models ("untyped" Objects,
JsonNode "trees") are benchmarked f.ex here:
https://github.com/FasterXML/jackson-benchmarks
and specifically
https://github.com/FasterXML/jackson-benchmarks/blob/master/results-pojo-2.9-home.txt
shows results.
Results for reading/writing speeds between two approaches:
c.f.j.p.json.JsonStdReadVanilla.readUntypedMediaItem thrpt 100
409050.888 ± 3215.011 ops/s
c.f.j.p.json.JsonStdReadVanilla.readNodeMediaItem thrpt 100 408543.442
± 4742.736 ops/s
c.f.j.p.json.JsonStdWriteVanilla.writeUntypedMediaItem thrpt 100
743788.893 ± 5046.760 ops/s
c.f.j.p.json.JsonStdWriteVanilla.writeNodeMediaItem thrpt 100
806293.628 ± 4913.396 ops/s
suggests that reading is essentially as fast in both cases; and for
writing `JsonNode` is marginally faster (5-10%) for test payload.
> potential advantage I was considering for JsonNode is taking advantage of its
> object pooling capabilities such as (in pseudo code)
>
> try (var sessionFactory = jsonNodeFactory.clone()) {
> JsonNode oldNode = sessionFactory.read(jsonText);
> JsonNode newNode = transform(sessionFactory, oldNode);
> byte[] newNodeBytes = sessionFactory.write(newNode);
> persist(newNodeBytes);
> }
>
> Though I am not sure if this is possible at all.
`JsonNodeFactory` is indeed thread-safe and does no object pooling:
the only reuse is for singleton values of `true`, `false` and `null`
(with `NullNode`).
As to memory use `JsonNode` has additional wrapper object for
`Number`s and `String`s compared to `Object` approach but that's
probably irrelevant for most usage.
I think the main question to me would be convenience: for traversal
and modifications `JsonNode` is superior to dealing with `Map`s and
`List`s, especially when using null-safe "path()" and "at()" methods,
with which you can safely traverse paths (results include
`MissongNode` if there's no value at specified location.
And with `at()` you can use `JsonPointer` which is both convenient and
efficient (you can reuse thread-safe pointer instances, although even
creating one from String is quite well optimized).
> You mentioned about using Avro to represent JSON in memory. To the best of my
> knowledge, Avro requires a schema. Maybe I am missing something. Would you
> mind elaborating a little bit on your idea here?
>
> Best.
-+ Tatu +-
>
> On Wednesday, February 27, 2019 at 12:48:07 PM UTC+1, Burak Emre Kabakcı
> wrote:
>>
>> 1. JsonNode doesn't actually have any magic under the hood. It uses
>> Map<String, JsonNode> for objects, List<JsonNode> for arrays and wrapper
>> objects for primitive types. Therefore, it's not memory efficient IMO so it
>> actually depends on your priorities:
>> 1.1. If you care about the code complexity compared to memory bottleneck
>> or doesn't really have any schema, I would stick with JsonNode as it's
>> feature-rich and well-documented.
>> 1.2. If you care about the performance and doesn't really have any option
>> to use class data binding, you may write your custom parser and deserialize
>> the JSON blob into a more compact representation such as Apache Avro rather
>> than JsonNode. If you have a metadata store, you can basically parse the
>> JSON blob, validate it and convert them into Avro instances which will
>> occupy less heap memory. We prefer this approach as it also provides us a
>> native way to validate the JSON blob.
>>
>> 2. AFAIK, JsonNodeFactory is thread-safe and stateless but it would be
>> better if someone who has experience with it could answer this question.
>>
>> On Tuesday, February 26, 2019 at 11:48:13 PM UTC+3, [email protected]
>> wrote:
>>>
>>> Hello,
>>>
>>> In a code base I have been working on for more than a year, we receive JSON
>>> (as byte[] from queue), we transform JSON, and we persist JSON (to
>>> Elasticsearch). In the initial design, tempted by its convenience, we made
>>> the mistake of representing the JSON as java.lang.Object in this JSON
>>> pipeline. Transformation functions receive Object and spit out Object.
>>> Though this introduced other problems (e.g., certain types like Set<V> do
>>> not map 1-to-1 to JSON, variable types do not properly communicate the
>>> intent of the value, lost caching and object pooling opportunities, etc.)
>>> revealed themselves in pretty late stages of the development. If I would
>>> have been starting from scratch today, I would pick Jackson's JsonNode as
>>> the only allowed JSON type. Though getting off the beaten path might raise
>>> other issues that we cannot oversee right now and this is where my question
>>> comes into play.
>>>
>>> What are the best practices to represent JSON documents to allow efficient
>>> read/write/update operations? (Note that in our case there is no schema,
>>> hence no class data binding.)
>>> If one would use JsonNode, what are the best practices for using
>>> JsonNodeFactory'ies? A single global factory? A thread-local factory? No
>>> factory but explicit JsonNode::new calls?
>>>
>>> Thanks in advance.
>>> Best.
>
> --
> You received this message because you are subscribed to the Google Groups
> "jackson-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"jackson-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.