On Thu, Apr 18, 2019 at 7:43 AM Artashes Aghajanyan < [email protected]> wrote:
> Thank you for the additional insight. > > In my specific case I used a different workaround as I know the size of > the json object arriving, so I was able to just slice my original input > buffer (create another view of the buffer without copying the bytes) and > only then wrap it with an input stream. This way the problem goes away as > it can't consume more from that sliced input stream than there is data in > it. > > In a more generic case, just a small note that returning the data buffered > but not parsed to the user with releaseBuffered() is not free - it > involves copying data. I understand it's a trade off. But can we say that a > more performant approach would be to not give all the data to the parser in > the first place if you don't want it to read and parse everything from the > stream? > I think that is true, although I am not sure memcpy of a partial buffer is significant enough to worry about. But yes in general it is more efficient not to do something at all in the first place. -+ Tatu +- > > On Wednesday, 17 April 2019 21:38:48 UTC-4, Tatu Saloranta wrote: >> >> On Wed, Apr 17, 2019 at 6:34 PM Artashes Aghajanyan >> <[email protected]> wrote: >> > >> > I noticed the following behavior that feels like a bug to me, but I >> want to confirm if it's not by design before opening an issue in github. >> > >> > jackson-databind-2.4.3 >> > >> > Consider the following code fragment: >> > >> > ByteBuf in = ... // a bytebuf containing multiple jsons, e.g. >> {"ack":"a1"}{"ack":"b1"}{"ack":"b3"} >> > ByteBufInputStream inputStream = new ByteBufInputStream(in); >> > Map<String, String> ackMap = mapper.readValue(inputStream, Map.class); >> > >> > After this readValue() call, inputStream becomes empty (nothing left to >> read) but only the first {"ack":"a1"} object is parsed and returned. I >> debugged it a bit, here's what's happening: >> > >> > ObjectMapper.readValue(InputStream src, Class<T> valueType) calls >> _jsonFactory.createParser(src) which calls >> ByteSourceJsonBootstrapper.detectEncoding() which calls >> ByteSourceJsonBootstrapper.ensureLoaded(4). >> > >> > ensureLoaded(4) basically tries to read 4k bytes from the stream. >> > >> > If the input stream contains multiple small (less than 4k?) json >> objects, it reads everything from the stream, just to detect the encoding! >> > >> > The problem with this approach is that once the data is read from the >> stream it is essentially lost for the user of object mapper, so if we have >> a stream that contains a series of small json strings, it'll read all of it >> just to detect the encoding but will only return the first json from >> readValue() call. >> > >> > Since ObjectMapper doesn't "own" the stream, one may expect that it >> won't consume more data from the stream than is necessary to parse one json >> object. >> > >> > I've also tried this with the latest 2.9.8 release and the behavior is >> the same. >> > >> > Is this a bug? >> >> No, it is by design. >> >> Decoding is most efficient directly accessing byte[] for content, and >> overhead for reads from InputStream is non-trivial (depending on type >> of stream). Reads request buffer full of content, although if stream >> returns less whatever is available is consumed first before requesting >> more. >> >> If content buffered needs to be recovered for some reason it is >> available using one of 2 methods: >> >> releaseBuffered(OutputStream) >> releaseBuffered(Writer) >> >> which will then pass buffered but unused content, if any; method to >> call depends on kind of input source parser has been created with. >> >> -+ Tatu +- >> > -- > You received this message because you are subscribed to the Google Groups > "jackson-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "jackson-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. For more options, visit https://groups.google.com/d/optout.
