Re: Improve "direct" serialization

Vladimir Ozerov Sat, 15 Oct 2016 01:16:16 -0700

Consider you have a message of 100 bytes. If it is split in between 2
buffers, you can write it as follows:


[Chunk 1]
65 - 65 bytes to follow
false - incomplete flag
data - 65 bytes

[Chunk 2]
35 - 35 bytes to follow
true - complete flag, this is the end
data - 35 bytes

15 окт. 2016 г. 1:06 пользователь "Valentin Kulichenko" <
valentin.kuliche...@gmail.com> написал:

> Vova,
>
> You still can't write the first buffer to channel before the whole message
> is serialized. Are you going to have multiple write buffers? I'm confused
> :)
>
> -Val
>
> On Fri, Oct 14, 2016 at 1:49 PM, Vladimir Ozerov <voze...@gridgain.com>
> wrote:
>
> > Val,
> >
> > No need to copy on write. You simply reserve 5 bytes before writing the
> > object, and put them after object is finished.
> >
> > If object is split between two buffers, you set special marker, meaning
> > that the next part is to follow in the next chunk.
> >
> > Vladimir.
> >
> > 14 окт. 2016 г. 22:36 пользователь "Valentin Kulichenko" <
> > valentin.kuliche...@gmail.com> написал:
> >
> > > Vova,
> > >
> > > I meant the copy on write. To know the length you need to fully marshal
> > > message first. This means that you will always copy the array before
> > > writing to channel. Unless I'm missing something, this eliminates the
> > > purpose of direct serialization.
> > >
> > > -Val
> > >
> > > On Thu, Oct 13, 2016 at 11:09 PM, Vladimir Ozerov <
> voze...@gridgain.com>
> > > wrote:
> > >
> > > > Valya,
> > > >
> > > > Yes, in this design we will copy data into separate buffer on read.
> But
> > > > what is important - it will happen only for message which is split
> > > between
> > > > buffers.
> > > >
> > > > On Fri, Oct 14, 2016 at 2:33 AM, Valentin Kulichenko <
> > > > valentin.kuliche...@gmail.com> wrote:
> > > >
> > > > > Vladimir,
> > > > >
> > > > > We don't write length because be don't know it in advance. Sounds
> > like
> > > > > you're proposing to marshal the message first and then copy it to
> the
> > > > write
> > > > > buffer. But that's actually our previous implementation and the
> whole
> > > > > purpose of direct serialization was to avoid this copy.
> > > > >
> > > > > The optimization for writes sounds interesting, though.
> > > > >
> > > > > -Val
> > > > >
> > > > > On Thu, Oct 13, 2016 at 3:51 AM, Vladimir Ozerov <
> > voze...@gridgain.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Writes can be optimized even further:
> > > > > > 1) Write to *ByteBuffer *as long as there is a place in it.
> > > > > > 2) When it is full - invoke a callback which will submit it to
> the
> > > > > socket,
> > > > > > reset position to 0, and continue marshaling.
> > > > > >
> > > > > > This way we can probably get rid of write "state" at all.
> > > > > >
> > > > > > On Thu, Oct 13, 2016 at 1:17 PM, Vladimir Ozerov <
> > > voze...@gridgain.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Folks,
> > > > > > >
> > > > > > > I went through our so-called "direct serialization" and appears
> > to
> > > be
> > > > > not
> > > > > > > very efficient to me. We never write message length. As a
> result
> > we
> > > > > have
> > > > > > to
> > > > > > > constantly track what was written and what was not, and whether
> > we
> > > > > have a
> > > > > > > room for the next write. The same goes for reader. As a result
> > even
> > > > > > single
> > > > > > > "writeInt" is surrounded with multiple checks and writes.
> > > > > > >
> > > > > > > It looks like we can make our algorithm much more simple,
> > > > > straightforward
> > > > > > > and efficient if we add two things to every message:
> > > > > > > - Message length
> > > > > > > - Flag indicating whether it was written fully or not.
> > > > > > >
> > > > > > > If message was written fully to the buffer, we do no need to
> > > perform
> > > > > any
> > > > > > > checks during deserialization. To read int it is enough to call
> > > > > > > *ByteBuffer.getInt()*. To read byte array it is enough to call
> > > > > > > *ByteBuffer.getByte()*, etc. Simple and fast.
> > > > > > >
> > > > > > > And only if message was split into pieces on either send or
> > receive
> > > > > > sides,
> > > > > > > which should not happen often, we may want to fallback to
> current
> > > > > > > implementation. Or may be we may copy such message  to a
> separate
> > > > > buffer
> > > > > > > and still read them without any boundaries checks and
> > > > > "incrementStates".
> > > > > > >
> > > > > > > Thoughts?
> > > > > > >
> > > > > > > Vladimir.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Improve "direct" serialization

Reply via email to