Re: Improve "direct" serialization

2016-10-15 Thread Vladimir Ozerov
Consider you have a message of 100 bytes. If it is split in between 2
buffers, you can write it as follows:

[Chunk 1]
65 - 65 bytes to follow
false - incomplete flag
data - 65 bytes

[Chunk 2]
35 - 35 bytes to follow
true - complete flag, this is the end
data - 35 bytes

15 окт. 2016 г. 1:06 пользователь "Valentin Kulichenko" <
valentin.kuliche...@gmail.com> написал:

> Vova,
>
> You still can't write the first buffer to channel before the whole message
> is serialized. Are you going to have multiple write buffers? I'm confused
> :)
>
> -Val
>
> On Fri, Oct 14, 2016 at 1:49 PM, Vladimir Ozerov 
> wrote:
>
> > Val,
> >
> > No need to copy on write. You simply reserve 5 bytes before writing the
> > object, and put them after object is finished.
> >
> > If object is split between two buffers, you set special marker, meaning
> > that the next part is to follow in the next chunk.
> >
> > Vladimir.
> >
> > 14 окт. 2016 г. 22:36 пользователь "Valentin Kulichenko" <
> > valentin.kuliche...@gmail.com> написал:
> >
> > > Vova,
> > >
> > > I meant the copy on write. To know the length you need to fully marshal
> > > message first. This means that you will always copy the array before
> > > writing to channel. Unless I'm missing something, this eliminates the
> > > purpose of direct serialization.
> > >
> > > -Val
> > >
> > > On Thu, Oct 13, 2016 at 11:09 PM, Vladimir Ozerov <
> voze...@gridgain.com>
> > > wrote:
> > >
> > > > Valya,
> > > >
> > > > Yes, in this design we will copy data into separate buffer on read.
> But
> > > > what is important - it will happen only for message which is split
> > > between
> > > > buffers.
> > > >
> > > > On Fri, Oct 14, 2016 at 2:33 AM, Valentin Kulichenko <
> > > > valentin.kuliche...@gmail.com> wrote:
> > > >
> > > > > Vladimir,
> > > > >
> > > > > We don't write length because be don't know it in advance. Sounds
> > like
> > > > > you're proposing to marshal the message first and then copy it to
> the
> > > > write
> > > > > buffer. But that's actually our previous implementation and the
> whole
> > > > > purpose of direct serialization was to avoid this copy.
> > > > >
> > > > > The optimization for writes sounds interesting, though.
> > > > >
> > > > > -Val
> > > > >
> > > > > On Thu, Oct 13, 2016 at 3:51 AM, Vladimir Ozerov <
> > voze...@gridgain.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Writes can be optimized even further:
> > > > > > 1) Write to *ByteBuffer *as long as there is a place in it.
> > > > > > 2) When it is full - invoke a callback which will submit it to
> the
> > > > > socket,
> > > > > > reset position to 0, and continue marshaling.
> > > > > >
> > > > > > This way we can probably get rid of write "state" at all.
> > > > > >
> > > > > > On Thu, Oct 13, 2016 at 1:17 PM, Vladimir Ozerov <
> > > voze...@gridgain.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Folks,
> > > > > > >
> > > > > > > I went through our so-called "direct serialization" and appears
> > to
> > > be
> > > > > not
> > > > > > > very efficient to me. We never write message length. As a
> result
> > we
> > > > > have
> > > > > > to
> > > > > > > constantly track what was written and what was not, and whether
> > we
> > > > > have a
> > > > > > > room for the next write. The same goes for reader. As a result
> > even
> > > > > > single
> > > > > > > "writeInt" is surrounded with multiple checks and writes.
> > > > > > >
> > > > > > > It looks like we can make our algorithm much more simple,
> > > > > straightforward
> > > > > > > and efficient if we add two things to every message:
> > > > > > > - Message length
> > > > > > > - Flag indicating whether it was written fully or not.
> > > > > > >
> > > > > > > If message was written fully to the buffer, we do no need to
> > > perform
> > > > > any
> > > > > > > checks during deserialization. To read int it is enough to call
> > > > > > > *ByteBuffer.getInt()*. To read byte array it is enough to call
> > > > > > > *ByteBuffer.getByte()*, etc. Simple and fast.
> > > > > > >
> > > > > > > And only if message was split into pieces on either send or
> > receive
> > > > > > sides,
> > > > > > > which should not happen often, we may want to fallback to
> current
> > > > > > > implementation. Or may be we may copy such message  to a
> separate
> > > > > buffer
> > > > > > > and still read them without any boundaries checks and
> > > > > "incrementStates".
> > > > > > >
> > > > > > > Thoughts?
> > > > > > >
> > > > > > > Vladimir.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Improve "direct" serialization

2016-10-14 Thread Valentin Kulichenko
Vova,

You still can't write the first buffer to channel before the whole message
is serialized. Are you going to have multiple write buffers? I'm confused :)

-Val

On Fri, Oct 14, 2016 at 1:49 PM, Vladimir Ozerov 
wrote:

> Val,
>
> No need to copy on write. You simply reserve 5 bytes before writing the
> object, and put them after object is finished.
>
> If object is split between two buffers, you set special marker, meaning
> that the next part is to follow in the next chunk.
>
> Vladimir.
>
> 14 окт. 2016 г. 22:36 пользователь "Valentin Kulichenko" <
> valentin.kuliche...@gmail.com> написал:
>
> > Vova,
> >
> > I meant the copy on write. To know the length you need to fully marshal
> > message first. This means that you will always copy the array before
> > writing to channel. Unless I'm missing something, this eliminates the
> > purpose of direct serialization.
> >
> > -Val
> >
> > On Thu, Oct 13, 2016 at 11:09 PM, Vladimir Ozerov 
> > wrote:
> >
> > > Valya,
> > >
> > > Yes, in this design we will copy data into separate buffer on read. But
> > > what is important - it will happen only for message which is split
> > between
> > > buffers.
> > >
> > > On Fri, Oct 14, 2016 at 2:33 AM, Valentin Kulichenko <
> > > valentin.kuliche...@gmail.com> wrote:
> > >
> > > > Vladimir,
> > > >
> > > > We don't write length because be don't know it in advance. Sounds
> like
> > > > you're proposing to marshal the message first and then copy it to the
> > > write
> > > > buffer. But that's actually our previous implementation and the whole
> > > > purpose of direct serialization was to avoid this copy.
> > > >
> > > > The optimization for writes sounds interesting, though.
> > > >
> > > > -Val
> > > >
> > > > On Thu, Oct 13, 2016 at 3:51 AM, Vladimir Ozerov <
> voze...@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > Writes can be optimized even further:
> > > > > 1) Write to *ByteBuffer *as long as there is a place in it.
> > > > > 2) When it is full - invoke a callback which will submit it to the
> > > > socket,
> > > > > reset position to 0, and continue marshaling.
> > > > >
> > > > > This way we can probably get rid of write "state" at all.
> > > > >
> > > > > On Thu, Oct 13, 2016 at 1:17 PM, Vladimir Ozerov <
> > voze...@gridgain.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Folks,
> > > > > >
> > > > > > I went through our so-called "direct serialization" and appears
> to
> > be
> > > > not
> > > > > > very efficient to me. We never write message length. As a result
> we
> > > > have
> > > > > to
> > > > > > constantly track what was written and what was not, and whether
> we
> > > > have a
> > > > > > room for the next write. The same goes for reader. As a result
> even
> > > > > single
> > > > > > "writeInt" is surrounded with multiple checks and writes.
> > > > > >
> > > > > > It looks like we can make our algorithm much more simple,
> > > > straightforward
> > > > > > and efficient if we add two things to every message:
> > > > > > - Message length
> > > > > > - Flag indicating whether it was written fully or not.
> > > > > >
> > > > > > If message was written fully to the buffer, we do no need to
> > perform
> > > > any
> > > > > > checks during deserialization. To read int it is enough to call
> > > > > > *ByteBuffer.getInt()*. To read byte array it is enough to call
> > > > > > *ByteBuffer.getByte()*, etc. Simple and fast.
> > > > > >
> > > > > > And only if message was split into pieces on either send or
> receive
> > > > > sides,
> > > > > > which should not happen often, we may want to fallback to current
> > > > > > implementation. Or may be we may copy such message  to a separate
> > > > buffer
> > > > > > and still read them without any boundaries checks and
> > > > "incrementStates".
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Improve "direct" serialization

2016-10-14 Thread Valentin Kulichenko
Vova,

I meant the copy on write. To know the length you need to fully marshal
message first. This means that you will always copy the array before
writing to channel. Unless I'm missing something, this eliminates the
purpose of direct serialization.

-Val

On Thu, Oct 13, 2016 at 11:09 PM, Vladimir Ozerov 
wrote:

> Valya,
>
> Yes, in this design we will copy data into separate buffer on read. But
> what is important - it will happen only for message which is split between
> buffers.
>
> On Fri, Oct 14, 2016 at 2:33 AM, Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > Vladimir,
> >
> > We don't write length because be don't know it in advance. Sounds like
> > you're proposing to marshal the message first and then copy it to the
> write
> > buffer. But that's actually our previous implementation and the whole
> > purpose of direct serialization was to avoid this copy.
> >
> > The optimization for writes sounds interesting, though.
> >
> > -Val
> >
> > On Thu, Oct 13, 2016 at 3:51 AM, Vladimir Ozerov 
> > wrote:
> >
> > > Writes can be optimized even further:
> > > 1) Write to *ByteBuffer *as long as there is a place in it.
> > > 2) When it is full - invoke a callback which will submit it to the
> > socket,
> > > reset position to 0, and continue marshaling.
> > >
> > > This way we can probably get rid of write "state" at all.
> > >
> > > On Thu, Oct 13, 2016 at 1:17 PM, Vladimir Ozerov  >
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > > I went through our so-called "direct serialization" and appears to be
> > not
> > > > very efficient to me. We never write message length. As a result we
> > have
> > > to
> > > > constantly track what was written and what was not, and whether we
> > have a
> > > > room for the next write. The same goes for reader. As a result even
> > > single
> > > > "writeInt" is surrounded with multiple checks and writes.
> > > >
> > > > It looks like we can make our algorithm much more simple,
> > straightforward
> > > > and efficient if we add two things to every message:
> > > > - Message length
> > > > - Flag indicating whether it was written fully or not.
> > > >
> > > > If message was written fully to the buffer, we do no need to perform
> > any
> > > > checks during deserialization. To read int it is enough to call
> > > > *ByteBuffer.getInt()*. To read byte array it is enough to call
> > > > *ByteBuffer.getByte()*, etc. Simple and fast.
> > > >
> > > > And only if message was split into pieces on either send or receive
> > > sides,
> > > > which should not happen often, we may want to fallback to current
> > > > implementation. Or may be we may copy such message  to a separate
> > buffer
> > > > and still read them without any boundaries checks and
> > "incrementStates".
> > > >
> > > > Thoughts?
> > > >
> > > > Vladimir.
> > > >
> > >
> >
>


Re: Improve "direct" serialization

2016-10-14 Thread Yakov Zhdanov
I already tried smth like that when tried to make marshalling parallel.

This is how I tried to reapproach it.
1. allocate set of direct buffers per each session (connection). that was 1
large buffer, but sliced with smaller ones over large offheap array.
2. each thread acquires chunk by chunk from the session it wants to write
to.
3. each chunk has a header - chunk length (31 bit) and last flag (1 bit)
4. message can span several chunks
5. all message chunks are passed to nio thread
6. nio thread gets chunks from the queue until queue is empty or total
chunks length is equal to socket buffer and calls -
java.nio.channels.SocketChannel#write(java.nio.ByteBuffer[], int, int)


On reader side
1. Read into large byte buffer
2. Scan it for chunks and messages that are fully read
3. Unmarshall each message in system pool passing slices over larger buffer

This approach worked a bit slower than current solution, but brought a lot
of complex code including more complex synchronizations. So, I decided not
to continue with it.

--Yakov

2016-10-14 9:09 GMT+03:00 Vladimir Ozerov :

> Valya,
>
> Yes, in this design we will copy data into separate buffer on read. But
> what is important - it will happen only for message which is split between
> buffers.
>
> On Fri, Oct 14, 2016 at 2:33 AM, Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > Vladimir,
> >
> > We don't write length because be don't know it in advance. Sounds like
> > you're proposing to marshal the message first and then copy it to the
> write
> > buffer. But that's actually our previous implementation and the whole
> > purpose of direct serialization was to avoid this copy.
> >
> > The optimization for writes sounds interesting, though.
> >
> > -Val
> >
> > On Thu, Oct 13, 2016 at 3:51 AM, Vladimir Ozerov 
> > wrote:
> >
> > > Writes can be optimized even further:
> > > 1) Write to *ByteBuffer *as long as there is a place in it.
> > > 2) When it is full - invoke a callback which will submit it to the
> > socket,
> > > reset position to 0, and continue marshaling.
> > >
> > > This way we can probably get rid of write "state" at all.
> > >
> > > On Thu, Oct 13, 2016 at 1:17 PM, Vladimir Ozerov  >
> > > wrote:
> > >
> > > > Folks,
> > > >
> > > > I went through our so-called "direct serialization" and appears to be
> > not
> > > > very efficient to me. We never write message length. As a result we
> > have
> > > to
> > > > constantly track what was written and what was not, and whether we
> > have a
> > > > room for the next write. The same goes for reader. As a result even
> > > single
> > > > "writeInt" is surrounded with multiple checks and writes.
> > > >
> > > > It looks like we can make our algorithm much more simple,
> > straightforward
> > > > and efficient if we add two things to every message:
> > > > - Message length
> > > > - Flag indicating whether it was written fully or not.
> > > >
> > > > If message was written fully to the buffer, we do no need to perform
> > any
> > > > checks during deserialization. To read int it is enough to call
> > > > *ByteBuffer.getInt()*. To read byte array it is enough to call
> > > > *ByteBuffer.getByte()*, etc. Simple and fast.
> > > >
> > > > And only if message was split into pieces on either send or receive
> > > sides,
> > > > which should not happen often, we may want to fallback to current
> > > > implementation. Or may be we may copy such message  to a separate
> > buffer
> > > > and still read them without any boundaries checks and
> > "incrementStates".
> > > >
> > > > Thoughts?
> > > >
> > > > Vladimir.
> > > >
> > >
> >
>


Re: Improve "direct" serialization

2016-10-14 Thread Vladimir Ozerov
Valya,

Yes, in this design we will copy data into separate buffer on read. But
what is important - it will happen only for message which is split between
buffers.

On Fri, Oct 14, 2016 at 2:33 AM, Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Vladimir,
>
> We don't write length because be don't know it in advance. Sounds like
> you're proposing to marshal the message first and then copy it to the write
> buffer. But that's actually our previous implementation and the whole
> purpose of direct serialization was to avoid this copy.
>
> The optimization for writes sounds interesting, though.
>
> -Val
>
> On Thu, Oct 13, 2016 at 3:51 AM, Vladimir Ozerov 
> wrote:
>
> > Writes can be optimized even further:
> > 1) Write to *ByteBuffer *as long as there is a place in it.
> > 2) When it is full - invoke a callback which will submit it to the
> socket,
> > reset position to 0, and continue marshaling.
> >
> > This way we can probably get rid of write "state" at all.
> >
> > On Thu, Oct 13, 2016 at 1:17 PM, Vladimir Ozerov 
> > wrote:
> >
> > > Folks,
> > >
> > > I went through our so-called "direct serialization" and appears to be
> not
> > > very efficient to me. We never write message length. As a result we
> have
> > to
> > > constantly track what was written and what was not, and whether we
> have a
> > > room for the next write. The same goes for reader. As a result even
> > single
> > > "writeInt" is surrounded with multiple checks and writes.
> > >
> > > It looks like we can make our algorithm much more simple,
> straightforward
> > > and efficient if we add two things to every message:
> > > - Message length
> > > - Flag indicating whether it was written fully or not.
> > >
> > > If message was written fully to the buffer, we do no need to perform
> any
> > > checks during deserialization. To read int it is enough to call
> > > *ByteBuffer.getInt()*. To read byte array it is enough to call
> > > *ByteBuffer.getByte()*, etc. Simple and fast.
> > >
> > > And only if message was split into pieces on either send or receive
> > sides,
> > > which should not happen often, we may want to fallback to current
> > > implementation. Or may be we may copy such message  to a separate
> buffer
> > > and still read them without any boundaries checks and
> "incrementStates".
> > >
> > > Thoughts?
> > >
> > > Vladimir.
> > >
> >
>


Re: Improve "direct" serialization

2016-10-13 Thread Vladimir Ozerov
Writes can be optimized even further:
1) Write to *ByteBuffer *as long as there is a place in it.
2) When it is full - invoke a callback which will submit it to the socket,
reset position to 0, and continue marshaling.

This way we can probably get rid of write "state" at all.

On Thu, Oct 13, 2016 at 1:17 PM, Vladimir Ozerov 
wrote:

> Folks,
>
> I went through our so-called "direct serialization" and appears to be not
> very efficient to me. We never write message length. As a result we have to
> constantly track what was written and what was not, and whether we have a
> room for the next write. The same goes for reader. As a result even single
> "writeInt" is surrounded with multiple checks and writes.
>
> It looks like we can make our algorithm much more simple, straightforward
> and efficient if we add two things to every message:
> - Message length
> - Flag indicating whether it was written fully or not.
>
> If message was written fully to the buffer, we do no need to perform any
> checks during deserialization. To read int it is enough to call
> *ByteBuffer.getInt()*. To read byte array it is enough to call
> *ByteBuffer.getByte()*, etc. Simple and fast.
>
> And only if message was split into pieces on either send or receive sides,
> which should not happen often, we may want to fallback to current
> implementation. Or may be we may copy such message  to a separate buffer
> and still read them without any boundaries checks and "incrementStates".
>
> Thoughts?
>
> Vladimir.
>


Improve "direct" serialization

2016-10-13 Thread Vladimir Ozerov
Folks,

I went through our so-called "direct serialization" and appears to be not
very efficient to me. We never write message length. As a result we have to
constantly track what was written and what was not, and whether we have a
room for the next write. The same goes for reader. As a result even single
"writeInt" is surrounded with multiple checks and writes.

It looks like we can make our algorithm much more simple, straightforward
and efficient if we add two things to every message:
- Message length
- Flag indicating whether it was written fully or not.

If message was written fully to the buffer, we do no need to perform any
checks during deserialization. To read int it is enough to call
*ByteBuffer.getInt()*. To read byte array it is enough to call
*ByteBuffer.getByte()*, etc. Simple and fast.

And only if message was split into pieces on either send or receive sides,
which should not happen often, we may want to fallback to current
implementation. Or may be we may copy such message  to a separate buffer
and still read them without any boundaries checks and "incrementStates".

Thoughts?

Vladimir.