[ 
https://issues.apache.org/jira/browse/IGNITE-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-17871:
-----------------------------------
    Description: 
h3. Problem

Currently, there are two places where {{Command}} instances are being 
serialized:
 * ActionRequest - here the command property is marked is Marshallable, meaning 
that it will be serialized using a User Object Serialization approach
 * Listener - here command is explicitly serialized using JDKMarshaller, for 
further handling by RAFT. This is the data that will be written to the Log and 
deserialized on followers / learners

What are the problems?

For ActionRequest message, command is expected to be its largest part. And 
although writing serialized UOS byte array into a netty socket is faster, then 
optimized marshalling, it feels like overall throughput will be smaller. And 
the reason is that there's an extra step of converting command into a byte 
array, that happens in caller thread.

For serialization in listeners, using JDKMarshaller is both slow and 
inefficient in terms of space. Obvious example - network serialization of 
SnapshotMeta object, for instance, can be condensed to 8 bytes (assuming we 
optimize "writeShort" and change its message type). JDKMarshaller produces 232 
bytes. Of course, here most of fields are nulls and real payload will be 
bigger, but JDKMarshaller will always lead to more data simply because it has 
to store schema meta-information.
h3. Solution

Making Command an implementation of NetworkMessage will solve both of these 
problems. ActionRequest will not have its "prepareMarshal" phase, listeners 
will have fast and space-efficient serialization algorithm.

Of course, there must be drawbacks. I'll try to explain what I see at the 
moment.
 * Currently, there's no explicit support for List properties, only Collection. 
It is easy to fix
 * CMG commands use classes like ClusterNode and IgniteProductVersion. We 
should introduce message alternatives
 * I saw some enums being used, they are not natively supported at the moment. 
There are two options:
 ** implement native support. I consider this a dangerous path
 ** store explicit ordinal where it's necessary
 * ByteBuffer support would be really nice to have natively. Should be fast to 
implement also

One important note: there should be no Marshallable properties in commands, 
because we can't persist them. Information about classes' ids is stored in 
sessions and can change between sessions. The way to achieve it is to pass a 
"null" UOS context into serializator.

Now about serializator: we can have thread-local buffers to write data to. When 
write is complete, data is copied as a byte[]. Reading will be done directly 
from the byte[].

Possible optimization for ByteBuffers - we can implement them as slices of the 
byte[] payload instead of copying sub-arrays. Will save some time and memory.
h3. Plan

Given the volume of changes, I suggest splitting the issue into several parts. 
There are multiple sets of commands in Ignite:
 * Table commands (5 commands at the moment of writing this text)
 * CMG commands (6 commands)
 * Metastorage (19)

This list goes in order of complexity. Table commands are very simple. CMG 
commands require additional messages for ClusterNodes and such.

Metastorage commands have complicated structures for conditional updates, and 
there are many of them.

When all commands are messages, we can safely inherit Command from 
NetworkMessage and remove Marshallable from the ActionRequest's field. In 
total, this looks like 4 separate issues, 4th one being the current one. List & 
ByteBuffer support is already completed in IGNITE-17874.

  was:
h3. Problem

Currently, there are two places where {{Command}} instances are being 
serialized:
 * ActionRequest - here the command property is marked is Marshallable, meaning 
that it will be serialized using a User Object Serialization approach
 * Listener - here command is explicitly serialized using JDKMarshaller, for 
further handling by RAFT. This is the data that will be written to the Log and 
deserialized on followers / learners

What are the problems?

For ActionRequest message, command is expected to be its largest part. And 
although writing serialized UOS byte array into a netty socket is faster, then 
optimized marshalling, it feels like overall throughput will be smaller. And 
the reason is that there's an extra step of converting command into a byte 
array, that happens in caller thread.

For serialization in listeners, using JDKMarshaller is both slow and 
inefficient in terms of space. Obvious example - network serialization of 
SnapshotMeta object, for instance, can be condensed to 8 bytes (assuming we 
optimize "writeShort" and change its message type). JDKMarshaller produces 232 
bytes. Of course, here most of fields are nulls and real payload will be 
bigger, but JDKMarshaller will always lead to more data simply because it has 
to store schema meta-information.
h3. Solution

Making Command an implementation of NetworkMessage will solve both of these 
problems. ActionRequest will not have its "prepareMarshal" phase, listeners 
will have fast and space-efficient serialization algorithm.

Of course, there must be drawbacks. I'll try to explain what I see at the 
moment.
 * Currently, there's no explicit support for List properties, only Collection. 
It is easy to fix
 * CMG commands use classes like ClusterNode and IgniteProductVersion. We 
should introduce message alternatives
 * I saw some enums being used, they are not natively supported at the moment. 
There are two options:
 ** implement native support. I consider this a dangerous path
 ** store explicit ordinal where it's necessary
 * ByteBuffer support would be really nice to have natively. Should be fast to 
implement also

One important note: there should be no Marshallable properties in commands, 
because we can't persist them. Information about classes' ids is stored in 
sessions and can change between sessions. The way to achieve it is to pass a 
"null" UOS context into serializator.

Now about serializator: we can have thread-local buffers to write data to. When 
write is complete, data is copied as a byte[]. Reading will be done directly 
from the byte[].

Possible optimization for ByteBuffers - we can implement them as slices of the 
byte[] payload instead of copying sub-arrays. Will save some time and memory.


> Use network serialization for RAFT commands
> -------------------------------------------
>
>                 Key: IGNITE-17871
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17871
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Priority: Major
>              Labels: ignite-3
>
> h3. Problem
> Currently, there are two places where {{Command}} instances are being 
> serialized:
>  * ActionRequest - here the command property is marked is Marshallable, 
> meaning that it will be serialized using a User Object Serialization approach
>  * Listener - here command is explicitly serialized using JDKMarshaller, for 
> further handling by RAFT. This is the data that will be written to the Log 
> and deserialized on followers / learners
> What are the problems?
> For ActionRequest message, command is expected to be its largest part. And 
> although writing serialized UOS byte array into a netty socket is faster, 
> then optimized marshalling, it feels like overall throughput will be smaller. 
> And the reason is that there's an extra step of converting command into a 
> byte array, that happens in caller thread.
> For serialization in listeners, using JDKMarshaller is both slow and 
> inefficient in terms of space. Obvious example - network serialization of 
> SnapshotMeta object, for instance, can be condensed to 8 bytes (assuming we 
> optimize "writeShort" and change its message type). JDKMarshaller produces 
> 232 bytes. Of course, here most of fields are nulls and real payload will be 
> bigger, but JDKMarshaller will always lead to more data simply because it has 
> to store schema meta-information.
> h3. Solution
> Making Command an implementation of NetworkMessage will solve both of these 
> problems. ActionRequest will not have its "prepareMarshal" phase, listeners 
> will have fast and space-efficient serialization algorithm.
> Of course, there must be drawbacks. I'll try to explain what I see at the 
> moment.
>  * Currently, there's no explicit support for List properties, only 
> Collection. It is easy to fix
>  * CMG commands use classes like ClusterNode and IgniteProductVersion. We 
> should introduce message alternatives
>  * I saw some enums being used, they are not natively supported at the 
> moment. There are two options:
>  ** implement native support. I consider this a dangerous path
>  ** store explicit ordinal where it's necessary
>  * ByteBuffer support would be really nice to have natively. Should be fast 
> to implement also
> One important note: there should be no Marshallable properties in commands, 
> because we can't persist them. Information about classes' ids is stored in 
> sessions and can change between sessions. The way to achieve it is to pass a 
> "null" UOS context into serializator.
> Now about serializator: we can have thread-local buffers to write data to. 
> When write is complete, data is copied as a byte[]. Reading will be done 
> directly from the byte[].
> Possible optimization for ByteBuffers - we can implement them as slices of 
> the byte[] payload instead of copying sub-arrays. Will save some time and 
> memory.
> h3. Plan
> Given the volume of changes, I suggest splitting the issue into several 
> parts. There are multiple sets of commands in Ignite:
>  * Table commands (5 commands at the moment of writing this text)
>  * CMG commands (6 commands)
>  * Metastorage (19)
> This list goes in order of complexity. Table commands are very simple. CMG 
> commands require additional messages for ClusterNodes and such.
> Metastorage commands have complicated structures for conditional updates, and 
> there are many of them.
> When all commands are messages, we can safely inherit Command from 
> NetworkMessage and remove Marshallable from the ActionRequest's field. In 
> total, this looks like 4 separate issues, 4th one being the current one. List 
> & ByteBuffer support is already completed in IGNITE-17874.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to