[ 
https://issues.apache.org/jira/browse/IGNITE-28520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Abashev updated IGNITE-28520:
----------------------------------
    Description: 
*Background / Problem Statement:*

After moving the marshalling methods (prepareMarshal / finishUnmarshal) into 
the NIO thread, two related issues emerged:

Performance degradation (IGNITE-28473). Marshalling of CustomObject/CacheObject 
now happens in a single NIO worker, whereas previously it was done in parallel 
across user threads.
Deadlock in Discovery. The marshaller broadcasts a class registration message 
across the cluster and waits for acknowledgement from all nodes. If marshalling 
happens on the Discovery thread, a deadlock occurs: the thread waits for a 
response to a message it is supposed to process itself.

Root cause: the serializer invokes prepareMarshal / finishUnmarshal directly on 
the sending thread (NIO / Discovery), whereas these methods must be executed on 
a user thread.

*Proposed Solution (Phase 1):*
Implement two-phase marshalling for CacheObject fields:

Phase 1 — on the send call thread (user thread): Add methods to the generated 
serializer that recursively traverse all @Order-annotated fields, locate 
CacheObject fields (including nested ones and those inside collections), invoke 
prepareMarshal, and store the result in a byte[].
Phase 2 — on the NIO sending thread: The serializer reads the pre-computed 
byte[] and writes them to the socket. prepareMarshal is not called.

This phase covers only CacheObject fields generated by the code generator via 
@Order. Manual code for MarshallableMessage fields (e.g. 
GridJobExecuteResponse::marshallUserData) and encapsulation of byte[] fields 
are deferred to the next ticket.

Out of scope (next ticket):

Handling MarshallableMessage fields that require manual code.
Hiding / encapsulating byte[] fields inside messages.


Acceptance Criteria:

 prepareMarshal / finishUnmarshal for CacheObject fields are only invoked on a 
user thread, never on NIO / Discovery threads.
 The NIO worker only reads pre-computed bytes and writes them to the socket.
 Recursive traversal of @Order-annotated fields correctly handles nested 
CacheObject instances and collections.
 The Discovery deadlock when sending messages with CustomObject is no longer 
reproducible.
 No performance degradation (confirmed by JMH benchmarks — IGNITE-28119).
 Existing tests pass.


  was:
Background / Problem Statement:
After moving the marshalling methods (prepareMarshal / finishUnmarshal) into 
the NIO thread, two related issues emerged:

Performance degradation (IGNITE-28473). Marshalling of CustomObject/CacheObject 
now happens in a single NIO worker, whereas previously it was done in parallel 
across user threads.
Deadlock in Discovery. The marshaller broadcasts a class registration message 
across the cluster and waits for acknowledgement from all nodes. If marshalling 
happens on the Discovery thread, a deadlock occurs: the thread waits for a 
response to a message it is supposed to process itself.

Root cause: the serializer invokes prepareMarshal / finishUnmarshal directly on 
the sending thread (NIO / Discovery), whereas these methods must be executed on 
a user thread.

Proposed Solution (Phase 1):
Implement two-phase marshalling for CacheObject fields:

Phase 1 — on the send call thread (user thread): Add methods to the generated 
serializer that recursively traverse all @Order-annotated fields, locate 
CacheObject fields (including nested ones and those inside collections), invoke 
prepareMarshal, and store the result in a byte[].
Phase 2 — on the NIO sending thread: The serializer reads the pre-computed 
byte[] and writes them to the socket. prepareMarshal is not called.

This phase covers only CacheObject fields generated by the code generator via 
@Order. Manual code for MarshallableMessage fields (e.g. 
GridJobExecuteResponse::marshallUserData) and encapsulation of byte[] fields 
are deferred to the next ticket.

Out of scope (next ticket):

Handling MarshallableMessage fields that require manual code.
Hiding / encapsulating byte[] fields inside messages.


Acceptance Criteria:

 prepareMarshal / finishUnmarshal for CacheObject fields are only invoked on a 
user thread, never on NIO / Discovery threads.
 The NIO worker only reads pre-computed bytes and writes them to the socket.
 Recursive traversal of @Order-annotated fields correctly handles nested 
CacheObject instances and collections.
 The Discovery deadlock when sending messages with CustomObject is no longer 
reproducible.
 No performance degradation (confirmed by JMH benchmarks — IGNITE-28119).
 Existing tests pass.



> Move prepareMarshal / finishUnmarshal out of NIO communication thread — Phase 
> 1: CacheObjects
> ---------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-28520
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28520
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Alex Abashev
>            Assignee: Alex Abashev
>            Priority: Minor
>              Labels: IEP-132, ise
>             Fix For: 2.19
>
>
> *Background / Problem Statement:*
> After moving the marshalling methods (prepareMarshal / finishUnmarshal) into 
> the NIO thread, two related issues emerged:
> Performance degradation (IGNITE-28473). Marshalling of 
> CustomObject/CacheObject now happens in a single NIO worker, whereas 
> previously it was done in parallel across user threads.
> Deadlock in Discovery. The marshaller broadcasts a class registration message 
> across the cluster and waits for acknowledgement from all nodes. If 
> marshalling happens on the Discovery thread, a deadlock occurs: the thread 
> waits for a response to a message it is supposed to process itself.
> Root cause: the serializer invokes prepareMarshal / finishUnmarshal directly 
> on the sending thread (NIO / Discovery), whereas these methods must be 
> executed on a user thread.
> *Proposed Solution (Phase 1):*
> Implement two-phase marshalling for CacheObject fields:
> Phase 1 — on the send call thread (user thread): Add methods to the generated 
> serializer that recursively traverse all @Order-annotated fields, locate 
> CacheObject fields (including nested ones and those inside collections), 
> invoke prepareMarshal, and store the result in a byte[].
> Phase 2 — on the NIO sending thread: The serializer reads the pre-computed 
> byte[] and writes them to the socket. prepareMarshal is not called.
> This phase covers only CacheObject fields generated by the code generator via 
> @Order. Manual code for MarshallableMessage fields (e.g. 
> GridJobExecuteResponse::marshallUserData) and encapsulation of byte[] fields 
> are deferred to the next ticket.
> Out of scope (next ticket):
> Handling MarshallableMessage fields that require manual code.
> Hiding / encapsulating byte[] fields inside messages.
> Acceptance Criteria:
>  prepareMarshal / finishUnmarshal for CacheObject fields are only invoked on 
> a user thread, never on NIO / Discovery threads.
>  The NIO worker only reads pre-computed bytes and writes them to the socket.
>  Recursive traversal of @Order-annotated fields correctly handles nested 
> CacheObject instances and collections.
>  The Discovery deadlock when sending messages with CustomObject is no longer 
> reproducible.
>  No performance degradation (confirmed by JMH benchmarks — IGNITE-28119).
>  Existing tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to