Hello, Igniters!

I found a bug with handling events when the client reconnects.

Discovery has a queue for events. It's processed by event thread. If
we hold up event processing using a listener on the client side and
restarts cluster - the client will reconnect. After it reconnects it
will continue processing events from the previous cluster.

This behavior produces bugs in MvccProcessor [1] and hanging of
partitions exchange [2] on the client side. The reason is that
discovery notifies components about reconnection in the notifier
thread by calling the 'onLocalJoin' method. After it (or at the same
time), components can process events from the previous cluster in
their listeners and break their logic.

The order of events is fine - after processing previous cluster events
- it will process client disconnection/reconnection and new cluster
events.

I see two possible solutions for the issue:

1. Fix discovery logic. Make a guarantee that no one event from the
previous cluster will be processed after the client reconnect
('onLocalJoin' called). For example, wait for the client disconnect
event will be processed in the discovery event thread. Then start
attempt to reconnect.

2. Write synchronization logic in components for this case.

What is the correct fix for this issue? Any thoughts?

1. https://issues.apache.org/jira/browse/IGNITE-11460
2. https://github.com/NSAmelchev/ignite/pull/26/files

-- 
Best wishes,
Amelchev Nikita

Reply via email to