Re: IGNITE-1948 ClusterTopologyCheckedException can return null for retryReadyFuture()

Alexey Goncharuk Mon, 30 Jan 2017 05:56:19 -0800

In the example above there is no point of setting the retry future in the
exception because it will be serialized and sent to a remote node.


I see the only possible way to ensure this: make this be verified at
compile time. This looks like a big change, but the draft may look like so:
1) Introduce new exception type, like CacheTopology*Checked*Exception which
*must* take the minimum topology version to wait for
2) In all cases when a cache operation fails because of topology change,
provide the appropriate exception
3) When CacheTopologyException is being constructed, create a corresponding
local future based on wait version and throw the exception.

Even though this still does not give us 100% guarantee that we did not miss
anything, it should cover a lot more cases than now.

2017-01-30 14:20 GMT+03:00 Александр Меньшиков <[email protected]>:

> Alexey,
>
> For GridCacheIoManager#sendNoRetry it's not necessary because exception
> will be ignored (or will cast to String). Perhaps my message was unclear.
> I try to say that three methods can throw "ClusterTopologyCheckedException"
> without any problem.
>
> But you are right, in some methods I can't set "retryFuture". We must
> ensure that exception doesn't escape away. The best option is to ignore it
> (or cast to String).
>
> But any way, look here:
>
> IgniteTxHandler#sendReply(UUID nodeId, GridDhtTxFinishRequest req, boolean
> committed, GridCacheVersion nearTxId)
>
> Inside you can found next lines:
> __________________________
> ClusterTopologyCheckedException *cause* =
>     new ClusterTopologyCheckedException("Primary node left grid.");
>
> res.checkCommittedError(new IgniteTxRollbackCheckedException("Failed to
> commit transaction " +
>     "(transaction has been rolled back on backup node): " + req.version(),
> *cause*));
> __________________________
>
> How do you unsure that *"cause"* can't come to 
> GridCacheUtils#retryTopologySafe(final
> Callable<S> c) ?
>
>
> Perhaps I don't know some rules which you use to maintain it now?
>
>
> 2017-01-30 12:24 GMT+03:00 Alexey Goncharuk <[email protected]>:
>
>> Alexander,
>>
>> Do you have a use-case in mind which results in IgniteTopologyException
>> thrown from public API with retryFuture unset?
>>
>> retryFuture makes sense only for certain cache operations and may be set
>> only in certain places in the code where we already know what to wait for.
>> I do not see how you can initialize this future, for example, in
>>  GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage msg,
>> byte
>> plc)
>>
>> --AG
>>
>> 2017-01-27 15:45 GMT+03:00 Александр Меньшиков <[email protected]>:
>>
>> > I found a lot of using "ClusterTopologyCheckedException" exception. In
>> > most
>> > use-case these exceptions were just ignored. It's easier to see if
>> > issue-4612 will be applied (I'm waiting for code review by my team
>> leader)
>> > where I renamed all unused exceptions in the "ignored".
>> >
>> > It means that in some case "retryReadyFuture" can left unset. But there
>> are
>> > code where exception is being getting from "get()" method in some
>> "future"
>> > class and don't ignored (exception is sending to method
>> > "GridFutureAdapter#onDone()" for example).
>> >
>> > So I decided not to do a deep global analysis of code and make some
>> simple
>> > rules.
>> > 1. If some method "X" throws ClusterTopologyCheckedException and
>> ignores
>> > it
>> > (or converts to String, there are some variants to lose exception like
>> > object) in all methods that call the method "X", then we can don't set
>> > "retryReadyFuture" for optimization. But this should be clarified in
>> > javadoc.
>> > 2. But if exception escapes at least once like object: we must set
>> > "retryReadyFuture" in that method.
>> >
>> > A few times when call tree was simple, I went deeper.
>> >
>> > So I found only three methods which can throw
>> > "ClusterTopologyCheckedException" without setting "retryReadyFuture":
>> >
>> > 1. IgfsFragmentizerManager#sendWithRetries(UUID nodeId,
>> > IgfsCommunicationMessage msg)
>> > 2. GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage
>> msg,
>> > byte plc)
>> > 3. GridCacheIoManager#sendOrderedMessage(ClusterNode node, Object
>> topic,
>> > GridCacheMessage msg, byte plc, long timeout)
>> >
>> > In all other methods "ClusterTopologyCheckedException" escapes away too
>> > far.
>> >
>> > What do you think about this approach to make compromise between
>> > optimization and correctness?
>> >
>>
>
>

Re: IGNITE-1948 ClusterTopologyCheckedException can return null for retryReadyFuture()

Reply via email to