Alexey, I just remind you about issue. 2017-02-20 16:42 GMT+03:00 Александр Меньшиков <sharple...@gmail.com>:
> Alexey, > > I'm ready to make some conclusions. You can see result immediately here: > PR: https://github.com/apache/ignite/pull/1537/files > CI Tests: http://ci.ignite.apache.org/project.html?projectId= > IgniteTests&tab=projectOverview&branch_IgniteTests=pull/1537/head > > I fixed only "ClusterTopologyCheckedException", and didn't touched > "ClusterTopologyException". "ClusterTopologyException" has a same problem > like "ClusterTopologyCheckedException", but even now changes is huge (80 > files). So if you think fixing of "ClusterTopologyException" is necessary, > you could add different issue in JIRA (or I can do that). And I will fix it > in future. I can't implement your idea about using Runtime exception, > because right now a lot of logic tied to the fact that this is > "IgniteCheckedException". I real tried to mix it but I couldn't make it > work. > > So before my changes we have 3 affected classes: > 1. "ClusterTopologyCheckedException". > 2. "ClusterTopologyServerNotFoundException". > 3. "ClusterGroupEmptyCheckedException". > > Now we there are 5 affected classes: > > "ClusterTopologyCheckedException" split into 2 classes: > 1. "ClusterTopologyCheckedException" (with future). > 2. "ClusterTopologyLocalException" (without future, and it's parent for " > ClusterTopologyCheckedException"). > > "ClusterTopologyServerNotFoundException" rename to > 3. "ClusterTopologyServerNotFoundLocalException" (For consistency of > names, I didn't find any cases where "ClusterTopologyServerNotFoundException" > need his future). > > "ClusterGroupEmptyCheckedException" split into 2 classes: > 4. "ClusterGroupEmptyCheckedException". > 5. "ClusterGroupEmptyLocalException". > > Also in code "ready future" is using for waiting, and don't using for get > real value (just look at code, all values just ignored). In fact "ready > future" has type "IgniteFuture<?>". It suggests that result doesn't matter. > > 2017-02-10 11:06 GMT+03:00 Alexey Goncharuk <alexey.goncha...@gmail.com>: > >> Alexander, >> >> I do not see why you would need to overwrite the cause field. >> >> What I meant in the previous email is that instead of setting a retry >> ready future in the checked exception, you would set a correct affinity >> topology version here. Then, during a construction of CacheException >> (unchecked, which is guaranteed to be thrown locally and will not be >> serialized), you would create the retry ready future based on the topology >> version set. >> >> Hope this helps, >> AG >> >> 2017-02-07 17:18 GMT+03:00 Александр Меньшиков <sharple...@gmail.com>: >> >>> Alexey, I ran into some difficulties. >>> >>> Look at the method: GridNearTxFinishFuture.CheckBa >>> ckupMiniFuture#onDhtFinishResponse(GridDhtTxFinishResponse res) >>> >>> *Throwable err* = res.checkCommittedError(); >>> >>> if (*err* != null) { >>> if (*err* instanceof IgniteCheckedException) { >>> *ClusterTopologyCheckedException cause* = >>> ((IgniteCheckedException)*err*). >>> *getCause(ClusterTopologyCheckedException.class)*; >>> >>> if (*cause* != null) >>> *cause.retryReadyFuture(*cctx.ne >>> xtAffinityReadyFuture(tx.topologyVersion())); >>> * //^_____Here update the readyFut in >>> some first "cause". * >>> } >>> >>> onDone(*err*); >>> } >>> >>> >>> So I can't rewrite "cause" field in exception without reflection. It >>> means if I try to use two exception: one with "readyFut" and second without >>> "readyFut", I see no way to do it in this place. >>> >>> Perhaps I misunderstood your original idea. Or maybe this code some kind >>> of a crutch and should be removed. What do you think? >>> >>> 2017-01-30 16:54 GMT+03:00 Alexey Goncharuk <alexey.goncha...@gmail.com> >>> : >>> >>>> In the example above there is no point of setting the retry future in >>>> the >>>> exception because it will be serialized and sent to a remote node. >>>> >>>> I see the only possible way to ensure this: make this be verified at >>>> compile time. This looks like a big change, but the draft may look like >>>> so: >>>> 1) Introduce new exception type, like CacheTopology*Checked*Exception >>>> which >>>> *must* take the minimum topology version to wait for >>>> 2) In all cases when a cache operation fails because of topology change, >>>> provide the appropriate exception >>>> 3) When CacheTopologyException is being constructed, create a >>>> corresponding >>>> local future based on wait version and throw the exception. >>>> >>>> Even though this still does not give us 100% guarantee that we did not >>>> miss >>>> anything, it should cover a lot more cases than now. >>>> >>>> 2017-01-30 14:20 GMT+03:00 Александр Меньшиков <sharple...@gmail.com>: >>>> >>>> > Alexey, >>>> > >>>> > For GridCacheIoManager#sendNoRetry it's not necessary because >>>> exception >>>> > will be ignored (or will cast to String). Perhaps my message was >>>> unclear. >>>> > I try to say that three methods can throw >>>> "ClusterTopologyCheckedException" >>>> > without any problem. >>>> > >>>> > But you are right, in some methods I can't set "retryFuture". We must >>>> > ensure that exception doesn't escape away. The best option is to >>>> ignore it >>>> > (or cast to String). >>>> > >>>> > But any way, look here: >>>> > >>>> > IgniteTxHandler#sendReply(UUID nodeId, GridDhtTxFinishRequest req, >>>> boolean >>>> > committed, GridCacheVersion nearTxId) >>>> > >>>> > Inside you can found next lines: >>>> > __________________________ >>>> > ClusterTopologyCheckedException *cause* = >>>> > new ClusterTopologyCheckedException("Primary node left grid."); >>>> > >>>> > res.checkCommittedError(new IgniteTxRollbackCheckedException("Failed >>>> to >>>> > commit transaction " + >>>> > "(transaction has been rolled back on backup node): " + >>>> req.version(), >>>> > *cause*)); >>>> > __________________________ >>>> > >>>> > How do you unsure that *"cause"* can't come to >>>> GridCacheUtils#retryTopologySafe(final >>>> >>>> > Callable<S> c) ? >>>> > >>>> > >>>> > Perhaps I don't know some rules which you use to maintain it now? >>>> > >>>> > >>>> > 2017-01-30 12:24 GMT+03:00 Alexey Goncharuk < >>>> alexey.goncha...@gmail.com>: >>>> > >>>> >> Alexander, >>>> >> >>>> >> Do you have a use-case in mind which results in >>>> IgniteTopologyException >>>> >> thrown from public API with retryFuture unset? >>>> >> >>>> >> retryFuture makes sense only for certain cache operations and may be >>>> set >>>> >> only in certain places in the code where we already know what to >>>> wait for. >>>> >> I do not see how you can initialize this future, for example, in >>>> >> GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage >>>> msg, >>>> >> byte >>>> >> plc) >>>> >> >>>> >> --AG >>>> >> >>>> >> 2017-01-27 15:45 GMT+03:00 Александр Меньшиков <sharple...@gmail.com >>>> >: >>>> >> >>>> >> > I found a lot of using "ClusterTopologyCheckedException" >>>> exception. In >>>> >> > most >>>> >> > use-case these exceptions were just ignored. It's easier to see if >>>> >> > issue-4612 will be applied (I'm waiting for code review by my team >>>> >> leader) >>>> >> > where I renamed all unused exceptions in the "ignored". >>>> >> > >>>> >> > It means that in some case "retryReadyFuture" can left unset. But >>>> there >>>> >> are >>>> >> > code where exception is being getting from "get()" method in some >>>> >> "future" >>>> >> > class and don't ignored (exception is sending to method >>>> >> > "GridFutureAdapter#onDone()" for example). >>>> >> > >>>> >> > So I decided not to do a deep global analysis of code and make some >>>> >> simple >>>> >> > rules. >>>> >> > 1. If some method "X" throws ClusterTopologyCheckedException and >>>> >> ignores >>>> >> > it >>>> >> > (or converts to String, there are some variants to lose exception >>>> like >>>> >> > object) in all methods that call the method "X", then we can don't >>>> set >>>> >> > "retryReadyFuture" for optimization. But this should be clarified >>>> in >>>> >> > javadoc. >>>> >> > 2. But if exception escapes at least once like object: we must set >>>> >> > "retryReadyFuture" in that method. >>>> >> > >>>> >> > A few times when call tree was simple, I went deeper. >>>> >> > >>>> >> > So I found only three methods which can throw >>>> >> > "ClusterTopologyCheckedException" without setting >>>> "retryReadyFuture": >>>> >> > >>>> >> > 1. IgfsFragmentizerManager#sendWithRetries(UUID nodeId, >>>> >> > IgfsCommunicationMessage msg) >>>> >> > 2. GridCacheIoManager#sendNoRetry(ClusterNode node, >>>> GridCacheMessage >>>> >> msg, >>>> >> > byte plc) >>>> >> > 3. GridCacheIoManager#sendOrderedMessage(ClusterNode node, Object >>>> >> topic, >>>> >> > GridCacheMessage msg, byte plc, long timeout) >>>> >> > >>>> >> > In all other methods "ClusterTopologyCheckedException" escapes >>>> away too >>>> >> > far. >>>> >> > >>>> >> > What do you think about this approach to make compromise between >>>> >> > optimization and correctness? >>>> >> > >>>> >> >>>> > >>>> > >>>> >>> >>> >> >