Just some points looking questionable to me, although I realize the error handling style may be very opinionated:
- Would it make sense splitting the proposed composite error code (TBL-0001) into separate numeric code (0001) and scope/category ("TBL") to avoid parsing the code when an error handler needs to analyze only the category or the code? - "*The cause - short string description of an issue, readable by user.*". This terminology sounds confusing to me since that "cause" sounds like Java Throwable's Message to me and the "Cause" is a lower level exception. - "*The action - steps for a user to resolve error...*". The action is very important but do we want to make it part of the IgniteException? I do not think the recovery action text should be part of the exception. IgniteException may include a URL pointing to the corresponding documentation - this is discussable. - "*All public methods throw only unchecked IgniteException*" - I assume we still respect JCache specification and prefer using standard Java exception to signal about invalid parameters. - Why we do not follow the "classic" structured exception handling practices in Ignite: - Why do we not allow using checked exceptions? It seems to me sometimes forcing the user to handle an error serves as a hint and thus improves usability. For example, handling an optimistic/pessimistic transaction conflict/deadlock. Or handling a timeout for operations with timeouts. - Why single public IgniteException and no exception hierarchy? Java is optimized for structured exception handling instead of using IF-ELSE to analyze the codes. - Why no nested exceptions? Sometimes an error handler is interested only in high level exceptions (like Invalid Configuration) and sometimes details are needed (like specific configuration parser exceptions). - For async methods returning a Future we may have a universal rule on how to handle exceptions. For example, we may specify that any async method can throw only invalid argument exceptions. All other errors are reported via the exceptionally(IgniteException -> {}) callback even if the async method was executed synchronously. вт, 13 апр. 2021 г. в 12:08, Alexei Scherbakov <alexey.scherbak...@gmail.com >: > Igniters, > > I would like to start the discussion about error handling in Ignite 3 and > how we can improve it compared to Ignite 2. > > The error handling in Ignite 2 was not very good because of generic > CacheException thrown on almost any occasion, having deeply nested root > cause and often containing no useful information on further steps to fix > the issue. > > I aim to fix it by introducing some rules on error handling. > > *Public exception structure.* > > A public exception must have an error code, a cause, and an action. > > * The code - the combination of 2 byte scope id and 2 byte error number > within the module. This allows up to 2^16 errors for each scope, which > should be enough. The error code string representation can look like > RFT-0001 or TBL-0001 > * The cause - short string description of an issue, readable by user. This > can have dynamic parameters depending on the error type for better user > experience, like "Can't write a snapshot, no space left on device {0}" > * The action - steps for a user to resolve error situation described in the > documentation in the corresponding error section, for example "Clean up > disk space and retry the operation". > > Common errors should have their own scope, for example IGN-0001 > > All public methods throw only unchecked > org.apache.ignite.lang.IgniteException containing aforementioned fields. > Each public method must have a section in the javadoc with a list of all > possible error codes for this method. > > A good example with similar structure can be found here [1] > > *Async timeouts.* > > Because almost all API methods in Ignite 3 are async, they all will have a > configurable default timeout and can complete with timeout error if a > computation is not finished in time, for example if a response has not been > yet received. > I suggest to complete the async op future with TimeoutException in this > case to make it on par with synchronous execution using future.get, which > will throw java.util.concurrent.TimeoutException on timeout. > For reference, see java.util.concurrent.CompletableFuture#orTimeout > No special error code should be used for this scenario. > > *Internal exceptions hierarchy.* > > All internal exceptions should extend > org.apache.ignite.internal.lang.IgniteInternalException for checked > exceptions and > org.apache.ignite.internal.lang.IgniteInternalCheckedException for > unchecked exceptions. > > Thoughts ? > > [1] https://docs.oracle.com/cd/B10501_01/server.920/a96525/preface.htm > > -- > > Best regards, > Alexei Scherbakov > -- Best regards, Alexey