Do back this idea up of having a glossary of common errors. There is even a ticket for that I created a couple of years ago. Can search for it later.
Val, how about the 3.0 suggestion? Let’s introduce error codes. On Monday, January 4, 2021, Michael Cherkasov <michael.cherka...@gmail.com> wrote: > Hi Ilya, > > It's about logs only, I don't think we need this at the API level. Error > codes will make the solutions more searchable. > Plus we can build troubleshooting guides based on it, it will help us > gather information from user list and StackOverflow. > > Even a solution for trivial cases will be helpful, once I was requested to > join the call late evening because ignite failed to copy WAL file and there > just was no space on the disk. > While the error was obvious for me, it's not obvious for all users. > > Let's start from something simple, just assign error codes to > absolutely all exceptions first. So next year or two user list will full of > error codes and solutions for them. > > Might be it's a change for Ignite 3.0? @Val, I think you can help with > this question. > > Any thoughts/comments? > > Thanks, > Mike. > > сб, 2 янв. 2021 г. в 12:18, Ilya Kasnacheev <ilya.kasnach...@gmail.com>: > >> Hello! >> >> I don't think there's a direct link between an exception thrown in depths >> of Ignite code, and specific error which may be reported to user. >> >> A notorious example is CorruptedTreeException which is known to be thrown >> due to incorrect field type in binary object or bad SQL cast. So we could >> document it "If you get IGN13 error this means your persistence is >> corrupted beyond repair. This, or you have a typo in your SQL." - of course >> it will not help anyone. >> >> This means we can't get to the desired result by application of 1. >> >> There's got to be a different plan. First of all, we need to decide >> what's our target. Is it log, or is it API? >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> пт, 1 янв. 2021 г. в 02:07, Michael Cherkasov < >> michael.cherka...@gmail.com>: >> >>> Hi folks, >>> >>> I was thinking how we can simplify Ignite clusters troubleshooting and >>> the best of course if the cluster can do self-healing, like transaction >>> cancellation if tx blocks exchange or note restart on OOM error. However, >>> sometimes those mechanisms don't work well or user interaction is required. >>> Not all errors are obvious for users and it's not clear what actions >>> required to restore the cluster. >>> If you google exceptions or error messages and the results can be >>> ambiguous and not certain because different errors can have similar >>> exceptions and you need to analyze stack trace to distinguish them. So >>> googling isn't a straight and easy process in this case. >>> Almost all major DBs have error codes[1][2][3] >>> Let's do the same for Ignite, error codes easy to google, so user/dev >>> list will be significantly more useful. We can have documentation with an >>> error code registry and solutions for the errors. >>> >>> To implement this we need to do the following: >>> 1. all error messages/exceptions must have a unique error code(so, all >>> new PR must NOT be accepted if any exceptions/errors don't have error >>> codes.) >>> 2. to avoid error code duplication, all error codes will be stored as >>> files under some folder. >>> 3. those files can be a source of documentation for this error code. >>> >>> All this files can be empty, but futher, if exception will apper on user >>> list and someone will find solution, first, other people can easialy google >>> it by error code, and second, we can build documentation for this error >>> code base on user-list thread/stackoverflow/other source. >>> >>> Any thoughts? >>> >>> [1] Mysql https://dev.mysql.com/doc/refman/8.0/en/error- >>> message-elements.html >>> [2] OracleDB https://docs.oracle.com/pls/db92/db92.error_search >>> [3] PostgreSQL https://www.postgresql.org/docs/10/errcodes-appendix.html >>> >>> Thanks, >>> Mike. >>> >> -- - Denis