Do back this idea up of having a glossary of common errors. There is even a
ticket for that I created a couple of years ago. Can search for it later.

Val, how about the 3.0 suggestion? Let’s introduce error codes.

On Monday, January 4, 2021, Michael Cherkasov <michael.cherka...@gmail.com>
wrote:

> Hi Ilya,
>
> It's about logs only, I don't think we need this at the API level. Error
> codes will make the solutions more searchable.
> Plus we can build troubleshooting guides based on it, it will help us
> gather information from user list and StackOverflow.
>
> Even a solution for trivial cases will be helpful, once I was requested to
> join the call late evening because ignite failed to copy WAL file and there
> just was no space on the disk.
> While the error was obvious for me, it's not obvious for all users.
>
> Let's start from something simple, just assign error codes to
> absolutely all exceptions first. So next year or two user list will full of
> error codes and solutions for them.
>
> Might be it's a change for Ignite 3.0? @Val, I think you can help with
> this question.
>
> Any thoughts/comments?
>
> Thanks,
> Mike.
>
> сб, 2 янв. 2021 г. в 12:18, Ilya Kasnacheev <ilya.kasnach...@gmail.com>:
>
>> Hello!
>>
>> I don't think there's a direct link between an exception thrown in depths
>> of Ignite code, and specific error which may be reported to user.
>>
>> A notorious example is CorruptedTreeException which is known to be thrown
>> due to incorrect field type in binary object or bad SQL cast. So we could
>> document it "If you get IGN13 error this means your persistence is
>> corrupted beyond repair. This, or you have a typo in your SQL." - of course
>> it will not help anyone.
>>
>> This means we can't get to the desired result by application of 1.
>>
>> There's got to be a different plan. First of all, we need to decide
>> what's our target. Is it log, or is it API?
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пт, 1 янв. 2021 г. в 02:07, Michael Cherkasov <
>> michael.cherka...@gmail.com>:
>>
>>> Hi folks,
>>>
>>> I was thinking how we can simplify Ignite clusters troubleshooting and
>>> the best of course if the cluster can do self-healing, like transaction
>>> cancellation if tx blocks exchange or note restart on OOM error. However,
>>> sometimes those mechanisms don't work well or user interaction is required.
>>> Not all errors are obvious for users and it's not clear what actions
>>> required to restore the cluster.
>>> If you google exceptions or error messages and the results can be
>>> ambiguous and not certain because different errors can have similar
>>> exceptions and you need to analyze stack trace to distinguish them. So
>>> googling isn't a straight and easy process in this case.
>>> Almost all major DBs have error codes[1][2][3]
>>> Let's do the same for Ignite, error codes easy to google, so user/dev
>>> list will be significantly more useful. We can have documentation with an
>>> error code registry and solutions for the errors.
>>>
>>> To implement this we need to do the following:
>>> 1. all error messages/exceptions must have a unique error code(so, all
>>> new PR must NOT be accepted if any exceptions/errors don't have error
>>> codes.)
>>> 2. to avoid error code duplication, all error codes will be stored as
>>> files under some folder.
>>> 3. those files can be a source of documentation for this error code.
>>>
>>> All this files can be empty, but futher, if exception will apper on user
>>> list and someone will find solution, first, other people can easialy google
>>> it by error code, and second, we can build documentation for this error
>>> code base on user-list thread/stackoverflow/other source.
>>>
>>> Any thoughts?
>>>
>>> [1] Mysql https://dev.mysql.com/doc/refman/8.0/en/error-
>>> message-elements.html
>>> [2] OracleDB https://docs.oracle.com/pls/db92/db92.error_search
>>> [3] PostgreSQL https://www.postgresql.org/docs/10/errcodes-appendix.html
>>>
>>> Thanks,
>>> Mike.
>>>
>>

-- 
-
Denis

Reply via email to