[ 
https://issues.apache.org/jira/browse/IGNITE-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814588#comment-16814588
 ] 

Ivan Rakov commented on IGNITE-11392:
-------------------------------------

[~Denis Chudov],
1. I propose to throttle with lrtDumpLimiter only logic that sends and prints 
stack trace from near node. Logic that prints current state of LRTs should be 
kept as is.
2. Catch clauses should be on the separate line.
{code:java}
} catch (Exception e) { // Incorrect
{code}
3. IDEA suggests to replace "new StackTraceElement[0]" with constant, I tend to 
agree.
4. Warning message is not quite correct:
{code:java}
U.warn(diagnosticLog, "Could not receive dump from transaction owner near node: 
" + e.getMessage());
{code}
We can't "receive" anything in this try block as we just trigger asynchronous 
computation. "Can't send" would be more honest.
5. Let's mention any transaction identifier (e.g. xid or nearXidVersion) in 
this message:
{code:java}
                                if (traceDump != null)
                                    U.warn(diagnosticLog, traceDump);
{code}
It would be easier to analyze trace dump messages in log afterwards.
6. Maybe inheritdoc instead of blank javadoc?
{code:java}
    /** */
    @Override public void run() {
        ((IgniteEx)ignite)
            .context()
            .cache()
            .context()
            .tm()
            .setTxOwnerDumpRequestsAllowed(allowed);
{code}


> Improve LRT diagnostic messages
> -------------------------------
>
>                 Key: IGNITE-11392
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11392
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexey Goncharuk
>            Assignee: Denis Chudov
>            Priority: Major
>             Fix For: 2.8
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently we print out only local information about long-running 
> transactions. This makes it hard to understand the cause of the LRT when an 
> ACTIVE transaction is initiated by a client in a large system and is not 
> being committed.
> Given that a primary node knows the near node ID, we can send a diagnostic 
> message to the near node for an ACTIVE transaction, find the thread that 
> started the transaction and dump it's stack, so the server node logs can at 
> least give an idea why the transaction is not being committed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to