Hi George,

Thank you for your excellent analysis!
Please find my solution attached as a
patch. Basically I needed to change
PendingRequest subclass only.

Best regards,
Frank

George Tsaloukidis wrote:
Dear Frank,



I have tested the solution with the usage of pendingRetry flag,

but the problem still remains.



I am afraid the pendingRetry flag is raised too late. As I wrote in the first mail,

target object can be null in pduHandleAssigned() due to two different reasons:



a)         PendingRequest was constructed from the original PendingRequest

that had been canceled before the creation of the copy.

That means the thread of TransportMapping informed for the receive of the

response after the valuation of "retry" flag in PendingRequest.run() method and

main thread managed to call the cancel method on original PendingRequest

before the creation of the copy. So the pendingRetry should become true

when the retry flag is evaluated and not after.



       synchronized (pendingRequests) {

         retry = (!finished) && (retryCount > 0) && (!responseReceived);

         if(retry)

         {

           this.pendingRetry = true;

         }

       }



By placing the assignment of flag inside the synchronized block, we are sure that

the main thread is waiting for the notify call (so cancel cannot be called).



b)         cancel() method of the copy PendingRequest was called.

This could only happen if main thread found that into the pendingRequests list.

This means TimerThread managed to execute "pendingRequests.put(handle, this);"

in pduHandleAssigned() and main thread removed it from the pendingRequests list

and called the cancel() method before the access of target from TimerThread.



We can use the pendingRetry flag to protect us in this case too.

The flag has to be raised before the insertion of the copy into the pendingRequests list

and it can be turned again to false after the calculation of the "delay" local variable.



I am looking forward for your comments,


George Tsaloukidis.



----- Original Message ----- From: "Frank Fock" <[email protected]>
To: "George Tsaloukidis" <[email protected]>
Cc: <[email protected]>
Sent: Monday, April 06, 2009 1:08 AM
Subject: Re: [SNMP4J] NullPointerException while resending a GET request


Hi George,

Synchronizing the PendingRequest.cancel() method causes deadlock
problems. The source problem is the early object release
in that method. The retry creation phase can be protected by
a simple flag that is true (on) while the next retry is being
prepared and not yet sent.

Please replace the PendingRequest class in Snmp.java with the
code below (or the next nightly build) and let me know if
the NPE goes away:

Best regards,
Frank


  class PendingRequest extends TimerTask implements PduHandleCallback {

    private PduHandle key;
    protected int retryCount;
    protected ResponseListener listener;
    protected Object userObject;

    protected PDU pdu;
    protected Target target;
    protected TransportMapping transport;

    private int requestStatus = 0;
// Maximum request status - allows to receive up to two reports and then // send the original request again. A value of 0 is used for discovery.
    private int maxRequestStatus = 2;

    private volatile boolean finished = false;
    private volatile boolean responseReceived = false;

    private volatile boolean pendingRetry = false;


    public PendingRequest(ResponseListener listener,
                          Object userObject,
                          PDU pdu,
                          Target target,
                          TransportMapping transport) {
      this.userObject = userObject;
      this.listener = listener;
      this.retryCount = target.getRetries();
      this.pdu = pdu;
      this.target = (Target) target.clone();
      this.transport = transport;
    }

    private PendingRequest(PendingRequest other) {
      other.pendingRetry = true;
      this.userObject = other.userObject;
      this.listener = other.listener;
      this.retryCount = other.retryCount - 1;
      this.pdu = other.pdu;
      this.target = other.target;
      this.requestStatus = other.requestStatus;
      this.responseReceived = other.responseReceived;
      this.transport = other.transport;
    }

    protected void registerRequest(PduHandle handle) {

    }

    public void responseReceived() {
      this.responseReceived = true;
    }

public synchronized void pduHandleAssigned(PduHandle handle, Object pdu) {
      if (key == null) {
        key = handle;
        pendingRequests.put(handle, this);
        registerRequest(handle);
        if (logger.isDebugEnabled()) {
          logger.debug("Running pending "+
                       ((listener instanceof SyncResponseListener) ?
                        "sync" : "async")+
                       " request with handle " + handle+
                       " and retry count left "+retryCount);
        }
        long delay = timeoutModel.getRetryTimeout(target.getRetries() -
                                                  retryCount,
                                                  target.getRetries(),
                                                  target.getTimeout());
        if ((!finished) && (!responseReceived)) {
          timer.schedule(this, delay);
        }
      }
    }

    /**
     * Process retries of a pending request.
     */
    public synchronized void run() {
      PduHandle m_key = key;
      PDU m_pdu = pdu;
      Target m_target = target;
      TransportMapping m_transport = transport;
      ResponseListener m_listener = listener;
      Object m_userObject = userObject;

      if ((m_key == null) || (m_pdu == null) || (m_target == null) ||
          (m_listener == null)) {
        if (logger.isDebugEnabled()) {
logger.debug("PendingRequest canceled key="+m_key+", pdu="+m_pdu+ ", target="+m_target+", transport="+m_transport+", listener="+
              m_listener);
        }
        return;
      }

      try {
        boolean retry;
        synchronized (pendingRequests) {
          retry = (!finished) && (retryCount > 0) && (!responseReceived);
        }
        if (retry) {
          try {
            PendingRequest nextRetry = new PendingRequest(this);
            sendMessage(m_pdu, m_target, m_transport, nextRetry);
            this.pendingRetry = false;
          }
          catch (IOException ex) {
            finished = true;
            logger.error("Failed to send SNMP message to " + m_target +
                         ": " +
                         ex.getMessage());
messageDispatcher.releaseStateReference(m_target.getVersion(),
                m_key);
            listener.onResponse(new ResponseEvent(Snmp.this, null,
m_pdu, null, m_userObject,
                                                  ex));
          }
        }
        else if (!finished) {
          finished = true;
          pendingRequests.remove(m_key);

          if (!responseReceived) {
            // request timed out
            if (logger.isDebugEnabled()) {
logger.debug("Request timed out: " + m_key.getTransactionID());
            }
messageDispatcher.releaseStateReference(m_target.getVersion(),
                                                    m_key);
            listener.onResponse(new ResponseEvent(Snmp.this, null,
m_pdu, null, m_userObject));
          }
        }
        else {
// make sure pending request is removed even if response listener
          // failed to call Snmp.cancel
          pendingRequests.remove(m_key);
        }
      }
      catch (RuntimeException ex) {
        logger.error("Failed to process pending request " + m_key +
                     " because " + ex.getMessage(), ex);
        throw ex;
      }
      catch (Error er) {
        logger.fatal("Failed to process pending request " + m_key +
                     " because " + er.getMessage(), er);
        throw er;
      }
    }

    public boolean setFinished() {
      boolean currentState = finished;
      this.finished = true;
      return currentState;
    }

    public void setMaxRequestStatus(int maxRequestStatus) {
      this.maxRequestStatus = maxRequestStatus;
    }

    public int getMaxRequestStatus() {
      return maxRequestStatus;
    }

    public boolean isResponseReceived() {
      return responseReceived;
    }

    /**
     * Cancels the request and clears all internal fields by setting them
     * to <code>null</code>.
     * @return
     *    <code>true</code> if cancellation was successful.
     */
    public boolean cancel(){
        boolean result = super.cancel();

        // free objects early
        if (!pendingRetry) {
          key = null;
          pdu = null;
          target = null;
          transport = null;
          listener = null;
          userObject = null;
        }
        return result;
    }
  }


George Tsaloukidis wrote:
Dear Frank,

I am using snmp4j version 1.9.3d for executing synchronous snmp GET requests. My application has many threads, but each thread uses its own org.snmp4j.Snmp instance.
Sometimes I get the following exception:

java.lang.NullPointerException
at org.snmp4j.Snmp$PendingRequest.pduHandleAssigned(Unknown Source)
        at org.snmp4j.MessageDispatcherImpl.sendPdu(Unknown Source)
        at org.snmp4j.Snmp.sendMessage(Unknown Source)
        at org.snmp4j.Snmp$PendingRequest.run(Unknown Source)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
Exception in thread "Timer-4877612" java.lang.NullPointerException
at org.snmp4j.Snmp$PendingRequest.pduHandleAssigned(Unknown Source)
        at org.snmp4j.MessageDispatcherImpl.sendPdu(Unknown Source)
        at org.snmp4j.Snmp.sendMessage(Unknown Source)
        at org.snmp4j.Snmp$PendingRequest.run(Unknown Source)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)

After rebuilding SNMP4J.jar with debug information I saw that the problem was at line 1449 of Snmp.java file:

        long delay = timeoutModel.getRetryTimeout(target.getRetries() -
                                                  retryCount,
                                                  target.getRetries(),
                                                  target.getTimeout());

That means the target object was null while trying to retransmit the request. That can happen only for two reasons: a) PendingRequest of the retry action created just after the cancellation of the very first PendingRequest (line 1487) or,
the PendingRequest canceled while inside the pduHandleAssigned() method.
The second case can appear only if the main thread gets that object from the pendingRequests list (line 811). That means the Timer thread had previously managed to execute the 1440 line and suspended for a while.

The problem is that the send() method (line 793) can call the cancel() operation on a PendingRequest at any time, without taking into consideration if the TimerThread is executed.

A solution can be to first lock the PendingRequest before calling setFinished() and cancel() methods. Line 811 should also go after the cancellation of the first request or else if PendingRequest.run() has started, the newly created PendingRequest is left into the pendingRequests list for ever.
I have tested the above solution and I had no problem.

And something else. At line 810 the wait() method is called.
Sun advices such calls to be wrapped into a while loop
(see http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.8.1 and http://java.sun.com/javase/6/docs/api/java/lang/Object.html#wait() links)

while(syncResponse.response == null)
{
   syncResponse.wait();
}

Best regards,

George Tsaloukidis,
Senior Software Engineer
Systems Software Developement
Intracom Telecom S.A
_______________________________________________
SNMP4J mailing list
[email protected]
http://lists.agentpp.org/mailman/listinfo/snmp4j

--
AGENT++
http://www.agentpp.com
http://www.mibexplorer.com
http://www.mibdesigner.com



--
AGENT++
http://www.agentpp.com
http://www.snmp4j.com
http://www.mibexplorer.com
http://www.mibdesigner.com
Index: Snmp.java
===================================================================
--- Snmp.java   (revision 459)
+++ Snmp.java   (working copy)
@@ -804,7 +804,9 @@
           new PendingRequest(syncResponse, target, pdu, target, transport);
       handle = sendMessage(pdu, target, transport, request);
       try {
-        syncResponse.wait();
+        while (syncResponse.getResponse() == null) {
+          syncResponse.wait();
+        }
         retryRequest = (PendingRequest) pendingRequests.remove(handle);
         if (logger.isDebugEnabled()) {
           logger.debug("Removed pending request with handle: "+handle);
@@ -1415,7 +1417,6 @@
     }
 
     private PendingRequest(PendingRequest other) {
-      other.pendingRetry = true;
       this.userObject = other.userObject;
       this.listener = other.listener;
       this.retryCount = other.retryCount - 1;
@@ -1427,7 +1428,7 @@
     }
 
     protected void registerRequest(PduHandle handle) {
-
+      // overwritten by subclasses
     }
 
     public void responseReceived() {
@@ -1437,22 +1438,31 @@
     public synchronized void pduHandleAssigned(PduHandle handle, Object pdu) {
       if (key == null) {
         key = handle;
-        pendingRequests.put(handle, this);
-        registerRequest(handle);
-        if (logger.isDebugEnabled()) {
-          logger.debug("Running pending "+
-                       ((listener instanceof SyncResponseListener) ?
-                        "sync" : "async")+
-                       " request with handle " + handle+
-                       " and retry count left "+retryCount);
+        // get pointer to target before adding request to pending list
+        // to make sure that this retry is not being cancelled before we
+        // got the target pointer.
+        Target t = target;
+        if (t != null) {
+          pendingRequests.put(handle, this);
+          registerRequest(handle);
+          if (logger.isDebugEnabled()) {
+            logger.debug("Running pending " +
+                         ((listener instanceof SyncResponseListener) ?
+                          "sync" : "async") +
+                         " request with handle " + handle +
+                         " and retry count left " + retryCount);
+          }
+          long delay =
+              timeoutModel.getRetryTimeout(t.getRetries() - retryCount,
+                                           t.getRetries(),
+                                           t.getTimeout());
+          if ((!finished) && (!responseReceived)) {
+            timer.schedule(this, delay);
+          }
+          else {
+            pendingRequests.remove(handle);
+          }
         }
-        long delay = timeoutModel.getRetryTimeout(target.getRetries() -
-                                                  retryCount,
-                                                  target.getRetries(),
-                                                  target.getTimeout());
-        if ((!finished) && (!responseReceived)) {
-          timer.schedule(this, delay);
-        }
       }
     }
 
@@ -1478,11 +1488,11 @@
       }
 
       try {
-        boolean retry;
         synchronized (pendingRequests) {
-          retry = (!finished) && (retryCount > 0) && (!responseReceived);
+          this.pendingRetry =
+              (!finished) && (retryCount > 0) && (!responseReceived);
         }
-        if (retry) {
+        if (this.pendingRetry) {
           try {
             PendingRequest nextRetry = new PendingRequest(this);
             sendMessage(m_pdu, m_target, m_transport, nextRetry);
_______________________________________________
SNMP4J mailing list
[email protected]
http://lists.agentpp.org/mailman/listinfo/snmp4j

Reply via email to