[ 
https://issues.apache.org/jira/browse/DIRMINA-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133933#comment-15133933
 ] 

Michael Kohlsche edited comment on DIRMINA-1027 at 2/5/16 10:15 AM:
--------------------------------------------------------------------

I’m sorry again for been not clear enough. The problem occurs in the releases 
2.0.9, 2.0.11 and with my own build from the 2.0-branch. The (in my opinion) 
problematic method looks the same in all these builds:

{code:title=SslHandler.java}
    /* no qualifier */void flushScheduledEvents() {
        // Fire events only when the lock is available for this handler.
        IoFilterEvent event;
        try {
            sslLock.lock();

            // We need synchronization here inevitably because filterWrite can 
be
            // called simultaneously and cause 'bad record MAC' integrity error.
            while ((event = filterWriteEventQueue.poll()) != null) {
                NextFilter nextFilter = event.getNextFilter();
                nextFilter.filterWrite(session, (WriteRequest) 
event.getParameter());
            }
        } finally {
            sslLock.unlock();
        }

        while ((event = messageReceivedEventQueue.poll()) != null) {
            NextFilter nextFilter = event.getNextFilter();
            nextFilter.messageReceived(session, event.getParameter());
        }
    }
{code}

After changing this part to something like Terence Marks provided in his 
fix.java inside the ticket 
[DIRMINA-1019|https://issues.apache.org/jira/browse/DIRMINA-1019] the problem 
disappeared. My working solution looks like this:

{code:title=SslHandler.java}
  private final AtomicInteger scheduled_events = new AtomicInteger(0);
...

  /* no qualifier */void flushScheduledEvents() {

    scheduled_events.incrementAndGet();

    // Fire events only when the lock is available for this handler.
    if (sslLock.tryLock()) {

      IoFilterEvent event;

      // We need synchronization here inevitably because filterWrite can be
      // called simultaneously and cause 'bad record MAC' integrity error.
      try {
        do {
          while ((event = filterWriteEventQueue.poll()) != null) {
            final NextFilter nextFilter = event.getNextFilter();
            nextFilter.filterWrite(session, (WriteRequest)event.getParameter());
          }

          while ((event = messageReceivedEventQueue.poll()) != null) {
            final NextFilter nextFilter = event.getNextFilter();
            nextFilter.messageReceived(session, event.getParameter());
          }
        }
        while (scheduled_events.decrementAndGet() > 0);
      }
      finally {
        sslLock.unlock();
      }
    }
  }
{code}

I’m not sure if this fixes the real problem or if it just hides the symptoms.





was (Author: coulton):
I’m sorry again for been not clear enough. The problem occurs in the releases 
2.0.9, 2.0.11 and with my own build from the 2.0-branch. The (in my opinion) 
problematic method looks the same in all these builds:

{code:title=SslHandler.java}
 /* no qualifier */void flushScheduledEvents() {
    // Fire events only when the lock is available for this handler.
    IoFilterEvent event;
    try {
      sslLock.lock();

      // We need synchronization here inevitably because filterWrite can be
      // called simultaneously and cause 'bad record MAC' integrity error.
      while ((event = filterWriteEventQueue.poll()) != null) {
        final NextFilter nextFilter = event.getNextFilter();
        nextFilter.filterWrite(session, (WriteRequest)event.getParameter());
      }

      while ((event = messageReceivedEventQueue.poll()) != null) {
        final NextFilter nextFilter = event.getNextFilter();
        nextFilter.messageReceived(session, event.getParameter());
      }

    }
    finally {
      sslLock.unlock();
    }
  }
{code}

After changing this part to something like Terence Marks provided in his 
fix.java inside the ticket 
[DIRMINA-1019|https://issues.apache.org/jira/browse/DIRMINA-1019] the problem 
disappeared. My working solution looks like this:

{code:title=SslHandler.java}
  private final AtomicInteger scheduled_events = new AtomicInteger(0);
...

  /* no qualifier */void flushScheduledEvents() {

    scheduled_events.incrementAndGet();

    // Fire events only when the lock is available for this handler.
    if (sslLock.tryLock()) {

      IoFilterEvent event;

      // We need synchronization here inevitably because filterWrite can be
      // called simultaneously and cause 'bad record MAC' integrity error.
      try {
        do {
          while ((event = filterWriteEventQueue.poll()) != null) {
            final NextFilter nextFilter = event.getNextFilter();
            nextFilter.filterWrite(session, (WriteRequest)event.getParameter());
          }

          while ((event = messageReceivedEventQueue.poll()) != null) {
            final NextFilter nextFilter = event.getNextFilter();
            nextFilter.messageReceived(session, event.getParameter());
          }
        }
        while (scheduled_events.decrementAndGet() > 0);
      }
      finally {
        sslLock.unlock();
      }
    }
  }
{code}

I’m not sure if this fixes the real problem or if it just hides the symptoms.




> SSLHandler writes corrupt messages under heavy load
> ---------------------------------------------------
>
>                 Key: DIRMINA-1027
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-1027
>             Project: MINA
>          Issue Type: Bug
>          Components: SSL
>    Affects Versions: 2.0.11
>            Reporter: Michael Kohlsche
>            Priority: Critical
>
> I’m facing a critical problem in my project with an MINA-stack including SSL. 
> My Protocol-IoFilterAdapter receives corrupt messages under heavy load 
> (JMeter-Benchmark with 300 Threads and round about 5-10 MINA-calls per 
> thread). The Problem disappeared without SSL, so I debugged for some days, 
> and I think the problem occurred because of the changes made for fixing the 
> issue [DIRMINA-1019|https://issues.apache.org/jira/browse/DIRMINA-1019] 
> ([commit|http://git-wip-us.apache.org/repos/asf/mina/commit/9b5c07f9]). I 
> think the problem is that the SSLHandler is calling messageReceived while 
> also writing data into the next filter, which is writing inside the 
> messageReceivedEventQueue. With the fix provided inside the 
> race-condition-ticket, it works like a charm. So I think that the answer of 
> the question "Why the second loop is also protected by the loop?" is: because 
> otherwise the messageReceivedEventQueue gets corrupted data...
> (I hope I’m not wrong and anybody can follow my thoughts and maybe bad 
> English)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to