[ 
https://issues.apache.org/jira/browse/DIRMINA-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15728492#comment-15728492
 ] 

Fabio Soares commented on DIRMINA-1055:
---------------------------------------

Hey Guys!
I have an update!
To reduce the number of logs I did this and it helped a lot:

{code:java}
@Override
public void messageReceived(IoSession session, Object message) {
        if (message == null) {
                errorCounter++;
                if (errorCounter == 0 || errorCounter % 100000 == 0) {
                        logger.error("Error on Antenna: " + 
this.source.getName() + " (ID:" + this.source.getId() + "): Message is null");
                        //session.closeNow();
                }
                return;
        }
        ....
}
{code}

After I deployed this, the next time we had this error the load didn't increase 
much from normal levels and the log file size was reduced from 45GB to 22MB. 
This helps mitigate the risk of the machine running out of space and high load 
but doesn't solve the problem.

I was able to unbind and dispose the NioDatagramAcceptor but the session didn't 
close and it seems to be locked in a loop just generating null messages. And 
once again the shutdownhook isn't triggered when I try to restart the process.

>From what I can see, what was causing the high load was the big amount of log 
>being generated and not Mina itself.

Like I said before, this only happens with 3 UDP connections from China. In 
total we have 116 UDP connections from all over the world.
When we had this incident, I did a tcpdump to check how the incoming data 
looked and it was normal, but the amount of data received doesn't compare to 
the amount of log being generated. Something I tried was to drop all data 
coming in in that port and see if the log would still be generated without 
incoming data but I wasn't able to make it work.

Another interesting 'symptom' is that sometimes, we are receiving data and its 
not processed. 
I do a tcpdump and I can see the data coming in but is as if ignored. This also 
happened when we were using Mina 2.0.7 

I will keep trying to diagnose this problem but I'm running out of ideas.
Thank you.

> High cpu load on messageReceived after null message
> ---------------------------------------------------
>
>                 Key: DIRMINA-1055
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-1055
>             Project: MINA
>          Issue Type: Bug
>    Affects Versions: 2.0.16
>         Environment: Ubuntu 16.04 LTS
> 4.4.0-28-generic #47-Ubuntu SMP Fri Jun 24 10:09:13 UTC 2016 x86_64 x86_64 
> x86_64 GNU/Linux
> java version "1.8.0_91"
> Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
>            Reporter: Fabio Soares
>
> Here is the code where I initialize the Acceptor
> {code:java}
>   private NioDatagramAcceptor startUdpAcceptor(Antenna source) throws 
> Exception {
>     NioDatagramAcceptor acceptor = new NioDatagramAcceptor();
>     initAcceptor(acceptor, source);
>     return acceptor;
>   }
>   private void initAcceptor(IoAcceptor acceptor, Antenna antenna) throws 
> Exception {
>     acceptor.getFilterChain().addLast("codec", new ProtocolCodecFilter(new 
> TextLineCodecFactory()));
>     FormatHandler handler = FormatHandlerFactory.getFormatHandler(antenna);
>     acceptor.setHandler(handler);
>     acceptor.bind(new InetSocketAddress(antenna.getRawPort()));
>   }
> {code}
> My format handler just extends IoHandlerAdapter and overrides some of the 
> other methods to log information's. 
> On my method _public void messageReceived(IoSession session, Object message)_ 
> I have the following:
> {code:java}
> @Override
> public void messageReceived(IoSession session, Object message) {
>     //Sometimes happens that the message at this point is null
>     if (message == null) {
>          logger.error("Error on Antenna: " + this.source.getName() + " (ID:" 
> + this.source.getId() + "): Message is null");
>          return;
>     }
>     ... just process the message
> }
> {code}
> On my _public void exceptionCaught(IoSession session, Throwable cause)_ 
> I just close the session
> Sometimes it happens that this session somehow gets locked, my log only shows 
> an error message as in a loop and the server load spikes. All the other 
> sessions (around 1200 sessions) are still working on background as the data 
> keeps flowing but certain other threats stop responding. There's a thread 
> running calling an API checking there are new session that need to be created 
> or stopped and once it gets looked this api call stops happening.
> I've had this happening for a few hours and it generated a 45GB log file each 
> hour just with this message:
> {panel}
> 2016-11-23 07:10:22,913 ERROR [NioDatagramAcceptor-441] 
> com.vesseltracker.ais_proxy.logic.formatHandlers.NMEAFormatHandler: Error on 
> Antenna: ChinaSource (ID:2405): Message is null
> {panel}
> Closing the session doesn't stop the problem.
> This error only stops when we restart the process and it only dies when we 
> use kill -5 or -9
> Before we were using Ubuntu 12 and Mina 2.0.7 and we didn't have this 
> problem. It started when we upgraded the server and software running on it
> After the update we started by using mina 2.0.13 and it has followed all the 
> new updates. 
> The funny thing is that this only happens with 3 UDP connections coming from 
> China (not simultaneously) and after the restart it works normally
> Its running on:
> Mina 2.0.16
> Ubuntu 16.04 LTS
> 4.4.0-28-generic #47-Ubuntu SMP Fri Jun 24 10:09:13 UTC 2016 x86_64 x86_64 
> x86_64 GNU/Linux
> java version "1.8.0_91"
> Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
> And the server:
> Intel® Xeon® Processor E5-2650 v4 2,2 GHz, 12-Core
> 128GB DDR4-2133 RAM
> 500GB DC S3500 Series SSD
> I would like to provide more infos but I don't know what you guys would need 
> so please let me know and I will try to get those infos



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to