[ 
https://issues.apache.org/jira/browse/DIRMINA-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Lecharny updated DIRMINA-766:
--------------------------------------
    Description: 
When we read data from the channel, we store what we read into a buffer. This 
buffer has been allocated just before, using a configured size :

{code}
AsbtractPollingIoProcessor() {
...
     private void read(T session) {
        IoSessionConfig config = session.getConfig();
        IoBuffer buf = IoBuffer.allocate(config.getReadBufferSize()); // Here, 
this size is configured when the server is initialized. It defaults to 2048
        ...
                    while ((ret = read(session, buf)) > 0) {
                        readBytes += ret;
                        
                        if (!buf.hasRemaining()) {
                            break;
                        }
                    }
        ...
{code}

The problem here is that we either have a buffer which is too big, or not big 
enough.

1) The buffer is too big :
we allocate way too much memory, and the GC will kick too often. When having 
thousands of sessions and small messages, one can get an OOM quite quickly.

2) The buffer is too small :
If all the messages are smaller, this is not an issue. However, if the message 
gets bigger, then as the loop stops as soon as the buffer is full, many loops 
will be necessary to read the full message. As we get out of the loop as soon 
as the buffer is full, that means a full loop with a new select() etc. 
Overkilling !

Suggestion : use small buffer (let the user define the buffer size, as it's 
already the case), but assemble them until we get no more data to read from the 
socket. This way, we can deal with big messages efficiently, and with small 
messages without sucking too much memory.

One pitfall we want to avoid though : if the message is really huge (Mb), we 
may want to avoid storing all of it in memory. We have to define a threshold 
for the incoming data, or decide that when the message is above a certain size, 
then we have to switch to a file stored message.

In any case, we have many ways to improve the current implementation.


  was:
When we read data from the channel, we store what we read into a buffer. This 
buffer has been allocated just before, using a configured size :

AsbtractPollingIoProcessor() {
...
     private void read(T session) {
        IoSessionConfig config = session.getConfig();
        IoBuffer buf = IoBuffer.allocate(config.getReadBufferSize()); // Here, 
this size is configured when the server is initialized. It defaults to 2048
        ...
                    while ((ret = read(session, buf)) > 0) {
                        readBytes += ret;
                        
                        if (!buf.hasRemaining()) {
                            break;
                        }
                    }
        ...

The problem here is that we either have a buffer which is too big, or not big 
enough.

1) The buffer is too big :
we allocate way too much memory, and the GC will kick too often. When having 
thousands of sessions and small messages, one can get an OOM quite quickly.

2) The buffer is too small :
If all the messages are smaller, this is not an issue. However, if the message 
gets bigger, then as the loop stops as soon as the buffer is full, many loops 
will be necessary to read the full message. As we get out of the loop as soon 
as the buffer is full, that means a full loop with a new select() etc. 
Overkilling !

Suggestion : use small buffer (let the user define the buffer size, as it's 
already the case), but assemble them until we get no more data to read from the 
socket. This way, we can deal with big messages efficiently, and with small 
messages without sucking too much memory.

One pitfall we want to avoid though : if the message is really huge (Mb), we 
may want to avoid storing all of it in memory. We have to define a threshold 
for the incoming data, or decide that when the message is above a certain size, 
then we have to switch to a file stored message.

In any case, we have many ways to improve the current implementation.



> Read does not exploit buffer optimally
> --------------------------------------
>
>                 Key: DIRMINA-766
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-766
>             Project: MINA
>          Issue Type: Improvement
>    Affects Versions: 2.0.0-RC1
>            Reporter: Emmanuel Lecharny
>             Fix For: 2.0.8
>
>
> When we read data from the channel, we store what we read into a buffer. This 
> buffer has been allocated just before, using a configured size :
> {code}
> AsbtractPollingIoProcessor() {
> ...
>      private void read(T session) {
>         IoSessionConfig config = session.getConfig();
>         IoBuffer buf = IoBuffer.allocate(config.getReadBufferSize()); // 
> Here, this size is configured when the server is initialized. It defaults to 
> 2048
>         ...
>                     while ((ret = read(session, buf)) > 0) {
>                         readBytes += ret;
>                         
>                         if (!buf.hasRemaining()) {
>                             break;
>                         }
>                     }
>         ...
> {code}
> The problem here is that we either have a buffer which is too big, or not big 
> enough.
> 1) The buffer is too big :
> we allocate way too much memory, and the GC will kick too often. When having 
> thousands of sessions and small messages, one can get an OOM quite quickly.
> 2) The buffer is too small :
> If all the messages are smaller, this is not an issue. However, if the 
> message gets bigger, then as the loop stops as soon as the buffer is full, 
> many loops will be necessary to read the full message. As we get out of the 
> loop as soon as the buffer is full, that means a full loop with a new 
> select() etc. Overkilling !
> Suggestion : use small buffer (let the user define the buffer size, as it's 
> already the case), but assemble them until we get no more data to read from 
> the socket. This way, we can deal with big messages efficiently, and with 
> small messages without sucking too much memory.
> One pitfall we want to avoid though : if the message is really huge (Mb), we 
> may want to avoid storing all of it in memory. We have to define a threshold 
> for the incoming data, or decide that when the message is above a certain 
> size, then we have to switch to a file stored message.
> In any case, we have many ways to improve the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to