[jira] [Commented] (SSHD-754) OOM in sending data for channel

2017-07-11 Thread Goldstein Lyor (JIRA)

[ 
https://issues.apache.org/jira/browse/SSHD-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081972#comment-16081972
 ] 

Goldstein Lyor commented on SSHD-754:
-

Looks like a good idea - I believe though that the better solution would be to 
implement such a "throttling" at the *channel* level instead of the *session* - 
after all, we might want to control each channel separately. I.e., 
{{ChannelOutputStream/ChannelAsyncOutputStream - write/flush}}. Furthermore, 
the "throttle" rate should be *configurable* - where zero (default) means "no 
throttling".

> OOM in sending data for channel
> ---
>
> Key: SSHD-754
> URL: https://issues.apache.org/jira/browse/SSHD-754
> Project: MINA SSHD
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Eugene Petrenko
>
> I have an implementation of SSHD server with the library. It sends gigabytes 
> (e.g. 5GB) of data as command output. 
> Starting from Putty plink 0.68 (also includes plink 0.69) we started to have 
> OOM errors. Checking memory dumps shown the most of the memory is consumed 
> from the function
> org.apache.sshd.common.session.AbstractSession#writePacket(org.apache.sshd.common.util.buffer.Buffer)
> In the hprof I see thousands of PendingWriteFuture objects (btw, each holds a 
> reference to a logger instance). And those objects are only created from this 
> function. 
> It is clear the session is running through rekey. I see the kexState 
> indicating the progress. 
> Is there a way to artificially limit the sending queue, no matter if related 
> remote window allows sending that enormous amount of data? As of my 
> estimation, the window was reported to be around 1.5 GB or more. Maybe, such 
> huge window size was caused by an arithmetic overflow that is fixed on 
> SSHD-701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SSHD-754) OOM in sending data for channel

2017-07-09 Thread Goldstein Lyor (JIRA)

[ 
https://issues.apache.org/jira/browse/SSHD-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079673#comment-16079673
 ] 

Goldstein Lyor commented on SSHD-754:
-

I have not had time to really think about the scenario you presented, and 
nowadays it will take me a while to find some free time to review it. Please 
feel free to go ahead and publish a pull-request once you think you have a 
proposal in mind - it will be easier for us to review it and better understand 
the problem and its possible fix.

That being said, off the top of my head - I believe one can control the window 
size via properties and/or window-size SSH messages (don't remember offhand the 
exact message exchange). AFAIK, the window size mechanism provides a way to 
throttle the data rate, so maybe that can be a way to address this issue as 
well.

> OOM in sending data for channel
> ---
>
> Key: SSHD-754
> URL: https://issues.apache.org/jira/browse/SSHD-754
> Project: MINA SSHD
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Eugene Petrenko
>
> I have an implementation of SSHD server with the library. It sends gigabytes 
> (e.g. 5GB) of data as command output. 
> Starting from Putty plink 0.68 (also includes plink 0.69) we started to have 
> OOM errors. Checking memory dumps shown the most of the memory is consumed 
> from the function
> org.apache.sshd.common.session.AbstractSession#writePacket(org.apache.sshd.common.util.buffer.Buffer)
> In the hprof I see thousands of PendingWriteFuture objects (btw, each holds a 
> reference to a logger instance). And those objects are only created from this 
> function. 
> It is clear the session is running through rekey. I see the kexState 
> indicating the progress. 
> Is there a way to artificially limit the sending queue, no matter if related 
> remote window allows sending that enormous amount of data? As of my 
> estimation, the window was reported to be around 1.5 GB or more. Maybe, such 
> huge window size was caused by an arithmetic overflow that is fixed on 
> SSHD-701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SSHD-754) OOM in sending data for channel

2017-07-09 Thread Eugene Petrenko (JIRA)

[ 
https://issues.apache.org/jira/browse/SSHD-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079668#comment-16079668
 ] 

Eugene Petrenko commented on SSHD-754:
--

It is indeed a non-trivial problem that I have discovered. First, I want to 
make sure the proposed fix is good enough to be converted into a pull request. 

The problem I have is that on the SSH server:
1) channel remote window is set to be 2GB
2) client is slow to receive data (data is generated faster that it is being 
consumed by network/client)
2a) Slow network connection
2b) Rekey is running

Because of 1) and 2) we have OOM - too many data chunks are queued.

The proposed solution is to limit send queue only for DATA and EXTENDED_DATA 
messages (is that correct?), by blocking too active sender (deadlock is 
possible if we block a NIO/callback thread)

Alternative solution can be to implement similar logic in 
org.apache.sshd.common.channel.ChannelOutputStream and 
org.apache.sshd.common.channel.ChannelAsyncOutputStream. Also, there might be 
other usages of Channel/Session, thus fixing those 2 classes may not be enough 
and trick to avoid code duplication

What would you say?


> OOM in sending data for channel
> ---
>
> Key: SSHD-754
> URL: https://issues.apache.org/jira/browse/SSHD-754
> Project: MINA SSHD
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Eugene Petrenko
>
> I have an implementation of SSHD server with the library. It sends gigabytes 
> (e.g. 5GB) of data as command output. 
> Starting from Putty plink 0.68 (also includes plink 0.69) we started to have 
> OOM errors. Checking memory dumps shown the most of the memory is consumed 
> from the function
> org.apache.sshd.common.session.AbstractSession#writePacket(org.apache.sshd.common.util.buffer.Buffer)
> In the hprof I see thousands of PendingWriteFuture objects (btw, each holds a 
> reference to a logger instance). And those objects are only created from this 
> function. 
> It is clear the session is running through rekey. I see the kexState 
> indicating the progress. 
> Is there a way to artificially limit the sending queue, no matter if related 
> remote window allows sending that enormous amount of data? As of my 
> estimation, the window was reported to be around 1.5 GB or more. Maybe, such 
> huge window size was caused by an arithmetic overflow that is fixed on 
> SSHD-701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SSHD-754) OOM in sending data for channel

2017-07-09 Thread Goldstein Lyor (JIRA)

[ 
https://issues.apache.org/jira/browse/SSHD-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079638#comment-16079638
 ] 

Goldstein Lyor commented on SSHD-754:
-

If indeed you have a fix / improvement in mind, a pull-request is the best way 
to have it evaluated and eventually merged...

> OOM in sending data for channel
> ---
>
> Key: SSHD-754
> URL: https://issues.apache.org/jira/browse/SSHD-754
> Project: MINA SSHD
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Eugene Petrenko
>
> I have an implementation of SSHD server with the library. It sends gigabytes 
> (e.g. 5GB) of data as command output. 
> Starting from Putty plink 0.68 (also includes plink 0.69) we started to have 
> OOM errors. Checking memory dumps shown the most of the memory is consumed 
> from the function
> org.apache.sshd.common.session.AbstractSession#writePacket(org.apache.sshd.common.util.buffer.Buffer)
> In the hprof I see thousands of PendingWriteFuture objects (btw, each holds a 
> reference to a logger instance). And those objects are only created from this 
> function. 
> It is clear the session is running through rekey. I see the kexState 
> indicating the progress. 
> Is there a way to artificially limit the sending queue, no matter if related 
> remote window allows sending that enormous amount of data? As of my 
> estimation, the window was reported to be around 1.5 GB or more. Maybe, such 
> huge window size was caused by an arithmetic overflow that is fixed on 
> SSHD-701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SSHD-754) OOM in sending data for channel

2017-07-05 Thread Eugene Petrenko (JIRA)

[ 
https://issues.apache.org/jira/browse/SSHD-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075469#comment-16075469
 ] 

Eugene Petrenko commented on SSHD-754:
--

After reproducing the problem in tests I was able to come up with the following 
patch (in Kotlin) for an inheritor of  ServerSessionImpl to fix the test

{code}
  private class PressureLock {
private val semaphore = Semaphore(100)
fun acquire() : SshFutureListener {
  semaphore.acquire()
  return listener
}

private val listener = object : SshFutureListener {
  override fun operationComplete(future: IoWriteFuture?) {
semaphore.release()
  }
}
  }

  private val CHANNEL_STDOUT_LOCK = PressureLock()
  private val CHANNEL_STDERR_LOCK = PressureLock()

  override fun writePacket(buffer: Buffer): IoWriteFuture {
// The workaround for VCS-797
// and https://issues.apache.org/jira/browse/SSHD-754
// the trick is to block writer thread once there are more
// than 100 messages in either rekey wait queue or nio write queue
val lock = when (buffer.array()[buffer.rpos()]) {
  SshConstants.SSH_MSG_CHANNEL_DATA -> CHANNEL_STDOUT_LOCK
  SshConstants.SSH_MSG_CHANNEL_EXTENDED_DATA -> CHANNEL_STDERR_LOCK
  else -> null
}?.acquire()

val future = super.writePacket(buffer)

if (lock != null) {
  future.addListener(lock)
}
return future
  }
}
{code}


> OOM in sending data for channel
> ---
>
> Key: SSHD-754
> URL: https://issues.apache.org/jira/browse/SSHD-754
> Project: MINA SSHD
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Eugene Petrenko
>
> I have an implementation of SSHD server with the library. It sends gigabytes 
> (e.g. 5GB) of data as command output. 
> Starting from Putty plink 0.68 (also includes plink 0.69) we started to have 
> OOM errors. Checking memory dumps shown the most of the memory is consumed 
> from the function
> org.apache.sshd.common.session.AbstractSession#writePacket(org.apache.sshd.common.util.buffer.Buffer)
> In the hprof I see thousands of PendingWriteFuture objects (btw, each holds a 
> reference to a logger instance). And those objects are only created from this 
> function. 
> It is clear the session is running through rekey. I see the kexState 
> indicating the progress. 
> Is there a way to artificially limit the sending queue, no matter if related 
> remote window allows sending that enormous amount of data? As of my 
> estimation, the window was reported to be around 1.5 GB or more. Maybe, such 
> huge window size was caused by an arithmetic overflow that is fixed on 
> SSHD-701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SSHD-754) OOM in sending data for channel

2017-07-05 Thread Eugene Petrenko (JIRA)

[ 
https://issues.apache.org/jira/browse/SSHD-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075341#comment-16075341
 ] 

Eugene Petrenko commented on SSHD-754:
--

It looks like starting from April 2016, plink enables 'simple' mode for SSH. It 
was done by commit b22c0b6f3e6f5254270a89f86df3edfc4da829d2 
https://git.tartarus.org/?p=simon/putty.git;a=commit;h=b22c0b6f3e6f5254270a89f86df3edfc4da829d2.
 

Starting from 0.68 it sends 2GB (0x7fff) as receive window size. That makes 
SSHD library to be vulnerable for OOM.

The simplified STR is as follows. We need a command that returns huge portion 
of data, say bigger than heap size. Next, it is only enough to have a client 
which is slow to read data (e.g. slow channel), the server will easily queue to 
many packets and it will have OOM out of that. 

Looks like the org.apache.sshd.common.channel.ChannelOutputStream and similar 
classes should take into account not only the window size but also it's own 
write queue.

> OOM in sending data for channel
> ---
>
> Key: SSHD-754
> URL: https://issues.apache.org/jira/browse/SSHD-754
> Project: MINA SSHD
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Eugene Petrenko
>
> I have an implementation of SSHD server with the library. It sends gigabytes 
> (e.g. 5GB) of data as command output. 
> Starting from Putty plink 0.68 (also includes plink 0.69) we started to have 
> OOM errors. Checking memory dumps shown the most of the memory is consumed 
> from the function
> org.apache.sshd.common.session.AbstractSession#writePacket(org.apache.sshd.common.util.buffer.Buffer)
> In the hprof I see thousands of PendingWriteFuture objects (btw, each holds a 
> reference to a logger instance). And those objects are only created from this 
> function. 
> It is clear the session is running through rekey. I see the kexState 
> indicating the progress. 
> Is there a way to artificially limit the sending queue, no matter if related 
> remote window allows sending that enormous amount of data? As of my 
> estimation, the window was reported to be around 1.5 GB or more. Maybe, such 
> huge window size was caused by an arithmetic overflow that is fixed on 
> SSHD-701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SSHD-754) OOM in sending data for channel

2017-07-05 Thread Eugene Petrenko (JIRA)

[ 
https://issues.apache.org/jira/browse/SSHD-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074753#comment-16074753
 ] 

Eugene Petrenko commented on SSHD-754:
--

Did a try with 1.4.0, where SSHD-701 is closed. Same issue. I have enormous 
remote windows size, so anything fits into the window, generating endless 
pending write queue. 

> OOM in sending data for channel
> ---
>
> Key: SSHD-754
> URL: https://issues.apache.org/jira/browse/SSHD-754
> Project: MINA SSHD
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Eugene Petrenko
>
> I have an implementation of SSHD server with the library. It sends gigabytes 
> (e.g. 5GB) of data as command output. 
> Starting from Putty plink 0.68 (also includes plink 0.69) we started to have 
> OOM errors. Checking memory dumps shown the most of the memory is consumed 
> from the function
> org.apache.sshd.common.session.AbstractSession#writePacket(org.apache.sshd.common.util.buffer.Buffer)
> In the hprof I see thousands of PendingWriteFuture objects (btw, each holds a 
> reference to a logger instance). And those objects are only created from this 
> function. 
> It is clear the session is running through rekey. I see the kexState 
> indicating the progress. 
> Is there a way to artificially limit the sending queue, no matter if related 
> remote window allows sending that enormous amount of data? As of my 
> estimation, the window was reported to be around 1.5 GB or more. Maybe, such 
> huge window size was caused by an arithmetic overflow that is fixed on 
> SSHD-701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)