[jira] [Commented] (CASSANDRA-8457) nio MessagingService

Sylvain Lebresne (JIRA) Thu, 13 Apr 2017 06:14:52 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967568#comment-15967568
 ]


Sylvain Lebresne commented on CASSANDRA-8457:
---------------------------------------------

bq. I think it's important that a single slow node or network issue resulting 
in a socket that isn't writable shouldn't allow an arbitrary amount of data to 
collect on the heap. Right now there is nothing that can drop the data in that 
scenario.

I don't necessarily disagree on that somewhat general statement, but I'm far 
from convinced that checking for expired message is the right tool for the job 
in the first place. The fact is that expiration is time-based, that default 
timeouts are in multiple of seconds, so plenty of time for message to 
accumulate and blow the heap without having any of them being droppable. On top 
of that, not all message have timeouts, which actually make sense because 
message timeout isn't a back-pressure mechanism, it's about how long we're 
willing to wait for an answer to a request message, and hence one-way message 
have no reason to have such timeout. And that's part of the point, I dislike 
using a concept that isn't meant to be related to back-pressure to do 
back-pressure, especially when it's as flawed as this one. Users shouldn't have 
to worry that nodes could OOM because they put writes timeout high, it's just 
not intuitive.

Don't get me wrong, I don't disagree that some back-pressure mechanism should 
be added for that problem, but that should be more based on the amount of 
message data (or, at the very least the number of such messages) in the Netty 
queue. Surely we're not the only one facing this problem though, doesn't Netty 
already have a standard way to deal with that problem of messages piling up in 
its queues?

> nio MessagingService
> --------------------
>
>                 Key: CASSANDRA-8457
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: netty, performance
>             Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CASSANDRA-8457) nio MessagingService

Reply via email to