[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968472#comment-15968472
 ] 

Jason Brown commented on CASSANDRA-8457:
----------------------------------------

So, [~aweisberg] and I spent some time talking offline about the expiring 
messages on the outbound side, and came up with the following: 

1. run a periodic, scheduled task in each channel that checks to make sure the 
channel is making progess wrt sending bytes. If we fail to see any progress 
being made after some number of seconds, we should close the connection/socket 
and throw away the messages.
2. repurpose the high/low water mark (and arguably use it more correctly) to 
indicate when we should stop writing messages to the channel (at the 
{{ChannelWriter}} layer). Currently, I'm just using the water mark to indicate 
when we should flush, but a simple check elsewhere would accomplish the same 
thing. Instead, the water marks should indicate when we really shouldn't write 
to the channel anymore, and either queue up those messages in something like 
{{OutboundMessageConnection#backlog}} or perhaps drop them (I'd prefer to 
queue).
3. When we've exceeded the high water mark, we can disable the reading incoming 
messages from the same peer (achievable by disabiling auto read for the 
channel). This would prevent the current node from executing more work on 
behalf of a peer to which we cannot send any data. Then when the channel drops 
below the low water mark (and the channel is 'writable'), we re-enable netty 
auto read on the read channels for the pper.

1 and 2 are reasonably easy to do (and I'll do them asap), but I'd prefer to 
defer 3 until later as it has a lot of races and other complexities/subtleties 
I'd like to put off for the scope of this ticket (especially as sockets are not 
bidirectional yet). Thoughts?

Note: items 1 & 2 are significantly simpler than my earlier comments wrt 
message expiration, so please disregard them for now.


> nio MessagingService
> --------------------
>
>                 Key: CASSANDRA-8457
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: netty, performance
>             Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to