[jira] Updated: (CASSANDRA-1632) Thread workflow and cpu affinity

Chris Goffinet (JIRA) Tue, 19 Oct 2010 13:12:09 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Chris Goffinet updated CASSANDRA-1632:
--------------------------------------

    Description: 
Here are some thoughts I wanted to write down, we need to run some serious 
benchmarks to see the benefits:

1) All thread pools for our stages use a shared queue per stage. For some 
stages we could move to a model where each thread has its own queue. This would 
reduce lock contention on the shared queue. This workload only suits the stages 
that have no variance, else you run into thread starvation. Some stages that 
this might work: ROW-MUTATION.

2) Set cpu affinity for each thread in each stage. If we can pin threads to 
specific cores, and control the workflow of a message from Thrift down to each 
stage, we should see improvements on reducing L1 cache misses. We would need to 
build a JNI extension (to set cpu affinity), as I could not find anywhere in 
JDK where it was exposed. 

3) Batching the delivery of requests across stage boundaries. Peter Schuller 
hasn't looked deep enough yet into the JDK, but he thinks there may be 
significant improvements to be had there. Especially in high-throughput 
situations. If on each consumption you were to consume everything in the queue, 
rather than implying a synchronization point in between each request.


  was:
Here are some thoughts I wanted to write down, we need to run some serious 
benchmarks to see the benefits:

1) All thread pools for our stages use a shared queue per stage. For some 
stages we could move to a model where each thread has its own queue. This would 
reduce lock contention on the shared queue. This workload only suits the stages 
that have no variance, else you run into thread starvation. Some stages that 
this might work: ROW-MUTATION.

2) Set cpu affinity for each thread in each stage. If we can pin threads to 
specific cores, and control the workflow of a message from Thrift down to each 
stage, we should see improvements on reducing L1 cache misses. We would need to 
build a JNI extension, as I could not find anywhere in JDK where it was 
exposed. 

3) Batching the delivery of requests across stage boundaries. Peter Schuller 
hasn't looked deep enough yet into the JDK, but he thinks there may be 
significant improvements to be had there. Especially in high-throughput 
situations. If on each consumption you were to consume everything in the queue, 
rather than implying a synchronization point in between each request.



> Thread workflow and cpu affinity
> --------------------------------
>
>                 Key: CASSANDRA-1632
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1632
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Goffinet
>             Fix For: 0.7.1
>
>
> Here are some thoughts I wanted to write down, we need to run some serious 
> benchmarks to see the benefits:
> 1) All thread pools for our stages use a shared queue per stage. For some 
> stages we could move to a model where each thread has its own queue. This 
> would reduce lock contention on the shared queue. This workload only suits 
> the stages that have no variance, else you run into thread starvation. Some 
> stages that this might work: ROW-MUTATION.
> 2) Set cpu affinity for each thread in each stage. If we can pin threads to 
> specific cores, and control the workflow of a message from Thrift down to 
> each stage, we should see improvements on reducing L1 cache misses. We would 
> need to build a JNI extension (to set cpu affinity), as I could not find 
> anywhere in JDK where it was exposed. 
> 3) Batching the delivery of requests across stage boundaries. Peter Schuller 
> hasn't looked deep enough yet into the JDK, but he thinks there may be 
> significant improvements to be had there. Especially in high-throughput 
> situations. If on each consumption you were to consume everything in the 
> queue, rather than implying a synchronization point in between each request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1632) Thread workflow and cpu affinity

Reply via email to