[ 
https://issues.apache.org/jira/browse/KAFKA-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569670#comment-13569670
 ] 

Neha Narkhede commented on KAFKA-736:
-------------------------------------

Benchmarked the draft  and v2 patches for producer throughput , here are the 
results -

Message size is 1K in all the tests

batch size 1, producer threads 1

kafka-736-v2   - 13 MB/s

kafka-736-draft  - 30 MB/s

batch size 100, producer threads 1

kafka-736-v2   - 48.4 MB/s

kafka-736-draft  - 61.5 MB/s

batch size 100, producer threads 20

kafka-736-v2   - 11.6 MB/s

kafka-736-draft - 81.6 MB/s

I looked into the cause of this performance degradation on the v2 patch. What's 
happening is setting the selection key's interest bits to READ in 
processNewResponses is not reflected in the following select() operation for 
all BUT the first network thread (id 0). I tried the producer performance test 
with varying # of producer threads and network threads on the server and I 
consistently see this result. Due to this, all the producer connections handled 
by network threads with ids > 1 see very low throughput since the next request 
is not read until 300 ms after the previous request is finished processing. I 
also confirmed that the producer had sent lot of data on those low throughput 
connections, just the server was reading it 300 ms later.  I read up a little 
bit about concurrency and selection keys, found this - 

"Generally, SelectionKey objects are thread-safe, but it's important to know 
that operations that modify the interest set are synchronized by Selector 
objects. This could cause calls to the interestOps( ) method to block for an 
indeterminate amount of time. The specific locking policy used by a selector, 
such as whether the locks are held throughout the selection process, is 
implementation-dependent. 

Overall, seems like Java NIO doesn't behave the way we want to wrt to having 
the updated interest bits take effect in the next select operation. This makes 
the v2 approach even trickier to reason about.
                
> Add an option to the 0.8 producer to mimic 0.7 producer behavior
> ----------------------------------------------------------------
>
>                 Key: KAFKA-736
>                 URL: https://issues.apache.org/jira/browse/KAFKA-736
>             Project: Kafka
>          Issue Type: Improvement
>          Components: producer 
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>              Labels: p2, replication-performance
>         Attachments: check-message-ordering.py, kafka-736-draft.patch, 
> kafka-736-v1.patch, kafka-736-v2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I profiled a producer throughput benchmark between a producer and a remote 
> broker. It turns out that the background send threads spends ~97% of its time 
> waiting to read the acknowledgement from the broker.
> I propose we change the current behavior of request.required.acks=0 to mean 
> no acknowledgement from the broker. This will mimic the 0.7 producer behavior 
> and will enable tuning the producer for very high throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to