from:"Tim Brooks"

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616158#comment-14616158
]

Tim Brooks commented on KAFKA-2102:
---

I can probably take another look at this later this week and see if it makes a
difference in the cases you raised.

Remove unnecessary synchronization when managing metadata
-

Key: KAFKA-2102
URL: https://issues.apache.org/jira/browse/KAFKA-2102
Project: Kafka
Issue Type: Improvement
Reporter: Tim Brooks
Assignee: Tim Brooks
Attachments: KAFKA-2102.patch, KAFKA-2102_2015-04-08_00:20:33.patch,
KAFKA-2102_2015-04-15_19:55:45.patch, KAFKA-2102_2015-04-26_17:25:21.patch,
eight-threads-patch.txt, eight-threads-trunk.txt, five-threads-patch.txt,
five-threads-trunk.txt

Usage of the org.apache.kafka.clients.Metadata class is synchronized. It
seems like the current functionality could be maintained without
synchronizing the whole class.
I have been working on improving this by moving to finer grained locks and
using atomic operations. My initial benchmarking of the producer is that this
will improve latency (using HDRHistogram) on submitting messages.
I have produced an initial patch. I do not necessarily believe this is
complete. And I want to definitely produce some more benchmarks. However, I
wanted to get early feedback because this change could be deceptively tricky.
I am interested in knowing if this is:
1. Something that is of interest to the maintainers/community.
2. Along the right track
3. If there are any gotchas that make my current approach naive.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2129) Consumer could make multiple concurrent metadata requests


[ 
https://issues.apache.org/jira/browse/KAFKA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616136#comment-14616136
 ] 

Tim Brooks commented on KAFKA-2129:
---

Since post-2168 it seems to be explicitly documented that the consumer is not 
thread safe, and every public method (besides wakeup) requires a thread to gain 
ownership before entering the consumer, this ticket seems to be resolved.

- Tim

 Consumer could make multiple concurrent metadata requests
 -

 Key: KAFKA-2129
 URL: https://issues.apache.org/jira/browse/KAFKA-2129
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
Assignee: Tim Brooks
 Attachments: KAFKA-2129.patch


 The NetworkClient's metadataFetchInProgress is neither volatile nor atomic. 
 This protects against multiple metadata requests being made and is read on 
 poll() on the NetworkClient. It is written to when a request is initiated.
 This is fine for the producer. Which seems to have one thread writing. The 
 KafkaConsumer's poll()  method is synchronized, so there will not be more 
 than one writer entering from there. However, the NetworkClient's poll() 
 method is also accessed on the Consumer's partitionsFor() method. Which could 
 be access by a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-2311) Consumer's ensureNotClosed method not thread safe

Tim Brooks created KAFKA-2311:
-

 Summary: Consumer's ensureNotClosed method not thread safe
 Key: KAFKA-2311
 URL: https://issues.apache.org/jira/browse/KAFKA-2311
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks


When a call is to the consumer is made, the first check is to see that the 
consumer is not closed. This variable is not volatile so there is no guarantee 
previous stores will be visible before a read.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 36242: Patch for KAFKA-2311

2015-07-06 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36242/
---

Review request for kafka.


Bugs: KAFKA-2311
https://issues.apache.org/jira/browse/KAFKA-2311


Repository: kafka


Description
---

Make closed flag atomic on consumer


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
1f0e51557c4569f0980b72652846b250d00e05d6 

Diff: https://reviews.apache.org/r/36242/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Commented] (KAFKA-2311) Consumer's ensureNotClosed method not thread safe


[ 
https://issues.apache.org/jira/browse/KAFKA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616155#comment-14616155
 ] 

Tim Brooks commented on KAFKA-2311:
---

Created reviewboard https://reviews.apache.org/r/36242/diff/
 against branch origin/trunk

 Consumer's ensureNotClosed method not thread safe
 -

 Key: KAFKA-2311
 URL: https://issues.apache.org/jira/browse/KAFKA-2311
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
 Attachments: KAFKA-2311.patch


 When a call is to the consumer is made, the first check is to see that the 
 consumer is not closed. This variable is not volatile so there is no 
 guarantee previous stores will be visible before a read.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2311) Consumer's ensureNotClosed method not thread safe


 [ 
https://issues.apache.org/jira/browse/KAFKA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2311:
--
Assignee: Tim Brooks
  Status: Patch Available  (was: Open)

 Consumer's ensureNotClosed method not thread safe
 -

 Key: KAFKA-2311
 URL: https://issues.apache.org/jira/browse/KAFKA-2311
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
Assignee: Tim Brooks
 Attachments: KAFKA-2311.patch


 When a call is to the consumer is made, the first check is to see that the 
 consumer is not closed. This variable is not volatile so there is no 
 guarantee previous stores will be visible before a read.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2311) Consumer's ensureNotClosed method not thread safe


 [ 
https://issues.apache.org/jira/browse/KAFKA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2311:
--
Attachment: KAFKA-2311.patch

 Consumer's ensureNotClosed method not thread safe
 -

 Key: KAFKA-2311
 URL: https://issues.apache.org/jira/browse/KAFKA-2311
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
 Attachments: KAFKA-2311.patch


 When a call is to the consumer is made, the first check is to see that the 
 consumer is not closed. This variable is not volatile so there is no 
 guarantee previous stores will be visible before a read.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-2312) Use AtomicLong opposed to AtomicReference to store currentThread in consumer

Tim Brooks created KAFKA-2312:
-

 Summary: Use AtomicLong opposed to AtomicReference to store 
currentThread in consumer
 Key: KAFKA-2312
 URL: https://issues.apache.org/jira/browse/KAFKA-2312
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Tim Brooks
Priority: Minor


When a thread id is returned by Thread.currentThread().getId() it is a 
primitive. Storing it in an AtomicReference requires boxing and additional 
indirection.

An AtomicLong seems more natural to store a long. 

The current implementation relies on knowing that null means no owner. Since 
thread ids are always positive (specified in javadoc), it is possible to create 
a constant NO_CURRENT_THREAD for -1. Which allows the usage of an AtomicLong 
and makes the functionality explicit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2312) Use AtomicLong opposed to AtomicReference to store currentThread in consumer


 [ 
https://issues.apache.org/jira/browse/KAFKA-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2312:
--
Attachment: KAFKA-2312.patch

 Use AtomicLong opposed to AtomicReference to store currentThread in consumer
 

 Key: KAFKA-2312
 URL: https://issues.apache.org/jira/browse/KAFKA-2312
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Tim Brooks
Priority: Minor
 Attachments: KAFKA-2312.patch


 When a thread id is returned by Thread.currentThread().getId() it is a 
 primitive. Storing it in an AtomicReference requires boxing and additional 
 indirection.
 An AtomicLong seems more natural to store a long. 
 The current implementation relies on knowing that null means no owner. Since 
 thread ids are always positive (specified in javadoc), it is possible to 
 create a constant NO_CURRENT_THREAD for -1. Which allows the usage of an 
 AtomicLong and makes the functionality explicit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2312) Use AtomicLong opposed to AtomicReference to store currentThread in consumer


 [ 
https://issues.apache.org/jira/browse/KAFKA-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2312:
--
Assignee: Tim Brooks
  Status: Patch Available  (was: Open)

 Use AtomicLong opposed to AtomicReference to store currentThread in consumer
 

 Key: KAFKA-2312
 URL: https://issues.apache.org/jira/browse/KAFKA-2312
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Tim Brooks
Assignee: Tim Brooks
Priority: Minor
 Attachments: KAFKA-2312.patch


 When a thread id is returned by Thread.currentThread().getId() it is a 
 primitive. Storing it in an AtomicReference requires boxing and additional 
 indirection.
 An AtomicLong seems more natural to store a long. 
 The current implementation relies on knowing that null means no owner. Since 
 thread ids are always positive (specified in javadoc), it is possible to 
 create a constant NO_CURRENT_THREAD for -1. Which allows the usage of an 
 AtomicLong and makes the functionality explicit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 36244: Patch for KAFKA-2312

2015-07-06 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36244/
---

Review request for kafka.


Bugs: KAFKA-2312
https://issues.apache.org/jira/browse/KAFKA-2312


Repository: kafka


Description
---

Use an atomic long for the 'light lock' opposed to an atomic reference.


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
1f0e51557c4569f0980b72652846b250d00e05d6 

Diff: https://reviews.apache.org/r/36244/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Commented] (KAFKA-2312) Use AtomicLong opposed to AtomicReference to store currentThread in consumer


[ 
https://issues.apache.org/jira/browse/KAFKA-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616179#comment-14616179
 ] 

Tim Brooks commented on KAFKA-2312:
---

Created reviewboard https://reviews.apache.org/r/36244/diff/
 against branch origin/trunk

 Use AtomicLong opposed to AtomicReference to store currentThread in consumer
 

 Key: KAFKA-2312
 URL: https://issues.apache.org/jira/browse/KAFKA-2312
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Tim Brooks
Priority: Minor
 Attachments: KAFKA-2312.patch


 When a thread id is returned by Thread.currentThread().getId() it is a 
 primitive. Storing it in an AtomicReference requires boxing and additional 
 indirection.
 An AtomicLong seems more natural to store a long. 
 The current implementation relies on knowing that null means no owner. Since 
 thread ids are always positive (specified in javadoc), it is possible to 
 create a constant NO_CURRENT_THREAD for -1. Which allows the usage of an 
 AtomicLong and makes the functionality explicit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Checking in on 2101, 2102, 2129

2015-06-15 Thread Tim Brooks

I wanted to check in on these three tickets I opened in JIRA. I have not seen 
responses to them in a while. 


2101 and 2129 seem like straightforward issues that need to be resolved (2129 - 
related the consumer not entirely being thread safe; 2101 related to a full 
successful update back off on a metadata refresh even if the last update failed 
and slightly misleading metrics).


I understand if 2102 has been determined to be unnecessary by the core team. 
But it seems like the other two need resolutions. 


Is there something I need to change about my patches?


Thanks,
Tim

Review Request 34865: Patch for KAFKA-2101

2015-05-31 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34865/
---

Review request for kafka.


Bugs: KAFKA-2101
https://issues.apache.org/jira/browse/KAFKA-2101


Repository: kafka


Description
---

Create distinction between successful update and unsuccessfull update for 
metrics and backoff


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/Metadata.java 
07f1cdb1fe920b0c7a5f2d101ddc40c689e1b247 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
1e943d621732889a1c005b243920dc32cea7af66 
  clients/src/test/java/org/apache/kafka/clients/MetadataTest.java 
928087d29deb80655ca83726c1ebc45d76468c1f 

Diff: https://reviews.apache.org/r/34865/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Commented] (KAFKA-2101) Metric metadata-age is reset on a failed update

[
https://issues.apache.org/jira/browse/KAFKA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566770#comment-14566770
]

Tim Brooks commented on KAFKA-2101:
---

As I mentioned earlier, this overlaps some with the changes in Kafka-2102.
However, if that patch is not accepted, the patch I submitted here will resolve
this ticket.

A second long is created for last successful update. This is used in the
metrics. And it is used when considering when next to update based on
metadataExpireMs. Whereas right now, if there is a failed update - the backoff
will be for the entire metadataExpireMs opposed to just the refreshBackoffMs
time period.

Metric metadata-age is reset on a failed update
---

Key: KAFKA-2101
URL: https://issues.apache.org/jira/browse/KAFKA-2101
Project: Kafka
Issue Type: Bug
Reporter: Tim Brooks
Assignee: Tim Brooks
Attachments: KAFKA-2101.patch

In org.apache.kafka.clients.Metadata there is a lastUpdate() method that
returns the time the metadata was lasted updated. This is only called by
metadata-age metric.
However the lastRefreshMs is updated on a failed update (when
MetadataResponse has not valid nodes). This is confusing since the metric's
name suggests that it is a true reflection of the age of the current
metadata. But the age might be reset by a failed update.
Additionally, lastRefreshMs is not reset on a failed update due to no node
being available. This seems slightly inconsistent, since one failure
condition resets the metrics, but another one does not. Especially since both
failure conditions do trigger the backoff (for the next attempt).
I have not implemented a patch yet, because I am unsure what expected
behavior is.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2101) Metric metadata-age is reset on a failed update


 [ 
https://issues.apache.org/jira/browse/KAFKA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2101:
--
Attachment: KAFKA-2101.patch

 Metric metadata-age is reset on a failed update
 ---

 Key: KAFKA-2101
 URL: https://issues.apache.org/jira/browse/KAFKA-2101
 Project: Kafka
  Issue Type: Bug
Reporter: Tim Brooks
 Attachments: KAFKA-2101.patch


 In org.apache.kafka.clients.Metadata there is a lastUpdate() method that 
 returns the time the metadata was lasted updated. This is only called by 
 metadata-age metric. 
 However the lastRefreshMs is updated on a failed update (when 
 MetadataResponse has not valid nodes). This is confusing since the metric's 
 name suggests that it is a true reflection of the age of the current 
 metadata. But the age might be reset by a failed update. 
 Additionally, lastRefreshMs is not reset on a failed update due to no node 
 being available. This seems slightly inconsistent, since one failure 
 condition resets the metrics, but another one does not. Especially since both 
 failure conditions do trigger the backoff (for the next attempt).
 I have not implemented a patch yet, because I am unsure what expected 
 behavior is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2101) Metric metadata-age is reset on a failed update


[ 
https://issues.apache.org/jira/browse/KAFKA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566767#comment-14566767
 ] 

Tim Brooks commented on KAFKA-2101:
---

Created reviewboard https://reviews.apache.org/r/34865/diff/
 against branch origin/trunk

 Metric metadata-age is reset on a failed update
 ---

 Key: KAFKA-2101
 URL: https://issues.apache.org/jira/browse/KAFKA-2101
 Project: Kafka
  Issue Type: Bug
Reporter: Tim Brooks
 Attachments: KAFKA-2101.patch


 In org.apache.kafka.clients.Metadata there is a lastUpdate() method that 
 returns the time the metadata was lasted updated. This is only called by 
 metadata-age metric. 
 However the lastRefreshMs is updated on a failed update (when 
 MetadataResponse has not valid nodes). This is confusing since the metric's 
 name suggests that it is a true reflection of the age of the current 
 metadata. But the age might be reset by a failed update. 
 Additionally, lastRefreshMs is not reset on a failed update due to no node 
 being available. This seems slightly inconsistent, since one failure 
 condition resets the metrics, but another one does not. Especially since both 
 failure conditions do trigger the backoff (for the next attempt).
 I have not implemented a patch yet, because I am unsure what expected 
 behavior is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2101) Metric metadata-age is reset on a failed update


 [ 
https://issues.apache.org/jira/browse/KAFKA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2101:
--
Assignee: Tim Brooks
  Status: Patch Available  (was: Open)

 Metric metadata-age is reset on a failed update
 ---

 Key: KAFKA-2101
 URL: https://issues.apache.org/jira/browse/KAFKA-2101
 Project: Kafka
  Issue Type: Bug
Reporter: Tim Brooks
Assignee: Tim Brooks
 Attachments: KAFKA-2101.patch


 In org.apache.kafka.clients.Metadata there is a lastUpdate() method that 
 returns the time the metadata was lasted updated. This is only called by 
 metadata-age metric. 
 However the lastRefreshMs is updated on a failed update (when 
 MetadataResponse has not valid nodes). This is confusing since the metric's 
 name suggests that it is a true reflection of the age of the current 
 metadata. But the age might be reset by a failed update. 
 Additionally, lastRefreshMs is not reset on a failed update due to no node 
 being available. This seems slightly inconsistent, since one failure 
 condition resets the metrics, but another one does not. Especially since both 
 failure conditions do trigger the backoff (for the next attempt).
 I have not implemented a patch yet, because I am unsure what expected 
 behavior is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 33634: Patch for KAFKA-2129

2015-04-28 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33634/
---

Review request for kafka.


Bugs: KAFKA-2129
https://issues.apache.org/jira/browse/KAFKA-2129


Repository: kafka


Description
---

Synchronize method


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
d301be4709f7b112e1f3a39f3c04cfa65f00fa60 

Diff: https://reviews.apache.org/r/33634/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Commented] (KAFKA-2129) Consumer could make multiple concurrent metadata requests


[ 
https://issues.apache.org/jira/browse/KAFKA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517557#comment-14517557
 ] 

Tim Brooks commented on KAFKA-2129:
---

Created reviewboard https://reviews.apache.org/r/33634/diff/
 against branch origin/trunk

 Consumer could make multiple concurrent metadata requests
 -

 Key: KAFKA-2129
 URL: https://issues.apache.org/jira/browse/KAFKA-2129
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
Priority: Minor
 Attachments: KAFKA-2129.patch


 The NetworkClient's metadataFetchInProgress is neither volatile nor atomic. 
 This protects against multiple metadata requests being made and is read on 
 poll() on the NetworkClient. It is written to when a request is initiated.
 This is fine for the producer. Which seems to have one thread writing. The 
 KafkaConsumer's poll()  method is synchronized, so there will not be more 
 than one writer entering from there. However, the NetworkClient's poll() 
 method is also accessed on the Consumer's partitionsFor() method. Which could 
 be access by a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2129) Consumer could make multiple concurrent metadata requests


 [ 
https://issues.apache.org/jira/browse/KAFKA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2129:
--
Priority: Major  (was: Minor)

 Consumer could make multiple concurrent metadata requests
 -

 Key: KAFKA-2129
 URL: https://issues.apache.org/jira/browse/KAFKA-2129
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
Assignee: Tim Brooks
 Attachments: KAFKA-2129.patch


 The NetworkClient's metadataFetchInProgress is neither volatile nor atomic. 
 This protects against multiple metadata requests being made and is read on 
 poll() on the NetworkClient. It is written to when a request is initiated.
 This is fine for the producer. Which seems to have one thread writing. The 
 KafkaConsumer's poll()  method is synchronized, so there will not be more 
 than one writer entering from there. However, the NetworkClient's poll() 
 method is also accessed on the Consumer's partitionsFor() method. Which could 
 be access by a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2129) Consumer could make multiple concurrent metadata requests


 [ 
https://issues.apache.org/jira/browse/KAFKA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2129:
--
Attachment: KAFKA-2129.patch

 Consumer could make multiple concurrent metadata requests
 -

 Key: KAFKA-2129
 URL: https://issues.apache.org/jira/browse/KAFKA-2129
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
Priority: Minor
 Attachments: KAFKA-2129.patch


 The NetworkClient's metadataFetchInProgress is neither volatile nor atomic. 
 This protects against multiple metadata requests being made and is read on 
 poll() on the NetworkClient. It is written to when a request is initiated.
 This is fine for the producer. Which seems to have one thread writing. The 
 KafkaConsumer's poll()  method is synchronized, so there will not be more 
 than one writer entering from there. However, the NetworkClient's poll() 
 method is also accessed on the Consumer's partitionsFor() method. Which could 
 be access by a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2129) Consumer could make multiple concurrent metadata requests


 [ 
https://issues.apache.org/jira/browse/KAFKA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2129:
--
Assignee: Tim Brooks
  Status: Patch Available  (was: Open)

 Consumer could make multiple concurrent metadata requests
 -

 Key: KAFKA-2129
 URL: https://issues.apache.org/jira/browse/KAFKA-2129
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
Assignee: Tim Brooks
Priority: Minor
 Attachments: KAFKA-2129.patch


 The NetworkClient's metadataFetchInProgress is neither volatile nor atomic. 
 This protects against multiple metadata requests being made and is read on 
 poll() on the NetworkClient. It is written to when a request is initiated.
 This is fine for the producer. Which seems to have one thread writing. The 
 KafkaConsumer's poll()  method is synchronized, so there will not be more 
 than one writer entering from there. However, the NetworkClient's poll() 
 method is also accessed on the Consumer's partitionsFor() method. Which could 
 be access by a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2129) Consumer could make multiple concurrent metadata requests


[ 
https://issues.apache.org/jira/browse/KAFKA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517553#comment-14517553
 ] 

Tim Brooks commented on KAFKA-2129:
---

It kind of looks like this is a larger issue than maybe I realized when I first 
opened this ticket. Let's say that thread 1 is calling poll() on the 
KafkaClient. And then thread 2 calls partionsFor() a topic that is not locally 
known on the KafkaClient.

Both threads will make it into the poll() method on the Selector since 
partitionsFor() is not synchronized. If Thread 1 is in the middle of a poll(), 
tons of intermediate state will be lost when Thread 2 calls the clear() method 
on the selector:

this.completedSends.clear();
this.completedReceives.clear();
this.connected.clear();
this.disconnected.clear();
this.disconnected.addAll(this.failedSends);
this.failedSends.clear();

I can generate a number of failing integration tests by adding:

scala.concurrent.future {
  Thread.sleep(30)
  consumer.partitionsFor(weird-topic)
}

in consumeRecords() in the ConsumerTest right before the call is made to poll().

If I add the synchronize keyword to the partionsFor() method these errors go 
away. Is this the correct approach to this ticket? Obviously those errors are 
an issue since the KafkaConsumer documentation indicates that the class is 
threadsafe.

But adding synchronize to the method means that calling partitionsFor() will be 
blocked on a poll() that is in progress. And hopefully, the majority of the 
time partitionsFor() will not require a network call.

Anyway, I added a patch to synchronize that method. But if the we are 
interested in a non synchronized method to get locally-known partitions for 
that topic, we will need a different change.

 Consumer could make multiple concurrent metadata requests
 -

 Key: KAFKA-2129
 URL: https://issues.apache.org/jira/browse/KAFKA-2129
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
Priority: Minor

 The NetworkClient's metadataFetchInProgress is neither volatile nor atomic. 
 This protects against multiple metadata requests being made and is read on 
 poll() on the NetworkClient. It is written to when a request is initiated.
 This is fine for the producer. Which seems to have one thread writing. The 
 KafkaConsumer's poll()  method is synchronized, so there will not be more 
 than one writer entering from there. However, the NetworkClient's poll() 
 method is also accessed on the Consumer's partitionsFor() method. Which could 
 be access by a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2101) Metric metadata-age is reset on a failed update

2015-04-27 Thread Tim Brooks (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516263#comment-14516263
]

Tim Brooks commented on KAFKA-2101:
---

Okay. Well I made a number of changes related to Metadata in Kafka-2102. One of
which would resolve this issue. Essentially split last successful update from
last update attempt.

I guess I will wait to see if that patch is accepted before submitting one for
this card.

Metric metadata-age is reset on a failed update
---

Key: KAFKA-2101
URL: https://issues.apache.org/jira/browse/KAFKA-2101
Project: Kafka
Issue Type: Bug
Reporter: Tim Brooks

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 32937: Patch for KAFKA-2102

2015-04-26 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32937/
---

(Updated April 27, 2015, 12:26 a.m.)


Review request for kafka.


Bugs: KAFKA-2102
https://issues.apache.org/jira/browse/KAFKA-2102


Repository: kafka


Description (updated)
---

Method does not need to be synchronized


Do not synchronize contains topic method


Continue removing the need to synchronize the metadata object


Store both last refresh and need to refresh in same variable


Fix synchronize issue


Version needs to be volatile


rework how signally happens


remove unnecessary creation of new set


initialize 0 at the field level


Fix the build


Start moving synchronization of metadata to different class


Start moving synchronization work to new class


Remove unused code


Functionality works. Not threadsafe


move version into metadata synchronizer


Make version volatile


Rename classes


move to finergrained locking


Use locks in bookkeeper


Only use atomic variabled


use successful metadata in metrics


Change these things back to trunk


Address issues with patch


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/clients/Metadata.java 
07f1cdb1fe920b0c7a5f2d101ddc40c689e1b247 
  clients/src/main/java/org/apache/kafka/clients/MetadataBookkeeper.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
b7ae595f2cc46e5dfe728bc3ce6082e9cd0b6d36 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
42b12928781463b56fc4a45d96bb4da2745b6d95 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
b2db91ca14bbd17fef5ce85839679144fff3f689 

Diff: https://reviews.apache.org/r/32937/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Commented] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

2015-04-26 Thread Tim Brooks (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513346#comment-14513346
]

Tim Brooks commented on KAFKA-2102:
---

Updated reviewboard https://reviews.apache.org/r/32937/diff/
against branch origin/trunk

Remove unnecessary synchronization when managing metadata
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

2015-04-26 Thread Tim Brooks (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tim Brooks updated KAFKA-2102:
--
Attachment: KAFKA-2102_2015-04-26_17:25:21.patch

Remove unnecessary synchronization when managing metadata
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 32937: Patch for KAFKA-2102

2015-04-17 Thread Tim Brooks



 On April 16, 2015, 5:34 a.m., Ewen Cheslack-Postava wrote:
  clients/src/main/java/org/apache/kafka/clients/MetadataBookkeeper.java, 
  line 25
  https://reviews.apache.org/r/32937/diff/3/?file=931249#file931249line25
 
  This patch will definitely need a comment somewhere explaining the 
  locking strategy and the reasoning behind it. It's won't be obvious even to 
  someone familiar with other client code why this all works the way it does.

I can add some documentation soon.


 On April 16, 2015, 5:34 a.m., Ewen Cheslack-Postava wrote:
  clients/src/main/java/org/apache/kafka/clients/MetadataBookkeeper.java, 
  line 27
  https://reviews.apache.org/r/32937/diff/3/?file=931249#file931249line27
 
  Is there any benefit to using a Lock + Condition instead of the monitor 
  for the object? Seems like it would make the code a bit simpler - you'd get 
  rid of all the try/finally blocks.

It does not seem like it does currently. I tend towards locks because I like 
the explicitness, accessibility to source, privateness, and distinction between 
moniter and condition. But if synchronized in preferred by the project, I can 
switch it to that. 

The only piece of functionality the lock would provide, would be if we wanted a 
timeout on the acquire in awaitUpdate.


 On April 16, 2015, 5:34 a.m., Ewen Cheslack-Postava wrote:
  clients/src/main/java/org/apache/kafka/clients/MetadataBookkeeper.java, 
  line 69
  https://reviews.apache.org/r/32937/diff/3/?file=931249#file931249line69
 
  Are the lazySets accomplishing anything performance-wise? I doubt most 
  people even know the method exists, let alone the implications.
  
  Some seem like they're probably ok, e.g. none of the code that uses the 
  version number relies on seeing the updated value immediately. But others 
  I'm less sure about (and the lack of documentation on lazySet makes it hard 
  to be certain). For example, using updateRequested.lazySet() might work, 
  but could give incorrect results if the old value is returned to a 
  different thread. I think this only affects the network thread in the 
  producer, but the consumer can potentially be called from different 
  threads. Are we sure the lazySet works as expected in that case?

Reasonable question on the performance. I'll take a look and if there is not 
much of a difference, move them all to .sets(). As to correctness, my 
understanding is that since the writes happen in the lock, releasing the lock 
ensures visability.


- Tim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32937/#review80298
---


On April 16, 2015, 12:56 a.m., Tim Brooks wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32937/
 ---
 
 (Updated April 16, 2015, 12:56 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2102
 https://issues.apache.org/jira/browse/KAFKA-2102
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Method does not need to be synchronized
 
 
 Do not synchronize contains topic method
 
 
 Continue removing the need to synchronize the metadata object
 
 
 Store both last refresh and need to refresh in same variable
 
 
 Fix synchronize issue
 
 
 Version needs to be volatile
 
 
 rework how signally happens
 
 
 remove unnecessary creation of new set
 
 
 initialize 0 at the field level
 
 
 Fix the build
 
 
 Start moving synchronization of metadata to different class
 
 
 Start moving synchronization work to new class
 
 
 Remove unused code
 
 
 Functionality works. Not threadsafe
 
 
 move version into metadata synchronizer
 
 
 Make version volatile
 
 
 Rename classes
 
 
 move to finergrained locking
 
 
 Use locks in bookkeeper
 
 
 Only use atomic variabled
 
 
 use successful metadata in metrics
 
 
 Change these things back to trunk
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/Metadata.java 
 07f1cdb1fe920b0c7a5f2d101ddc40c689e1b247 
   clients/src/main/java/org/apache/kafka/clients/MetadataBookkeeper.java 
 PRE-CREATION 
   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
 b7ae595f2cc46e5dfe728bc3ce6082e9cd0b6d36 
   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
 b91e2c52ed0acb1faa85915097d97bafa28c413a 
 
 Diff: https://reviews.apache.org/r/32937/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Tim Brooks

[jira] [Created] (KAFKA-2129) Consumer could make multiple concurrent metadata requests

2015-04-16 Thread Tim Brooks (JIRA)

Tim Brooks created KAFKA-2129:
-

 Summary: Consumer could make multiple concurrent metadata requests
 Key: KAFKA-2129
 URL: https://issues.apache.org/jira/browse/KAFKA-2129
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Tim Brooks
Priority: Minor


The NetworkClient's metadataFetchInProgress is neither volatile nor atomic. 
This protects against multiple metadata requests being made and is read on 
poll() on the NetworkClient. It is written to when a request is initiated.

This is fine for the producer. Which seems to have one thread writing. The 
KafkaConsumer's poll()  method is synchronized, so there will not be more than 
one writer entering from there. However, the NetworkClient's poll() method is 
also accessed on the Consumer's partitionsFor() method. Which could be access 
by a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tim Brooks updated KAFKA-2102:
--
Attachment: KAFKA-2102_2015-04-15_19:55:45.patch

Remove unnecessary synchronization when managing metadata
-

Key: KAFKA-2102
URL: https://issues.apache.org/jira/browse/KAFKA-2102
Project: Kafka
Issue Type: Improvement
Reporter: Tim Brooks
Assignee: Tim Brooks
Attachments: KAFKA-2102.patch, KAFKA-2102_2015-04-08_00:20:33.patch,
KAFKA-2102_2015-04-15_19:55:45.patch, eight-threads-patch.txt,
eight-threads-trunk.txt, five-threads-patch.txt, five-threads-trunk.txt

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tim Brooks updated KAFKA-2102:
--
Status: Patch Available (was: In Progress)

Remove unnecessary synchronization when managing metadata
-

Key: KAFKA-2102
URL: https://issues.apache.org/jira/browse/KAFKA-2102
Project: Kafka
Issue Type: Improvement
Reporter: Tim Brooks
Assignee: Tim Brooks
Attachments: KAFKA-2102.patch, KAFKA-2102_2015-04-08_00:20:33.patch,
KAFKA-2102_2015-04-15_19:55:45.patch, eight-threads-patch.txt,
eight-threads-trunk.txt, five-threads-patch.txt, five-threads-trunk.txt

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497396#comment-14497396
]

Tim Brooks commented on KAFKA-2102:
---

Updated reviewboard https://reviews.apache.org/r/32937/diff/
against branch origin/trunk

Remove unnecessary synchronization when managing metadata
-

Key: KAFKA-2102
URL: https://issues.apache.org/jira/browse/KAFKA-2102
Project: Kafka
Issue Type: Improvement
Reporter: Tim Brooks
Assignee: Tim Brooks
Attachments: KAFKA-2102.patch, KAFKA-2102_2015-04-08_00:20:33.patch,
KAFKA-2102_2015-04-15_19:55:45.patch, eight-threads-patch.txt,
eight-threads-trunk.txt, five-threads-patch.txt, five-threads-trunk.txt

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 32937: Patch for KAFKA-2102

2015-04-15 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32937/
---

(Updated April 16, 2015, 12:56 a.m.)


Review request for kafka.


Bugs: KAFKA-2102
https://issues.apache.org/jira/browse/KAFKA-2102


Repository: kafka


Description (updated)
---

Method does not need to be synchronized


Do not synchronize contains topic method


Continue removing the need to synchronize the metadata object


Store both last refresh and need to refresh in same variable


Fix synchronize issue


Version needs to be volatile


rework how signally happens


remove unnecessary creation of new set


initialize 0 at the field level


Fix the build


Start moving synchronization of metadata to different class


Start moving synchronization work to new class


Remove unused code


Functionality works. Not threadsafe


move version into metadata synchronizer


Make version volatile


Rename classes


move to finergrained locking


Use locks in bookkeeper


Only use atomic variabled


use successful metadata in metrics


Change these things back to trunk


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/clients/Metadata.java 
07f1cdb1fe920b0c7a5f2d101ddc40c689e1b247 
  clients/src/main/java/org/apache/kafka/clients/MetadataBookkeeper.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
b7ae595f2cc46e5dfe728bc3ce6082e9cd0b6d36 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
b91e2c52ed0acb1faa85915097d97bafa28c413a 

Diff: https://reviews.apache.org/r/32937/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Commented] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497392#comment-14497392
]

Tim Brooks commented on KAFKA-2102:
---

I added an updated patch. This patch includes a few things:

1. I moved to using a finer locking strategy opposed to attempting to use all
atomic instructions. None of the methods are synchronized.
2. I delegated the synchronization code and data about when the last update
was, etc to a new MetadataBookkeeper. When I was first reading the old code I
had some issues parsing the mixture of cluster state, topic state, state about
when to do the next update, and state about when the last update had been
completed. Maybe my changes make this easier to parse. Maybe they don't.
3. I moved lastNoNodeAvailableMs in the NetworkClient state into the
MetadataBookkeeper. Since this variable was essentially a failed attempt to
update metadata, and it was not accessed in any different way for distinct
metrics, it seemed like it would be nicer to keep state about when the next
metadata update should happen together.
4. No one has responded to KAFKA-2101. But it was highly relevant to what I was
working on, so it is affected by this patch. I created a distinction between
successful metadata update and a metadata update attempt. The metadata-age
metric only uses the last successful update in its reports. This seemed like
the correct approach based on the name of that metric. Since a failed update
does not make the metadata any younger.

The performance improvements are primarily in the 90+ percentile. I ran a
producer test with both five and eight threads pushing 10,000 messages to
kafka. And I repeated it ten times. I recorded the time with HDRHistogram.

The improvements were somewhere between 4-30% reduced latency in the 90+%. For
example at the 0.99062500 percentile on the five thread test the latency
was reduced from 14.223 microseconds to 9.775 (31%). At the 0.9000
percentile the latency was reduced from 2.947 to 2.837 (3.9%) So certainly not
a lot. But pretty consistently across the higher percentiles, the latency is
improved.

In the five thread test the mean decreased 4.8%. In the eight thread test the
mean decreased 7.8%.

The code for the latency test can be found here:

https://github.com/tbrooks8/kafka-latency-test

Remove unnecessary synchronization when managing metadata
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tim Brooks updated KAFKA-2102:
--
Attachment: five-threads-trunk.txt
five-threads-patch.txt
eight-threads-trunk.txt
eight-threads-patch.txt

HDRHistogram results

Remove unnecessary synchronization when managing metadata
-

Key: KAFKA-2102
URL: https://issues.apache.org/jira/browse/KAFKA-2102
Project: Kafka
Issue Type: Improvement
Reporter: Tim Brooks
Assignee: Tim Brooks
Attachments: KAFKA-2102.patch, KAFKA-2102_2015-04-08_00:20:33.patch,
eight-threads-patch.txt, eight-threads-trunk.txt, five-threads-patch.txt,
five-threads-trunk.txt

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

2015-04-08 Thread Tim Brooks (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485494#comment-14485494
]

Tim Brooks commented on KAFKA-2102:
---

Thanks for the feedback.

Unless I misunderstand you I think that I fixed number 1 in the updated
patched. Issue number 2 you raised is definitely the kind of feedback I was
looking for. And something that I'll look into.

Generally, I saw pretty reasonable latency improvements in the 90+ percentiles.
Enough that I committed some time to seeing if I could find a workable solution.

I too am concerned about the complexity. I'm hoping to be able to come up with
a solution that is more intuitive (and does not depend on lastRefreshMs never
being 0).

I'll continue to explore this. If I find some way to do this (or part of this)
in a way to provides an noticeable performance improvement, I'll definitely
submit those changes and some benchmark numbers to justify them.

Remove unnecessary synchronization when managing metadata
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-2101) Metric metadata-age is reset on a failed update

Tim Brooks created KAFKA-2101:
-

 Summary: Metric metadata-age is reset on a failed update
 Key: KAFKA-2101
 URL: https://issues.apache.org/jira/browse/KAFKA-2101
 Project: Kafka
  Issue Type: Bug
Reporter: Tim Brooks


In org.apache.kafka.clients.Metadata there is a lastUpdate() method that 
returns the time the metadata was lasted updated. This is only called by 
metadata-age metric. 

However the lastRefreshMs is updated on a failed update (when MetadataResponse 
has not valid nodes). This is confusing since the metric's name suggests that 
it is a true reflection of the age of the current metadata. But the age might 
be reset by a failed update. 

Additionally, lastRefreshMs is not reset on a failed update due to no node 
being available. This seems slightly inconsistent, since one failure condition 
resets the metrics, but another one does not. Especially since both failure 
conditions do trigger the backoff (for the next attempt).

I have not implemented a patch yet, because I am unsure what expected behavior 
is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-2102) Remove unnecessary synchronization when managing metadata

Tim Brooks created KAFKA-2102:
-

 Summary: Remove unnecessary synchronization when managing metadata
 Key: KAFKA-2102
 URL: https://issues.apache.org/jira/browse/KAFKA-2102
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks


Usage of the org.apache.kafka.clients.Metadata class is synchronized. It seems 
like the current functionality could be maintained without synchronizing the 
whole class.

I have been working on improving this by moving to finer grained locks and 
using atomic operations. My initial benchmarking of the producer is that this 
will improve latency (using HDRHistogram) on submitting messages.

I have produced an initial patch. I do not necessarily believe this is 
complete. And I want to definitely produce some more benchmarks. However, I 
wanted to get early feedback because this change could be deceptively tricky.

I am interested in knowing if this is:
1. Something that is of interest to the maintainers/community.
2. Along the right track
3. If there are any gotchas that make my current approach naive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 32937: Patch for KAFKA-2102

2015-04-07 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32937/
---

Review request for kafka.


Bugs: KAFKA-2102
https://issues.apache.org/jira/browse/KAFKA-2102


Repository: kafka


Description
---

Method does not need to be synchronized


Do not synchronize contains topic method


Continue removing the need to synchronize the metadata object


Store both last refresh and need to refresh in same variable


Fix synchronize issue


Version needs to be volatile


rework how signally happens


remove unnecessary creation of new set


initialize 0 at the field level


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/Metadata.java 
07f1cdb1fe920b0c7a5f2d101ddc40c689e1b247 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
b91e2c52ed0acb1faa85915097d97bafa28c413a 

Diff: https://reviews.apache.org/r/32937/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Updated] (KAFKA-2102) Remove unnecessary synchronization when managing metadata


 [ 
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2102:
--
Attachment: KAFKA-2102.patch

 Remove unnecessary synchronization when managing metadata
 -

 Key: KAFKA-2102
 URL: https://issues.apache.org/jira/browse/KAFKA-2102
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks
 Attachments: KAFKA-2102.patch


 Usage of the org.apache.kafka.clients.Metadata class is synchronized. It 
 seems like the current functionality could be maintained without 
 synchronizing the whole class.
 I have been working on improving this by moving to finer grained locks and 
 using atomic operations. My initial benchmarking of the producer is that this 
 will improve latency (using HDRHistogram) on submitting messages.
 I have produced an initial patch. I do not necessarily believe this is 
 complete. And I want to definitely produce some more benchmarks. However, I 
 wanted to get early feedback because this change could be deceptively tricky.
 I am interested in knowing if this is:
 1. Something that is of interest to the maintainers/community.
 2. Along the right track
 3. If there are any gotchas that make my current approach naive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2102) Remove unnecessary synchronization when managing metadata


 [ 
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2102:
--
Assignee: Tim Brooks
  Status: Patch Available  (was: Open)

 Remove unnecessary synchronization when managing metadata
 -

 Key: KAFKA-2102
 URL: https://issues.apache.org/jira/browse/KAFKA-2102
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks
Assignee: Tim Brooks
 Attachments: KAFKA-2102.patch


 Usage of the org.apache.kafka.clients.Metadata class is synchronized. It 
 seems like the current functionality could be maintained without 
 synchronizing the whole class.
 I have been working on improving this by moving to finer grained locks and 
 using atomic operations. My initial benchmarking of the producer is that this 
 will improve latency (using HDRHistogram) on submitting messages.
 I have produced an initial patch. I do not necessarily believe this is 
 complete. And I want to definitely produce some more benchmarks. However, I 
 wanted to get early feedback because this change could be deceptively tricky.
 I am interested in knowing if this is:
 1. Something that is of interest to the maintainers/community.
 2. Along the right track
 3. If there are any gotchas that make my current approach naive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2102) Remove unnecessary synchronization when managing metadata


[ 
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484054#comment-14484054
 ] 

Tim Brooks commented on KAFKA-2102:
---

Created reviewboard https://reviews.apache.org/r/32937/diff/
 against branch origin/trunk

 Remove unnecessary synchronization when managing metadata
 -

 Key: KAFKA-2102
 URL: https://issues.apache.org/jira/browse/KAFKA-2102
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks
 Attachments: KAFKA-2102.patch


 Usage of the org.apache.kafka.clients.Metadata class is synchronized. It 
 seems like the current functionality could be maintained without 
 synchronizing the whole class.
 I have been working on improving this by moving to finer grained locks and 
 using atomic operations. My initial benchmarking of the producer is that this 
 will improve latency (using HDRHistogram) on submitting messages.
 I have produced an initial patch. I do not necessarily believe this is 
 complete. And I want to definitely produce some more benchmarks. However, I 
 wanted to get early feedback because this change could be deceptively tricky.
 I am interested in knowing if this is:
 1. Something that is of interest to the maintainers/community.
 2. Along the right track
 3. If there are any gotchas that make my current approach naive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 32937: Patch for KAFKA-2102

2015-04-07 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32937/
---

(Updated April 8, 2015, 5:21 a.m.)


Review request for kafka.


Bugs: KAFKA-2102
https://issues.apache.org/jira/browse/KAFKA-2102


Repository: kafka


Description (updated)
---

Method does not need to be synchronized


Do not synchronize contains topic method


Continue removing the need to synchronize the metadata object


Store both last refresh and need to refresh in same variable


Fix synchronize issue


Version needs to be volatile


rework how signally happens


remove unnecessary creation of new set


initialize 0 at the field level


Fix the build


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/clients/Metadata.java 
07f1cdb1fe920b0c7a5f2d101ddc40c689e1b247 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
b91e2c52ed0acb1faa85915097d97bafa28c413a 
  clients/src/test/java/org/apache/kafka/clients/MetadataTest.java 
928087d29deb80655ca83726c1ebc45d76468c1f 

Diff: https://reviews.apache.org/r/32937/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Commented] (KAFKA-2102) Remove unnecessary synchronization when managing metadata


[ 
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484737#comment-14484737
 ] 

Tim Brooks commented on KAFKA-2102:
---

I updated my WIP patch to ensure that all tests are passing.

 Remove unnecessary synchronization when managing metadata
 -

 Key: KAFKA-2102
 URL: https://issues.apache.org/jira/browse/KAFKA-2102
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks
Assignee: Tim Brooks
 Attachments: KAFKA-2102.patch, KAFKA-2102_2015-04-08_00:20:33.patch


 Usage of the org.apache.kafka.clients.Metadata class is synchronized. It 
 seems like the current functionality could be maintained without 
 synchronizing the whole class.
 I have been working on improving this by moving to finer grained locks and 
 using atomic operations. My initial benchmarking of the producer is that this 
 will improve latency (using HDRHistogram) on submitting messages.
 I have produced an initial patch. I do not necessarily believe this is 
 complete. And I want to definitely produce some more benchmarks. However, I 
 wanted to get early feedback because this change could be deceptively tricky.
 I am interested in knowing if this is:
 1. Something that is of interest to the maintainers/community.
 2. Along the right track
 3. If there are any gotchas that make my current approach naive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2102) Remove unnecessary synchronization when managing metadata


[ 
https://issues.apache.org/jira/browse/KAFKA-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484735#comment-14484735
 ] 

Tim Brooks commented on KAFKA-2102:
---

Updated reviewboard https://reviews.apache.org/r/32937/diff/
 against branch origin/trunk

 Remove unnecessary synchronization when managing metadata
 -

 Key: KAFKA-2102
 URL: https://issues.apache.org/jira/browse/KAFKA-2102
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks
Assignee: Tim Brooks
 Attachments: KAFKA-2102.patch, KAFKA-2102_2015-04-08_00:20:33.patch


 Usage of the org.apache.kafka.clients.Metadata class is synchronized. It 
 seems like the current functionality could be maintained without 
 synchronizing the whole class.
 I have been working on improving this by moving to finer grained locks and 
 using atomic operations. My initial benchmarking of the producer is that this 
 will improve latency (using HDRHistogram) on submitting messages.
 I have produced an initial patch. I do not necessarily believe this is 
 complete. And I want to definitely produce some more benchmarks. However, I 
 wanted to get early feedback because this change could be deceptively tricky.
 I am interested in knowing if this is:
 1. Something that is of interest to the maintainers/community.
 2. Along the right track
 3. If there are any gotchas that make my current approach naive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-2090) Remove duplicate check to metadataFetchInProgress

2015-04-03 Thread Tim Brooks (JIRA)

Tim Brooks created KAFKA-2090:
-

 Summary: Remove duplicate check to metadataFetchInProgress
 Key: KAFKA-2090
 URL: https://issues.apache.org/jira/browse/KAFKA-2090
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks
Priority: Minor


In the NetworkClient class on a call to poll() there are three checks 
(timeToNextMetadataUpdate, timeToNextReconnectAttempt, waitForMetadataFetch) 
made to see if a metadata update should be made. If all of these == 0 then an 
update is performed.

However, in the if block a second check is made to metadataFetchInProgress. 
This only make sense in a multithreaded environment. However, as the variable 
is not volatile I suspect this is not the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 32823: Patch for KAFKA-2090

2015-04-03 Thread Tim Brooks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32823/
---

Review request for kafka.


Bugs: KAFKA-2090
https://issues.apache.org/jira/browse/KAFKA-2090


Repository: kafka


Description
---

Remove a duplicate check to metadataFetchInProgress


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
f4295025c28e2842244dc775052b7a3d30fb9d11 

Diff: https://reviews.apache.org/r/32823/diff/


Testing
---


Thanks,

Tim Brooks

[jira] [Updated] (KAFKA-2090) Remove duplicate check to metadataFetchInProgress

2015-04-03 Thread Tim Brooks (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2090:
--
Attachment: KAFKA-2090.patch

 Remove duplicate check to metadataFetchInProgress
 -

 Key: KAFKA-2090
 URL: https://issues.apache.org/jira/browse/KAFKA-2090
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks
Priority: Minor
 Attachments: KAFKA-2090.patch


 In the NetworkClient class on a call to poll() there are three checks 
 (timeToNextMetadataUpdate, timeToNextReconnectAttempt, waitForMetadataFetch) 
 made to see if a metadata update should be made. If all of these == 0 then an 
 update is performed.
 However, in the if block a second check is made to metadataFetchInProgress. 
 This only make sense in a multithreaded environment. However, as the variable 
 is not volatile I suspect this is not the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2090) Remove duplicate check to metadataFetchInProgress

2015-04-03 Thread Tim Brooks (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394750#comment-14394750
 ] 

Tim Brooks commented on KAFKA-2090:
---

Created reviewboard https://reviews.apache.org/r/32823/diff/
 against branch origin/trunk

 Remove duplicate check to metadataFetchInProgress
 -

 Key: KAFKA-2090
 URL: https://issues.apache.org/jira/browse/KAFKA-2090
 Project: Kafka
  Issue Type: Improvement
Reporter: Tim Brooks
Priority: Minor
 Attachments: KAFKA-2090.patch


 In the NetworkClient class on a call to poll() there are three checks 
 (timeToNextMetadataUpdate, timeToNextReconnectAttempt, waitForMetadataFetch) 
 made to see if a metadata update should be made. If all of these == 0 then an 
 update is performed.
 However, in the if block a second check is made to metadataFetchInProgress. 
 This only make sense in a multithreaded environment. However, as the variable 
 is not volatile I suspect this is not the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-2050) Avoid calling .size() on java.util.ConcurrentLinkedQueue

Tim Brooks created KAFKA-2050:
-

 Summary: Avoid calling .size() on java.util.ConcurrentLinkedQueue
 Key: KAFKA-2050
 URL: https://issues.apache.org/jira/browse/KAFKA-2050
 Project: Kafka
  Issue Type: Bug
  Components: network
Reporter: Tim Brooks
Assignee: Jun Rao


Generally, it seems to be preferred to avoid calling .size() on a Java 
ConcurrentLinkedQueue. This is an O(N) operation as it must iterate through all 
the nodes.

Calling this every time through the loop makes this issue worse under high 
load. It seems like the same functionality can be attained by just polling and 
checking for null.

This is more imperative and less functional, but it seems alright since this is 
in lower-level networking code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2050) Avoid calling .size() on java.util.ConcurrentLinkedQueue


 [ 
https://issues.apache.org/jira/browse/KAFKA-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2050:
--
Attachment: dont_call_queue_size.patch

 Avoid calling .size() on java.util.ConcurrentLinkedQueue
 

 Key: KAFKA-2050
 URL: https://issues.apache.org/jira/browse/KAFKA-2050
 Project: Kafka
  Issue Type: Bug
  Components: network
Reporter: Tim Brooks
Assignee: Jun Rao
 Attachments: dont_call_queue_size.patch


 Generally, it seems to be preferred to avoid calling .size() on a Java 
 ConcurrentLinkedQueue. This is an O(N) operation as it must iterate through 
 all the nodes.
 Calling this every time through the loop makes this issue worse under high 
 load. It seems like the same functionality can be attained by just polling 
 and checking for null.
 This is more imperative and less functional, but it seems alright since this 
 is in lower-level networking code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2050) Avoid calling .size() on java.util.ConcurrentLinkedQueue


[ 
https://issues.apache.org/jira/browse/KAFKA-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381251#comment-14381251
 ] 

Tim Brooks commented on KAFKA-2050:
---

Okay. I submitted it. I had some trouble. But I think I was able to work it out.

 Avoid calling .size() on java.util.ConcurrentLinkedQueue
 

 Key: KAFKA-2050
 URL: https://issues.apache.org/jira/browse/KAFKA-2050
 Project: Kafka
  Issue Type: Bug
  Components: network
Reporter: Tim Brooks
Assignee: Jun Rao
 Attachments: KAFKA-2050.patch, dont_call_queue_size.patch


 Generally, it seems to be preferred to avoid calling .size() on a Java 
 ConcurrentLinkedQueue. This is an O(N) operation as it must iterate through 
 all the nodes.
 Calling this every time through the loop makes this issue worse under high 
 load. It seems like the same functionality can be attained by just polling 
 and checking for null.
 This is more imperative and less functional, but it seems alright since this 
 is in lower-level networking code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2050) Avoid calling .size() on java.util.ConcurrentLinkedQueue


 [ 
https://issues.apache.org/jira/browse/KAFKA-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Brooks updated KAFKA-2050:
--
Attachment: KAFKA-2050.patch

 Avoid calling .size() on java.util.ConcurrentLinkedQueue
 

 Key: KAFKA-2050
 URL: https://issues.apache.org/jira/browse/KAFKA-2050
 Project: Kafka
  Issue Type: Bug
  Components: network
Reporter: Tim Brooks
Assignee: Jun Rao
 Attachments: KAFKA-2050.patch, dont_call_queue_size.patch


 Generally, it seems to be preferred to avoid calling .size() on a Java 
 ConcurrentLinkedQueue. This is an O(N) operation as it must iterate through 
 all the nodes.
 Calling this every time through the loop makes this issue worse under high 
 load. It seems like the same functionality can be attained by just polling 
 and checking for null.
 This is more imperative and less functional, but it seems alright since this 
 is in lower-level networking code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2050) Avoid calling .size() on java.util.ConcurrentLinkedQueue