date:20150105


 [ 
https://issues.apache.org/jira/browse/KAFKA-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1841:
-
Fix Version/s: 0.8.2

 OffsetCommitRequest API - timestamp field is not versioned
 --

 Key: KAFKA-1841
 URL: https://issues.apache.org/jira/browse/KAFKA-1841
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.2
 Environment: wire-protocol
Reporter: Dana Powers
Priority: Blocker
 Fix For: 0.8.2


 Timestamp field was added to the OffsetCommitRequest wire protocol api for 
 0.8.2 by KAFKA-1012 .  The 0.8.1.1 server does not support the timestamp 
 field, so I think the api version of OffsetCommitRequest should be 
 incremented and checked by the 0.8.2 kafka server before attempting to read a 
 timestamp from the network buffer in OffsetCommitRequest.readFrom 
 (core/src/main/scala/kafka/api/OffsetCommitRequest.scala)
 It looks like a subsequent patch (KAFKA-1462) added another api change to 
 support a new constructor w/ params generationId and consumerId, calling that 
 version 1, and a pending patch (KAFKA-1634) adds retentionMs as another 
 field, while possibly removing timestamp altogether, calling this version 2.  
 So the fix here is not straightforward enough for me to submit a patch.
 This could possibly be merged into KAFKA-1634, but opening as a separate 
 Issue because I believe the lack of versioning in the current trunk should 
 block 0.8.2 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1012) Implement an Offset Manager and hook offset requests to it


 [ 
https://issues.apache.org/jira/browse/KAFKA-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1012:
-
Fix Version/s: 0.8.2

 Implement an Offset Manager and hook offset requests to it
 --

 Key: KAFKA-1012
 URL: https://issues.apache.org/jira/browse/KAFKA-1012
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Tejas Patil
Assignee: Tejas Patil
Priority: Minor
 Fix For: 0.8.2

 Attachments: KAFKA-1012-v2.patch, KAFKA-1012.patch


 After KAFKA-657, we have a protocol for consumers to commit and fetch offsets 
 from brokers. Currently, consumers are not using this API and directly 
 talking with Zookeeper. 
 This Jira will involve following:
 1. Add a special topic in kafka for storing offsets
 2. Add an OffsetManager interface which would handle storing, accessing, 
 loading and maintaining consumer offsets
 3. Implement offset managers for both of these 2 choices : existing ZK based 
 storage or inbuilt storage for offsets.
 4. Leader brokers would now maintain an additional hash table of offsets for 
 the group-topic-partitions that they lead
 5. Consumers should now use the OffsetCommit and OffsetFetch API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1841) OffsetCommitRequest API - timestamp field is not versioned


[ 
https://issues.apache.org/jira/browse/KAFKA-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265798#comment-14265798
 ] 

Joe Stein commented on KAFKA-1841:
--

In addition to the issue you bring up, the functionality as a whole has 
changed.. when you call OffsetFetchRequest the version = 0 needs to preserve 
the old functionality 
https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaApis.scala#L678-L700
 and version = 1 the new 
https://github.com/apache/kafka/blob/0.8.2/core/src/main/scala/kafka/server/KafkaApis.scala#L153-L223.
 Also the OffsetFetchRequest functionality even though the wire protocol is the 
same after the 0.8.2 upgrade for OffsetFetchRequest if you were using 0.8.1.1 
OffsetFetchRequest 
https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaApis.scala#L705-L728
 will stop going to zookeeper and start going to Kafka storage 
https://github.com/apache/kafka/blob/0.8.2/core/src/main/scala/kafka/server/KafkaApis.scala#L504-L519
 so more errors will happen and things break too.  

 OffsetCommitRequest API - timestamp field is not versioned
 --

 Key: KAFKA-1841
 URL: https://issues.apache.org/jira/browse/KAFKA-1841
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.2
 Environment: wire-protocol
Reporter: Dana Powers
Priority: Blocker
 Fix For: 0.8.2


 Timestamp field was added to the OffsetCommitRequest wire protocol api for 
 0.8.2 by KAFKA-1012 .  The 0.8.1.1 server does not support the timestamp 
 field, so I think the api version of OffsetCommitRequest should be 
 incremented and checked by the 0.8.2 kafka server before attempting to read a 
 timestamp from the network buffer in OffsetCommitRequest.readFrom 
 (core/src/main/scala/kafka/api/OffsetCommitRequest.scala)
 It looks like a subsequent patch (KAFKA-1462) added another api change to 
 support a new constructor w/ params generationId and consumerId, calling that 
 version 1, and a pending patch (KAFKA-1634) adds retentionMs as another 
 field, while possibly removing timestamp altogether, calling this version 2.  
 So the fix here is not straightforward enough for me to submit a patch.
 This could possibly be merged into KAFKA-1634, but opening as a separate 
 Issue because I believe the lack of versioning in the current trunk should 
 block 0.8.2 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1499) Broker-side compression configuration

2015-01-05 Thread Manikumar Reddy (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264537#comment-14264537
 ] 

Manikumar Reddy commented on KAFKA-1499:


[~jjkoshy] Uploaded a new patch with some modifications.. Pl let me know if any 
changes are required.

 Broker-side compression configuration
 -

 Key: KAFKA-1499
 URL: https://issues.apache.org/jira/browse/KAFKA-1499
 Project: Kafka
  Issue Type: New Feature
Reporter: Joel Koshy
Assignee: Manikumar Reddy
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1499.patch, KAFKA-1499.patch, 
 KAFKA-1499_2014-08-15_14:20:27.patch, KAFKA-1499_2014-08-21_21:44:27.patch, 
 KAFKA-1499_2014-09-21_15:57:23.patch, KAFKA-1499_2014-09-23_14:45:38.patch, 
 KAFKA-1499_2014-09-24_14:20:33.patch, KAFKA-1499_2014-09-24_14:24:54.patch, 
 KAFKA-1499_2014-09-25_11:05:57.patch, KAFKA-1499_2014-10-27_13:13:55.patch, 
 KAFKA-1499_2014-12-16_22:39:10.patch, KAFKA-1499_2014-12-26_21:37:51.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 A given topic can have messages in mixed compression codecs. i.e., it can
 also have a mix of uncompressed/compressed messages.
 It will be useful to support a broker-side configuration to recompress
 messages to a specific compression codec. i.e., all messages (for all
 topics) on the broker will be compressed to this codec. We could have
 per-topic overrides as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Optimistic locking

2015-01-05 Thread Daniel Schierbeck

Regarding the use case – some events may only be valid given a specific
state of the world, e.g. a NewComment event and a PostClosedForComments
event would be valid only in that order. If two producer processes (e.g.
two HTTP application processes) tries to write an event each, you may get
integrity issues depending on the order. In this case, knowing that you
*must* build upon that latest event for a key will provide application
authors a way to ensure integrity.

I'm not quite sure I understand the challenge in the implementation – how
is key-based message expiration implemented, if not with the some mapping
of message keys to the last written offset? How else would you know whether
the message you're examining is the last or not?

On Mon Jan 05 2015 at 1:41:39 PM Colin Clark co...@clark.ws wrote:

 Hi,

 Couple of comments on this.

 What you're proposing is difficult to do at scale and would require some
 type of Paxos style algorithm for the update only if different - it would
 be easier in that case to just go ahead and do the update.

 Also, it seems like a conflation of concerns - in an event sourcing model,
 we save the immutable event, and represent current state in another,
 separate data structure.  Perhaps cassandra would work well here - that
 data model night provide what you're looking for out of the box.

 Just as I don't recommend people use data stores as queuing mechanisms, I
 also recommend not using a queuing mechanism as a primary datastore - mixed
 semantics.

 --
 *Colin*
 +1 612 859-6129


 On Mon, Jan 5, 2015 at 4:47 AM, Daniel Schierbeck 
 daniel.schierb...@gmail.com wrote:

  I'm trying to design a system that uses Kafka as its primary data store
 by
  persisting immutable events into a topic and keeping a secondary index in
  another data store. The secondary index would store the entities. Each
  event would pertain to some entity, e.g. a user, and those entities are
  stored in an easily queriable way.
 
  Kafka seems well suited for this, but there's one thing I'm having
 problems
  with. I cannot guarantee that only one process writes events about an
  entity, which makes the design vulnerable to integrity issues.
 
  For example, say that a user can have multiple email addresses assigned,
  and the EmailAddressRemoved event is published when the user removes one.
  There's an integrity constraint, though: every user MUST have at least
 one
  email address. As far as I can see, there's no way to stop two separate
  processes from looking up a user entity, seeing that there are two email
  addresses assigned, and each publish an event. The end result would
 violate
  the contraint.
 
  If I'm wrong in saying that this isn't possible I'd love some feedback!
 
  My current thinking is that Kafka could relatively easily support this
 kind
  of application with a small additional API. Kafka already has the
 abstract
  notion of entities through its key-based retention policy. If the produce
  API was modified in order to allow an integer OffsetConstraint, the
  following algorithm could determine whether the request should proceed:
 
  1. For every key seen, keep track of the offset of the latest message
  referencing the key.
  2. When an OffsetContraint is specified in the produce API call, compare
  that value with the latest offset for the message key.
  2.1. If they're identical, allow the operation to continue.
  2.2. If they're not identical, fail with some OptimisticLockingFailure.
 
  Would such a feature be completely out of scope for Kafka?
 
  Best regards,
  Daniel Schierbeck

Re: Optimistic locking

2015-01-05 Thread Colin Clark

Hi,

Couple of comments on this.

What you're proposing is difficult to do at scale and would require some
type of Paxos style algorithm for the update only if different - it would
be easier in that case to just go ahead and do the update.

Also, it seems like a conflation of concerns - in an event sourcing model,
we save the immutable event, and represent current state in another,
separate data structure.  Perhaps cassandra would work well here - that
data model night provide what you're looking for out of the box.

Just as I don't recommend people use data stores as queuing mechanisms, I
also recommend not using a queuing mechanism as a primary datastore - mixed
semantics.

--
*Colin*
+1 612 859-6129


On Mon, Jan 5, 2015 at 4:47 AM, Daniel Schierbeck 
daniel.schierb...@gmail.com wrote:

 I'm trying to design a system that uses Kafka as its primary data store by
 persisting immutable events into a topic and keeping a secondary index in
 another data store. The secondary index would store the entities. Each
 event would pertain to some entity, e.g. a user, and those entities are
 stored in an easily queriable way.

 Kafka seems well suited for this, but there's one thing I'm having problems
 with. I cannot guarantee that only one process writes events about an
 entity, which makes the design vulnerable to integrity issues.

 For example, say that a user can have multiple email addresses assigned,
 and the EmailAddressRemoved event is published when the user removes one.
 There's an integrity constraint, though: every user MUST have at least one
 email address. As far as I can see, there's no way to stop two separate
 processes from looking up a user entity, seeing that there are two email
 addresses assigned, and each publish an event. The end result would violate
 the contraint.

 If I'm wrong in saying that this isn't possible I'd love some feedback!

 My current thinking is that Kafka could relatively easily support this kind
 of application with a small additional API. Kafka already has the abstract
 notion of entities through its key-based retention policy. If the produce
 API was modified in order to allow an integer OffsetConstraint, the
 following algorithm could determine whether the request should proceed:

 1. For every key seen, keep track of the offset of the latest message
 referencing the key.
 2. When an OffsetContraint is specified in the produce API call, compare
 that value with the latest offset for the message key.
 2.1. If they're identical, allow the operation to continue.
 2.2. If they're not identical, fail with some OptimisticLockingFailure.

 Would such a feature be completely out of scope for Kafka?

 Best regards,
 Daniel Schierbeck

Re: Optimistic locking

2015-01-05 Thread Daniel Schierbeck

Reading your reply again, I'd like to address this:

 Also, it seems like a conflation of concerns - in an event sourcing
model, we save the immutable event, and represent current state in
another, separate
data structure.

That's the idea – events are stored in Kafka and processed sequentially by
a single process per partition. Those processes update a secondary store
(e.g. MySQL) with the current state. However, when determining whether an
event is valid, the current state must be taken into account. E.g. you
can only withdraw money from an open account, so the event sequence
AccountClosed - MoneyWithdrawn is invalid. The only way I can think of to
ensure this is the case is to have optimistic locking. Each account would
have a unique key, and in order to write an event to Kafka the secondary
store *must* be up-to-date with the previous events for that key. If not,
it should wait a bit and try again, re-validating the input against the new
state of the world. Is that a completely hopeless idea?

On Mon Jan 05 2015 at 2:18:49 PM Daniel Schierbeck 
daniel.schierb...@gmail.com wrote:

 Regarding the use case – some events may only be valid given a specific
 state of the world, e.g. a NewComment event and a PostClosedForComments
 event would be valid only in that order. If two producer processes (e.g.
 two HTTP application processes) tries to write an event each, you may get
 integrity issues depending on the order. In this case, knowing that you
 *must* build upon that latest event for a key will provide application
 authors a way to ensure integrity.

 I'm not quite sure I understand the challenge in the implementation – how
 is key-based message expiration implemented, if not with the some mapping
 of message keys to the last written offset? How else would you know whether
 the message you're examining is the last or not?

 On Mon Jan 05 2015 at 1:41:39 PM Colin Clark co...@clark.ws wrote:

 Hi,

 Couple of comments on this.

 What you're proposing is difficult to do at scale and would require some
 type of Paxos style algorithm for the update only if different - it would
 be easier in that case to just go ahead and do the update.

 Also, it seems like a conflation of concerns - in an event sourcing model,
 we save the immutable event, and represent current state in another,
 separate data structure.  Perhaps cassandra would work well here - that
 data model night provide what you're looking for out of the box.

 Just as I don't recommend people use data stores as queuing mechanisms, I
 also recommend not using a queuing mechanism as a primary datastore -
 mixed
 semantics.

 --
 *Colin*
 +1 612 859-6129


 On Mon, Jan 5, 2015 at 4:47 AM, Daniel Schierbeck 
 daniel.schierb...@gmail.com wrote:

  I'm trying to design a system that uses Kafka as its primary data store
 by
  persisting immutable events into a topic and keeping a secondary index
 in
  another data store. The secondary index would store the entities. Each
  event would pertain to some entity, e.g. a user, and those entities
 are
  stored in an easily queriable way.
 
  Kafka seems well suited for this, but there's one thing I'm having
 problems
  with. I cannot guarantee that only one process writes events about an
  entity, which makes the design vulnerable to integrity issues.
 
  For example, say that a user can have multiple email addresses assigned,
  and the EmailAddressRemoved event is published when the user removes
 one.
  There's an integrity constraint, though: every user MUST have at least
 one
  email address. As far as I can see, there's no way to stop two separate
  processes from looking up a user entity, seeing that there are two email
  addresses assigned, and each publish an event. The end result would
 violate
  the contraint.
 
  If I'm wrong in saying that this isn't possible I'd love some feedback!
 
  My current thinking is that Kafka could relatively easily support this
 kind
  of application with a small additional API. Kafka already has the
 abstract
  notion of entities through its key-based retention policy. If the
 produce
  API was modified in order to allow an integer OffsetConstraint, the
  following algorithm could determine whether the request should proceed:
 
  1. For every key seen, keep track of the offset of the latest message
  referencing the key.
  2. When an OffsetContraint is specified in the produce API call, compare
  that value with the latest offset for the message key.
  2.1. If they're identical, allow the operation to continue.
  2.2. If they're not identical, fail with some OptimisticLockingFailure.
 
  Would such a feature be completely out of scope for Kafka?
 
  Best regards,
  Daniel Schierbeck

Re: Optimistic locking

2015-01-05 Thread Daniel Schierbeck

Reading the source code, it seems that kafka.log.LogCleaner builds a
kafka.log.OffsetMap only when compacting logs, not ahead of time, so the
information is not available at write time :-/

If I'm not mistaken, this also means that the cleaner needs _two_ passes to
clean a log segment, one to build up the key-last_offset map and one to
actually remove expired messages. Having the map be available up front
could make that be a single-pass algorithm. Especially if the segments are
too large to keep in memory this should be considerable faster, at the cost
of having to maintain the mapping for each write.

I'd be willing to do the coding if you think it's a good idea.

On Mon Jan 05 2015 at 2:24:00 PM Daniel Schierbeck 
daniel.schierb...@gmail.com wrote:

 Reading your reply again, I'd like to address this:

  Also, it seems like a conflation of concerns - in an event sourcing
 model, we save the immutable event, and represent current state in
 another, separate data structure.

 That's the idea – events are stored in Kafka and processed sequentially by
 a single process per partition. Those processes update a secondary store
 (e.g. MySQL) with the current state. However, when determining whether an
 event is valid, the current state must be taken into account. E.g. you
 can only withdraw money from an open account, so the event sequence
 AccountClosed - MoneyWithdrawn is invalid. The only way I can think of to
 ensure this is the case is to have optimistic locking. Each account would
 have a unique key, and in order to write an event to Kafka the secondary
 store *must* be up-to-date with the previous events for that key. If not,
 it should wait a bit and try again, re-validating the input against the new
 state of the world. Is that a completely hopeless idea?

 On Mon Jan 05 2015 at 2:18:49 PM Daniel Schierbeck 
 daniel.schierb...@gmail.com wrote:

 Regarding the use case – some events may only be valid given a specific
 state of the world, e.g. a NewComment event and a PostClosedForComments
 event would be valid only in that order. If two producer processes (e.g.
 two HTTP application processes) tries to write an event each, you may get
 integrity issues depending on the order. In this case, knowing that you
 *must* build upon that latest event for a key will provide application
 authors a way to ensure integrity.

 I'm not quite sure I understand the challenge in the implementation – how
 is key-based message expiration implemented, if not with the some mapping
 of message keys to the last written offset? How else would you know whether
 the message you're examining is the last or not?

 On Mon Jan 05 2015 at 1:41:39 PM Colin Clark co...@clark.ws wrote:

 Hi,

 Couple of comments on this.

 What you're proposing is difficult to do at scale and would require some
 type of Paxos style algorithm for the update only if different - it would
 be easier in that case to just go ahead and do the update.

 Also, it seems like a conflation of concerns - in an event sourcing
 model,
 we save the immutable event, and represent current state in another,
 separate data structure.  Perhaps cassandra would work well here - that
 data model night provide what you're looking for out of the box.

 Just as I don't recommend people use data stores as queuing mechanisms, I
 also recommend not using a queuing mechanism as a primary datastore -
 mixed
 semantics.

 --
 *Colin*
 +1 612 859-6129


 On Mon, Jan 5, 2015 at 4:47 AM, Daniel Schierbeck 
 daniel.schierb...@gmail.com wrote:

  I'm trying to design a system that uses Kafka as its primary data
 store by
  persisting immutable events into a topic and keeping a secondary index
 in
  another data store. The secondary index would store the entities.
 Each
  event would pertain to some entity, e.g. a user, and those entities
 are
  stored in an easily queriable way.
 
  Kafka seems well suited for this, but there's one thing I'm having
 problems
  with. I cannot guarantee that only one process writes events about an
  entity, which makes the design vulnerable to integrity issues.
 
  For example, say that a user can have multiple email addresses
 assigned,
  and the EmailAddressRemoved event is published when the user removes
 one.
  There's an integrity constraint, though: every user MUST have at least
 one
  email address. As far as I can see, there's no way to stop two separate
  processes from looking up a user entity, seeing that there are two
 email
  addresses assigned, and each publish an event. The end result would
 violate
  the contraint.
 
  If I'm wrong in saying that this isn't possible I'd love some feedback!
 
  My current thinking is that Kafka could relatively easily support this
 kind
  of application with a small additional API. Kafka already has the
 abstract
  notion of entities through its key-based retention policy. If the
 produce
  API was modified in order to allow an integer OffsetConstraint, the
  following algorithm could determine whether the

[jira] [Updated] (KAFKA-1512) Limit the maximum number of connections per ip address

2015-01-05 Thread Jeff Holoman (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Holoman updated KAFKA-1512:

Attachment: KAFKA-1512-082.patch

Patch against 0.8.2

 Limit the maximum number of connections per ip address
 --

 Key: KAFKA-1512
 URL: https://issues.apache.org/jira/browse/KAFKA-1512
 Project: Kafka
  Issue Type: New Feature
Affects Versions: 0.8.2
Reporter: Jay Kreps
Assignee: Jeff Holoman
 Fix For: 0.8.2

 Attachments: KAFKA-1512-082.patch, KAFKA-1512.patch, 
 KAFKA-1512.patch, KAFKA-1512_2014-07-03_15:17:55.patch, 
 KAFKA-1512_2014-07-14_13:28:15.patch, KAFKA-1512_2014-12-23_21:47:23.patch


 To protect against client connection leaks add a new configuration
   max.connections.per.ip
 that causes the SocketServer to enforce a limit on the maximum number of 
 connections from each InetAddress instance. For backwards compatibility this 
 will default to 2 billion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1785) Consumer offset checker should show the offset manager and offsets partition


[ 
https://issues.apache.org/jira/browse/KAFKA-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265205#comment-14265205
 ] 

Neha Narkhede commented on KAFKA-1785:
--

Ping. [~jjkoshy], [~mgharat]

 Consumer offset checker should show the offset manager and offsets partition
 

 Key: KAFKA-1785
 URL: https://issues.apache.org/jira/browse/KAFKA-1785
 Project: Kafka
  Issue Type: New Feature
Reporter: Joel Koshy
Assignee: Mayuresh Gharat
Priority: Blocker
 Fix For: 0.8.2


 This is trivial, extremely useful to have and can be done as part of the 
 offset client patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1013) Modify existing tools as per the changes in KAFKA-1000

2015-01-05 Thread Mayuresh Gharat (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265225#comment-14265225
 ] 

Mayuresh Gharat commented on KAFKA-1013:


Yes. I suppose the code will require a final review.

 Modify existing tools as per the changes in KAFKA-1000
 --

 Key: KAFKA-1013
 URL: https://issues.apache.org/jira/browse/KAFKA-1013
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Tejas Patil
Assignee: Mayuresh Gharat
Priority: Minor
 Attachments: KAFKA-1013.patch, KAFKA-1013.patch, 
 KAFKA-1013_2014-09-23_10:45:59.patch, KAFKA-1013_2014-09-23_10:48:07.patch, 
 KAFKA-1013_2014-09-26_18:52:09.patch, KAFKA-1013_2014-10-01_21:05:00.patch, 
 KAFKA-1013_2014-10-02_22:42:58.patch, KAFKA-1013_2014-12-21_14:42:49.patch, 
 KAFKA-1013_2014-12-27_16:15:54.patch, KAFKA-1013_2014-12-27_16:20:57.patch


 Modify existing tools as per the changes in KAFKA-1000. AFAIK, the tools 
 below would be affected:
 - ConsumerOffsetChecker
 - ExportZkOffsets
 - ImportZkOffsets
 - UpdateOffsetsInZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Latency Tracking Across All Kafka Component

2015-01-05 Thread Otis Gospodnetic

Hi,

That sounds a bit like needing a full, cross-app, cross-network
transaction/call tracing, and not something specific or limited to Kafka,
doesn't it?

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr  Elasticsearch Support * http://sematext.com/


On Mon, Jan 5, 2015 at 2:43 PM, Bhavesh Mistry mistry.p.bhav...@gmail.com
wrote:

 Hi Kafka Team/Users,

 We are using Linked-in Kafka data pipe-line end-to-end.

 Producer(s) -Local DC Brokers - MM - Central brokers - Camus Job -
 HDFS

 This is working out very well for us, but we need to have visibility of
 latency at each layer (Local DC Brokers - MM - Central brokers - Camus
 Job -  HDFS).  Our events are time-based (time event was produce).  Is
 there any feature or any audit trail  mentioned at (
 https://github.com/linkedin/camus/) ?  But, I would like to know
 in-between
 latency and time event spent in each hope? So, we do not know where is
 problem and what t o optimize ?

 Any of this cover in 0.9.0 or any other version of upcoming Kafka release
 ?  How might we achive this  latency tracking across all components ?


 Thanks,

 Bhavesh

Re: Review Request 29210: Patch for KAFKA-1819

2015-01-05 Thread Neha Narkhede


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29210/#review66733
---



core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala
https://reviews.apache.org/r/29210/#comment110339

suggest moving this logic to verifyTopicDeletion. It is one of the 
important validity checks that checks if there are any deleted topics in the 
cleaner checkpoint.


- Neha Narkhede


On Dec. 31, 2014, 12:01 a.m., Gwen Shapira wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29210/
 ---
 
 (Updated Dec. 31, 2014, 12:01 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1819
 https://issues.apache.org/jira/browse/KAFKA-1819
 
 
 Repository: kafka
 
 
 Description
 ---
 
 added locking
 
 
 improved tests per Joel and Neha's suggestions
 
 
 added cleaner test to DeleteTopicTest
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/log/LogCleaner.scala 
 f8fcb843c80eec3cf3c931df6bb472c019305253 
   core/src/main/scala/kafka/log/LogCleanerManager.scala 
 bcfef77ed53f94017c06a884e4db14531774a0a2 
   core/src/main/scala/kafka/log/LogManager.scala 
 4ebaae00ca4b80bf15c7930bae2011d98bbec053 
   core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala 
 29cc01bcef9cacd8dec1f5d662644fc6fe4994bc 
   core/src/test/scala/unit/kafka/log/LogCleanerIntegrationTest.scala 
 5bfa764638e92f217d0ff7108ec8f53193c22978 
 
 Diff: https://reviews.apache.org/r/29210/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Gwen Shapira

[jira] [Updated] (KAFKA-1784) Implement a ConsumerOffsetClient library


 [ 
https://issues.apache.org/jira/browse/KAFKA-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1784:
-
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Closing as duplicate of KAFKA-1013, as advised by [~mgharat]. 

 Implement a ConsumerOffsetClient library
 

 Key: KAFKA-1784
 URL: https://issues.apache.org/jira/browse/KAFKA-1784
 Project: Kafka
  Issue Type: New Feature
Reporter: Joel Koshy
Assignee: Mayuresh Gharat
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1784.patch


 I think it would be useful to provide an offset client library. It would make 
 the documentation a lot simpler. Right now it is non-trivial to commit/fetch 
 offsets to/from kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1819) Cleaner gets confused about deleted and re-created topics


[ 
https://issues.apache.org/jira/browse/KAFKA-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265187#comment-14265187
 ] 

Neha Narkhede commented on KAFKA-1819:
--

[~gwenshap]. Thanks for incorporating the review suggestions. Left another 
suggestion to refactor the test, so that all other delete topics tests, current 
and future, can benefit from the cleaner checkpoint validation check. 

 Cleaner gets confused about deleted and re-created topics
 -

 Key: KAFKA-1819
 URL: https://issues.apache.org/jira/browse/KAFKA-1819
 Project: Kafka
  Issue Type: Bug
Reporter: Gian Merlino
Assignee: Gwen Shapira
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1819.patch, KAFKA-1819_2014-12-26_13:58:44.patch, 
 KAFKA-1819_2014-12-30_16:01:19.patch


 I get an error like this after deleting a compacted topic and re-creating it. 
 I think it's because the brokers don't remove cleaning checkpoints from the 
 cleaner-offset-checkpoint file. This is from a build based off commit bd212b7.
 java.lang.IllegalArgumentException: requirement failed: Last clean offset is 
 587607 but segment base offset is 0 for log foo-6.
 at scala.Predef$.require(Predef.scala:233)
 at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:502)
 at kafka.log.Cleaner.clean(LogCleaner.scala:300)
 at 
 kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:214)
 at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:192)
 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1731) add config/jmx changes in 0.8.2 doc


[ 
https://issues.apache.org/jira/browse/KAFKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265203#comment-14265203
 ] 

Neha Narkhede commented on KAFKA-1731:
--

+1 on this patch. If there are no more reviews, I will check this in this week.

 add config/jmx changes in 0.8.2 doc
 ---

 Key: KAFKA-1731
 URL: https://issues.apache.org/jira/browse/KAFKA-1731
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jun Rao
Assignee: Jun Rao
Priority: Blocker
 Fix For: 0.8.2

 Attachments: config-jmx_082.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1013) Modify existing tools as per the changes in KAFKA-1000


[ 
https://issues.apache.org/jira/browse/KAFKA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265214#comment-14265214
 ] 

Neha Narkhede commented on KAFKA-1013:
--

[~mgharat], [~jjkoshy], Are you guys hoping to get this in 0.8.2 final?

 Modify existing tools as per the changes in KAFKA-1000
 --

 Key: KAFKA-1013
 URL: https://issues.apache.org/jira/browse/KAFKA-1013
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Tejas Patil
Assignee: Mayuresh Gharat
Priority: Minor
 Attachments: KAFKA-1013.patch, KAFKA-1013.patch, 
 KAFKA-1013_2014-09-23_10:45:59.patch, KAFKA-1013_2014-09-23_10:48:07.patch, 
 KAFKA-1013_2014-09-26_18:52:09.patch, KAFKA-1013_2014-10-01_21:05:00.patch, 
 KAFKA-1013_2014-10-02_22:42:58.patch, KAFKA-1013_2014-12-21_14:42:49.patch, 
 KAFKA-1013_2014-12-27_16:15:54.patch, KAFKA-1013_2014-12-27_16:20:57.patch


 Modify existing tools as per the changes in KAFKA-1000. AFAIK, the tools 
 below would be affected:
 - ConsumerOffsetChecker
 - ExportZkOffsets
 - ImportZkOffsets
 - UpdateOffsetsInZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 28769: Patch for KAFKA-1809

2015-01-05 Thread Gwen Shapira


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28769/
---

(Updated Jan. 5, 2015, 10:24 p.m.)


Review request for kafka.


Bugs: KAFKA-1809
https://issues.apache.org/jira/browse/KAFKA-1809


Repository: kafka


Description (updated)
---

changed topicmetadata to include brokerendpoints and fixed few unit tests


fixing systest and support for binding to default address


fixed system tests


fix default address binding and ipv6 support


fix some issues regarding endpoint parsing. Also, larger segments for systest 
make the validation much faster


added link to security wiki in doc


fixing unit test after rename of ProtocolType to SecurityProtocol


Following Joe's advice, added security protocol enum on client side, and 
modified protocol to use ID instead of string.


validate producer config against enum


add a second protocol for testing and modify SocketServerTests to check on 
multi-ports


Reverted the metadata request changes and removed the explicit security 
protocol argument. Instead the socketserver will determine the protocol based 
on the port and add this to the request


bump version for UpdateMetadataRequest and added support for rolling upgrades 
with new config


following tests - fixed LeaderAndISR protocol and ZK registration for backward 
compatibility


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
525b95e98010cd2053eacd8c321d079bcac2f910 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
f61efb35db7e0de590556e6a94a7b5cb850cdae9 
  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
a893d88c2f4e21509b6c70d6817b4b2cdd0fd657 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
7517b879866fc5dad5f8d8ad30636da8bbe7784a 
  clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/requests/MetadataRequest.java 
b22ca1dce65f665d84c2a980fd82f816e93d9960 
  clients/src/main/java/org/apache/kafka/common/utils/Utils.java 
527dd0f9c47fce7310b7a37a9b95bf87f1b9c292 
  clients/src/test/java/org/apache/kafka/clients/NetworkClientTest.java 
1a55242e9399fa4669630b55110d530f954e1279 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
df37fc6d8f0db0b8192a948426af603be3444da4 
  clients/src/test/java/org/apache/kafka/common/utils/ClientUtilsTest.java 
6e37ea553f73d9c584641c48c56dbf6e62ba5f88 
  clients/src/test/java/org/apache/kafka/common/utils/UtilsTest.java 
a39fab532f73148316a56c0f8e9197f38ea66f79 
  config/server.properties b0e4496a8ca736b6abe965a430e8ce87b0e8287f 
  core/src/main/scala/kafka/admin/AdminUtils.scala 
28b12c7b89a56c113b665fbde1b95f873f8624a3 
  core/src/main/scala/kafka/admin/TopicCommand.scala 
285c0333ff43543d3e46444c1cd9374bb883bb59 
  core/src/main/scala/kafka/api/ConsumerMetadataRequest.scala 
6d00ed090d76cd7925621a9c6db8fb00fb9d48f4 
  core/src/main/scala/kafka/api/ConsumerMetadataResponse.scala 
84f60178f6ebae735c8aa3e79ed93fe21ac4aea7 
  core/src/main/scala/kafka/api/LeaderAndIsrRequest.scala 
4ff7e8f8cc695551dd5d2fe65c74f6b6c571e340 
  core/src/main/scala/kafka/api/TopicMetadata.scala 
0190076df0adf906ecd332284f222ff974b315fc 
  core/src/main/scala/kafka/api/TopicMetadataRequest.scala 
7dca09ce637a40e125de05703dc42e8b611971ac 
  core/src/main/scala/kafka/api/TopicMetadataResponse.scala 
92ac4e687be22e4800199c0666bfac5e0059e5bb 
  core/src/main/scala/kafka/api/UpdateMetadataRequest.scala 
530982e36b17934b8cc5fb668075a5342e142c59 
  core/src/main/scala/kafka/client/ClientUtils.scala 
ebba87f0566684c796c26cb76c64b4640a5ccfde 
  core/src/main/scala/kafka/cluster/Broker.scala 
0060add008bb3bc4b0092f2173c469fce0120be6 
  core/src/main/scala/kafka/cluster/BrokerEndPoint.scala PRE-CREATION 
  core/src/main/scala/kafka/cluster/EndPoint.scala PRE-CREATION 
  core/src/main/scala/kafka/cluster/SecurityProtocol.scala PRE-CREATION 
  core/src/main/scala/kafka/common/BrokerEndPointNotAvailableException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/consumer/ConsumerConfig.scala 
9ebbee6c16dc83767297c729d2d74ebbd063a993 
  core/src/main/scala/kafka/consumer/ConsumerFetcherManager.scala 
b9e2bea7b442a19bcebd1b350d39541a8c9dd068 
  core/src/main/scala/kafka/consumer/ConsumerFetcherThread.scala 
ee6139c901082358382c5ef892281386bf6fc91b 
  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
191a8677444e53b043e9ad6e94c5a9191c32599e 
  core/src/main/scala/kafka/controller/ControllerChannelManager.scala 
eb492f00449744bc8d63f55b393e2a1659d38454 
  core/src/main/scala/kafka/controller/KafkaController.scala 
66df6d2fbdbdd556da6bea0df84f93e0472c8fbf 
  core/src/main/scala/kafka/javaapi/ConsumerMetadataResponse.scala 
1b28861cdf7dfb30fc696b54f8f8f05f730f4ece

[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs


[ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265232#comment-14265232
 ] 

Gwen Shapira commented on KAFKA-1809:
-

Updated reviewboard https://reviews.apache.org/r/28769/diff/
 against branch trunk

 Refactor brokers to allow listening on multiple ports and IPs 
 --

 Key: KAFKA-1809
 URL: https://issues.apache.org/jira/browse/KAFKA-1809
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Gwen Shapira
Assignee: Gwen Shapira
 Attachments: KAFKA-1809.patch, KAFKA-1809_2014-12-25_11:04:17.patch, 
 KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, 
 KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, 
 KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch


 The goal is to eventually support different security mechanisms on different 
 ports. 
 Currently brokers are defined as host+port pair, and this definition exists 
 throughout the code-base, therefore some refactoring is needed to support 
 multiple ports for a single broker.
 The detailed design is here: 
 https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs


 [ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-1809:

Attachment: KAFKA-1809_2015-01-05_14:23:57.patch

 Refactor brokers to allow listening on multiple ports and IPs 
 --

 Key: KAFKA-1809
 URL: https://issues.apache.org/jira/browse/KAFKA-1809
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Gwen Shapira
Assignee: Gwen Shapira
 Attachments: KAFKA-1809.patch, KAFKA-1809_2014-12-25_11:04:17.patch, 
 KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, 
 KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, 
 KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch


 The goal is to eventually support different security mechanisms on different 
 ports. 
 Currently brokers are defined as host+port pair, and this definition exists 
 throughout the code-base, therefore some refactoring is needed to support 
 multiple ports for a single broker.
 The detailed design is here: 
 https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 29590: 1. Removed defaults for serializer/deserializer. 2. Converted type cast exception to serialization exception in the producer. 3. Added string ser/deser. 4. Moved the isKey fl

2015-01-05 Thread Jay Kreps


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29590/#review66773
---



clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java
https://reviews.apache.org/r/29590/#comment110394

This does suck. An alternative would be to default the serializer to the 
string  which would not show in the docs...?



clients/src/main/java/org/apache/kafka/common/serialization/StringDeserializer.java
https://reviews.apache.org/r/29590/#comment110391

The Javadoc for this needs to give the configurations that it accepts.



clients/src/main/java/org/apache/kafka/common/serialization/StringDeserializer.java
https://reviews.apache.org/r/29590/#comment110390

It would be nice if this accepted deserializer.encoding as a way to define 
both the key and value.



clients/src/main/java/org/apache/kafka/common/serialization/StringSerializer.java
https://reviews.apache.org/r/29590/#comment110392

Ditto on javadoc and params.


- Jay Kreps


On Jan. 5, 2015, 7:47 p.m., Jun Rao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29590/
 ---
 
 (Updated Jan. 5, 2015, 7:47 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: kafka-1797
 https://issues.apache.org/jira/browse/kafka-1797
 
 
 Repository: kafka
 
 
 Description
 ---
 
 addressing Jay's comments
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ByteArrayDeserializer.java
  514cbd2c27a8d1ce13489d315f7880dfade7ffde 
   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
 1d64f08762b0c33fcaebde0f41039b327060215a 
   clients/src/main/java/org/apache/kafka/clients/consumer/Deserializer.java 
 c774a199db71fbc00776cd1256af57b2d9e55a66 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 fe9066388f4b7910512d85ef088a1b96749735ac 
   
 clients/src/main/java/org/apache/kafka/clients/producer/ByteArraySerializer.java
  9005b74a328c997663232fe3a0999b25d2267efe 
   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
 d859fc588a276eb36bcfd621ae6d7978ad0decdd 
   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
 9cdc13d6cbb372b350acf90a21538b8ba495d2e8 
   clients/src/main/java/org/apache/kafka/clients/producer/Serializer.java 
 de87f9c1caeadd176195be75d0db43fc0a518380 
   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
 3d4ab7228926f50309c07f0672f33416ce4fa37f 
   
 clients/src/main/java/org/apache/kafka/common/errors/DeserializationException.java
  a5433398fb9788e260a4250da32e4be607f3f207 
   
 clients/src/main/java/org/apache/kafka/common/serialization/StringDeserializer.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/serialization/StringSerializer.java
  PRE-CREATION 
   
 clients/src/test/java/org/apache/kafka/common/serialization/SerializationTest.java
  PRE-CREATION 
   core/src/main/scala/kafka/producer/KafkaLog4jAppender.scala 
 e194942492324092f811b86f9c1f28f79b366cfd 
   core/src/main/scala/kafka/tools/ConsoleProducer.scala 
 397d80da08c925757649b7d104d8360f56c604c3 
   core/src/main/scala/kafka/tools/MirrorMaker.scala 
 2126f6e55c5ec6a418165d340cc9a4f445af5045 
   core/src/main/scala/kafka/tools/ProducerPerformance.scala 
 f2dc4ed2f04f0e9656e10b02db5ed1d39c4a4d39 
   core/src/main/scala/kafka/tools/ReplayLogProducer.scala 
 f541987b2876a438c43ea9088ae8fed708ba82a3 
   core/src/main/scala/kafka/tools/TestEndToEndLatency.scala 
 2ebc7bf643ea91bd93ba37b8e64e8a5a9bb37ece 
   core/src/main/scala/kafka/tools/TestLogCleaning.scala 
 b81010ec0fa9835bfe48ce6aad0c491cdc67e7ef 
   core/src/test/scala/integration/kafka/api/ProducerCompressionTest.scala 
 1505fd4464dc9ac71cce52d9b64406a21e5e45d2 
   core/src/test/scala/integration/kafka/api/ProducerSendTest.scala 
 6196060edf9f1650720ec916f88933953a1daa2c 
   core/src/test/scala/unit/kafka/utils/TestUtils.scala 
 94d0028d8c4907e747aa8a74a13d719b974c97bf 
 
 Diff: https://reviews.apache.org/r/29590/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jun Rao

[jira] [Commented] (KAFKA-1785) Consumer offset checker should show the offset manager and offsets partition

2015-01-05 Thread Mayuresh Gharat (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265465#comment-14265465
 ] 

Mayuresh Gharat commented on KAFKA-1785:


Talked to Joel about it last time. Joel was of the opinion that its trivial and 
suggested that it would be better if we get it done. But in any case its not a 
blocker for the release as per our discussion. Will try to get it done but it 
can be moved in worst case.

 Consumer offset checker should show the offset manager and offsets partition
 

 Key: KAFKA-1785
 URL: https://issues.apache.org/jira/browse/KAFKA-1785
 Project: Kafka
  Issue Type: New Feature
Reporter: Joel Koshy
Assignee: Mayuresh Gharat
Priority: Blocker
 Fix For: 0.8.2


 This is trivial, extremely useful to have and can be done as part of the 
 offset client patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 29590: 1. Removed defaults for serializer/deserializer. 2. Converted type cast exception to serialization exception in the producer. 3. Added string ser/deser. 4. Moved the isKey fl

2015-01-05 Thread Jay Kreps


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29590/#review66776
---

Ship it!


I'm +1 with those minor comments addressed as you see best.

- Jay Kreps


On Jan. 5, 2015, 7:47 p.m., Jun Rao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29590/
 ---
 
 (Updated Jan. 5, 2015, 7:47 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: kafka-1797
 https://issues.apache.org/jira/browse/kafka-1797
 
 
 Repository: kafka
 
 
 Description
 ---
 
 addressing Jay's comments
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ByteArrayDeserializer.java
  514cbd2c27a8d1ce13489d315f7880dfade7ffde 
   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
 1d64f08762b0c33fcaebde0f41039b327060215a 
   clients/src/main/java/org/apache/kafka/clients/consumer/Deserializer.java 
 c774a199db71fbc00776cd1256af57b2d9e55a66 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 fe9066388f4b7910512d85ef088a1b96749735ac 
   
 clients/src/main/java/org/apache/kafka/clients/producer/ByteArraySerializer.java
  9005b74a328c997663232fe3a0999b25d2267efe 
   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
 d859fc588a276eb36bcfd621ae6d7978ad0decdd 
   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
 9cdc13d6cbb372b350acf90a21538b8ba495d2e8 
   clients/src/main/java/org/apache/kafka/clients/producer/Serializer.java 
 de87f9c1caeadd176195be75d0db43fc0a518380 
   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
 3d4ab7228926f50309c07f0672f33416ce4fa37f 
   
 clients/src/main/java/org/apache/kafka/common/errors/DeserializationException.java
  a5433398fb9788e260a4250da32e4be607f3f207 
   
 clients/src/main/java/org/apache/kafka/common/serialization/StringDeserializer.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/serialization/StringSerializer.java
  PRE-CREATION 
   
 clients/src/test/java/org/apache/kafka/common/serialization/SerializationTest.java
  PRE-CREATION 
   core/src/main/scala/kafka/producer/KafkaLog4jAppender.scala 
 e194942492324092f811b86f9c1f28f79b366cfd 
   core/src/main/scala/kafka/tools/ConsoleProducer.scala 
 397d80da08c925757649b7d104d8360f56c604c3 
   core/src/main/scala/kafka/tools/MirrorMaker.scala 
 2126f6e55c5ec6a418165d340cc9a4f445af5045 
   core/src/main/scala/kafka/tools/ProducerPerformance.scala 
 f2dc4ed2f04f0e9656e10b02db5ed1d39c4a4d39 
   core/src/main/scala/kafka/tools/ReplayLogProducer.scala 
 f541987b2876a438c43ea9088ae8fed708ba82a3 
   core/src/main/scala/kafka/tools/TestEndToEndLatency.scala 
 2ebc7bf643ea91bd93ba37b8e64e8a5a9bb37ece 
   core/src/main/scala/kafka/tools/TestLogCleaning.scala 
 b81010ec0fa9835bfe48ce6aad0c491cdc67e7ef 
   core/src/test/scala/integration/kafka/api/ProducerCompressionTest.scala 
 1505fd4464dc9ac71cce52d9b64406a21e5e45d2 
   core/src/test/scala/integration/kafka/api/ProducerSendTest.scala 
 6196060edf9f1650720ec916f88933953a1daa2c 
   core/src/test/scala/unit/kafka/utils/TestUtils.scala 
 94d0028d8c4907e747aa8a74a13d719b974c97bf 
 
 Diff: https://reviews.apache.org/r/29590/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jun Rao

[jira] [Commented] (KAFKA-1819) Cleaner gets confused about deleted and re-created topics


[ 
https://issues.apache.org/jira/browse/KAFKA-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265503#comment-14265503
 ] 

Gwen Shapira commented on KAFKA-1819:
-

Moving the check will not always verify deletion from checkpoint file. 
The checkpoints file only exists if brokers are configured with compact and 
not delete and cleanup actually happened before topic deletion (this requires 
very small segments and writing to the topic before its deleted). Thats why 
testDeleteTopicWithCleaner is a fairly involved test with very specific broker 
configuration.  So by moving the check we will check the contents of an empty 
file most of the time.

Since the new test covers the Cleaner codepath, I'm not sure what are the 
benefits of adding the check to all tests.  Do you want to activate the Cleaner 
on every test? Or run the validation regardless of whether the cleaner was 
active?


 Cleaner gets confused about deleted and re-created topics
 -

 Key: KAFKA-1819
 URL: https://issues.apache.org/jira/browse/KAFKA-1819
 Project: Kafka
  Issue Type: Bug
Reporter: Gian Merlino
Assignee: Gwen Shapira
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1819.patch, KAFKA-1819_2014-12-26_13:58:44.patch, 
 KAFKA-1819_2014-12-30_16:01:19.patch


 I get an error like this after deleting a compacted topic and re-creating it. 
 I think it's because the brokers don't remove cleaning checkpoints from the 
 cleaner-offset-checkpoint file. This is from a build based off commit bd212b7.
 java.lang.IllegalArgumentException: requirement failed: Last clean offset is 
 587607 but segment base offset is 0 for log foo-6.
 at scala.Predef$.require(Predef.scala:233)
 at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:502)
 at kafka.log.Cleaner.clean(LogCleaner.scala:300)
 at 
 kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:214)
 at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:192)
 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs


[ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265717#comment-14265717
 ] 

Gwen Shapira commented on KAFKA-1809:
-

Upgrade tests pass, system tests pass, and I've done my best to shrink the 
patch to its minimal possible size.

Reviews will be really appreciated :)

 Refactor brokers to allow listening on multiple ports and IPs 
 --

 Key: KAFKA-1809
 URL: https://issues.apache.org/jira/browse/KAFKA-1809
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Gwen Shapira
Assignee: Gwen Shapira
 Attachments: KAFKA-1809.patch, KAFKA-1809_2014-12-25_11:04:17.patch, 
 KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, 
 KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, 
 KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch, 
 KAFKA-1809_2015-01-05_20:25:15.patch, KAFKA-1809_2015-01-05_21:40:14.patch


 The goal is to eventually support different security mechanisms on different 
 ports. 
 Currently brokers are defined as host+port pair, and this definition exists 
 throughout the code-base, therefore some refactoring is needed to support 
 multiple ports for a single broker.
 The detailed design is here: 
 https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs


 [ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-1809:

Attachment: KAFKA-1809_2015-01-05_20:25:15.patch

 Refactor brokers to allow listening on multiple ports and IPs 
 --

 Key: KAFKA-1809
 URL: https://issues.apache.org/jira/browse/KAFKA-1809
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Gwen Shapira
Assignee: Gwen Shapira
 Attachments: KAFKA-1809.patch, KAFKA-1809_2014-12-25_11:04:17.patch, 
 KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, 
 KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, 
 KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch, 
 KAFKA-1809_2015-01-05_20:25:15.patch


 The goal is to eventually support different security mechanisms on different 
 ports. 
 Currently brokers are defined as host+port pair, and this definition exists 
 throughout the code-base, therefore some refactoring is needed to support 
 multiple ports for a single broker.
 The detailed design is here: 
 https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 28769: Patch for KAFKA-1809

2015-01-05 Thread Gwen Shapira


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28769/
---

(Updated Jan. 6, 2015, 4:25 a.m.)


Review request for kafka.


Bugs: KAFKA-1809
https://issues.apache.org/jira/browse/KAFKA-1809


Repository: kafka


Description (updated)
---

changed topicmetadata to include brokerendpoints and fixed few unit tests


fixing systest and support for binding to default address


fixed system tests


fix default address binding and ipv6 support


fix some issues regarding endpoint parsing. Also, larger segments for systest 
make the validation much faster


added link to security wiki in doc


fixing unit test after rename of ProtocolType to SecurityProtocol


Following Joe's advice, added security protocol enum on client side, and 
modified protocol to use ID instead of string.


validate producer config against enum


add a second protocol for testing and modify SocketServerTests to check on 
multi-ports


Reverted the metadata request changes and removed the explicit security 
protocol argument. Instead the socketserver will determine the protocol based 
on the port and add this to the request


bump version for UpdateMetadataRequest and added support for rolling upgrades 
with new config


following tests - fixed LeaderAndISR protocol and ZK registration for backward 
compatibility


cleaned up some changes that were actually not necessary. hopefully making this 
patch slightly easier to review


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
525b95e98010cd2053eacd8c321d079bcac2f910 
  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
a893d88c2f4e21509b6c70d6817b4b2cdd0fd657 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
7517b879866fc5dad5f8d8ad30636da8bbe7784a 
  clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/requests/MetadataRequest.java 
b22ca1dce65f665d84c2a980fd82f816e93d9960 
  clients/src/main/java/org/apache/kafka/common/utils/Utils.java 
527dd0f9c47fce7310b7a37a9b95bf87f1b9c292 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
df37fc6d8f0db0b8192a948426af603be3444da4 
  clients/src/test/java/org/apache/kafka/common/utils/ClientUtilsTest.java 
6e37ea553f73d9c584641c48c56dbf6e62ba5f88 
  clients/src/test/java/org/apache/kafka/common/utils/UtilsTest.java 
a39fab532f73148316a56c0f8e9197f38ea66f79 
  config/server.properties b0e4496a8ca736b6abe965a430e8ce87b0e8287f 
  core/src/main/scala/kafka/admin/AdminUtils.scala 
28b12c7b89a56c113b665fbde1b95f873f8624a3 
  core/src/main/scala/kafka/admin/TopicCommand.scala 
285c0333ff43543d3e46444c1cd9374bb883bb59 
  core/src/main/scala/kafka/api/ConsumerMetadataRequest.scala 
6d00ed090d76cd7925621a9c6db8fb00fb9d48f4 
  core/src/main/scala/kafka/api/ConsumerMetadataResponse.scala 
84f60178f6ebae735c8aa3e79ed93fe21ac4aea7 
  core/src/main/scala/kafka/api/LeaderAndIsrRequest.scala 
4ff7e8f8cc695551dd5d2fe65c74f6b6c571e340 
  core/src/main/scala/kafka/api/TopicMetadata.scala 
0190076df0adf906ecd332284f222ff974b315fc 
  core/src/main/scala/kafka/api/TopicMetadataRequest.scala 
7dca09ce637a40e125de05703dc42e8b611971ac 
  core/src/main/scala/kafka/api/TopicMetadataResponse.scala 
92ac4e687be22e4800199c0666bfac5e0059e5bb 
  core/src/main/scala/kafka/api/UpdateMetadataRequest.scala 
530982e36b17934b8cc5fb668075a5342e142c59 
  core/src/main/scala/kafka/client/ClientUtils.scala 
ebba87f0566684c796c26cb76c64b4640a5ccfde 
  core/src/main/scala/kafka/cluster/Broker.scala 
0060add008bb3bc4b0092f2173c469fce0120be6 
  core/src/main/scala/kafka/cluster/BrokerEndPoint.scala PRE-CREATION 
  core/src/main/scala/kafka/cluster/EndPoint.scala PRE-CREATION 
  core/src/main/scala/kafka/cluster/SecurityProtocol.scala PRE-CREATION 
  core/src/main/scala/kafka/common/BrokerEndPointNotAvailableException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/consumer/ConsumerConfig.scala 
9ebbee6c16dc83767297c729d2d74ebbd063a993 
  core/src/main/scala/kafka/consumer/ConsumerFetcherManager.scala 
b9e2bea7b442a19bcebd1b350d39541a8c9dd068 
  core/src/main/scala/kafka/consumer/ConsumerFetcherThread.scala 
ee6139c901082358382c5ef892281386bf6fc91b 
  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
191a8677444e53b043e9ad6e94c5a9191c32599e 
  core/src/main/scala/kafka/controller/ControllerChannelManager.scala 
eb492f00449744bc8d63f55b393e2a1659d38454 
  core/src/main/scala/kafka/controller/KafkaController.scala 
66df6d2fbdbdd556da6bea0df84f93e0472c8fbf 
  core/src/main/scala/kafka/javaapi/ConsumerMetadataResponse.scala 
1b28861cdf7dfb30fc696b54f8f8f05f730f4ece 
  core/src/main/scala/kafka/javaapi/TopicMetadata.scala 
f384e04678df10a5b46a439f475c63371bf8e32b

[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs


[ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265686#comment-14265686
 ] 

Gwen Shapira commented on KAFKA-1809:
-

Updated reviewboard https://reviews.apache.org/r/28769/diff/
 against branch trunk

 Refactor brokers to allow listening on multiple ports and IPs 
 --

 Key: KAFKA-1809
 URL: https://issues.apache.org/jira/browse/KAFKA-1809
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Gwen Shapira
Assignee: Gwen Shapira
 Attachments: KAFKA-1809.patch, KAFKA-1809_2014-12-25_11:04:17.patch, 
 KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, 
 KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, 
 KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch, 
 KAFKA-1809_2015-01-05_20:25:15.patch


 The goal is to eventually support different security mechanisms on different 
 ports. 
 Currently brokers are defined as host+port pair, and this definition exists 
 throughout the code-base, therefore some refactoring is needed to support 
 multiple ports for a single broker.
 The detailed design is here: 
 https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 28769: Patch for KAFKA-1809

2015-01-05 Thread Gwen Shapira


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28769/
---

(Updated Jan. 6, 2015, 5:40 a.m.)


Review request for kafka.


Bugs: KAFKA-1809
https://issues.apache.org/jira/browse/KAFKA-1809


Repository: kafka


Description (updated)
---

changed topicmetadata to include brokerendpoints and fixed few unit tests


fixing systest and support for binding to default address


fixed system tests


fix default address binding and ipv6 support


fix some issues regarding endpoint parsing. Also, larger segments for systest 
make the validation much faster


added link to security wiki in doc


fixing unit test after rename of ProtocolType to SecurityProtocol


Following Joe's advice, added security protocol enum on client side, and 
modified protocol to use ID instead of string.


validate producer config against enum


add a second protocol for testing and modify SocketServerTests to check on 
multi-ports


Reverted the metadata request changes and removed the explicit security 
protocol argument. Instead the socketserver will determine the protocol based 
on the port and add this to the request


bump version for UpdateMetadataRequest and added support for rolling upgrades 
with new config


following tests - fixed LeaderAndISR protocol and ZK registration for backward 
compatibility


cleaned up some changes that were actually not necessary. hopefully making this 
patch slightly easier to review


undoing some changes that don't belong here


bring back config lost in cleanup


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/utils/Utils.java 
527dd0f9c47fce7310b7a37a9b95bf87f1b9c292 
  clients/src/test/java/org/apache/kafka/common/utils/ClientUtilsTest.java 
6e37ea553f73d9c584641c48c56dbf6e62ba5f88 
  clients/src/test/java/org/apache/kafka/common/utils/UtilsTest.java 
a39fab532f73148316a56c0f8e9197f38ea66f79 
  config/server.properties b0e4496a8ca736b6abe965a430e8ce87b0e8287f 
  core/src/main/scala/kafka/admin/AdminUtils.scala 
28b12c7b89a56c113b665fbde1b95f873f8624a3 
  core/src/main/scala/kafka/admin/TopicCommand.scala 
285c0333ff43543d3e46444c1cd9374bb883bb59 
  core/src/main/scala/kafka/api/ConsumerMetadataResponse.scala 
84f60178f6ebae735c8aa3e79ed93fe21ac4aea7 
  core/src/main/scala/kafka/api/LeaderAndIsrRequest.scala 
4ff7e8f8cc695551dd5d2fe65c74f6b6c571e340 
  core/src/main/scala/kafka/api/TopicMetadata.scala 
0190076df0adf906ecd332284f222ff974b315fc 
  core/src/main/scala/kafka/api/TopicMetadataRequest.scala 
7dca09ce637a40e125de05703dc42e8b611971ac 
  core/src/main/scala/kafka/api/TopicMetadataResponse.scala 
92ac4e687be22e4800199c0666bfac5e0059e5bb 
  core/src/main/scala/kafka/api/UpdateMetadataRequest.scala 
530982e36b17934b8cc5fb668075a5342e142c59 
  core/src/main/scala/kafka/client/ClientUtils.scala 
ebba87f0566684c796c26cb76c64b4640a5ccfde 
  core/src/main/scala/kafka/cluster/Broker.scala 
0060add008bb3bc4b0092f2173c469fce0120be6 
  core/src/main/scala/kafka/cluster/BrokerEndPoint.scala PRE-CREATION 
  core/src/main/scala/kafka/cluster/EndPoint.scala PRE-CREATION 
  core/src/main/scala/kafka/cluster/SecurityProtocol.scala PRE-CREATION 
  core/src/main/scala/kafka/common/BrokerEndPointNotAvailableException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/consumer/ConsumerConfig.scala 
9ebbee6c16dc83767297c729d2d74ebbd063a993 
  core/src/main/scala/kafka/consumer/ConsumerFetcherManager.scala 
b9e2bea7b442a19bcebd1b350d39541a8c9dd068 
  core/src/main/scala/kafka/consumer/ConsumerFetcherThread.scala 
ee6139c901082358382c5ef892281386bf6fc91b 
  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
191a8677444e53b043e9ad6e94c5a9191c32599e 
  core/src/main/scala/kafka/controller/ControllerChannelManager.scala 
eb492f00449744bc8d63f55b393e2a1659d38454 
  core/src/main/scala/kafka/controller/KafkaController.scala 
66df6d2fbdbdd556da6bea0df84f93e0472c8fbf 
  core/src/main/scala/kafka/javaapi/ConsumerMetadataResponse.scala 
1b28861cdf7dfb30fc696b54f8f8f05f730f4ece 
  core/src/main/scala/kafka/javaapi/TopicMetadata.scala 
f384e04678df10a5b46a439f475c63371bf8e32b 
  core/src/main/scala/kafka/javaapi/TopicMetadataRequest.scala 
b0b7be14d494ae8c87f4443b52db69d273c20316 
  core/src/main/scala/kafka/network/BlockingChannel.scala 
6e2a38eee8e568f9032f95c75fa5899e9715b433 
  core/src/main/scala/kafka/network/RequestChannel.scala 
7b1db3dbbb2c0676f166890f566c14aa248467ab 
  core/src/main/scala/kafka/network/SocketServer.scala 
e451592fe358158548117f47a80e807007dd8b98 
  core/src/main/scala/kafka/producer/ProducerPool.scala 
43df70bb461dd3e385e6b20396adef3c4016a3fc 
  core/src/main/scala/kafka/server/AbstractFetcherManager.scala 
20c00cb8cc2351950edbc8cb1752905a0c26e79f

[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs


[ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265715#comment-14265715
 ] 

Gwen Shapira commented on KAFKA-1809:
-

Updated reviewboard https://reviews.apache.org/r/28769/diff/
 against branch trunk

 Refactor brokers to allow listening on multiple ports and IPs 
 --

 Key: KAFKA-1809
 URL: https://issues.apache.org/jira/browse/KAFKA-1809
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Gwen Shapira
Assignee: Gwen Shapira
 Attachments: KAFKA-1809.patch, KAFKA-1809_2014-12-25_11:04:17.patch, 
 KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, 
 KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, 
 KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch, 
 KAFKA-1809_2015-01-05_20:25:15.patch, KAFKA-1809_2015-01-05_21:40:14.patch


 The goal is to eventually support different security mechanisms on different 
 ports. 
 Currently brokers are defined as host+port pair, and this definition exists 
 throughout the code-base, therefore some refactoring is needed to support 
 multiple ports for a single broker.
 The detailed design is here: 
 https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs


 [ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-1809:

Attachment: KAFKA-1809_2015-01-05_21:40:14.patch

 Refactor brokers to allow listening on multiple ports and IPs 
 --

 Key: KAFKA-1809
 URL: https://issues.apache.org/jira/browse/KAFKA-1809
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Gwen Shapira
Assignee: Gwen Shapira
 Attachments: KAFKA-1809.patch, KAFKA-1809_2014-12-25_11:04:17.patch, 
 KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, 
 KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, 
 KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch, 
 KAFKA-1809_2015-01-05_20:25:15.patch, KAFKA-1809_2015-01-05_21:40:14.patch


 The goal is to eventually support different security mechanisms on different 
 ports. 
 Currently brokers are defined as host+port pair, and this definition exists 
 throughout the code-base, therefore some refactoring is needed to support 
 multiple ports for a single broker.
 The detailed design is here: 
 https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost

2015-01-05 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1642:
-
Attachment: KAFKA-1642_2015-01-05_18:56:55.patch

 [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network 
 connection is lost
 ---

 Key: KAFKA-1642
 URL: https://issues.apache.org/jira/browse/KAFKA-1642
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Blocker
 Fix For: 0.8.2

 Attachments: 
 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, 
 KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, 
 KAFKA-1642_2014-10-23_16:19:41.patch, KAFKA-1642_2015-01-05_18:56:55.patch


 I see my CPU spike to 100% when network connection is lost for while.  It 
 seems network  IO thread are very busy logging following error message.  Is 
 this expected behavior ?
 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR 
 org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
 producer I/O thread: 
 java.lang.IllegalStateException: No entry found for node -2
 at 
 org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110)
 at 
 org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99)
 at 
 org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394)
 at 
 org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380)
 at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
 at java.lang.Thread.run(Thread.java:744)
 Thanks,
 Bhavesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost

2015-01-05 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265602#comment-14265602
 ] 

Ewen Cheslack-Postava commented on KAFKA-1642:
--

Updated reviewboard https://reviews.apache.org/r/28582/diff/
 against branch origin/trunk

 [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network 
 connection is lost
 ---

 Key: KAFKA-1642
 URL: https://issues.apache.org/jira/browse/KAFKA-1642
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Blocker
 Fix For: 0.8.2

 Attachments: 
 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, 
 KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, 
 KAFKA-1642_2014-10-23_16:19:41.patch, KAFKA-1642_2015-01-05_18:56:55.patch


 I see my CPU spike to 100% when network connection is lost for while.  It 
 seems network  IO thread are very busy logging following error message.  Is 
 this expected behavior ?
 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR 
 org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
 producer I/O thread: 
 java.lang.IllegalStateException: No entry found for node -2
 at 
 org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110)
 at 
 org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99)
 at 
 org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394)
 at 
 org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380)
 at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
 at java.lang.Thread.run(Thread.java:744)
 Thanks,
 Bhavesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 28582: Patch for KAFKA-1642

2015-01-05 Thread Ewen Cheslack-Postava


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28582/
---

(Updated Jan. 6, 2015, 2:56 a.m.)


Review request for kafka.


Bugs: KAFKA-1642
https://issues.apache.org/jira/browse/KAFKA-1642


Repository: kafka


Description (updated)
---

KAFKA-1642 Make nodes as connecting before making connect request so exceptions 
are handled properly.


KAFKA-1642 Update last time no nodes were available when updating metadata has 
to initiate a connection.


KAFKA-1642 Don't busy wait trying to send metadata to a node that is connected 
but has too many in flight requests.


Improve handling of fast-fail connection attempts in NetworkClient and add some 
comments explaining how each branch of maybeUpdateMetadata works.


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
525b95e98010cd2053eacd8c321d079bcac2f910 

Diff: https://reviews.apache.org/r/28582/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava

Re: Follow-up On Important Issues for 0.8.2

2015-01-05 Thread Bhavesh Mistry

Hi Kafka Dev Team,

We have to similar change for # 2 issue, I would like to know if /KAFKA-1788
https://issues.apache.org/jira/browse/KAFKA-1788  will be part of
release.  Secondly, issue #1, I would like to know what is expected
behavior of metrics error counting and call back in (async mode) when there
is no issues with ZK or Brokers but network layer (firewalll or DNS issue)
only.

If need to be, I can file a jira ticket after understanding behavior and I
can test expected behavior as well.

Thanks,

Bhavesh

On Fri, Jan 2, 2015 at 12:37 PM, Bhavesh Mistry mistry.p.bhav...@gmail.com
wrote:

 Hi Kafka Dev Team,

 I am following-up with you guys regarding New (Java) Producer behavior in
 event of network or firewall rules.  I just wanted to make Java Producer
 resilient of any network or firewall issues, and does not become
 single-point of failure in application:

 1) Jira Issue https://issues.apache.org/jira/browse/KAFKA-1788


 https://issues.apache.org/jira/browse/KAFKA-1788?focusedCommentId=14259235page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14259235

 What should be the behavior of the Producer when it can not reach leader
 broker, but metadata reported broker is leader for that partition (via
 other broker) ?  Should the record-error-rate be counted and Call Back
 should be called with error or not ?

 1) *record-error-rate* metric remain zero despite following firewall
 rule. In my opinion, it should have called
 org.apache.kafka.clients.producer.Callback but I did not see that happening
 either one or two are brokers not reachable.

 2)  Is  jira ticket https://issues.apache.org/jira/browse/KAFKA-1788 will
 be merged to 0.8.2 ?  This will give the ability to close the producer in
 event of lost connectivity to broker  if io thread misbehave (does not end)
 ?

 Thanks for your help !

 Thanks,
 Bhavesh

[jira] [Commented] (KAFKA-1661) Move MockConsumer and MockProducer from src/main to src/test

2015-01-05 Thread jaikiran pai (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265649#comment-14265649
 ] 

jaikiran pai commented on KAFKA-1661:
-

I think it's a matter of how Kafka wants to classify certain code/classes as 
useable by applications in production. You are right that it's ultimately up to 
the developers to make the right decision to not use such classes in 
production. I don't have a strong preference in moving them to some other 
place. Maybe if something needs to be done then perhaps we could just change 
the package names to something like org.apache.kafka.clients.producer.internal 
(or even org.apache.kafka.clients.producer.unsupported) and similar for the 
consumer package?




 Move MockConsumer and MockProducer from src/main to src/test
 

 Key: KAFKA-1661
 URL: https://issues.apache.org/jira/browse/KAFKA-1661
 Project: Kafka
  Issue Type: Task
  Components: clients, consumer, producer 
Affects Versions: 0.8.1.1
 Environment: N/A
Reporter: Andras Hatvani
Priority: Trivial
  Labels: newbie, test
 Fix For: 0.8.3


 The MockConsumer and MockProducer are currently in src/main although they 
 belong in src/test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-1661) Move MockConsumer and MockProducer from src/main to src/test

2015-01-05 Thread jaikiran pai (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265649#comment-14265649
 ] 

jaikiran pai edited comment on KAFKA-1661 at 1/6/15 3:45 AM:
-

I think it's a matter of how Kafka wants to classify certain code/classes as 
not to be used in production. You are right that it's ultimately up to the 
developers to make the right decision to not use such classes in production. I 
don't have a strong preference in moving them to some other place. Maybe if 
something needs to be done then perhaps we could just change the package names 
to something like org.apache.kafka.clients.producer.internal (or even 
org.apache.kafka.clients.producer.unsupported) and similar for the consumer 
package?





was (Author: jaikiran):
I think it's a matter of how Kafka wants to classify certain code/classes as 
useable by applications in production. You are right that it's ultimately up to 
the developers to make the right decision to not use such classes in 
production. I don't have a strong preference in moving them to some other 
place. Maybe if something needs to be done then perhaps we could just change 
the package names to something like org.apache.kafka.clients.producer.internal 
(or even org.apache.kafka.clients.producer.unsupported) and similar for the 
consumer package?




 Move MockConsumer and MockProducer from src/main to src/test
 

 Key: KAFKA-1661
 URL: https://issues.apache.org/jira/browse/KAFKA-1661
 Project: Kafka
  Issue Type: Task
  Components: clients, consumer, producer 
Affects Versions: 0.8.1.1
 Environment: N/A
Reporter: Andras Hatvani
Priority: Trivial
  Labels: newbie, test
 Fix For: 0.8.3


 The MockConsumer and MockProducer are currently in src/main although they 
 belong in src/test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 29379: Patch for KAFKA-1788

2015-01-05 Thread Ewen Cheslack-Postava


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29379/#review66709
---


I think the basic approach in this patch looks sound and should work fine 
independent of any fixes to how metadata/leader info is retrieved (as discussed 
in the JIRA). However, it still needs some cleanup and fixes.


clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
https://reviews.apache.org/r/29379/#comment110290

deque is not threadsafe, you'll need a synchronized block. Since both 
branches of this if now require that and call deque.peekFirst, you might just 
want to pull that code out into the surrounding block.



clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
https://reviews.apache.org/r/29379/#comment110292

Looks like this is just a leftover setting from development? This should be 
using this.batchExpirationMs



clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
https://reviews.apache.org/r/29379/#comment110294

Normally this sequence of batch.done() and deallocate() is called from 
Sender.completeBatch(), which also calls Sender.sensors.recordErrors() when 
there was an error, as there was in this case. Any way to rework this so the 
error can be properly recorded in metrics?



clients/src/test/java/org/apache/kafka/clients/producer/RecordAccumulatorTest.java
https://reviews.apache.org/r/29379/#comment110295

This would probably be clearer if it was just batchExpirationMs.


- Ewen Cheslack-Postava


On Dec. 23, 2014, 8:44 p.m., Parth Brahmbhatt wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29379/
 ---
 
 (Updated Dec. 23, 2014, 8:44 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1788
 https://issues.apache.org/jira/browse/KAFKA-1788
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1788: Adding configuration for batch.expiration.ms to ensure batches 
 for partitions that has no leader do not stay in accumulator forever.
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
 f61efb35db7e0de590556e6a94a7b5cb850cdae9 
   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
 a893d88c2f4e21509b6c70d6817b4b2cdd0fd657 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
  c15485d1af304ef53691d478f113f332fe67af77 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordBatch.java
  dd0af8aee98abed5d4a0dc50989e37888bb353fe 
   
 clients/src/test/java/org/apache/kafka/clients/producer/RecordAccumulatorTest.java
  2c9932401d573549c40f16fda8c4e3e11309cb85 
   clients/src/test/java/org/apache/kafka/clients/producer/SenderTest.java 
 ef2ca65cabe97b909f17b62027a1bb06827e88fe 
 
 Diff: https://reviews.apache.org/r/29379/diff/
 
 
 Testing
 ---
 
 Unit test added. 
 
 
 Thanks,
 
 Parth Brahmbhatt

[jira] [Updated] (KAFKA-1841) OffsetCommitRequest API - timestamp field is not versioned

2015-01-05 Thread Dana Powers (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dana Powers updated KAFKA-1841:
---
Description: 
Timestamp field was added to the OffsetCommitRequest wire protocol api for 
0.8.2 by KAFKA-1012 .  The 0.8.1.1 server does not support the timestamp field, 
so I think the api version of OffsetCommitRequest should be incremented and 
checked by the 0.8.2 kafka server before attempting to read a timestamp from 
the network buffer in OffsetCommitRequest.readFrom 
(core/src/main/scala/kafka/api/OffsetCommitRequest.scala)

It looks like a subsequent patch (KAFKA-1462) added another api change to 
support a new constructor w/ params generationId and consumerId, calling that 
version 1, and a pending patch (KAFKA-1634) adds retentionMs as another field, 
while possibly removing timestamp altogether, calling this version 2.  So the 
fix here is not straightforward enough for me to submit a patch.

This could possibly be merged into KAFKA-1634, but opening as a separate Issue 
because I believe the lack of versioning in the current trunk should block 
0.8.2 release.

  was:
Timestamp field was added to the OffsetCommitRequest wire protocol api for 
0.8.2 by KAFKA-1012 .  The 0.8.1.1 server does not support the timestamp field, 
so I think the api version of OffsetCommitRequest should be incremented and 
checked by the 0.8.2 kafka server before attempting to read a timestamp from 
the network buffer in OffsetCommitRequest.readFrom 
(core/src/main/scala/kafka/api/OffsetCommitRequest.scala)

It looks like a subsequent patch (kafka-1462) added another api change to 
support a new constructor w/ params generationId and consumerId, calling that 
version 1, and a pending patch (kafka-1634) adds retentionMs as another field, 
while possibly removing timestamp altogether, calling this version 2.  So the 
fix here is not straightforward enough for me to submit a patch.

This could possibly be merged into KAFKA-1634, but opening as a separate Issue 
because I believe the lack of versioning in the current trunk should block 
0.8.2 release.


 OffsetCommitRequest API - timestamp field is not versioned
 --

 Key: KAFKA-1841
 URL: https://issues.apache.org/jira/browse/KAFKA-1841
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.2
 Environment: wire-protocol
Reporter: Dana Powers
Priority: Blocker

 Timestamp field was added to the OffsetCommitRequest wire protocol api for 
 0.8.2 by KAFKA-1012 .  The 0.8.1.1 server does not support the timestamp 
 field, so I think the api version of OffsetCommitRequest should be 
 incremented and checked by the 0.8.2 kafka server before attempting to read a 
 timestamp from the network buffer in OffsetCommitRequest.readFrom 
 (core/src/main/scala/kafka/api/OffsetCommitRequest.scala)
 It looks like a subsequent patch (KAFKA-1462) added another api change to 
 support a new constructor w/ params generationId and consumerId, calling that 
 version 1, and a pending patch (KAFKA-1634) adds retentionMs as another 
 field, while possibly removing timestamp altogether, calling this version 2.  
 So the fix here is not straightforward enough for me to submit a patch.
 This could possibly be merged into KAFKA-1634, but opening as a separate 
 Issue because I believe the lack of versioning in the current trunk should 
 block 0.8.2 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation

2015-01-05 Thread Dana Powers (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265042#comment-14265042
 ] 

Dana Powers commented on KAFKA-1634:


possibly related to this JIRA: KAFKA-1841 .  The timestamp field itself was not 
in the released api version 0 and if it is to be included in 0.8.2 (this JIRA 
suggests it is, but to be removed in 0.8.3 ?) then I think it will need to be 
versioned.

 Improve semantics of timestamp in OffsetCommitRequests and update 
 documentation
 ---

 Key: KAFKA-1634
 URL: https://issues.apache.org/jira/browse/KAFKA-1634
 Project: Kafka
  Issue Type: Bug
Reporter: Neha Narkhede
Assignee: Guozhang Wang
Priority: Blocker
 Fix For: 0.8.3

 Attachments: KAFKA-1634.patch, KAFKA-1634_2014-11-06_15:35:46.patch, 
 KAFKA-1634_2014-11-07_16:54:33.patch, KAFKA-1634_2014-11-17_17:42:44.patch, 
 KAFKA-1634_2014-11-21_14:00:34.patch, KAFKA-1634_2014-12-01_11:44:35.patch, 
 KAFKA-1634_2014-12-01_18:03:12.patch


 From the mailing list -
 following up on this -- I think the online API docs for OffsetCommitRequest
 still incorrectly refer to client-side timestamps:
 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
 Wasn't that removed and now always handled server-side now?  Would one of
 the devs mind updating the API spec wiki?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1661) Move MockConsumer and MockProducer from src/main to src/test

2015-01-05 Thread Jay Kreps (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264860#comment-14264860
 ] 

Jay Kreps commented on KAFKA-1661:
--

That would work, but is examples really better? I think what this stuff is, 
really, is a kind of kafka-test-support.jar, but it seems silly to make a new 
jar just for two mock classes. Basically I don't see the problem with having 
these in the main clients jar as they are part of what we ship that we expect 
users will make use of. I don't think there is any danger that people will end 
up shipping production code that makes use of the mocks, right?

 Move MockConsumer and MockProducer from src/main to src/test
 

 Key: KAFKA-1661
 URL: https://issues.apache.org/jira/browse/KAFKA-1661
 Project: Kafka
  Issue Type: Task
  Components: clients, consumer, producer 
Affects Versions: 0.8.1.1
 Environment: N/A
Reporter: Andras Hatvani
Priority: Trivial
  Labels: newbie, test
 Fix For: 0.8.3


 The MockConsumer and MockProducer are currently in src/main although they 
 belong in src/test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1723) make the metrics name in new producer more standard

2015-01-05 Thread Jay Kreps (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264870#comment-14264870
 ] 

Jay Kreps commented on KAFKA-1723:
--

Makes sense.

I'd like to keep the metric data model as independent as possible from JMX just 
because JMX is so weird as a metric system and doesn't map that well to 
anything. So I guess I would prefer to leave out domainName unless it has a 
sane non-JMX interpretation. I think having the package be particular to JMX 
seems reasonable because it is pretty specific to JMX.

 make the metrics name in new producer more standard
 ---

 Key: KAFKA-1723
 URL: https://issues.apache.org/jira/browse/KAFKA-1723
 Project: Kafka
  Issue Type: Sub-task
  Components: clients
Affects Versions: 0.8.2
Reporter: Jun Rao
Assignee: Manikumar Reddy
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1723.patch


 The jmx name in the new producer looks like the following:
 kafka.producer.myclientid:type=mytopic
 However, this can be ambiguous since we allow . in client id and topic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 29468: Patch for KAFKA-1805

2015-01-05 Thread Ewen Cheslack-Postava


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29468/#review66688
---



clients/src/main/java/org/apache/kafka/clients/producer/ProducerRecord.java
https://reviews.apache.org/r/29468/#comment110269

Most of the values in ProducerRecord can be null. That condition needs to 
be handled here as well as in hashCode()


- Ewen Cheslack-Postava


On Dec. 30, 2014, 12:37 a.m., Parth Brahmbhatt wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29468/
 ---
 
 (Updated Dec. 30, 2014, 12:37 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1805, KAFKA-1905 and KAFKA-42
 https://issues.apache.org/jira/browse/KAFKA-1805
 https://issues.apache.org/jira/browse/KAFKA-1905
 https://issues.apache.org/jira/browse/KAFKA-42
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1805: adding equals and hashcode methods to ProducerRecord.
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/producer/ProducerRecord.java 
 065d4e6c6a4966ac216e98696782e2714044df29 
 
 Diff: https://reviews.apache.org/r/29468/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Parth Brahmbhatt

[jira] [Comment Edited] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs

[
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264127#comment-14264127
]

Gwen Shapira edited comment on KAFKA-1809 at 1/5/15 7:21 PM:
-

Here's the testing plan:
* Start a Kafka 0.8.2-beta cluster (3 nodes).
* Sanity-test it.
* Replace one node with 0.8.3-SNAPSHOT with this patch.
* Produce with 0.8.2-beta new-producer, consume with 0.8.2-beta high-level
consumer
* Bounce 0.8.2 node
* Bounce 0.8.3 node
* Stop the cluster and make sure you start 0.8.3 node first (so it will be a
controller)
* Produce with 0.8.3 new-producer, consume with 0.8.3 high-level consumer

Makes sense? Anything I'm missing?

I'd like to bake these tests into system_test, but I've no clue where to begin.
Pointers will be super useful.

was (Author: gwenshap):
Here's the testing plan:
* Start a Kafka 0.8.2-beta cluster (3 nodes).
* Sanity-test it.
* Replace one node with 0.8.3-SNAPSHOT with this patch.
* Produce with 0.8.2-beta new-producer, consume with 0.8.2-beta high-level
consumer
* Bounce 0.8.2 node
* Bounce 0.8.3 node
* Produce with 0.8.3 new-producer, consume with 0.8.3 high-level consumer

Makes sense? Anything I'm missing?

I'd like to bake these tests into system_test, but I've no clue where to begin.
Pointers will be super useful.

Refactor brokers to allow listening on multiple ports and IPs
--

Key: KAFKA-1809
URL: https://issues.apache.org/jira/browse/KAFKA-1809
Project: Kafka
Issue Type: Sub-task
Components: security
Reporter: Gwen Shapira
Assignee: Gwen Shapira
Attachments: KAFKA-1809.patch, KAFKA-1809_2014-12-25_11:04:17.patch,
KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch,
KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch,
KAFKA-1809_2014-12-30_18:48:46.patch

The goal is to eventually support different security mechanisms on different
ports.
Currently brokers are defined as host+port pair, and this definition exists
throughout the code-base, therefore some refactoring is needed to support
multiple ports for a single broker.
The detailed design is here:
https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs

[
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264127#comment-14264127
]

Gwen Shapira edited comment on KAFKA-1809 at 1/5/15 7:23 PM:
-

Here's the testing plan:
* Start a Kafka 0.8.2-beta cluster (3 nodes).
* Sanity-test it.
* Replace one node with 0.8.3-SNAPSHOT with this patch.
* Produce with 0.8.2-beta new-producer, consume with 0.8.2-beta high-level
consumer (use producers that specify both old and new brokers in their config)
* Bounce 0.8.2 node
* Bounce 0.8.3 node
* Stop the cluster and make sure you start 0.8.3 node first (so it will be a
controller)
* Produce with 0.8.3 new-producer, consume with 0.8.3 high-level consumer (use
producers that specify both old and new brokers in their config)

Makes sense? Anything I'm missing?

I'd like to bake these tests into system_test, but I've no clue where to begin.
Pointers will be super useful.

was (Author: gwenshap):
Here's the testing plan:
* Start a Kafka 0.8.2-beta cluster (3 nodes).
* Sanity-test it.
* Replace one node with 0.8.3-SNAPSHOT with this patch.
* Produce with 0.8.2-beta new-producer, consume with 0.8.2-beta high-level
consumer
* Bounce 0.8.2 node
* Bounce 0.8.3 node
* Stop the cluster and make sure you start 0.8.3 node first (so it will be a
controller)
* Produce with 0.8.3 new-producer, consume with 0.8.3 high-level consumer

Makes sense? Anything I'm missing?

I'd like to bake these tests into system_test, but I've no clue where to begin.
Pointers will be super useful.

Refactor brokers to allow listening on multiple ports and IPs
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 29072: Patch for kafka-1797

2015-01-05 Thread Jun Rao



 On Jan. 2, 2015, 5:34 p.m., Jay Kreps wrote:
  I'm a little late to this party but I think there are a couple of issues we 
  should address before the release:
  1. Currently the producer defaults to 
  org.apache.kafka.clients.consumer.ByteArrayDeserializer. It should not 
  take a default. Defaulting to byte array serializer was a huge huge source 
  of confusion with the old producer because people would try to put in 
  objects of other types (after changing the parameteric types, thinking that 
  mattered) and then get a very confusing error message. By having no default 
  and requiring the user to set this themselves they will then not be 
  confused.
  2. The serializer and deserializer interface are in with the producer and 
  consumer respectively, but the SerializationException and 
  DeserializationException are under common. The serializers will need to be 
  shared more broadly than just the consumer and producer--we may end up 
  serializing other things. These should all go under 
  org.apache.kafka.common.serialization. I also recommend we have a single 
  SerializationException (no point in distinguishing serialization vs 
  deserialization).
  3. We should include a StringSerializer/StringDeserializer
  4. We should implement the ByteArray(De)Serializer and String(De)Serializer 
  as Object, Object so that we can do the type check in the serializer and 
  provide good error messages if the wrong type is provided (e.g. You are 
  using the ByteArraySerializer which expects only byte[] but have provided 
  an object that is of type X, probably you have configured the wrong 
  serializer and need to check your configuration.)

1. Make sense.
2. Make sense.
3. Yes.
4. This is possible, but doesn't work well when the serializer is passed into 
the constructor directly.


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29072/#review66523
---


On Dec. 17, 2014, 5:47 p.m., Jun Rao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29072/
 ---
 
 (Updated Dec. 17, 2014, 5:47 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: kafka-1797
 https://issues.apache.org/jira/browse/kafka-1797
 
 
 Repository: kafka
 
 
 Description
 ---
 
 fix imports
 
 
 address Neha's comments
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ByteArrayDeserializer.java
  PRE-CREATION 
   clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 
 227f5646ee708af1b861c15237eda2140cfd4900 
   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
 46efc0c8483acacf42b2984ac3f3b9e0a4566187 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRebalanceCallback.java
  f026ae41ce8203928e411f049002851952af5d65 
   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecord.java 
 436d8a479166eda29f2672b50fc99f288bbe3fa9 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecords.java 
 2ecfc8aaea90a7353bd0dabc4c0ebcc6fd9535ec 
   clients/src/main/java/org/apache/kafka/clients/consumer/Deserializer.java 
 PRE-CREATION 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 fe93afa24fc20b03830f1d190a276041d15bd3b9 
   clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java 
 c3aad3b4d6b677f759583f309061193f2f109250 
   
 clients/src/main/java/org/apache/kafka/clients/producer/ByteArraySerializer.java
  PRE-CREATION 
   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
 32f444ebbd27892275af7a0947b86a6b8317a374 
   clients/src/main/java/org/apache/kafka/clients/producer/Producer.java 
 36e8398416036cab84faad1f07159e5adefd8086 
   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
 9095caf0db1e41a4acb4216fb197626fbd85b806 
   clients/src/main/java/org/apache/kafka/clients/producer/ProducerRecord.java 
 c3181b368b6cf15e7134b04e8ff5655a9321ee0b 
   clients/src/main/java/org/apache/kafka/clients/producer/Serializer.java 
 PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/Partitioner.java
  40e8234f8771098b097bf757a86d5ac98604df5e 
   
 clients/src/main/java/org/apache/kafka/common/errors/DeserializationException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/SerializationException.java
  PRE-CREATION 
   core/src/main/scala/kafka/producer/BaseProducer.scala 
 b0207930dd0543f2c51f0b35002e13bf104340ff 
   core/src/main/scala/kafka/producer/KafkaLog4jAppender.scala 
 4b5b823b85477394cd50eb2a66877a3b8b35b57f 
   core/src/main/scala/kafka/tools/MirrorMaker.scala 
 f399105087588946987bbc84e3759935d9498b6a

Latency Tracking Across All Kafka Component

2015-01-05 Thread Bhavesh Mistry

Hi Kafka Team/Users,

We are using Linked-in Kafka data pipe-line end-to-end.

Producer(s) -Local DC Brokers - MM - Central brokers - Camus Job -
HDFS

This is working out very well for us, but we need to have visibility of
latency at each layer (Local DC Brokers - MM - Central brokers - Camus
Job -  HDFS).  Our events are time-based (time event was produce).  Is
there any feature or any audit trail  mentioned at (
https://github.com/linkedin/camus/) ?  But, I would like to know in-between
latency and time event spent in each hope? So, we do not know where is
problem and what t o optimize ?

Any of this cover in 0.9.0 or any other version of upcoming Kafka release
?  How might we achive this  latency tracking across all components ?


Thanks,

Bhavesh

Review Request 29590: 1. Removed defaults for serializer/deserializer. 2. Converted type cast exception to serialization exception in the producer. 3. Added string ser/deser. 4. Moved the isKey flag t

2015-01-05 Thread Jun Rao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29590/
---

Review request for kafka.


Bugs: kafka-1797
https://issues.apache.org/jira/browse/kafka-1797


Repository: kafka


Description
---

addressing Jay's comments


Diffs
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/ByteArrayDeserializer.java
 514cbd2c27a8d1ce13489d315f7880dfade7ffde 
  clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
1d64f08762b0c33fcaebde0f41039b327060215a 
  clients/src/main/java/org/apache/kafka/clients/consumer/Deserializer.java 
c774a199db71fbc00776cd1256af57b2d9e55a66 
  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
fe9066388f4b7910512d85ef088a1b96749735ac 
  
clients/src/main/java/org/apache/kafka/clients/producer/ByteArraySerializer.java
 9005b74a328c997663232fe3a0999b25d2267efe 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
d859fc588a276eb36bcfd621ae6d7978ad0decdd 
  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
9cdc13d6cbb372b350acf90a21538b8ba495d2e8 
  clients/src/main/java/org/apache/kafka/clients/producer/Serializer.java 
de87f9c1caeadd176195be75d0db43fc0a518380 
  clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
3d4ab7228926f50309c07f0672f33416ce4fa37f 
  
clients/src/main/java/org/apache/kafka/common/errors/DeserializationException.java
 a5433398fb9788e260a4250da32e4be607f3f207 
  
clients/src/main/java/org/apache/kafka/common/serialization/StringDeserializer.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/serialization/StringSerializer.java
 PRE-CREATION 
  
clients/src/test/java/org/apache/kafka/common/serialization/SerializationTest.java
 PRE-CREATION 
  core/src/main/scala/kafka/producer/KafkaLog4jAppender.scala 
e194942492324092f811b86f9c1f28f79b366cfd 
  core/src/main/scala/kafka/tools/ConsoleProducer.scala 
397d80da08c925757649b7d104d8360f56c604c3 
  core/src/main/scala/kafka/tools/MirrorMaker.scala 
2126f6e55c5ec6a418165d340cc9a4f445af5045 
  core/src/main/scala/kafka/tools/ProducerPerformance.scala 
f2dc4ed2f04f0e9656e10b02db5ed1d39c4a4d39 
  core/src/main/scala/kafka/tools/ReplayLogProducer.scala 
f541987b2876a438c43ea9088ae8fed708ba82a3 
  core/src/main/scala/kafka/tools/TestEndToEndLatency.scala 
2ebc7bf643ea91bd93ba37b8e64e8a5a9bb37ece 
  core/src/main/scala/kafka/tools/TestLogCleaning.scala 
b81010ec0fa9835bfe48ce6aad0c491cdc67e7ef 
  core/src/test/scala/integration/kafka/api/ProducerCompressionTest.scala 
1505fd4464dc9ac71cce52d9b64406a21e5e45d2 
  core/src/test/scala/integration/kafka/api/ProducerSendTest.scala 
6196060edf9f1650720ec916f88933953a1daa2c 
  core/src/test/scala/unit/kafka/utils/TestUtils.scala 
94d0028d8c4907e747aa8a74a13d719b974c97bf 

Diff: https://reviews.apache.org/r/29590/diff/


Testing
---


Thanks,

Jun Rao

[jira] [Updated] (KAFKA-1797) add the serializer/deserializer api to the new java client

2015-01-05 Thread Jun Rao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1797:
---
Attachment: kafka-1797.patch

 add the serializer/deserializer api to the new java client
 --

 Key: KAFKA-1797
 URL: https://issues.apache.org/jira/browse/KAFKA-1797
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 0.8.2
Reporter: Jun Rao
Assignee: Jun Rao
 Fix For: 0.8.2

 Attachments: kafka-1797.patch, kafka-1797.patch, kafka-1797.patch, 
 kafka-1797_2014-12-09_18:48:33.patch, kafka-1797_2014-12-15_15:36:24.patch, 
 kafka-1797_2014-12-17_09:47:45.patch


 Currently, the new java clients take a byte array for both the key and the 
 value. While this api is simple, it pushes the serialization/deserialization 
 logic into the application. This makes it hard to reason about what type of 
 data flows through Kafka and also makes it hard to share an implementation of 
 the serializer/deserializer. For example, to support Avro, the serialization 
 logic could be quite involved since it might need to register the Avro schema 
 in some remote registry and maintain a schema cache locally, etc. Without a 
 serialization api, it's impossible to share such an implementation so that 
 people can easily reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1797) add the serializer/deserializer api to the new java client

2015-01-05 Thread Jun Rao (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264990#comment-14264990
 ] 

Jun Rao commented on KAFKA-1797:


Created reviewboard https://reviews.apache.org/r/29590/diff/
 against branch origin/0.8.2

 add the serializer/deserializer api to the new java client
 --

 Key: KAFKA-1797
 URL: https://issues.apache.org/jira/browse/KAFKA-1797
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 0.8.2
Reporter: Jun Rao
Assignee: Jun Rao
 Fix For: 0.8.2

 Attachments: kafka-1797.patch, kafka-1797.patch, kafka-1797.patch, 
 kafka-1797_2014-12-09_18:48:33.patch, kafka-1797_2014-12-15_15:36:24.patch, 
 kafka-1797_2014-12-17_09:47:45.patch


 Currently, the new java clients take a byte array for both the key and the 
 value. While this api is simple, it pushes the serialization/deserialization 
 logic into the application. This makes it hard to reason about what type of 
 data flows through Kafka and also makes it hard to share an implementation of 
 the serializer/deserializer. For example, to support Avro, the serialization 
 logic could be quite involved since it might need to register the Avro schema 
 in some remote registry and maintain a schema cache locally, etc. Without a 
 serialization api, it's impossible to share such an implementation so that 
 people can easily reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1512) Limit the maximum number of connections per ip address


[ 
https://issues.apache.org/jira/browse/KAFKA-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264667#comment-14264667
 ] 

Joe Stein commented on KAFKA-1512:
--

I was just about to commit this and realized this is introduced in 0.8.2 but 
not fully complete so we should have this as a patch for 0.8.2 and trunk. 
[~jholoman] can you upload a 0.8.2 patch also so we can double commit this to 
0.8.2 and trunk please... if there are no objects to having this in 0.8.2 it 
seems reasonable since it was introduced in this release we shouldn't ship 
something not complete if we have fix available now.

 Limit the maximum number of connections per ip address
 --

 Key: KAFKA-1512
 URL: https://issues.apache.org/jira/browse/KAFKA-1512
 Project: Kafka
  Issue Type: New Feature
Affects Versions: 0.8.2
Reporter: Jay Kreps
Assignee: Jeff Holoman
 Fix For: 0.8.2

 Attachments: KAFKA-1512.patch, KAFKA-1512.patch, 
 KAFKA-1512_2014-07-03_15:17:55.patch, KAFKA-1512_2014-07-14_13:28:15.patch, 
 KAFKA-1512_2014-12-23_21:47:23.patch


 To protect against client connection leaks add a new configuration
   max.connections.per.ip
 that causes the SocketServer to enforce a limit on the maximum number of 
 connections from each InetAddress instance. For backwards compatibility this 
 will default to 2 billion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1512) Limit the maximum number of connections per ip address