[jira] [Commented] (KAFKA-2456) Disable SSLv3 for ssl.enabledprotocols config on client & broker side

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966882#comment-14966882
 ] 

ASF GitHub Bot commented on KAFKA-2456:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/342

KAFKA-2456 KAFKA-2472; SSL clean-ups



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-2472-fix-kafka-ssl-config-warnings

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/342.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #342


commit 6955ae23aa951ef3a2bd3a9dc88411b3f5769ffb
Author: Ismael Juma 
Date:   2015-10-20T07:57:40Z

Remove unused strings to silence compiler warning

commit 5b2a1bcc6e430eb7f0f2bf480afcc16cec6c4a81
Author: Ismael Juma 
Date:   2015-10-21T09:46:10Z

Remove `channelConfigs` and use `values` instead

There's not much value in using the former and it's error-prone.

Also include a couple of minor improvements to `KafkaConfig`.

commit 48bce07dc39023ea5b9f8dfae99b852a622430e3
Author: Ismael Juma 
Date:   2015-10-21T09:57:54Z

Add missing `define` for `SSLEndpointIdentificationAlgorithmProp` in broker

commit 1e4f1c37db26bad474efc4e85c980d1b265887fb
Author: Ismael Juma 
Date:   2015-10-21T12:45:45Z

KAFKA-2472; Fix SSL config warnings

commit a923a999338c6c6d09c40852d5dced71c8192ff2
Author: Ismael Juma 
Date:   2015-10-21T12:50:29Z

KAFKA-2456; Disable SSLv3 for ssl.enabledprotocols config on client & 
broker side




> Disable SSLv3 for ssl.enabledprotocols config on client & broker side
> -
>
> Key: KAFKA-2456
> URL: https://issues.apache.org/jira/browse/KAFKA-2456
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Sriharsha Chintalapani
>Assignee: Ismael Juma
> Fix For: 0.9.0.0
>
>
> This is a follow-up on KAFKA-1690 . Currently users have option to pass in 
> SSLv3 we should not be allowing this as its deprecated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2456) Disable SSLv3 for ssl.enabledprotocols config on client & broker side

2015-10-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2456:
---
Reviewer: Jun Rao
  Status: Patch Available  (was: In Progress)

> Disable SSLv3 for ssl.enabledprotocols config on client & broker side
> -
>
> Key: KAFKA-2456
> URL: https://issues.apache.org/jira/browse/KAFKA-2456
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Sriharsha Chintalapani
>Assignee: Ismael Juma
> Fix For: 0.9.0.0
>
>
> This is a follow-up on KAFKA-1690 . Currently users have option to pass in 
> SSLv3 we should not be allowing this as its deprecated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Flavio Junqueira
Thanks everyone for the feedback so far. At this point, I'd like to start a 
vote for KIP-38.

Summary: Add support for ZooKeeper authentication
KIP page: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
 


Thanks,
-Flavio

[jira] [Commented] (KAFKA-2644) Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL

2015-10-21 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966892#comment-14966892
 ] 

Sriharsha Chintalapani commented on KAFKA-2644:
---

[~rsivaram] we already have SaslTestHarness.scala which starts MiniKDC. If you 
want to run full kdc you can look at the vagrant setup.

> Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL
> 
>
> Key: KAFKA-2644
> URL: https://issues.apache.org/jira/browse/KAFKA-2644
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We need to define which of the existing ducktape tests are relevant. cc 
> [~rsivaram]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2644) Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL

2015-10-21 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967298#comment-14967298
 ] 

Rajini Sivaram commented on KAFKA-2644:
---

[~ijuma]  [~harsha_ch] Thank you both for your input. I will look at the test 
harness and vagrant files and see which is easier to integrate into the 
ducktape tests.
[~geoffra]  [~ewencp] Can you let me know if you think one of these is better 
suited to the ducktape test environment? We need to run either MiniKDC or full 
KDC on one of the VMs. Running MiniKDC involves packaging the jars which are 
currently test dependencies of core project and running a Java class with these 
jars in the classpath. Running KDC requires the installation steps from the 
vagrant files above.

> Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL
> 
>
> Key: KAFKA-2644
> URL: https://issues.apache.org/jira/browse/KAFKA-2644
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We need to define which of the existing ducktape tests are relevant. cc 
> [~rsivaram]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2456 KAFKA-2472; SSL clean-ups

2015-10-21 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/342

KAFKA-2456 KAFKA-2472; SSL clean-ups



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-2472-fix-kafka-ssl-config-warnings

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/342.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #342


commit 6955ae23aa951ef3a2bd3a9dc88411b3f5769ffb
Author: Ismael Juma 
Date:   2015-10-20T07:57:40Z

Remove unused strings to silence compiler warning

commit 5b2a1bcc6e430eb7f0f2bf480afcc16cec6c4a81
Author: Ismael Juma 
Date:   2015-10-21T09:46:10Z

Remove `channelConfigs` and use `values` instead

There's not much value in using the former and it's error-prone.

Also include a couple of minor improvements to `KafkaConfig`.

commit 48bce07dc39023ea5b9f8dfae99b852a622430e3
Author: Ismael Juma 
Date:   2015-10-21T09:57:54Z

Add missing `define` for `SSLEndpointIdentificationAlgorithmProp` in broker

commit 1e4f1c37db26bad474efc4e85c980d1b265887fb
Author: Ismael Juma 
Date:   2015-10-21T12:45:45Z

KAFKA-2472; Fix SSL config warnings

commit a923a999338c6c6d09c40852d5dced71c8192ff2
Author: Ismael Juma 
Date:   2015-10-21T12:50:29Z

KAFKA-2456; Disable SSLv3 for ssl.enabledprotocols config on client & 
broker side




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-2680) Zookeeper SASL check prevents any SASL code being run with IBM JDK

2015-10-21 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-2680:
-

 Summary: Zookeeper SASL check prevents any SASL code being run 
with IBM JDK
 Key: KAFKA-2680
 URL: https://issues.apache.org/jira/browse/KAFKA-2680
 Project: Kafka
  Issue Type: Bug
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
Priority: Blocker
 Fix For: 0.9.0.0


Vendor-specific code in JaasUtils prevents Kafka running with IBM JDK if a Jaas 
configuration file is provided.

{quote}
java.security.NoSuchAlgorithmException: JavaLoginConfig Configuration 
not available
at sun.security.jca.GetInstance.getInstance(GetInstance.java:210)
at 
javax.security.auth.login.Configuration.getInstance(Configuration.java:341)
at 
org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:100)
{quote}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Jun Rao
It seems that in the secure -> unsecure plan, step 3 needs to be done
before step 2.

Thanks,

Jun

On Wed, Oct 21, 2015 at 3:59 PM, Flavio Junqueira  wrote:

> Ok, thanks for the feedback, Todd. I have updated the KIP with some of the
> points discussed here. There is more to add based on these last comments,
> though.
>
> -Flavio
>
> > On 21 Oct 2015, at 23:43, Todd Palino  wrote:
> >
> > On Wed, Oct 21, 2015 at 3:38 PM, Flavio Junqueira  > wrote:
> >
> >>
> >>> On 21 Oct 2015, at 21:54, Todd Palino  wrote:
> >>>
> >>> Thanks for the clarification on that, Jun. Obviously, we haven't been
> >> doing
> >>> much with ZK authentication around here yet. There is still a small
> >> concern
> >>> there, mostly in that you should not share credentials any more than is
> >>> necessary, which would argue for being able to use a different ACL than
> >> the
> >>> default. I don't really like the idea of having to use the exact same
> >>> credentials for executing the admin tools as we do for running the
> >> brokers.
> >>> Given that we don't need to share the credentials with all consumers, I
> >>> think we can work around it.
> >>>
> >>
> >> Let me add that a feature to separate the sub-trees of users sharing an
> >> ensemble is chroot.
> >>
> >> On different credentials for admin tools, this sounds doable by setting
> >> the ACLs of znodes. For example, there could be an admin id and a broker
> >> id, both with the ability of changing znodes, but different credentials.
> >> Would something like that work for you?
> >>
> >
> > It would be a nice option to have, as the credentials can be protected
> > differently. I would consider this a nice to have, and not an "absolutely
> > must have" feature at this point.
> >
> >
> >> This does bring up another good question, however. What will be the
> >> process
> >>> for having to rotate the credentials? That is, if the credentials are
> >>> compromised and need to be changed, how can that be accomplished with
> the
> >>> cluster online. I'm guessing some combination of using skipAcl on the
> >>> Zookeeper ensemble and config changes to the brokers will be required,
> >> but
> >>> this is an important enough operation that we should make sure it's
> >>> reasonable to perform and that it is documented.
> >>
> >> Right now there is no kafka support in the plan for this. But this is
> >> doable directly through the zk api. Would it be sufficient to write down
> >> how to perform such an operation via the zk api or do we need a tool to
> do
> >> it?
> >>
> >
> > I think as long as there is a documented procedure for how to do it, that
> > will be good enough. It's mostly about making sure that we can, and that
> we
> > don't put something in place that would require downtime to a cluster in
> > order to change credentials. We can always develop a tool later if it is
> a
> > requested item.
> >
> > Thanks!
> >
> > -Todd
> >
> >
> >
> >>
> >> -Flavio
> >>
> >>>
> >>>
> >>> On Wed, Oct 21, 2015 at 1:23 PM, Jun Rao  wrote:
> >>>
>  Parth,
> 
>  For 2), in your approach, the broker/controller will then always have
> >> the
>  overhead of resetting the ACL on startup after zookeeper.set.acl is
> set
> >> to
>  true. The benefit of using a separate migration tool is that you paid
> >> the
>  cost only once during upgrade. It is an extra step during the upgrade.
>  However, given the other things that you need to do to upgrade to
> 0.9.0
>  (e.g. two rounds of rolling upgrades on all brokers, etc), I am not
> >> sure if
>  it's worth to optimize away of this step. We probably just need to
> >> document
>  this clearly.
> 
>  Todd,
> 
>  Just to be clear about the shared ZK usage. Once you set
> >> CREATOR_ALL_ACL +
>  READ_ACL_UNSAFE on a path, only ZK clients with the same user as the
>  creator can modify the path. Other ZK clients authenticated with a
>  different user can read, but not modify the path. Are you concerned
> >> about
>  the reads or the writes to ZK?
> 
>  Thanks,
> 
>  Jun
> 
> 
> 
>  On Wed, Oct 21, 2015 at 10:46 AM, Flavio Junqueira 
> >> wrote:
> 
> >
> >> On 21 Oct 2015, at 18:07, Parth Brahmbhatt <
>  pbrahmbh...@hortonworks.com>
> > wrote:
> >>
> >> I have 2 suggestions:
> >>
> >> 1) We need to document how does one move from secure to non secure
> >> environment:
> >> 1) change the config on all brokers to zookeeper.set.acl = false
> > and do a
> >> rolling upgrade.
> >> 2) Run the migration script with the jass config file so it is
>  sasl
> >> authenticated with zookeeper and change the acls on all subtrees
> back
>  to
> >> World modifiable.
> >> 3) remove the jaas config / or only the zookeeper section from
>  the
> > jaas,
> 

[GitHub] kafka pull request: MINOR: Update to Gradle 2.8

2015-10-21 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/343

MINOR: Update to Gradle 2.8

There have been a number of improvements between the version we are 
currently using (2.4) and the current version (2.8):

https://gradle.org/docs/2.5/release-notes
https://gradle.org/docs/2.6/release-notes
https://gradle.org/docs/2.7/release-notes
http://gradle.org/docs/current/release-notes

I'm particularly interested in the performance improvements.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka gradle-2.8

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/343.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #343


commit afbc0fb23cb21ae79cd6392af0ac835429353b1b
Author: Ismael Juma 
Date:   2015-10-21T16:54:40Z

Update to Gradle 2.8




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2459) Connection backoff/blackout period should start when a connection is disconnected, not when the connection attempt was initiated

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967459#comment-14967459
 ] 

ASF GitHub Bot commented on KAFKA-2459:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/290


> Connection backoff/blackout period should start when a connection is 
> disconnected, not when the connection attempt was initiated
> 
>
> Key: KAFKA-2459
> URL: https://issues.apache.org/jira/browse/KAFKA-2459
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, producer 
>Affects Versions: 0.8.2.1
>Reporter: Ewen Cheslack-Postava
>Assignee: Eno Thereska
> Fix For: 0.9.0.0
>
>
> Currently the connection code for new clients marks the time when a 
> connection was initiated (NodeConnectionState.lastConnectMs) and then uses 
> this to compute blackout periods for nodes, during which connections will not 
> be attempted and the node is not considered a candidate for leastLoadedNode.
> However, in cases where the connection attempt takes longer than the 
> blackout/backoff period (default 10ms), this results in incorrect behavior. 
> If a broker is not available and, for example, the broker does not explicitly 
> reject the connection, instead waiting for a connection timeout (e.g. due to 
> firewall settings), then the backoff period will have already elapsed and the 
> node will immediately be considered ready for a new connection attempt and a 
> node to be selected by leastLoadedNode for metadata updates. I think it 
> should be easy to reproduce and verify this problem manually by using tc to 
> introduce enough latency to make connection failures take > 10ms.
> The correct behavior would use the disconnection event to mark the end of the 
> last connection attempt and then wait for the backoff period to elapse after 
> that.
> See 
> http://mail-archives.apache.org/mod_mbox/kafka-users/201508.mbox/%3CCAJY8EofpeU4%2BAJ%3Dw91HDUx2RabjkWoU00Z%3DcQ2wHcQSrbPT4HA%40mail.gmail.com%3E
>  for the original description of the problem.
> This is related to KAFKA-1843 because leastLoadedNode currently will 
> consistently choose the same node if this blackout period is not handled 
> correctly, but is a much smaller issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Jay Kreps
+1

-Jay

On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira  wrote:

> Thanks everyone for the feedback so far. At this point, I'd like to start
> a vote for KIP-38.
>
> Summary: Add support for ZooKeeper authentication
> KIP page:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> >
>
> Thanks,
> -Flavio


Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Grant Henke
+1

Is it worth mentioning the follow up steps that were discussed in the KIP
call in this KIP document? Some of them were:

   - Adding SSL support for Zookeeper
   - Removing the "world readable" assumption

Thank you,
Grant

On Wed, Oct 21, 2015 at 10:23 AM, Onur Karaman <
okara...@linkedin.com.invalid> wrote:

> +1
>
> On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira  wrote:
>
> > Thanks everyone for the feedback so far. At this point, I'd like to start
> > a vote for KIP-38.
> >
> > Summary: Add support for ZooKeeper authentication
> > KIP page:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> > >
> >
> > Thanks,
> > -Flavio
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Flavio Junqueira
Todd,

There is a discuss thread for this KIP and we talked about it during the KIP 
call yesterday. I'm more than happy to add detail to the KIP if you bring it up 
in the discuss thread. I'd rather not stop the vote thread, though.

Thanks,
-Flavio

> On 21 Oct 2015, at 17:41, Todd Palino  wrote:
> 
> While this is a great idea, is it really ready for vote? I don't see any
> detail in the wiki about what trees will be secured, and whether or not
> that is configurable. I also don't see anything about how the use of admin
> tools is going to be addressed.
> 
> -Todd
> 
> On Wed, Oct 21, 2015 at 8:48 AM, Grant Henke  wrote:
> 
>> +1
>> 
>> Is it worth mentioning the follow up steps that were discussed in the KIP
>> call in this KIP document? Some of them were:
>> 
>>   - Adding SSL support for Zookeeper
>>   - Removing the "world readable" assumption
>> 
>> Thank you,
>> Grant
>> 
>> On Wed, Oct 21, 2015 at 10:23 AM, Onur Karaman <
>> okara...@linkedin.com.invalid> wrote:
>> 
>>> +1
>>> 
>>> On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira 
>> wrote:
>>> 
 Thanks everyone for the feedback so far. At this point, I'd like to
>> start
 a vote for KIP-38.
 
 Summary: Add support for ZooKeeper authentication
 KIP page:
 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
 <
 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> 
 
 Thanks,
 -Flavio
>>> 
>> 
>> 
>> 
>> --
>> Grant Henke
>> Software Engineer | Cloudera
>> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>> 



Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Todd Palino
There seems to be a bit of detail lacking in the KIP. Specifically, I'd
like to understand:

1) What znodes are the brokers going to secure? Is this configurable? How?
2) What ACL is the broker going to apply? Is this configurable?
3) How will the admin tools (such as preferred replica election and
partition reassignment) interact with this?

-Todd


On Wed, Oct 21, 2015 at 9:16 AM, Ismael Juma  wrote:

> On Wed, Oct 21, 2015 at 5:04 PM, Flavio Junqueira  wrote:
>
> > Bringing the points Grant brought to this thread:
> >
> > > Is it worth mentioning the follow up steps that were discussed in the
> KIP
> > > call in this KIP document? Some of them were:
> > >
> > >   - Adding SSL support for Zookeeper
> > >   - Removing the "world readable" assumption
> > >
> >
> > Grant, how would you do it? I see three options:
> >
> > 1- Add to the existing KIP, but then the functionality we should be
> > checking in soon won't include it, so the KIP will remain incomplete
> >
>
> A "Future work" section would make sense to me, but I don't know how this
> is normally handled.
>
> Ismael
>


[jira] [Updated] (KAFKA-1687) SASL unit tests

2015-10-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-1687:
---
Summary: SASL unit tests  (was: SASL tests)

> SASL unit tests
> ---
>
> Key: KAFKA-1687
> URL: https://issues.apache.org/jira/browse/KAFKA-1687
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
>
> We need tests for our SASL/Kerberos setup. This is not that easy to do with 
> Kerberos because of the dependency on the KDC. However possibly we can test 
> with another SASL mechanism that doesn't have that dependency?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1687) SASL unit tests

2015-10-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-1687:
---
Assignee: sriharsha chintalapani  (was: Gwen Shapira)

> SASL unit tests
> ---
>
> Key: KAFKA-1687
> URL: https://issues.apache.org/jira/browse/KAFKA-1687
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Jay Kreps
>Assignee: sriharsha chintalapani
>
> We need tests for our SASL/Kerberos setup. This is not that easy to do with 
> Kerberos because of the dependency on the KDC. However possibly we can test 
> with another SASL mechanism that doesn't have that dependency?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Java high level consumer providing duplicate messages when auto commit is off

2015-10-21 Thread James Cheng
Do you have multiple consumers in a consumer group?

I think that when a new consumer joins the consumer group, that the existing 
consumers will stop consuming during the group rebalance, and then when they 
start consuming again, that they will consume from the last committed offset.

You should get more verification on this, tho. I might be remembering wrong.

-James

> On Oct 21, 2015, at 8:40 AM, Cliff Rhyne  wrote:
>
> Hi,
>
> My team and I are looking into a problem where the Java high level consumer
> provides duplicate messages if we turn auto commit off (using version
> 0.8.2.1 of the server and Java client).  The expected sequence of events
> are:
>
> 1. Start high-level consumer and initialize a KafkaStream to get a
> ConsumerIterator
> 2. Consume n items (could be 10,000, could be 1,000,000) from the iterator
> 3. Commit the new offsets
>
> What we are seeing is that during step 2, some number of the n messages are
> getting returned by the iterator in duplicate (in some cases, we've seen
> n*5 messages consumed).  The problem appears to go away if we turn on auto
> commit (and committing offsets to kafka helped too), but auto commit causes
> conflicts with our offset rollback logic.  The issue seems to happen more
> when we are in our test environment on a lower-cost cloud provider.
>
> Diving into the Java and Scala classes including the ConsumerIterator, it's
> not obvious what event causes a duplicate offset to be requested or
> returned (there's even a loop that is supposed to exclude duplicate
> messages in this class).  I tried turning on trace logging but my log4j
> config isn't getting the Kafka client logs to write out.
>
> Does anyone have suggestions of where to look or how to enable logging?
>
> Thanks,
> Cliff




This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.


[jira] [Commented] (KAFKA-2658) Implement SASL/PLAIN

2015-10-21 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967340#comment-14967340
 ] 

Rajini Sivaram commented on KAFKA-2658:
---

[~ijuma] [~harsha_ch] [~junrao] Do you have time to review this PR? I 
refactored some unit test code to reuse them for Sasl tests, so the changeset 
looks bigger than the actual code changes. The main change is the config option 
for Sasl mechanism and the addition of SASL/PLAIN support. It would be of great 
help to us if this can be included in 0.9.0.0. Thank you...

> Implement SASL/PLAIN
> 
>
> Key: KAFKA-2658
> URL: https://issues.apache.org/jira/browse/KAFKA-2658
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> KAFKA-1686 supports SASL/Kerberos using GSSAPI. We should enable more SASL 
> mechanisms. SASL/PLAIN would enable a simpler use of SASL, which along with 
> SSL provides a secure Kafka that uses username/password for client 
> authentication.
> SASL/PLAIN protocol and its uses are described in 
> [https://tools.ietf.org/html/rfc4616]. It is supported in Java.
> This should be implemented after KAFKA-1686. This task should also hopefully 
> enable simpler unit testing of the SASL code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Flavio Junqueira
Bringing the points Grant brought to this thread:

> Is it worth mentioning the follow up steps that were discussed in the KIP
> call in this KIP document? Some of them were:
> 
>   - Adding SSL support for Zookeeper
>   - Removing the "world readable" assumption
> 

Grant, how would you do it? I see three options:

1- Add to the existing KIP, but then the functionality we should be checking in 
soon won't include it, so the KIP will remain incomplete
2- Create jiras and write a new KIP later
3- Create a new KIP now for the missing (but desirable) features

-Flavio

> On 19 Oct 2015, at 18:20, Flavio Junqueira  wrote:
> 
> I've created the following KIP and I'd appreciate any comment on the proposal:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
>  
> 
> 
> This is in progress and there is code for most of it already. Please check 
> the corresponding PR if you're interested.
> 
> Thanks,
> -Flavio



Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Todd Palino
I've added a reply on the discuss thread already. However, the point is
that if there were changes as a result of the KIP call (which I often can't
make on Tuesdays), it should be updated on the wiki page so everyone is
aware of what is being voted on.

-Todd


On Wed, Oct 21, 2015 at 9:47 AM, Flavio Junqueira  wrote:

> Todd,
>
> There is a discuss thread for this KIP and we talked about it during the
> KIP call yesterday. I'm more than happy to add detail to the KIP if you
> bring it up in the discuss thread. I'd rather not stop the vote thread,
> though.
>
> Thanks,
> -Flavio
>
> > On 21 Oct 2015, at 17:41, Todd Palino  wrote:
> >
> > While this is a great idea, is it really ready for vote? I don't see any
> > detail in the wiki about what trees will be secured, and whether or not
> > that is configurable. I also don't see anything about how the use of
> admin
> > tools is going to be addressed.
> >
> > -Todd
> >
> > On Wed, Oct 21, 2015 at 8:48 AM, Grant Henke 
> wrote:
> >
> >> +1
> >>
> >> Is it worth mentioning the follow up steps that were discussed in the
> KIP
> >> call in this KIP document? Some of them were:
> >>
> >>   - Adding SSL support for Zookeeper
> >>   - Removing the "world readable" assumption
> >>
> >> Thank you,
> >> Grant
> >>
> >> On Wed, Oct 21, 2015 at 10:23 AM, Onur Karaman <
> >> okara...@linkedin.com.invalid> wrote:
> >>
> >>> +1
> >>>
> >>> On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira 
> >> wrote:
> >>>
>  Thanks everyone for the feedback so far. At this point, I'd like to
> >> start
>  a vote for KIP-38.
> 
>  Summary: Add support for ZooKeeper authentication
>  KIP page:
> 
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
>  <
> 
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> >
> 
>  Thanks,
>  -Flavio
> >>>
> >>
> >>
> >>
> >> --
> >> Grant Henke
> >> Software Engineer | Cloudera
> >> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >>
>
>


[jira] [Resolved] (KAFKA-1687) SASL unit tests

2015-10-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-1687.

Resolution: Fixed

This was done as part of KAFKA-1686.

> SASL unit tests
> ---
>
> Key: KAFKA-1687
> URL: https://issues.apache.org/jira/browse/KAFKA-1687
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Jay Kreps
>Assignee: sriharsha chintalapani
>
> We need tests for our SASL/Kerberos setup. This is not that easy to do with 
> Kerberos because of the dependency on the KDC. However possibly we can test 
> with another SASL mechanism that doesn't have that dependency?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Jay Kreps
Yeah let's definitely get a complete description of the user-facing impact,
especially the changes to the command-line tools. As much as anything the
purpose of these KIPs is to fully capture what the user's life will be like
in the next release.

Also, if I understand correctly we aren't securing the consumer subtree so
it will be possible to run the scala consumer (or other non-secure
consumer) against a secured ZK cluster?

-Jay

On Wed, Oct 21, 2015 at 9:56 AM, Flavio Junqueira  wrote:

>
> > On 21 Oct 2015, at 17:47, Todd Palino  wrote:
> >
> > There seems to be a bit of detail lacking in the KIP. Specifically, I'd
> > like to understand:
> >
> > 1) What znodes are the brokers going to secure? Is this configurable?
> How?
>
> Currently it is securing all paths here except the consumers one:
>
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L56
> <
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L56
> >
>
> This isn't configurable at the moment.
>
> > 2) What ACL is the broker going to apply? Is this configurable?
>
> The default is CREATOR_ALL_ACL + READ_ACL_UNSAFE, which means that an
> authenticated client can manipulate secured znodes and everyone can read
> znodes. The API of ZkUtils accommodates other ACLs, but the current code is
> using the default.
>
> > 3) How will the admin tools (such as preferred replica election and
> > partition reassignment) interact with this?
> >
>
> Currently, you need to set a system property passing the login config file
> to be able to authenticate the client and perform writes to ZK.
>
> -Flavio
>
> > -Todd
> >
> >
> > On Wed, Oct 21, 2015 at 9:16 AM, Ismael Juma  wrote:
> >
> >> On Wed, Oct 21, 2015 at 5:04 PM, Flavio Junqueira 
> wrote:
> >>
> >>> Bringing the points Grant brought to this thread:
> >>>
>  Is it worth mentioning the follow up steps that were discussed in the
> >> KIP
>  call in this KIP document? Some of them were:
> 
>   - Adding SSL support for Zookeeper
>   - Removing the "world readable" assumption
> 
> >>>
> >>> Grant, how would you do it? I see three options:
> >>>
> >>> 1- Add to the existing KIP, but then the functionality we should be
> >>> checking in soon won't include it, so the KIP will remain incomplete
> >>>
> >>
> >> A "Future work" section would make sense to me, but I don't know how
> this
> >> is normally handled.
> >>
> >> Ismael
> >>
>
>


Java high level consumer providing duplicate messages when auto commit is off

2015-10-21 Thread Cliff Rhyne
Hi,

My team and I are looking into a problem where the Java high level consumer
provides duplicate messages if we turn auto commit off (using version
0.8.2.1 of the server and Java client).  The expected sequence of events
are:

1. Start high-level consumer and initialize a KafkaStream to get a
ConsumerIterator
2. Consume n items (could be 10,000, could be 1,000,000) from the iterator
3. Commit the new offsets

What we are seeing is that during step 2, some number of the n messages are
getting returned by the iterator in duplicate (in some cases, we've seen
n*5 messages consumed).  The problem appears to go away if we turn on auto
commit (and committing offsets to kafka helped too), but auto commit causes
conflicts with our offset rollback logic.  The issue seems to happen more
when we are in our test environment on a lower-cost cloud provider.

Diving into the Java and Scala classes including the ConsumerIterator, it's
not obvious what event causes a duplicate offset to be requested or
returned (there's even a loop that is supposed to exclude duplicate
messages in this class).  I tried turning on trace logging but my log4j
config isn't getting the Kafka client logs to write out.

Does anyone have suggestions of where to look or how to enable logging?

Thanks,
Cliff


Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Todd Palino
While this is a great idea, is it really ready for vote? I don't see any
detail in the wiki about what trees will be secured, and whether or not
that is configurable. I also don't see anything about how the use of admin
tools is going to be addressed.

-Todd

On Wed, Oct 21, 2015 at 8:48 AM, Grant Henke  wrote:

> +1
>
> Is it worth mentioning the follow up steps that were discussed in the KIP
> call in this KIP document? Some of them were:
>
>- Adding SSL support for Zookeeper
>- Removing the "world readable" assumption
>
> Thank you,
> Grant
>
> On Wed, Oct 21, 2015 at 10:23 AM, Onur Karaman <
> okara...@linkedin.com.invalid> wrote:
>
> > +1
> >
> > On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira 
> wrote:
> >
> > > Thanks everyone for the feedback so far. At this point, I'd like to
> start
> > > a vote for KIP-38.
> > >
> > > Summary: Add support for ZooKeeper authentication
> > > KIP page:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> > > >
> > >
> > > Thanks,
> > > -Flavio
> >
>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>


[GitHub] kafka pull request: KAFKA-2459: connection backoff, timeouts and r...

2015-10-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/290


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-2459) Connection backoff/blackout period should start when a connection is disconnected, not when the connection attempt was initiated

2015-10-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2459:
-
   Resolution: Fixed
Fix Version/s: 0.9.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 290
[https://github.com/apache/kafka/pull/290]

> Connection backoff/blackout period should start when a connection is 
> disconnected, not when the connection attempt was initiated
> 
>
> Key: KAFKA-2459
> URL: https://issues.apache.org/jira/browse/KAFKA-2459
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, producer 
>Affects Versions: 0.8.2.1
>Reporter: Ewen Cheslack-Postava
>Assignee: Eno Thereska
> Fix For: 0.9.0.0
>
>
> Currently the connection code for new clients marks the time when a 
> connection was initiated (NodeConnectionState.lastConnectMs) and then uses 
> this to compute blackout periods for nodes, during which connections will not 
> be attempted and the node is not considered a candidate for leastLoadedNode.
> However, in cases where the connection attempt takes longer than the 
> blackout/backoff period (default 10ms), this results in incorrect behavior. 
> If a broker is not available and, for example, the broker does not explicitly 
> reject the connection, instead waiting for a connection timeout (e.g. due to 
> firewall settings), then the backoff period will have already elapsed and the 
> node will immediately be considered ready for a new connection attempt and a 
> node to be selected by leastLoadedNode for metadata updates. I think it 
> should be easy to reproduce and verify this problem manually by using tc to 
> introduce enough latency to make connection failures take > 10ms.
> The correct behavior would use the disconnection event to mark the end of the 
> last connection attempt and then wait for the backoff period to elapse after 
> that.
> See 
> http://mail-archives.apache.org/mod_mbox/kafka-users/201508.mbox/%3CCAJY8EofpeU4%2BAJ%3Dw91HDUx2RabjkWoU00Z%3DcQ2wHcQSrbPT4HA%40mail.gmail.com%3E
>  for the original description of the problem.
> This is related to KAFKA-1843 because leastLoadedNode currently will 
> consistently choose the same node if this blackout period is not handled 
> correctly, but is a much smaller issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Ismael Juma
On Wed, Oct 21, 2015 at 5:04 PM, Flavio Junqueira  wrote:

> Bringing the points Grant brought to this thread:
>
> > Is it worth mentioning the follow up steps that were discussed in the KIP
> > call in this KIP document? Some of them were:
> >
> >   - Adding SSL support for Zookeeper
> >   - Removing the "world readable" assumption
> >
>
> Grant, how would you do it? I see three options:
>
> 1- Add to the existing KIP, but then the functionality we should be
> checking in soon won't include it, so the KIP will remain incomplete
>

A "Future work" section would make sense to me, but I don't know how this
is normally handled.

Ismael


[jira] [Commented] (KAFKA-2644) Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL

2015-10-21 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967415#comment-14967415
 ] 

Ewen Cheslack-Postava commented on KAFKA-2644:
--

[~rsivaram] To be honest, I do not know nearly enough about those tools to 
offer much guidance. From ducktape's perspective, there probably isn't much 
difference since all ducktape cares about is that you can write a Service class 
to wrap it.

Is MiniKDC enough of a replacement for a full setup to write all the system 
tests we'd want to? If it is, we'd probably just need to make sure there's an 
easy way to run it standalone since you want to run it as a separate service, 
not in a test harness class.

If we already have MiniKDC setup for our code, I'd say the things to consider 
are:
1. Can we get away with only MiniKDC? Are there any tests where it won't be 
sufficient?
2. Is MiniKDC realistic enough? Any limitations in functionality or anything 
like that which would limit our ability to write tests?

If we end up needing KDC, then you'd just want to integrate the installation 
into the Vagrant "provisioner" scripts so the necessary jars are available on 
the VMs in a fixed location that the system tests can use. Depending on the 
sort of manual testing people might want to do with a Vagrant test-bed, it 
might be handy to have that setup regardless of which one the system tests use!

> Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL
> 
>
> Key: KAFKA-2644
> URL: https://issues.apache.org/jira/browse/KAFKA-2644
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We need to define which of the existing ducktape tests are relevant. cc 
> [~rsivaram]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Flavio Junqueira

> On 21 Oct 2015, at 17:47, Todd Palino  wrote:
> 
> There seems to be a bit of detail lacking in the KIP. Specifically, I'd
> like to understand:
> 
> 1) What znodes are the brokers going to secure? Is this configurable? How?

Currently it is securing all paths here except the consumers one:

https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L56
 


This isn't configurable at the moment.

> 2) What ACL is the broker going to apply? Is this configurable?

The default is CREATOR_ALL_ACL + READ_ACL_UNSAFE, which means that an 
authenticated client can manipulate secured znodes and everyone can read 
znodes. The API of ZkUtils accommodates other ACLs, but the current code is 
using the default.

> 3) How will the admin tools (such as preferred replica election and
> partition reassignment) interact with this?
> 

Currently, you need to set a system property passing the login config file to 
be able to authenticate the client and perform writes to ZK.

-Flavio

> -Todd
> 
> 
> On Wed, Oct 21, 2015 at 9:16 AM, Ismael Juma  wrote:
> 
>> On Wed, Oct 21, 2015 at 5:04 PM, Flavio Junqueira  wrote:
>> 
>>> Bringing the points Grant brought to this thread:
>>> 
 Is it worth mentioning the follow up steps that were discussed in the
>> KIP
 call in this KIP document? Some of them were:
 
  - Adding SSL support for Zookeeper
  - Removing the "world readable" assumption
 
>>> 
>>> Grant, how would you do it? I see three options:
>>> 
>>> 1- Add to the existing KIP, but then the functionality we should be
>>> checking in soon won't include it, so the KIP will remain incomplete
>>> 
>> 
>> A "Future work" section would make sense to me, but I don't know how this
>> is normally handled.
>> 
>> Ismael
>> 



Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Parth Brahmbhatt
I have 2 suggestions:

1) We need to document how does one move from secure to non secure
environment: 
1) change the config on all brokers to zookeeper.set.acl = false and do 
a
rolling upgrade.
2) Run the migration script with the jass config file so it is sasl
authenticated with zookeeper and change the acls on all subtrees back to
World modifiable.
3) remove the jaas config / or only the zookeeper section from the jaas,
and restart all brokers.

2) I am not sure if we should force users trying to move from unsecure to
secure environment to execute the migration script. In the second step
once the zookeeper.set.acl is set to true, we can secure all the subtrees
by calling ensureCorrectAcls as part of broker initialization (just after
makesurePersistentPathExists). Not sure why we want to add one more
manual/admin step when it can be automated. This also has the added
advantage that migration script will not have to take a flag as input to
figure out if it should set the acls to secure or unsecure given it will
always be used to move from secure to unsecure.

Given we are assuming all the information in zookeeper is world readable ,
I don¹t see SSL support as a must have or a blocker for this KIP.

Thanks
Parth



On 10/21/15, 9:56 AM, "Flavio Junqueira"  wrote:

>
>> On 21 Oct 2015, at 17:47, Todd Palino  wrote:
>> 
>> There seems to be a bit of detail lacking in the KIP. Specifically, I'd
>> like to understand:
>> 
>> 1) What znodes are the brokers going to secure? Is this configurable?
>>How?
>
>Currently it is securing all paths here except the consumers one:
>
>https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils
>/ZkUtils.scala#L56
>s/ZkUtils.scala#L56>
>
>This isn't configurable at the moment.
>
>> 2) What ACL is the broker going to apply? Is this configurable?
>
>The default is CREATOR_ALL_ACL + READ_ACL_UNSAFE, which means that an
>authenticated client can manipulate secured znodes and everyone can read
>znodes. The API of ZkUtils accommodates other ACLs, but the current code
>is using the default.
>
>> 3) How will the admin tools (such as preferred replica election and
>> partition reassignment) interact with this?
>> 
>
>Currently, you need to set a system property passing the login config
>file to be able to authenticate the client and perform writes to ZK.
>
>-Flavio
>
>> -Todd
>> 
>> 
>> On Wed, Oct 21, 2015 at 9:16 AM, Ismael Juma  wrote:
>> 
>>> On Wed, Oct 21, 2015 at 5:04 PM, Flavio Junqueira 
>>>wrote:
>>> 
 Bringing the points Grant brought to this thread:
 
> Is it worth mentioning the follow up steps that were discussed in the
>>> KIP
> call in this KIP document? Some of them were:
> 
>  - Adding SSL support for Zookeeper
>  - Removing the "world readable" assumption
> 
 
 Grant, how would you do it? I see three options:
 
 1- Add to the existing KIP, but then the functionality we should be
 checking in soon won't include it, so the KIP will remain incomplete
 
>>> 
>>> A "Future work" section would make sense to me, but I don't know how
>>>this
>>> is normally handled.
>>> 
>>> Ismael
>>> 
>



Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Onur Karaman
+1

On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira  wrote:

> Thanks everyone for the feedback so far. At this point, I'd like to start
> a vote for KIP-38.
>
> Summary: Add support for ZooKeeper authentication
> KIP page:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> >
>
> Thanks,
> -Flavio


[jira] [Commented] (KAFKA-2626) Null offsets in copycat causes exception in OffsetStorageWriter

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967544#comment-14967544
 ] 

ASF GitHub Bot commented on KAFKA-2626:
---

GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/345

KAFKA-2626: Handle null keys and value validation properly in 
OffsetStorageWriter.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka 
kafka-2626-offset-storage-writer-null-values

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #345


commit b89f1f9bc214169b232e592ce1126d25c4e6e9da
Author: Ewen Cheslack-Postava 
Date:   2015-10-21T17:47:14Z

KAFKA-2626: Handle null keys and value validation properly in 
OffsetStorageWriter.




> Null offsets in copycat causes exception in OffsetStorageWriter
> ---
>
> Key: KAFKA-2626
> URL: https://issues.apache.org/jira/browse/KAFKA-2626
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.0
>
>
> {quote}
> [2015-10-07 16:20:39,052] ERROR CRITICAL: Failed to serialize offset data, 
> making it impossible to commit offsets under namespace wikipedia-irc-source. 
> This likely won't recover unless the unserializable partition or offset 
> information is overwritten. 
> (org.apache.kafka.copycat.storage.OffsetStorageWriter:152)
> [2015-10-07 16:20:39,053] ERROR Cause of serialization failure: 
> (org.apache.kafka.copycat.storage.OffsetStorageWriter:155)
> java.lang.NullPointerException
> at 
> org.apache.kafka.copycat.storage.OffsetUtils.validateFormat(OffsetUtils.java:34)
> at 
> org.apache.kafka.copycat.storage.OffsetStorageWriter.doFlush(OffsetStorageWriter.java:141)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:223)
> at 
> org.apache.kafka.copycat.runtime.WorkerSqourceTask.stop(WorkerSourceTask.java:100)
> at org.apache.kafka.copycat.runtime.Worker.stopTask(Worker.java:188)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.removeConnectorTasks(StandaloneHerder.java:210)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stopConnector(StandaloneHerder.java:155)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stop(StandaloneHerder.java:60)
> at org.apache.kafka.copycat.runtime.Copycat.stop(Copycat.java:66)
> at 
> org.apache.kafka.copycat.runtime.Copycat$ShutdownHook.run(Copycat.java:88)
> [2015-10-07 16:20:39,055] ERROR Failed to flush 
> org.apache.kafka.copycat.runtime.WorkerSourceTask$2@12782f6 offsets to 
> storage:  (org.apache.kafka.copycat.runtime.WorkerSourceTask:227)
> java.lang.NullPointerException
> at 
> org.apache.kafka.copycat.storage.OffsetUtils.validateFormat(OffsetUtils.java:34)
> at 
> org.apache.kafka.copycat.storage.OffsetStorageWriter.doFlush(OffsetStorageWriter.java:141)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:223)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.stop(WorkerSourceTask.java:100)
> at org.apache.kafka.copycat.runtime.Worker.stopTask(Worker.java:188)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.removeConnectorTasks(StandaloneHerder.java:210)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stopConnector(StandaloneHerder.java:155)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stop(StandaloneHerder.java:60)
> at org.apache.kafka.copycat.runtime.Copycat.stop(Copycat.java:66)
> at 
> org.apache.kafka.copycat.runtime.Copycat$ShutdownHook.run(Copycat.java:88)
> [2015-10-07 16:20:39,055] INFO Starting graceful shutdown of thread 
> WorkerSourceTask-wikipedia-irc-source-0 
> (org.apache.kafka.copycat.util.ShutdownableThread:119)
> [2015-10-07 16:20:39,056] INFO Herder stopped 
> (org.apache.kafka.copycat.runtime.standalone.StandaloneHerder:64)
> [2015-10-07 16:20:39,056] INFO Worker stopping 
> (org.apache.kafka.copycat.runtime.Worker:104)
> [2015-10-07 16:20:39,056] INFO Stopped FileOffsetBackingStore 
> (org.apache.kafka.copycat.storage.FileOffsetBackingStore:61)
> [2015-10-07 16:20:39,056] INFO Worker stopped 
> (org.apache.kafka.copycat.runtime.Worker:133)
> [2015-10-07 16:20:39,057] INFO Copycat stopped 
> (org.apache.kafka.copycat.runtime.Copycat:69)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2644) Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL

2015-10-21 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967569#comment-14967569
 ] 

Sriharsha Chintalapani commented on KAFKA-2644:
---

[~rsivaram] as long as you can create valid keytabs and with MiniKDC we do that 
already. So should be fine.

> Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL
> 
>
> Key: KAFKA-2644
> URL: https://issues.apache.org/jira/browse/KAFKA-2644
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We need to define which of the existing ducktape tests are relevant. cc 
> [~rsivaram]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1695) Authenticate connection to Zookeeper

2015-10-21 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967487#comment-14967487
 ] 

Ismael Juma commented on KAFKA-1695:


This ticket has been broken down into KAFKA-2639, KAFKA-2640 and KAFKA-2641.

> Authenticate connection to Zookeeper
> 
>
> Key: KAFKA-1695
> URL: https://issues.apache.org/jira/browse/KAFKA-1695
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Jay Kreps
>Assignee: Parth Brahmbhatt
> Fix For: 0.9.0.0
>
>
> We need to make it possible to secure the Zookeeper cluster Kafka is using. 
> This would make use of the normal authentication ZooKeeper provides. 
> ZooKeeper supports a variety of authentication mechanisms so we will need to 
> figure out what has to be passed in to the zookeeper client.
> The intention is that when the current round of client work is done it should 
> be possible to run without clients needing access to Zookeeper so all we need 
> here is to make it so that only the Kafka cluster is able to read and write 
> to the Kafka znodes  (we shouldn't need to set any kind of acl on a per-znode 
> basis).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2667) Copycat KafkaBasedLogTest.testSendAndReadToEnd transient failure

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967584#comment-14967584
 ] 

ASF GitHub Bot commented on KAFKA-2667:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/333


> Copycat KafkaBasedLogTest.testSendAndReadToEnd transient failure
> 
>
> Key: KAFKA-2667
> URL: https://issues.apache.org/jira/browse/KAFKA-2667
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Jason Gustafson
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.0
>
>
> Seen in recent builds:
> {code}
> org.apache.kafka.copycat.util.KafkaBasedLogTest > testSendAndReadToEnd FAILED
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.kafka.copycat.util.KafkaBasedLogTest.testSendAndReadToEnd(KafkaBasedLogTest.java:335)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2681) SASL authentication in official docs

2015-10-21 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-2681:
--

 Summary: SASL authentication in official docs
 Key: KAFKA-2681
 URL: https://issues.apache.org/jira/browse/KAFKA-2681
 Project: Kafka
  Issue Type: Sub-task
Reporter: Ismael Juma
 Fix For: 0.9.0.0


We need to add a section in the official documentation regarding SASL 
authentication:

http://kafka.apache.org/documentation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2682) Authorization section in official docs

2015-10-21 Thread Parth Brahmbhatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Brahmbhatt reassigned KAFKA-2682:
---

Assignee: Parth Brahmbhatt

> Authorization section in official docs
> --
>
> Key: KAFKA-2682
> URL: https://issues.apache.org/jira/browse/KAFKA-2682
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Parth Brahmbhatt
> Fix For: 0.9.0.0
>
>
> We need to add a section in the official documentation regarding 
> authorization:
> http://kafka.apache.org/documentation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2626) Null offsets in copycat causes exception in OffsetStorageWriter

2015-10-21 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-2626:
-
Reviewer: Guozhang Wang
  Status: Patch Available  (was: Open)

> Null offsets in copycat causes exception in OffsetStorageWriter
> ---
>
> Key: KAFKA-2626
> URL: https://issues.apache.org/jira/browse/KAFKA-2626
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.0
>
>
> {quote}
> [2015-10-07 16:20:39,052] ERROR CRITICAL: Failed to serialize offset data, 
> making it impossible to commit offsets under namespace wikipedia-irc-source. 
> This likely won't recover unless the unserializable partition or offset 
> information is overwritten. 
> (org.apache.kafka.copycat.storage.OffsetStorageWriter:152)
> [2015-10-07 16:20:39,053] ERROR Cause of serialization failure: 
> (org.apache.kafka.copycat.storage.OffsetStorageWriter:155)
> java.lang.NullPointerException
> at 
> org.apache.kafka.copycat.storage.OffsetUtils.validateFormat(OffsetUtils.java:34)
> at 
> org.apache.kafka.copycat.storage.OffsetStorageWriter.doFlush(OffsetStorageWriter.java:141)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:223)
> at 
> org.apache.kafka.copycat.runtime.WorkerSqourceTask.stop(WorkerSourceTask.java:100)
> at org.apache.kafka.copycat.runtime.Worker.stopTask(Worker.java:188)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.removeConnectorTasks(StandaloneHerder.java:210)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stopConnector(StandaloneHerder.java:155)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stop(StandaloneHerder.java:60)
> at org.apache.kafka.copycat.runtime.Copycat.stop(Copycat.java:66)
> at 
> org.apache.kafka.copycat.runtime.Copycat$ShutdownHook.run(Copycat.java:88)
> [2015-10-07 16:20:39,055] ERROR Failed to flush 
> org.apache.kafka.copycat.runtime.WorkerSourceTask$2@12782f6 offsets to 
> storage:  (org.apache.kafka.copycat.runtime.WorkerSourceTask:227)
> java.lang.NullPointerException
> at 
> org.apache.kafka.copycat.storage.OffsetUtils.validateFormat(OffsetUtils.java:34)
> at 
> org.apache.kafka.copycat.storage.OffsetStorageWriter.doFlush(OffsetStorageWriter.java:141)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:223)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.stop(WorkerSourceTask.java:100)
> at org.apache.kafka.copycat.runtime.Worker.stopTask(Worker.java:188)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.removeConnectorTasks(StandaloneHerder.java:210)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stopConnector(StandaloneHerder.java:155)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stop(StandaloneHerder.java:60)
> at org.apache.kafka.copycat.runtime.Copycat.stop(Copycat.java:66)
> at 
> org.apache.kafka.copycat.runtime.Copycat$ShutdownHook.run(Copycat.java:88)
> [2015-10-07 16:20:39,055] INFO Starting graceful shutdown of thread 
> WorkerSourceTask-wikipedia-irc-source-0 
> (org.apache.kafka.copycat.util.ShutdownableThread:119)
> [2015-10-07 16:20:39,056] INFO Herder stopped 
> (org.apache.kafka.copycat.runtime.standalone.StandaloneHerder:64)
> [2015-10-07 16:20:39,056] INFO Worker stopping 
> (org.apache.kafka.copycat.runtime.Worker:104)
> [2015-10-07 16:20:39,056] INFO Stopped FileOffsetBackingStore 
> (org.apache.kafka.copycat.storage.FileOffsetBackingStore:61)
> [2015-10-07 16:20:39,056] INFO Worker stopped 
> (org.apache.kafka.copycat.runtime.Worker:133)
> [2015-10-07 16:20:39,057] INFO Copycat stopped 
> (org.apache.kafka.copycat.runtime.Copycat:69)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2644) Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL

2015-10-21 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967560#comment-14967560
 ] 

Rajini Sivaram commented on KAFKA-2644:
---

[~ewencp] [~geoffra] Thank you both for your input. I will create a service 
using MiniKDC and submit for review. I think the functionality of MiniKDC 
should be sufficient for our tests. [~ijuma] [~harsha_ch] Please let me know if 
that is not the case.

> Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL
> 
>
> Key: KAFKA-2644
> URL: https://issues.apache.org/jira/browse/KAFKA-2644
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We need to define which of the existing ducktape tests are relevant. cc 
> [~rsivaram]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Todd Palino
Comments inline.

In addition, the documentation on the migration path is good, but do we
really need a separate utility? Would it be better to have checking and
setting the ACLs be a function of the controller, possibly as a separate
thread either only at controller startup or periodically, with information
available about when the check runs and when it completes (I would like to
see this in a metric - we already have problems determining whether or not
the log compaction thread is running). This would provide continuous
coverage on the ACLs.

Also, what is the downgrade plan?


On Wed, Oct 21, 2015 at 9:56 AM, Flavio Junqueira  wrote:

>
> > On 21 Oct 2015, at 17:47, Todd Palino  wrote:
> >
> > There seems to be a bit of detail lacking in the KIP. Specifically, I'd
> > like to understand:
> >
> > 1) What znodes are the brokers going to secure? Is this configurable?
> How?
>
> Currently it is securing all paths here except the consumers one:
>
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L56
> <
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L56
> >
>
> This isn't configurable at the moment.
>

That's fine. As long as the consumers tree is exempted, and the admin tools
continue to work properly, I don't see any problems with this not being
configurable. All of those paths are specific to the brokers.



> > 2) What ACL is the broker going to apply? Is this configurable?
>
> The default is CREATOR_ALL_ACL + READ_ACL_UNSAFE, which means that an
> authenticated client can manipulate secured znodes and everyone can read
> znodes. The API of ZkUtils accommodates other ACLs, but the current code is
> using the default.
>

I think we should consider making this configurable. A specific use case I
can see is that in an environment where you have multiple users of the
Zookeeper that your Kafka cluster is in, you will want to separately
protect those applications and Kafka. We may not want to do this as part of
the initial KIP work, as I think it's important to get this in as soon as
possible in some form (we have problems that this will address), but what
do you think about making this improvement shortly thereafter?


> > 3) How will the admin tools (such as preferred replica election and
> > partition reassignment) interact with this?
> >
>
> Currently, you need to set a system property passing the login config file
> to be able to authenticate the client and perform writes to ZK.
>

OK. I'll assume that your changes will include modifying the tools and
script helpers to make this easy to do :)

All of this should be clearly documented in the KIP wiki as well.


> -Flavio
>
> > -Todd
> >
> >
> > On Wed, Oct 21, 2015 at 9:16 AM, Ismael Juma  wrote:
> >
> >> On Wed, Oct 21, 2015 at 5:04 PM, Flavio Junqueira 
> wrote:
> >>
> >>> Bringing the points Grant brought to this thread:
> >>>
>  Is it worth mentioning the follow up steps that were discussed in the
> >> KIP
>  call in this KIP document? Some of them were:
> 
>   - Adding SSL support for Zookeeper
>   - Removing the "world readable" assumption
> 
> >>>
> >>> Grant, how would you do it? I see three options:
> >>>
> >>> 1- Add to the existing KIP, but then the functionality we should be
> >>> checking in soon won't include it, so the KIP will remain incomplete
> >>>
> >>
> >> A "Future work" section would make sense to me, but I don't know how
> this
> >> is normally handled.
> >>
> >> Ismael
> >>
>
>


[jira] [Commented] (KAFKA-2681) SASL authentication in official docs

2015-10-21 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967493#comment-14967493
 ] 

Ismael Juma commented on KAFKA-2681:


cc [~harsha_ch]

> SASL authentication in official docs
> 
>
> Key: KAFKA-2681
> URL: https://issues.apache.org/jira/browse/KAFKA-2681
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
> Fix For: 0.9.0.0
>
>
> We need to add a section in the official documentation regarding SASL 
> authentication:
> http://kafka.apache.org/documentation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2682) Authorization section in official docs

2015-10-21 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967491#comment-14967491
 ] 

Ismael Juma commented on KAFKA-2682:


cc [~parth.brahmbhatt]

> Authorization section in official docs
> --
>
> Key: KAFKA-2682
> URL: https://issues.apache.org/jira/browse/KAFKA-2682
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
> Fix For: 0.9.0.0
>
>
> We need to add a section in the official documentation regarding 
> authorization:
> http://kafka.apache.org/documentation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Question about compressed messages...

2015-10-21 Thread Robert Thille

I’m working on a Twisted Python Kafka client and I was wondering what the ‘key’ 
on a gzip’d block of messages “means”.  

That is, if the client has a batch of messages to send, with a mix of keys, 
would it be a bug to batch them together and gzip into a single message?  Or is 
the key on the outer “message” ignored/should be set to Null?

Thanks,

Robert

—
Robert P. Thille | Senior Software Engineer, Blue Planet
rthi...@ciena.com | 1383 N. McDowell Blvd. Suite 300 | Petaluma, CA 94954
Direct +1.707.735.2300 | Mobile +1.707.861.0042 






[GitHub] kafka pull request: KAFKA-2626: Handle null keys and value validat...

2015-10-21 Thread ewencp
Github user ewencp closed the pull request at:

https://github.com/apache/kafka/pull/344


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2626: Handle null keys and value validat...

2015-10-21 Thread ewencp
GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/345

KAFKA-2626: Handle null keys and value validation properly in 
OffsetStorageWriter.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka 
kafka-2626-offset-storage-writer-null-values

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #345


commit b89f1f9bc214169b232e592ce1126d25c4e6e9da
Author: Ewen Cheslack-Postava 
Date:   2015-10-21T17:47:14Z

KAFKA-2626: Handle null keys and value validation properly in 
OffsetStorageWriter.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2626) Null offsets in copycat causes exception in OffsetStorageWriter

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967543#comment-14967543
 ] 

ASF GitHub Bot commented on KAFKA-2626:
---

Github user ewencp closed the pull request at:

https://github.com/apache/kafka/pull/344


> Null offsets in copycat causes exception in OffsetStorageWriter
> ---
>
> Key: KAFKA-2626
> URL: https://issues.apache.org/jira/browse/KAFKA-2626
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.0
>
>
> {quote}
> [2015-10-07 16:20:39,052] ERROR CRITICAL: Failed to serialize offset data, 
> making it impossible to commit offsets under namespace wikipedia-irc-source. 
> This likely won't recover unless the unserializable partition or offset 
> information is overwritten. 
> (org.apache.kafka.copycat.storage.OffsetStorageWriter:152)
> [2015-10-07 16:20:39,053] ERROR Cause of serialization failure: 
> (org.apache.kafka.copycat.storage.OffsetStorageWriter:155)
> java.lang.NullPointerException
> at 
> org.apache.kafka.copycat.storage.OffsetUtils.validateFormat(OffsetUtils.java:34)
> at 
> org.apache.kafka.copycat.storage.OffsetStorageWriter.doFlush(OffsetStorageWriter.java:141)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:223)
> at 
> org.apache.kafka.copycat.runtime.WorkerSqourceTask.stop(WorkerSourceTask.java:100)
> at org.apache.kafka.copycat.runtime.Worker.stopTask(Worker.java:188)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.removeConnectorTasks(StandaloneHerder.java:210)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stopConnector(StandaloneHerder.java:155)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stop(StandaloneHerder.java:60)
> at org.apache.kafka.copycat.runtime.Copycat.stop(Copycat.java:66)
> at 
> org.apache.kafka.copycat.runtime.Copycat$ShutdownHook.run(Copycat.java:88)
> [2015-10-07 16:20:39,055] ERROR Failed to flush 
> org.apache.kafka.copycat.runtime.WorkerSourceTask$2@12782f6 offsets to 
> storage:  (org.apache.kafka.copycat.runtime.WorkerSourceTask:227)
> java.lang.NullPointerException
> at 
> org.apache.kafka.copycat.storage.OffsetUtils.validateFormat(OffsetUtils.java:34)
> at 
> org.apache.kafka.copycat.storage.OffsetStorageWriter.doFlush(OffsetStorageWriter.java:141)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:223)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.stop(WorkerSourceTask.java:100)
> at org.apache.kafka.copycat.runtime.Worker.stopTask(Worker.java:188)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.removeConnectorTasks(StandaloneHerder.java:210)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stopConnector(StandaloneHerder.java:155)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stop(StandaloneHerder.java:60)
> at org.apache.kafka.copycat.runtime.Copycat.stop(Copycat.java:66)
> at 
> org.apache.kafka.copycat.runtime.Copycat$ShutdownHook.run(Copycat.java:88)
> [2015-10-07 16:20:39,055] INFO Starting graceful shutdown of thread 
> WorkerSourceTask-wikipedia-irc-source-0 
> (org.apache.kafka.copycat.util.ShutdownableThread:119)
> [2015-10-07 16:20:39,056] INFO Herder stopped 
> (org.apache.kafka.copycat.runtime.standalone.StandaloneHerder:64)
> [2015-10-07 16:20:39,056] INFO Worker stopping 
> (org.apache.kafka.copycat.runtime.Worker:104)
> [2015-10-07 16:20:39,056] INFO Stopped FileOffsetBackingStore 
> (org.apache.kafka.copycat.storage.FileOffsetBackingStore:61)
> [2015-10-07 16:20:39,056] INFO Worker stopped 
> (org.apache.kafka.copycat.runtime.Worker:133)
> [2015-10-07 16:20:39,057] INFO Copycat stopped 
> (org.apache.kafka.copycat.runtime.Copycat:69)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #45

2015-10-21 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-2459: Mark last committed timestamp to fix connection backoff

--
[...truncated 4611 lines...]

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > testSourceTasks 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testSourceTasksStdin PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > testTaskClass 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testMultipleSourcesInvalid PASSED

org.apache.kafka.copycat.file.FileStreamSourceTaskTest > testNormalLifecycle 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceTaskTest > testMissingTopic PASSED
:copycat:json:checkstyleMain
:copycat:json:compileTestJavawarning: [options] bootstrap class path not set in 
conjunction with -source 1.7
Note: 
/x1/jenkins/jenkins-slave/workspace/kafka-trunk-jdk8/copycat/json/src/test/java/org/apache/kafka/copycat/json/JsonConverterTest.java
 uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:copycat:json:processTestResources UP-TO-DATE
:copycat:json:testClasses
:copycat:json:checkstyleTest
:copycat:json:test

org.apache.kafka.copycat.json.JsonConverterTest > longToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToJsonConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
nullSchemaAndMapNonStringKeysToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndMapToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCopycatSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToCopycatConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaPrimitiveToCopycat 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatNonStringKeys 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToCopycat PASSED
:copycat:runtime:checkstyleMain
:copycat:runtime:compileTestJavawarning: [options] bootstrap class path not set 
in conjunction with -source 1.7
Note: 

[jira] [Commented] (KAFKA-2674) ConsumerRebalanceListener.onPartitionsRevoked() is not called on consumer close

2015-10-21 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967638#comment-14967638
 ] 

Guozhang Wang commented on KAFKA-2674:
--

This is a good point, especially if the user DOES NOT want to do some logic 
upon shutting down that is included in the callback. One edge case though is 
that upon closing the consumer which tries to close the coordinator, it will 
likely try to finish all in-flight requests by calling 
"maybeAutoCommitOffsetsSync", hence users' behavior before calling close() may 
not be on the final state of the consumer. Do we have any ideas about resolving 
this? [~becket_qin] [~hachikuji]

> ConsumerRebalanceListener.onPartitionsRevoked() is not called on consumer 
> close
> ---
>
> Key: KAFKA-2674
> URL: https://issues.apache.org/jira/browse/KAFKA-2674
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0
>Reporter: Michal Turek
>Assignee: Neha Narkhede
>
> Hi, I'm investigating and testing behavior of new consumer from the planned 
> release 0.9 and found an inconsistency in calling of rebalance callbacks.
> I noticed that ConsumerRebalanceListener.onPartitionsRevoked() is NOT called 
> during consumer close and application shutdown. It's JavaDoc contract says:
> - "This method will be called before a rebalance operation starts and after 
> the consumer stops fetching data."
> - "It is recommended that offsets should be committed in this callback to 
> either Kafka or a custom offset store to prevent duplicate data."
> I believe calling consumer.close() is a start of rebalance operation and even 
> the local consumer that is actually closing should be notified to be able to 
> process any rebalance logic including offsets commit (e.g. if auto-commit is 
> disabled).
> There are commented logs of current and expected behaviors.
> {noformat}
> // Application start
> 2015-10-20 15:14:02.208 INFO  o.a.k.common.utils.AppInfoParser
> [TestConsumer-worker-0]: Kafka version : 0.9.0.0-SNAPSHOT 
> (AppInfoParser.java:82)
> 2015-10-20 15:14:02.208 INFO  o.a.k.common.utils.AppInfoParser
> [TestConsumer-worker-0]: Kafka commitId : 241b9ab58dcbde0c 
> (AppInfoParser.java:83)
> // Consumer started (the first one in group), rebalance callbacks are called 
> including empty onPartitionsRevoked()
> 2015-10-20 15:14:02.333 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Rebalance callback, revoked: [] 
> (TestConsumer.java:95)
> 2015-10-20 15:14:02.343 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Rebalance callback, assigned: [testB-1, testA-0, 
> testB-0, testB-3, testA-2, testB-2, testA-1, testA-4, testB-4, testA-3] 
> (TestConsumer.java:100)
> // Another consumer joined the group, rebalancing
> 2015-10-20 15:14:17.345 INFO  o.a.k.c.c.internals.Coordinator 
> [TestConsumer-worker-0]: Attempt to heart beat failed since the group is 
> rebalancing, try to re-join group. (Coordinator.java:714)
> 2015-10-20 15:14:17.346 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Rebalance callback, revoked: [testB-1, testA-0, 
> testB-0, testB-3, testA-2, testB-2, testA-1, testA-4, testB-4, testA-3] 
> (TestConsumer.java:95)
> 2015-10-20 15:14:17.349 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Rebalance callback, assigned: [testB-3, testA-4, 
> testB-4, testA-3] (TestConsumer.java:100)
> // Consumer started closing, there SHOULD be onPartitionsRevoked() callback 
> to commit offsets like during standard rebalance, but it is missing
> 2015-10-20 15:14:39.280 INFO  c.a.e.kafka.newapi.TestConsumer [main]: 
> Closing instance (TestConsumer.java:42)
> 2015-10-20 15:14:40.264 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Worker thread stopped (TestConsumer.java:89)
> {noformat}
> Workaround is to call onPartitionsRevoked() explicitly and manually just 
> before calling consumer.close() but it seems dirty and error prone for me. It 
> can be simply forgotten be someone without such experience.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Java high level consumer providing duplicate messages when auto commit is off

2015-10-21 Thread Cliff Rhyne
Hi James,

There are two scenarios we run:

1. Multiple partitions with one consumer per partition.  This rarely has
starting/stopping of consumers, so the pool is very static.  There is a
configured consumer timeout, which is causing the ConsumerTimeoutException
to get thrown prior to the test starting.  We handle this exception and
then resume consuming.
2. Single partition with one consumer.  This consumer is started by a
triggered condition (number of messages pending to be processed in the
kafka topic or a schedule).  The consumer is stopped after processing is
completed.

In both cases, based on my understanding there shouldn't be a rebalance as
either a) all consumers are running or b) there's only one consumer /
partition.  Also, the same consumer group is used by all consumers in
scenario 1 and 2.  Is there a good way to investigate whether rebalances
are occurring?

Thanks,
Cliff

On Wed, Oct 21, 2015 at 11:37 AM, James Cheng  wrote:

> Do you have multiple consumers in a consumer group?
>
> I think that when a new consumer joins the consumer group, that the
> existing consumers will stop consuming during the group rebalance, and then
> when they start consuming again, that they will consume from the last
> committed offset.
>
> You should get more verification on this, tho. I might be remembering
> wrong.
>
> -James
>
> > On Oct 21, 2015, at 8:40 AM, Cliff Rhyne  wrote:
> >
> > Hi,
> >
> > My team and I are looking into a problem where the Java high level
> consumer
> > provides duplicate messages if we turn auto commit off (using version
> > 0.8.2.1 of the server and Java client).  The expected sequence of events
> > are:
> >
> > 1. Start high-level consumer and initialize a KafkaStream to get a
> > ConsumerIterator
> > 2. Consume n items (could be 10,000, could be 1,000,000) from the
> iterator
> > 3. Commit the new offsets
> >
> > What we are seeing is that during step 2, some number of the n messages
> are
> > getting returned by the iterator in duplicate (in some cases, we've seen
> > n*5 messages consumed).  The problem appears to go away if we turn on
> auto
> > commit (and committing offsets to kafka helped too), but auto commit
> causes
> > conflicts with our offset rollback logic.  The issue seems to happen more
> > when we are in our test environment on a lower-cost cloud provider.
> >
> > Diving into the Java and Scala classes including the ConsumerIterator,
> it's
> > not obvious what event causes a duplicate offset to be requested or
> > returned (there's even a loop that is supposed to exclude duplicate
> > messages in this class).  I tried turning on trace logging but my log4j
> > config isn't getting the Kafka client logs to write out.
> >
> > Does anyone have suggestions of where to look or how to enable logging?
> >
> > Thanks,
> > Cliff
>
>
> 
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>


[jira] [Created] (KAFKA-2682) Authorization section in official docs

2015-10-21 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-2682:
--

 Summary: Authorization section in official docs
 Key: KAFKA-2682
 URL: https://issues.apache.org/jira/browse/KAFKA-2682
 Project: Kafka
  Issue Type: Sub-task
Reporter: Ismael Juma
 Fix For: 0.9.0.0


We need to add a section in the official documentation regarding authorization:
http://kafka.apache.org/documentation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2681) SASL authentication in official docs

2015-10-21 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani reassigned KAFKA-2681:
-

Assignee: Sriharsha Chintalapani

> SASL authentication in official docs
> 
>
> Key: KAFKA-2681
> URL: https://issues.apache.org/jira/browse/KAFKA-2681
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Sriharsha Chintalapani
> Fix For: 0.9.0.0
>
>
> We need to add a section in the official documentation regarding SASL 
> authentication:
> http://kafka.apache.org/documentation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2644) Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL

2015-10-21 Thread Geoff Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967527#comment-14967527
 ] 

Geoff Anderson commented on KAFKA-2644:
---

[~rsivaram] Keep in mind that service classes essentially serve as wrappers 
around various command-line tools.

I'm not very familiar with this, but it looks like MiniKDC can be used to start 
lightweight kdc server at the command-line, so perhaps your KDC service can 
simply wrap this? (assuming MiniKDC is realistic enough)



> Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL
> 
>
> Key: KAFKA-2644
> URL: https://issues.apache.org/jira/browse/KAFKA-2644
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We need to define which of the existing ducktape tests are relevant. cc 
> [~rsivaram]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2365) Copycat checklist

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967537#comment-14967537
 ] 

ASF GitHub Bot commented on KAFKA-2365:
---

GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/344

KAFKA-2365: Handle null keys and value validation properly in 
OffsetStorageWriter.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka 
kafka-2365-offset-storage-writer-null-values

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/344.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #344


commit f7241b508190d64d7b5ac48560b01bb14d89ffa9
Author: Ewen Cheslack-Postava 
Date:   2015-10-21T17:47:14Z

KAFKA-2365: Handle null keys and value validation properly in 
OffsetStorageWriter.




> Copycat checklist
> -
>
> Key: KAFKA-2365
> URL: https://issues.apache.org/jira/browse/KAFKA-2365
> Project: Kafka
>  Issue Type: New Feature
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
>  Labels: feature
> Fix For: 0.9.0.0
>
>
> This covers the development plan for 
> [KIP-26|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=58851767].
>  There are a number of features that can be developed in sequence to make 
> incremental progress, and often in parallel:
> * Initial patch - connector API and core implementation
> * Runtime data API
> * Standalone CLI
> * REST API
> * Distributed copycat - CLI
> * Distributed copycat - coordinator
> * Distributed copycat - config storage
> * Distributed copycat - offset storage
> * Log/file connector (sample source/sink connector)
> * Elasticsearch sink connector (sample sink connector for full log -> Kafka 
> -> Elasticsearch sample pipeline)
> * Copycat metrics
> * System tests (including connector tests)
> * Mirrormaker connector
> * Copycat documentation
> This is an initial list, but it might need refinement to allow for more 
> incremental progress and may be missing features we find we want before the 
> initial release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2365: Handle null keys and value validat...

2015-10-21 Thread ewencp
GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/344

KAFKA-2365: Handle null keys and value validation properly in 
OffsetStorageWriter.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka 
kafka-2365-offset-storage-writer-null-values

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/344.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #344


commit f7241b508190d64d7b5ac48560b01bb14d89ffa9
Author: Ewen Cheslack-Postava 
Date:   2015-10-21T17:47:14Z

KAFKA-2365: Handle null keys and value validation properly in 
OffsetStorageWriter.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: MINOR: Update to Gradle 2.8

2015-10-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/343


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #46

2015-10-21 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-2667: Fix transient error in KafkaBasedLogTest.

[wangguoz] MINOR: Update to Gradle 2.8

[wangguoz] KAFKA-2464: client-side assignment for new consumer

--
[...truncated 6770 lines...]

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testSourceTasksStdin PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > testTaskClass 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testMultipleSourcesInvalid PASSED

org.apache.kafka.copycat.file.FileStreamSinkConnectorTest > testSinkTasks PASSED

org.apache.kafka.copycat.file.FileStreamSinkConnectorTest > testTaskClass PASSED

org.apache.kafka.copycat.file.FileStreamSinkTaskTest > testPutFlush PASSED
:copycat:json:checkstyleMain
:copycat:json:compileTestJavawarning: [options] bootstrap class path not set in 
conjunction with -source 1.7
Note: 

 uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:copycat:json:processTestResources UP-TO-DATE
:copycat:json:testClasses
:copycat:json:checkstyleTest
:copycat:json:test

org.apache.kafka.copycat.json.JsonConverterTest > longToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToJsonConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
nullSchemaAndMapNonStringKeysToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndMapToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCopycatSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToCopycatConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaPrimitiveToCopycat 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatNonStringKeys 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToCopycat PASSED
:copycat:runtime:checkstyleMain
:copycat:runtime:compileTestJavawarning: 

Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Jun Rao
Parth,

For 2), in your approach, the broker/controller will then always have the
overhead of resetting the ACL on startup after zookeeper.set.acl is set to
true. The benefit of using a separate migration tool is that you paid the
cost only once during upgrade. It is an extra step during the upgrade.
However, given the other things that you need to do to upgrade to 0.9.0
(e.g. two rounds of rolling upgrades on all brokers, etc), I am not sure if
it's worth to optimize away of this step. We probably just need to document
this clearly.

Todd,

Just to be clear about the shared ZK usage. Once you set CREATOR_ALL_ACL +
READ_ACL_UNSAFE on a path, only ZK clients with the same user as the
creator can modify the path. Other ZK clients authenticated with a
different user can read, but not modify the path. Are you concerned about
the reads or the writes to ZK?

Thanks,

Jun



On Wed, Oct 21, 2015 at 10:46 AM, Flavio Junqueira  wrote:

>
> > On 21 Oct 2015, at 18:07, Parth Brahmbhatt 
> wrote:
> >
> > I have 2 suggestions:
> >
> > 1) We need to document how does one move from secure to non secure
> > environment:
> >   1) change the config on all brokers to zookeeper.set.acl = false
> and do a
> > rolling upgrade.
> >   2) Run the migration script with the jass config file so it is sasl
> > authenticated with zookeeper and change the acls on all subtrees back to
> > World modifiable.
> >   3) remove the jaas config / or only the zookeeper section from the
> jaas,
> > and restart all brokers.
> >
>
> Thanks for bringing it up, it makes sense to have a downgrade path and
> document it.
>
>
> > 2) I am not sure if we should force users trying to move from unsecure to
> > secure environment to execute the migration script. In the second step
> > once the zookeeper.set.acl is set to true, we can secure all the subtrees
> > by calling ensureCorrectAcls as part of broker initialization (just after
> > makesurePersistentPathExists). Not sure why we want to add one more
> > manual/admin step when it can be automated. This also has the added
> > advantage that migration script will not have to take a flag as input to
> > figure out if it should set the acls to secure or unsecure given it will
> > always be used to move from secure to unsecure.
> >
>
> The advantage of the third step is to make a single traversal to change
> any remaining znodes with the open ACL. As you suggest, each broker would
> do it, so the overhead is much higher. I do agree that eliminating a step
> is an advantage, though.
>
> > Given we are assuming all the information in zookeeper is world readable
> ,
> > I don¹t see SSL support as a must have or a blocker for this KIP.
>
> OK, but keep in mind that SSL is only available in the 3.5 branch of ZK.
>
> -Flavio
>
> >
> > Thanks
> > Parth
> >
> >
> >
> > On 10/21/15, 9:56 AM, "Flavio Junqueira"  wrote:
> >
> >>
> >>> On 21 Oct 2015, at 17:47, Todd Palino  wrote:
> >>>
> >>> There seems to be a bit of detail lacking in the KIP. Specifically, I'd
> >>> like to understand:
> >>>
> >>> 1) What znodes are the brokers going to secure? Is this configurable?
> >>> How?
> >>
> >> Currently it is securing all paths here except the consumers one:
> >>
> >>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils
> >> /ZkUtils.scala#L56
> >> <
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/util
> >> s/ZkUtils.scala#L56>
> >>
> >> This isn't configurable at the moment.
> >>
> >>> 2) What ACL is the broker going to apply? Is this configurable?
> >>
> >> The default is CREATOR_ALL_ACL + READ_ACL_UNSAFE, which means that an
> >> authenticated client can manipulate secured znodes and everyone can read
> >> znodes. The API of ZkUtils accommodates other ACLs, but the current code
> >> is using the default.
> >>
> >>> 3) How will the admin tools (such as preferred replica election and
> >>> partition reassignment) interact with this?
> >>>
> >>
> >> Currently, you need to set a system property passing the login config
> >> file to be able to authenticate the client and perform writes to ZK.
> >>
> >> -Flavio
> >>
> >>> -Todd
> >>>
> >>>
> >>> On Wed, Oct 21, 2015 at 9:16 AM, Ismael Juma 
> wrote:
> >>>
>  On Wed, Oct 21, 2015 at 5:04 PM, Flavio Junqueira 
>  wrote:
> 
> > Bringing the points Grant brought to this thread:
> >
> >> Is it worth mentioning the follow up steps that were discussed in
> the
>  KIP
> >> call in this KIP document? Some of them were:
> >>
> >> - Adding SSL support for Zookeeper
> >> - Removing the "world readable" assumption
> >>
> >
> > Grant, how would you do it? I see three options:
> >
> > 1- Add to the existing KIP, but then the functionality we should be
> > checking in soon won't include it, so the KIP will remain incomplete
> >
> 
>  A 

[GitHub] kafka pull request: KAFKA-2464: client-side assignment for new con...

2015-10-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/165


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Java high level consumer providing duplicate messages when auto commit is off

2015-10-21 Thread Kris K
Hi Cliff,

One other case I observed in my environment is - when there were gc pauses
on one of our high level consumer in the group.

Thanks,
Kris

On Wed, Oct 21, 2015 at 10:12 AM, Cliff Rhyne  wrote:

> Hi James,
>
> There are two scenarios we run:
>
> 1. Multiple partitions with one consumer per partition.  This rarely has
> starting/stopping of consumers, so the pool is very static.  There is a
> configured consumer timeout, which is causing the ConsumerTimeoutException
> to get thrown prior to the test starting.  We handle this exception and
> then resume consuming.
> 2. Single partition with one consumer.  This consumer is started by a
> triggered condition (number of messages pending to be processed in the
> kafka topic or a schedule).  The consumer is stopped after processing is
> completed.
>
> In both cases, based on my understanding there shouldn't be a rebalance as
> either a) all consumers are running or b) there's only one consumer /
> partition.  Also, the same consumer group is used by all consumers in
> scenario 1 and 2.  Is there a good way to investigate whether rebalances
> are occurring?
>
> Thanks,
> Cliff
>
> On Wed, Oct 21, 2015 at 11:37 AM, James Cheng  wrote:
>
> > Do you have multiple consumers in a consumer group?
> >
> > I think that when a new consumer joins the consumer group, that the
> > existing consumers will stop consuming during the group rebalance, and
> then
> > when they start consuming again, that they will consume from the last
> > committed offset.
> >
> > You should get more verification on this, tho. I might be remembering
> > wrong.
> >
> > -James
> >
> > > On Oct 21, 2015, at 8:40 AM, Cliff Rhyne  wrote:
> > >
> > > Hi,
> > >
> > > My team and I are looking into a problem where the Java high level
> > consumer
> > > provides duplicate messages if we turn auto commit off (using version
> > > 0.8.2.1 of the server and Java client).  The expected sequence of
> events
> > > are:
> > >
> > > 1. Start high-level consumer and initialize a KafkaStream to get a
> > > ConsumerIterator
> > > 2. Consume n items (could be 10,000, could be 1,000,000) from the
> > iterator
> > > 3. Commit the new offsets
> > >
> > > What we are seeing is that during step 2, some number of the n messages
> > are
> > > getting returned by the iterator in duplicate (in some cases, we've
> seen
> > > n*5 messages consumed).  The problem appears to go away if we turn on
> > auto
> > > commit (and committing offsets to kafka helped too), but auto commit
> > causes
> > > conflicts with our offset rollback logic.  The issue seems to happen
> more
> > > when we are in our test environment on a lower-cost cloud provider.
> > >
> > > Diving into the Java and Scala classes including the ConsumerIterator,
> > it's
> > > not obvious what event causes a duplicate offset to be requested or
> > > returned (there's even a loop that is supposed to exclude duplicate
> > > messages in this class).  I tried turning on trace logging but my log4j
> > > config isn't getting the Kafka client logs to write out.
> > >
> > > Does anyone have suggestions of where to look or how to enable logging?
> > >
> > > Thanks,
> > > Cliff
> >
> >
> > 
> >
> > This email and any attachments may contain confidential and privileged
> > material for the sole use of the intended recipient. Any review, copying,
> > or distribution of this email (or any attachments) by others is
> prohibited.
> > If you are not the intended recipient, please contact the sender
> > immediately and permanently delete this email and any attachments. No
> > employee or agent of TiVo Inc. is authorized to conclude any binding
> > agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> > Inc. may only be made by a signed written agreement.
> >
>


[jira] [Commented] (KAFKA-2464) Client-side assignment and group generalization

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967682#comment-14967682
 ] 

ASF GitHub Bot commented on KAFKA-2464:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/165


> Client-side assignment and group generalization
> ---
>
> Key: KAFKA-2464
> URL: https://issues.apache.org/jira/browse/KAFKA-2464
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Add support for client-side assignment and generalization of join group 
> protocol as documented here: 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2464) Client-side assignment and group generalization

2015-10-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-2464.
--
Resolution: Fixed

Issue resolved by pull request 165
[https://github.com/apache/kafka/pull/165]

> Client-side assignment and group generalization
> ---
>
> Key: KAFKA-2464
> URL: https://issues.apache.org/jira/browse/KAFKA-2464
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Add support for client-side assignment and generalization of join group 
> protocol as documented here: 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1843) Metadata fetch/refresh in new producer should handle all node connection states gracefully

2015-10-21 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967773#comment-14967773
 ] 

Eno Thereska commented on KAFKA-1843:
-

https://github.com/apache/kafka/pull/290 addressed  [~omkreddy] reported 
problem above and is now integrated. I propose splitting the rest of this JIRA 
into smaller JIRAs next.

> Metadata fetch/refresh in new producer should handle all node connection 
> states gracefully
> --
>
> Key: KAFKA-1843
> URL: https://issues.apache.org/jira/browse/KAFKA-1843
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.8.2.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Eno Thereska
>Priority: Blocker
>  Labels: patch
> Fix For: 0.8.2.1
>
>
> KAFKA-1642 resolved some issues with the handling of broker connection states 
> to avoid high CPU usage, but made the minimal fix rather than the ideal one. 
> The code for handling the metadata fetch is difficult to get right because it 
> has to handle a lot of possible connectivity states and failure modes across 
> all the known nodes. It also needs to correctly integrate with the 
> surrounding event loop, providing correct poll() timeouts to both avoid busy 
> looping and make sure it wakes up and tries new nodes in the face of both 
> connection and request failures.
> A patch here should address a few issues:
> 1. Make sure connection timeouts, as implemented in KAFKA-1842, are cleanly 
> integrated. This mostly means that when a connecting node is selected to 
> fetch metadata from, that the code notices that and sets the next timeout 
> based on the connection timeout rather than some other backoff.
> 2. Rethink the logic and naming of NetworkClient.leastLoadedNode. That method 
> actually takes into account a) the current connectivity of each node, b) 
> whether the node had a recent connection failure, c) the "load" in terms of 
> in flight requests. It also needs to ensure that different clients don't use 
> the same ordering across multiple calls (which is already addressed in the 
> current code by nodeIndexOffset) and that we always eventually try all nodes 
> in the face of connection failures (which isn't currently handled by 
> leastLoadedNode and probably cannot be without tracking additional state). 
> This method also has to work for new consumer use cases even though it is 
> currently only used by the new producer's metadata fetch. Finally it has to 
> properly handle when other code calls initiateConnect() since the normal path 
> for sending messages also initiates connections.
> We can already say that there is an order of preference given a single call 
> (as follows), but making this work across multiple calls when some initial 
> choices fail to connect or return metadata *and* connection states may be 
> changing is much more difficult.
>  * Connected, zero in flight requests - the request can be sent immediately
>  * Connecting node - it will hopefully be connected very soon and by 
> definition has no in flight requests
>  * Disconnected - same reasoning as for a connecting node
>  * Connected, > 0 in flight requests - we consider any # of in flight 
> requests as a big enough backlog to delay the request a lot.
> We could use an approach that better accounts for # of in flight requests 
> rather than just turning it into a boolean variable, but that probably 
> introduces much more complexity than it is worth.
> 3. The most difficult case to handle so far has been when leastLoadedNode 
> returns a disconnected node to maybeUpdateMetadata as its best option. 
> Properly handling the two resulting cases (initiateConnect fails immediately 
> vs. taking some time to possibly establish the connection) is tricky.
> 4. Consider optimizing for the failure cases. The most common cases are when 
> you already have an active connection and can immediately get the metadata or 
> you need to establish a connection, but the connection and metadata 
> request/response happen very quickly. These common cases are infrequent 
> enough (default every 5 min) that establishing an extra connection isn't a 
> big deal as long as it's eventually cleaned up. The edge cases, like network 
> partitions where some subset of nodes become unreachable for a long period, 
> are harder to reason about but we should be sure we will always be able to 
> gracefully recover from them.
> KAFKA-1642 enumerated the possible outcomes of a single call to 
> maybeUpdateMetadata. A good fix for this would consider all of those outcomes 
> for repeated calls to 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2454: Deadlock between log segment delet...

2015-10-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/153


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-37 - Add namespaces in Kafka

2015-10-21 Thread Ashish Singh
In last KIP hangout following questions were raised.

   1.

   *Whether or not to support move command? If yes, how do we support it.*
   I think *move* command will be essential, once we start supporting
   directories. However, implementation might be a bit convoluted. A few
   things required for it will be, ability to mark a topic unavailable during
   the move, update brokers’ metadata cache to reflect the move.
   2.

   *How will acls/ configs inheritance work?*
   Say we have /dc/ns/topic.
   dc has dc_acl and dc_config. Similarly for ns and topic.
   For being able to perform an action on /dc/ns/topic, the user must have
   required perms on dc, ns and topic for that operation. For example, User1
   will need DESCRIBE permissions on dc, ns and topic to be able to describe
   /dc/ns/topic.
   For configs, configs for /dc/ns/topic will be topic_config + ns_config +
   dc_config, in that order. So, if a config is specified for topic then that
   will be used, else it’s parent (ns) will be checked for that config, and
   this goes on.
   3.

   *Will supporting n-deep hierarchy be a concern?*
   This can be a performance concern, however it sounds more of a misusage
   of the functionality or bad organization of topics. We can have a depth
   limit, but I am not sure if it is required.
   4.

   *Will we continue to support multi-directory on disk, that was proposed
   in KAFKA-188?*
   Yes, we should be able to support that. It is within those directories,
   namespaces will be created. The heuristics for choosing least loaded
   disc/dir will remain same.
   5.

   *Will it be required to move existing topics from default directory/
   namespace to a particular directory/ namespace to enable mirror-maker
   replicate topics in that directory/namespace?*
   I do not think it will be required, as one can simple add /*/* to
   mirror-maker’s blacklist and this will only capture topics that exist in
   default namespace. @Joel, does this answer your question?

​

On Fri, Oct 16, 2015 at 6:33 PM, Ashish Singh  wrote:

> On Thu, Oct 15, 2015 at 1:30 PM, Jiangjie Qin 
> wrote:
>
>> Hey Jay,
>>
>> If we allow consumer to subscribe to /*/my-event, does that mean we allow
>> consumer to consume cross namespaces?
>
> That is the idea. If a user has permissions then yes, he should be able to
> consume from as many namespaces as he wants.
>
>
>> In that case it seems not
>> "hierarchical" but more like a name field filtering. i.e. user can choose
>> to consume from topic where datacenter={x,y},
>> topic_name={my-topic1,mytopic2}. Am I understanding right?
>>
> I think it is still hierarchical, however with possible filtering (as you
> said).
>
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Wed, Oct 14, 2015 at 12:49 PM, Jay Kreps  wrote:
>>
>> > Hey Jason,
>> >
>> > I actually think this is one of the advantages. The problem we have
>> today
>> > is that you can't really do bidirectional replication between clusters
>> > because it would actually be a feedback loop.
>> >
>> > So the intended use would be that you would have a structure where the
>> > top-level directory was DIFFERENT but the topic names were the same, so
>> if
>> > you maintain
>> >   /chicago-datacenter/actual-topics
>> >   /oregon-datacenter/actual topics
>> >   etc.
>> > Then you replicate
>> >   /chicago-datacenter/* => /oregon-datacenter
>> > and
>> >   /oregon-datacenter/* => /chicago-datacenter
>> >
>> > People who want the aggregate feed subscribe to /*/my-event.
>> >
>> > The nice thing about this is it gives a unified namespace across all
>> > locations.
>> >
>> > Basically exactly what we do now but you no longer need to add new
>> clusters
>> > to get the namespacing.
>> >
>> > -Jay
>> >
>> >
>> > On Wed, Oct 14, 2015 at 11:24 AM, Jason Gustafson 
>> > wrote:
>> >
>> > > Hey Ashish, thanks for the write-up. I think having a namespace
>> > capability
>> > > is a useful feature for Kafka, in particular with the addition of the
>> > > authorization layer. I probably prefer Jay's hierarchical approach if
>> > we're
>> > > going to embed the namespace in the topic name since it seems more
>> > general.
>> > > That said, one advantage of having a namespace independent of the
>> topic
>> > > name is that it simplifies replication between namespaces a bit since
>> you
>> > > don't have to parse and rewrite topic names. Assuming that
>> hierarchical
>> > > topics will happen eventually anyway, I imagine a common pattern
>> would be
>> > > to preserve the same directory structure in multiple namespaces, so
>> > having
>> > > an easy mechanism for applications to switch between them would be
>> nice.
>> > > The namespace is kind of analogous to a chroot in this case. Of course
>> > you
>> > > can achieve the same thing by having a configurable topic prefix, just
>> > you
>> > > have to do all the topic rewriting, which I'm guessing will be a
>> little

[jira] [Commented] (KAFKA-2454) Dead lock between delete log segment and shutting down.

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967809#comment-14967809
 ] 

ASF GitHub Bot commented on KAFKA-2454:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/153


> Dead lock between delete log segment and shutting down.
> ---
>
> Key: KAFKA-2454
> URL: https://issues.apache.org/jira/browse/KAFKA-2454
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.0
>
>
> When the broker shutdown, it will shutdown scheduler which grabs the 
> scheduler lock then wait for all the threads in scheduler to shutdown.
> The dead lock will happen when the scheduled task try to delete old log 
> segment, it will schedule a log delete task which also needs to acquire the 
> scheduler lock. In this case the shutdown thread will hold scheduler lock and 
> wait for the the log deletion thread to finish, but the log deletion thread 
> will block on waiting for the scheduler lock.
> Related stack trace:
> {noformat}
> "Thread-1" #21 prio=5 os_prio=0 tid=0x7fe7601a7000 nid=0x1a4e waiting on 
> condition [0x7fe7cf698000]
>java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x000640d53540> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1465)
> at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:94)
> - locked <0x000640b6d480> (a kafka.utils.KafkaScheduler)
> at 
> kafka.server.KafkaServer$$anonfun$shutdown$4.apply$mcV$sp(KafkaServer.scala:352)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:79)
> at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
> at kafka.utils.CoreUtils$.swallowWarn(CoreUtils.scala:51)
> at kafka.utils.Logging$class.swallow(Logging.scala:94)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:51)
> at kafka.server.KafkaServer.shutdown(KafkaServer.scala:352)
> at 
> kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:42)
> at com.linkedin.kafka.KafkaServer.notifyShutdown(KafkaServer.java:99)
> at 
> com.linkedin.util.factory.lifecycle.LifeCycleMgr.notifyShutdownListener(LifeCycleMgr.java:123)
> at 
> com.linkedin.util.factory.lifecycle.LifeCycleMgr.notifyListeners(LifeCycleMgr.java:102)
> at 
> com.linkedin.util.factory.lifecycle.LifeCycleMgr.notifyStop(LifeCycleMgr.java:82)
> - locked <0x000640b77bb0> (a java.util.ArrayDeque)
> at com.linkedin.util.factory.Generator.stop(Generator.java:177)
> - locked <0x000640b77bc8> (a java.lang.Object)
> at 
> com.linkedin.offspring.servlet.OffspringServletRuntime.destroy(OffspringServletRuntime.java:82)
> at 
> com.linkedin.offspring.servlet.OffspringServletContextListener.contextDestroyed(OffspringServletContextListener.java:51)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:813)
> at 
> org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:160)
> at 
> org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:516)
> at com.linkedin.emweb.WebappContext.doStop(WebappContext.java:35)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
> - locked <0x0006400018b8> (a java.lang.Object)
> at 
> com.linkedin.emweb.ContextBasedHandlerImpl.doStop(ContextBasedHandlerImpl.java:112)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
> - locked <0x000640001900> (a java.lang.Object)
> at 
> com.linkedin.emweb.WebappDeployerImpl.stop(WebappDeployerImpl.java:349)
> at 
> com.linkedin.emweb.WebappDeployerImpl.doStop(WebappDeployerImpl.java:414)
> - locked <0x0006400019c0> (a 
> com.linkedin.emweb.MapBasedHandlerImpl)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
> - locked <0x0006404ee8e8> (a java.lang.Object)
> at 
> org.eclipse.jetty.util.component.AggregateLifeCycle.doStop(AggregateLifeCycle.java:107)
> at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:69)
> at 
> 

[jira] [Updated] (KAFKA-2454) Dead lock between delete log segment and shutting down.

2015-10-21 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-2454:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 153
[https://github.com/apache/kafka/pull/153]

> Dead lock between delete log segment and shutting down.
> ---
>
> Key: KAFKA-2454
> URL: https://issues.apache.org/jira/browse/KAFKA-2454
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.0
>
>
> When the broker shutdown, it will shutdown scheduler which grabs the 
> scheduler lock then wait for all the threads in scheduler to shutdown.
> The dead lock will happen when the scheduled task try to delete old log 
> segment, it will schedule a log delete task which also needs to acquire the 
> scheduler lock. In this case the shutdown thread will hold scheduler lock and 
> wait for the the log deletion thread to finish, but the log deletion thread 
> will block on waiting for the scheduler lock.
> Related stack trace:
> {noformat}
> "Thread-1" #21 prio=5 os_prio=0 tid=0x7fe7601a7000 nid=0x1a4e waiting on 
> condition [0x7fe7cf698000]
>java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x000640d53540> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1465)
> at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:94)
> - locked <0x000640b6d480> (a kafka.utils.KafkaScheduler)
> at 
> kafka.server.KafkaServer$$anonfun$shutdown$4.apply$mcV$sp(KafkaServer.scala:352)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:79)
> at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
> at kafka.utils.CoreUtils$.swallowWarn(CoreUtils.scala:51)
> at kafka.utils.Logging$class.swallow(Logging.scala:94)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:51)
> at kafka.server.KafkaServer.shutdown(KafkaServer.scala:352)
> at 
> kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:42)
> at com.linkedin.kafka.KafkaServer.notifyShutdown(KafkaServer.java:99)
> at 
> com.linkedin.util.factory.lifecycle.LifeCycleMgr.notifyShutdownListener(LifeCycleMgr.java:123)
> at 
> com.linkedin.util.factory.lifecycle.LifeCycleMgr.notifyListeners(LifeCycleMgr.java:102)
> at 
> com.linkedin.util.factory.lifecycle.LifeCycleMgr.notifyStop(LifeCycleMgr.java:82)
> - locked <0x000640b77bb0> (a java.util.ArrayDeque)
> at com.linkedin.util.factory.Generator.stop(Generator.java:177)
> - locked <0x000640b77bc8> (a java.lang.Object)
> at 
> com.linkedin.offspring.servlet.OffspringServletRuntime.destroy(OffspringServletRuntime.java:82)
> at 
> com.linkedin.offspring.servlet.OffspringServletContextListener.contextDestroyed(OffspringServletContextListener.java:51)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:813)
> at 
> org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:160)
> at 
> org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:516)
> at com.linkedin.emweb.WebappContext.doStop(WebappContext.java:35)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
> - locked <0x0006400018b8> (a java.lang.Object)
> at 
> com.linkedin.emweb.ContextBasedHandlerImpl.doStop(ContextBasedHandlerImpl.java:112)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
> - locked <0x000640001900> (a java.lang.Object)
> at 
> com.linkedin.emweb.WebappDeployerImpl.stop(WebappDeployerImpl.java:349)
> at 
> com.linkedin.emweb.WebappDeployerImpl.doStop(WebappDeployerImpl.java:414)
> - locked <0x0006400019c0> (a 
> com.linkedin.emweb.MapBasedHandlerImpl)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
> - locked <0x0006404ee8e8> (a java.lang.Object)
> at 
> org.eclipse.jetty.util.component.AggregateLifeCycle.doStop(AggregateLifeCycle.java:107)
> at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:69)
> at 
> 

Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Todd Palino
Thanks for the clarification on that, Jun. Obviously, we haven't been doing
much with ZK authentication around here yet. There is still a small concern
there, mostly in that you should not share credentials any more than is
necessary, which would argue for being able to use a different ACL than the
default. I don't really like the idea of having to use the exact same
credentials for executing the admin tools as we do for running the brokers.
Given that we don't need to share the credentials with all consumers, I
think we can work around it.

This does bring up another good question, however. What will be the process
for having to rotate the credentials? That is, if the credentials are
compromised and need to be changed, how can that be accomplished with the
cluster online. I'm guessing some combination of using skipAcl on the
Zookeeper ensemble and config changes to the brokers will be required, but
this is an important enough operation that we should make sure it's
reasonable to perform and that it is documented.

-Todd


On Wed, Oct 21, 2015 at 1:23 PM, Jun Rao  wrote:

> Parth,
>
> For 2), in your approach, the broker/controller will then always have the
> overhead of resetting the ACL on startup after zookeeper.set.acl is set to
> true. The benefit of using a separate migration tool is that you paid the
> cost only once during upgrade. It is an extra step during the upgrade.
> However, given the other things that you need to do to upgrade to 0.9.0
> (e.g. two rounds of rolling upgrades on all brokers, etc), I am not sure if
> it's worth to optimize away of this step. We probably just need to document
> this clearly.
>
> Todd,
>
> Just to be clear about the shared ZK usage. Once you set CREATOR_ALL_ACL +
> READ_ACL_UNSAFE on a path, only ZK clients with the same user as the
> creator can modify the path. Other ZK clients authenticated with a
> different user can read, but not modify the path. Are you concerned about
> the reads or the writes to ZK?
>
> Thanks,
>
> Jun
>
>
>
> On Wed, Oct 21, 2015 at 10:46 AM, Flavio Junqueira  wrote:
>
> >
> > > On 21 Oct 2015, at 18:07, Parth Brahmbhatt <
> pbrahmbh...@hortonworks.com>
> > wrote:
> > >
> > > I have 2 suggestions:
> > >
> > > 1) We need to document how does one move from secure to non secure
> > > environment:
> > >   1) change the config on all brokers to zookeeper.set.acl = false
> > and do a
> > > rolling upgrade.
> > >   2) Run the migration script with the jass config file so it is
> sasl
> > > authenticated with zookeeper and change the acls on all subtrees back
> to
> > > World modifiable.
> > >   3) remove the jaas config / or only the zookeeper section from
> the
> > jaas,
> > > and restart all brokers.
> > >
> >
> > Thanks for bringing it up, it makes sense to have a downgrade path and
> > document it.
> >
> >
> > > 2) I am not sure if we should force users trying to move from unsecure
> to
> > > secure environment to execute the migration script. In the second step
> > > once the zookeeper.set.acl is set to true, we can secure all the
> subtrees
> > > by calling ensureCorrectAcls as part of broker initialization (just
> after
> > > makesurePersistentPathExists). Not sure why we want to add one more
> > > manual/admin step when it can be automated. This also has the added
> > > advantage that migration script will not have to take a flag as input
> to
> > > figure out if it should set the acls to secure or unsecure given it
> will
> > > always be used to move from secure to unsecure.
> > >
> >
> > The advantage of the third step is to make a single traversal to change
> > any remaining znodes with the open ACL. As you suggest, each broker would
> > do it, so the overhead is much higher. I do agree that eliminating a step
> > is an advantage, though.
> >
> > > Given we are assuming all the information in zookeeper is world
> readable
> > ,
> > > I don¹t see SSL support as a must have or a blocker for this KIP.
> >
> > OK, but keep in mind that SSL is only available in the 3.5 branch of ZK.
> >
> > -Flavio
> >
> > >
> > > Thanks
> > > Parth
> > >
> > >
> > >
> > > On 10/21/15, 9:56 AM, "Flavio Junqueira"  wrote:
> > >
> > >>
> > >>> On 21 Oct 2015, at 17:47, Todd Palino  wrote:
> > >>>
> > >>> There seems to be a bit of detail lacking in the KIP. Specifically,
> I'd
> > >>> like to understand:
> > >>>
> > >>> 1) What znodes are the brokers going to secure? Is this configurable?
> > >>> How?
> > >>
> > >> Currently it is securing all paths here except the consumers one:
> > >>
> > >>
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils
> > >> /ZkUtils.scala#L56
> > >> <
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/util
> > >> s/ZkUtils.scala#L56>
> > >>
> > >> This isn't configurable at the moment.
> > >>
> > >>> 2) What ACL is the broker going to apply? Is this configurable?
> > >>
> > >> The default is 

[GitHub] kafka pull request: KAFKA-2678; partition level lag metrics can be...

2015-10-21 Thread lindong28
GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/346

KAFKA-2678; partition level lag metrics can be negative



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-2678

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/346.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #346


commit 966d3aed1a796ddafd202f9ab467a06d7b804d2b
Author: Dong Lin 
Date:   2015-10-21T21:12:21Z

KAFKA-2678; partition level lag metrics can be negative




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2678) partition level lag metrics can be negative

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967925#comment-14967925
 ] 

ASF GitHub Bot commented on KAFKA-2678:
---

GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/346

KAFKA-2678; partition level lag metrics can be negative



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-2678

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/346.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #346


commit 966d3aed1a796ddafd202f9ab467a06d7b804d2b
Author: Dong Lin 
Date:   2015-10-21T21:12:21Z

KAFKA-2678; partition level lag metrics can be negative




> partition level lag metrics can be negative
> ---
>
> Key: KAFKA-2678
> URL: https://issues.apache.org/jira/browse/KAFKA-2678
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.1
>Reporter: Jun Rao
>Assignee: Dong Lin
>
> Currently, the per partition level lag metric can be negative since the last 
> committed offset can be smaller than the follower's offset. This is a bit 
> confusing to end users. We probably should lower bound it by 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-369) remove ZK dependency on producer

2015-10-21 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967853#comment-14967853
 ] 

Jun Rao commented on KAFKA-369:
---

We are adding a new java consumer (org.apache.kafka.clients.consumer) in trunk 
and it will take the same broker list as the producer for bootstrapping.

> remove ZK dependency on producer
> 
>
> Key: KAFKA-369
> URL: https://issues.apache.org/jira/browse/KAFKA-369
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 0.8.0
>Reporter: Jun Rao
>Assignee: Yang Ye
> Fix For: 0.8.0
>
> Attachments: kafka_369_v1.diff, kafka_369_v2.diff, kafka_369_v3.diff, 
> kafka_369_v4.diff, kafka_369_v5.diff, kafka_369_v6.diff, kafka_369_v7.diff, 
> kafka_369_v8.diff, kafka_369_v9.diff
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> Currently, the only place that ZK is actually used is in BrokerPartitionInfo. 
> We use ZK to get a list of brokers for making TopicMetadataRequest requests. 
> Instead, we can provide a list of brokers in the producer config directly. 
> That way, the producer client is no longer dependant on ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-37 - Add namespaces in Kafka

2015-10-21 Thread Jay Kreps
Gwen, It's a good question of what the producer semantics are--would we
only allow you to produce to a partition or first level directory or would
we hash over whatever subtree you supply? Actually not sure which makes
more sense...

Ashish, here are some thoughts:
1. I think we can do this online. There is a question of what happens to
readers and writers but presumably it would the same thing as if that topic
weren't there. There would be no guarantee this would happen atomic over
different brokers or clients, though.
2. ACLs should work like unix perms, right? I think configs would overide
hierarchically, so we would have a full set of configs for each partition
computed by walking up the tree from the root and taking the first
override). I think this is what you're describing, right?
3. Totally agree no reason to have an arbitrary limit.
4. I actually don't think the physical layout on disk should be at all
connected to the logical directory hierarchy we present. That is, whether
you use RAID or not shouldn't impact the location of a topic in your
directory structure. Not sure if this is what you are saying or not. This
does raise the question of how to do the disk layout. The simplest thing
would be to keep the flat data directories but make the names of the
partitions on disk just be logical inode numbers and then have a separate
mapping of these inodes to logical names stored in ZK with a cache. I think
this would make things like rename fast and atomic. The downside of this is
that the 'ls' command will no longer tell you much about the data on a
broker.

-Jay

On Wed, Oct 21, 2015 at 12:43 PM, Ashish Singh  wrote:

> In last KIP hangout following questions were raised.
>
>1.
>
>*Whether or not to support move command? If yes, how do we support it.*
>I think *move* command will be essential, once we start supporting
>directories. However, implementation might be a bit convoluted. A few
>things required for it will be, ability to mark a topic unavailable
> during
>the move, update brokers’ metadata cache to reflect the move.
>2.
>
>*How will acls/ configs inheritance work?*
>Say we have /dc/ns/topic.
>dc has dc_acl and dc_config. Similarly for ns and topic.
>For being able to perform an action on /dc/ns/topic, the user must have
>required perms on dc, ns and topic for that operation. For example,
> User1
>will need DESCRIBE permissions on dc, ns and topic to be able to
> describe
>/dc/ns/topic.
>For configs, configs for /dc/ns/topic will be topic_config + ns_config +
>dc_config, in that order. So, if a config is specified for topic then
> that
>will be used, else it’s parent (ns) will be checked for that config, and
>this goes on.
>3.
>
>*Will supporting n-deep hierarchy be a concern?*
>This can be a performance concern, however it sounds more of a misusage
>of the functionality or bad organization of topics. We can have a depth
>limit, but I am not sure if it is required.
>4.
>
>*Will we continue to support multi-directory on disk, that was proposed
>in KAFKA-188?*
>Yes, we should be able to support that. It is within those directories,
>namespaces will be created. The heuristics for choosing least loaded
>disc/dir will remain same.
>5.
>
>*Will it be required to move existing topics from default directory/
>namespace to a particular directory/ namespace to enable mirror-maker
>replicate topics in that directory/namespace?*
>I do not think it will be required, as one can simple add /*/* to
>mirror-maker’s blacklist and this will only capture topics that exist in
>default namespace. @Joel, does this answer your question?
>
> ​
>
> On Fri, Oct 16, 2015 at 6:33 PM, Ashish Singh  wrote:
>
> > On Thu, Oct 15, 2015 at 1:30 PM, Jiangjie Qin  >
> > wrote:
> >
> >> Hey Jay,
> >>
> >> If we allow consumer to subscribe to /*/my-event, does that mean we
> allow
> >> consumer to consume cross namespaces?
> >
> > That is the idea. If a user has permissions then yes, he should be able
> to
> > consume from as many namespaces as he wants.
> >
> >
> >> In that case it seems not
> >> "hierarchical" but more like a name field filtering. i.e. user can
> choose
> >> to consume from topic where datacenter={x,y},
> >> topic_name={my-topic1,mytopic2}. Am I understanding right?
> >>
> > I think it is still hierarchical, however with possible filtering (as you
> > said).
> >
> >>
> >> Thanks,
> >>
> >> Jiangjie (Becket) Qin
> >>
> >> On Wed, Oct 14, 2015 at 12:49 PM, Jay Kreps  wrote:
> >>
> >> > Hey Jason,
> >> >
> >> > I actually think this is one of the advantages. The problem we have
> >> today
> >> > is that you can't really do bidirectional replication between clusters
> >> > because it would actually be a feedback loop.
> >> >
> >> > So the intended use would be that you would have a 

[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-10-21 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967968#comment-14967968
 ] 

Jun Rao commented on KAFKA-1690:


[~sriharsha], I am wondering why we need a ssl.protocol and a separate 
ssl.enabled.protocols for SSL. Wouldn't having just one be enough?

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch, 
> KAFKA-1690_2015-08-18_11:24:46.patch, KAFKA-1690_2015-08-18_17:24:48.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Clean-up MemoryRecords variables and AP...

2015-10-21 Thread guozhangwang
GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/348

MINOR: Clean-up MemoryRecords variables and APIs



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka MemoryRecordsCapacity

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/348.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #348


commit 0b53a1a89a31ed43f897b1fae1ca7dd153df8f35
Author: Guozhang Wang 
Date:   2015-10-21T21:41:17Z

clean up MemoryRecords




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Java high level consumer providing duplicate messages when auto commit is off

2015-10-21 Thread Cliff Rhyne
Hi Kris,

Thanks for the tip.  I'm going to investigate this further.  I checked and
we have fairly short zk timeouts and run with a smaller memory allocation
on the two environments we encounter this issue.  I'll let you all know
what I find.

I saw this ticket https://issues.apache.org/jira/browse/KAFKA-2049 that
seems to be related to the problem (but would only inform that an issue
occurred).  Are there any other open issues that could be worked on to
improve Kafka's handling of this situation?

Thanks,
Cliff

On Wed, Oct 21, 2015 at 2:53 PM, Kris K  wrote:

> Hi Cliff,
>
> One other case I observed in my environment is - when there were gc pauses
> on one of our high level consumer in the group.
>
> Thanks,
> Kris
>
> On Wed, Oct 21, 2015 at 10:12 AM, Cliff Rhyne  wrote:
>
> > Hi James,
> >
> > There are two scenarios we run:
> >
> > 1. Multiple partitions with one consumer per partition.  This rarely has
> > starting/stopping of consumers, so the pool is very static.  There is a
> > configured consumer timeout, which is causing the
> ConsumerTimeoutException
> > to get thrown prior to the test starting.  We handle this exception and
> > then resume consuming.
> > 2. Single partition with one consumer.  This consumer is started by a
> > triggered condition (number of messages pending to be processed in the
> > kafka topic or a schedule).  The consumer is stopped after processing is
> > completed.
> >
> > In both cases, based on my understanding there shouldn't be a rebalance
> as
> > either a) all consumers are running or b) there's only one consumer /
> > partition.  Also, the same consumer group is used by all consumers in
> > scenario 1 and 2.  Is there a good way to investigate whether rebalances
> > are occurring?
> >
> > Thanks,
> > Cliff
> >
> > On Wed, Oct 21, 2015 at 11:37 AM, James Cheng  wrote:
> >
> > > Do you have multiple consumers in a consumer group?
> > >
> > > I think that when a new consumer joins the consumer group, that the
> > > existing consumers will stop consuming during the group rebalance, and
> > then
> > > when they start consuming again, that they will consume from the last
> > > committed offset.
> > >
> > > You should get more verification on this, tho. I might be remembering
> > > wrong.
> > >
> > > -James
> > >
> > > > On Oct 21, 2015, at 8:40 AM, Cliff Rhyne  wrote:
> > > >
> > > > Hi,
> > > >
> > > > My team and I are looking into a problem where the Java high level
> > > consumer
> > > > provides duplicate messages if we turn auto commit off (using version
> > > > 0.8.2.1 of the server and Java client).  The expected sequence of
> > events
> > > > are:
> > > >
> > > > 1. Start high-level consumer and initialize a KafkaStream to get a
> > > > ConsumerIterator
> > > > 2. Consume n items (could be 10,000, could be 1,000,000) from the
> > > iterator
> > > > 3. Commit the new offsets
> > > >
> > > > What we are seeing is that during step 2, some number of the n
> messages
> > > are
> > > > getting returned by the iterator in duplicate (in some cases, we've
> > seen
> > > > n*5 messages consumed).  The problem appears to go away if we turn on
> > > auto
> > > > commit (and committing offsets to kafka helped too), but auto commit
> > > causes
> > > > conflicts with our offset rollback logic.  The issue seems to happen
> > more
> > > > when we are in our test environment on a lower-cost cloud provider.
> > > >
> > > > Diving into the Java and Scala classes including the
> ConsumerIterator,
> > > it's
> > > > not obvious what event causes a duplicate offset to be requested or
> > > > returned (there's even a loop that is supposed to exclude duplicate
> > > > messages in this class).  I tried turning on trace logging but my
> log4j
> > > > config isn't getting the Kafka client logs to write out.
> > > >
> > > > Does anyone have suggestions of where to look or how to enable
> logging?
> > > >
> > > > Thanks,
> > > > Cliff
> > >
> > >
> > > 
> > >
> > > This email and any attachments may contain confidential and privileged
> > > material for the sole use of the intended recipient. Any review,
> copying,
> > > or distribution of this email (or any attachments) by others is
> > prohibited.
> > > If you are not the intended recipient, please contact the sender
> > > immediately and permanently delete this email and any attachments. No
> > > employee or agent of TiVo Inc. is authorized to conclude any binding
> > > agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> > > Inc. may only be made by a signed written agreement.
> > >
> >
>


[GitHub] kafka pull request: MINOR: fix checkstyle failures

2015-10-21 Thread SinghAsDev
GitHub user SinghAsDev opened a pull request:

https://github.com/apache/kafka/pull/347

MINOR: fix checkstyle failures

@guozhangwang could you take a look at this. These failures are a bit 
annoying as it never leads to a successful build.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SinghAsDev/kafka fixCheckstyle

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/347.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #347


commit 4dacf8b7c014a186bca2737b5f93e190d58f43ff
Author: Ashish Singh 
Date:   2015-10-21T21:18:51Z

MINOR: fix checkstyle failures




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2671) Enable starting Kafka server with a Properties object

2015-10-21 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967999#comment-14967999
 ] 

Ashish K Singh commented on KAFKA-2671:
---

[~gwenshap], [~guozhang] I have updated the patch, could you guys take a look. 
Thanks!

> Enable starting Kafka server with a Properties object
> -
>
> Key: KAFKA-2671
> URL: https://issues.apache.org/jira/browse/KAFKA-2671
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Kafka, as of now, can only be started with a properties file and override 
> params. It makes life easier for management applications to be able to start 
> Kafka with a properties object programatically.
> The changes required to enable this are minimal, just a tad bit of 
> refactoring of kafka.Kafka. The changes must maintain current behavior intact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2618; Disable SSL renegotiation for 0.9....

2015-10-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/339


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-2618) Disable SSL renegotiation for 0.9.0.0

2015-10-21 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-2618:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 339
[https://github.com/apache/kafka/pull/339]

> Disable SSL renegotiation for 0.9.0.0
> -
>
> Key: KAFKA-2618
> URL: https://issues.apache.org/jira/browse/KAFKA-2618
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.9.0.0
>
>
> As discussed in KAFKA-2609, we don't have enough tests for SSL renegotiation 
> to be confident that it works well. In addition, neither the clients or the 
> server make use of renegotiation at this point.
> For 0.9.0.0, we should disable renegotiation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2618) Disable SSL renegotiation for 0.9.0.0

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968006#comment-14968006
 ] 

ASF GitHub Bot commented on KAFKA-2618:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/339


> Disable SSL renegotiation for 0.9.0.0
> -
>
> Key: KAFKA-2618
> URL: https://issues.apache.org/jira/browse/KAFKA-2618
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.9.0.0
>
>
> As discussed in KAFKA-2609, we don't have enough tests for SSL renegotiation 
> to be confident that it works well. In addition, neither the clients or the 
> server make use of renegotiation at this point.
> For 0.9.0.0, we should disable renegotiation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Ismael Juma
+1 (non-binding)

On Wed, Oct 21, 2015 at 4:17 PM, Flavio Junqueira  wrote:

> Thanks everyone for the feedback so far. At this point, I'd like to start
> a vote for KIP-38.
>
> Summary: Add support for ZooKeeper authentication
> KIP page:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> >
>
> Thanks,
> -Flavio


[GitHub] kafka pull request: KAFKA-2209 - Change quotas dynamically using D...

2015-10-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/298


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2209) Change client quotas dynamically using DynamicConfigManager

2015-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968139#comment-14968139
 ] 

ASF GitHub Bot commented on KAFKA-2209:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/298


> Change client quotas dynamically using DynamicConfigManager
> ---
>
> Key: KAFKA-2209
> URL: https://issues.apache.org/jira/browse/KAFKA-2209
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Aditya Auradkar
>Assignee: Aditya Auradkar
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2209) Change client quotas dynamically using DynamicConfigManager

2015-10-21 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-2209.

   Resolution: Fixed
Fix Version/s: 0.9.0.0

Issue resolved by pull request 298
[https://github.com/apache/kafka/pull/298]

> Change client quotas dynamically using DynamicConfigManager
> ---
>
> Key: KAFKA-2209
> URL: https://issues.apache.org/jira/browse/KAFKA-2209
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Aditya Auradkar
>Assignee: Aditya Auradkar
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Dong Lin
+1

On Wed, Oct 21, 2015 at 5:01 PM, Jay Kreps  wrote:

> +1
>
> -Jay
>
> On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira  wrote:
>
> > Thanks everyone for the feedback so far. At this point, I'd like to start
> > a vote for KIP-38.
> >
> > Summary: Add support for ZooKeeper authentication
> > KIP page:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> > >
> >
> > Thanks,
> > -Flavio
>


Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Joel Koshy
+1 binding

On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira  wrote:

> Thanks everyone for the feedback so far. At this point, I'd like to start
> a vote for KIP-38.
>
> Summary: Add support for ZooKeeper authentication
> KIP page:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> >
>
> Thanks,
> -Flavio


Build failed in Jenkins: kafka-trunk-jdk8 #47

2015-10-21 Thread Apache Jenkins Server
See 

Changes:

[jjkoshy] KAFKA-2454; Deadlock between log segment deletion and server shutdown.

[junrao] KAFKA-2618; Disable SSL renegotiation for 0.9.0.0

--
[...truncated 4996 lines...]

kafka.coordinator.MemberMetadataTest > testVoteRaisesOnNoSupportedProtocols 
PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[17] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.log.LogManagerTest > testCleanupSegmentsToMaintainSize PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithRelativeDirectory 
PASSED

kafka.log.LogManagerTest > testGetNonExistentLog PASSED

kafka.log.LogManagerTest > testTwoLogManagersUsingSameDirFails PASSED

kafka.log.LogManagerTest > testLeastLoadedAssignment PASSED

kafka.log.LogManagerTest > testCleanupExpiredSegments PASSED

kafka.log.LogManagerTest > testCheckpointRecoveryPoints PASSED

kafka.log.LogManagerTest > testTimeBasedFlush PASSED

kafka.log.LogManagerTest > testCreateLog PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithTrailingSlash PASSED

kafka.log.LogSegmentTest > testRecoveryWithCorruptMessage PASSED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptIndex PASSED

kafka.log.LogSegmentTest > testReadFromGap PASSED

kafka.log.LogSegmentTest > testTruncate PASSED

kafka.log.LogSegmentTest > testReadBeforeFirstOffset PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeAppendMessage PASSED

kafka.log.LogSegmentTest > testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest > testMaxOffset PASSED

kafka.log.LogSegmentTest > testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest > testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest > testReadAfterLast PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.log.FileMessageSetTest > testTruncate PASSED

kafka.log.FileMessageSetTest > testIterationOverPartialAndTruncation PASSED

kafka.log.FileMessageSetTest > testRead PASSED

kafka.log.FileMessageSetTest > testFileSize PASSED

kafka.log.FileMessageSetTest > testIteratorWithLimits PASSED

kafka.log.FileMessageSetTest > testPreallocateTrue PASSED

kafka.log.FileMessageSetTest > testIteratorIsConsistent PASSED

kafka.log.FileMessageSetTest > testIterationDoesntChangePosition PASSED

kafka.log.FileMessageSetTest > testWrittenEqualsRead PASSED

kafka.log.FileMessageSetTest > testWriteTo PASSED

kafka.log.FileMessageSetTest > testPreallocateFalse PASSED

kafka.log.FileMessageSetTest > testPreallocateClearShutdown PASSED

kafka.log.FileMessageSetTest > testSearch PASSED

kafka.log.FileMessageSetTest > testSizeInBytes PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[1] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[3] PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingTopic PASSED

kafka.log.LogTest > testIndexRebuild PASSED

kafka.log.LogTest > testLogRolls PASSED

kafka.log.LogTest > testMessageSizeCheck PASSED

kafka.log.LogTest > testAsyncDelete PASSED

kafka.log.LogTest > testReadOutOfRange PASSED

kafka.log.LogTest > testReadAtLogGap PASSED

kafka.log.LogTest > testTimeBasedLogRoll PASSED

kafka.log.LogTest > testLoadEmptyLog PASSED

kafka.log.LogTest > testMessageSetSizeCheck PASSED

kafka.log.LogTest > testIndexResizingAtTruncation PASSED

kafka.log.LogTest > 

Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Flavio Junqueira
Ok, thanks for the feedback, Todd. I have updated the KIP with some of the 
points discussed here. There is more to add based on these last comments, 
though.

-Flavio

> On 21 Oct 2015, at 23:43, Todd Palino  wrote:
> 
> On Wed, Oct 21, 2015 at 3:38 PM, Flavio Junqueira  > wrote:
> 
>> 
>>> On 21 Oct 2015, at 21:54, Todd Palino  wrote:
>>> 
>>> Thanks for the clarification on that, Jun. Obviously, we haven't been
>> doing
>>> much with ZK authentication around here yet. There is still a small
>> concern
>>> there, mostly in that you should not share credentials any more than is
>>> necessary, which would argue for being able to use a different ACL than
>> the
>>> default. I don't really like the idea of having to use the exact same
>>> credentials for executing the admin tools as we do for running the
>> brokers.
>>> Given that we don't need to share the credentials with all consumers, I
>>> think we can work around it.
>>> 
>> 
>> Let me add that a feature to separate the sub-trees of users sharing an
>> ensemble is chroot.
>> 
>> On different credentials for admin tools, this sounds doable by setting
>> the ACLs of znodes. For example, there could be an admin id and a broker
>> id, both with the ability of changing znodes, but different credentials.
>> Would something like that work for you?
>> 
> 
> It would be a nice option to have, as the credentials can be protected
> differently. I would consider this a nice to have, and not an "absolutely
> must have" feature at this point.
> 
> 
>> This does bring up another good question, however. What will be the
>> process
>>> for having to rotate the credentials? That is, if the credentials are
>>> compromised and need to be changed, how can that be accomplished with the
>>> cluster online. I'm guessing some combination of using skipAcl on the
>>> Zookeeper ensemble and config changes to the brokers will be required,
>> but
>>> this is an important enough operation that we should make sure it's
>>> reasonable to perform and that it is documented.
>> 
>> Right now there is no kafka support in the plan for this. But this is
>> doable directly through the zk api. Would it be sufficient to write down
>> how to perform such an operation via the zk api or do we need a tool to do
>> it?
>> 
> 
> I think as long as there is a documented procedure for how to do it, that
> will be good enough. It's mostly about making sure that we can, and that we
> don't put something in place that would require downtime to a cluster in
> order to change credentials. We can always develop a tool later if it is a
> requested item.
> 
> Thanks!
> 
> -Todd
> 
> 
> 
>> 
>> -Flavio
>> 
>>> 
>>> 
>>> On Wed, Oct 21, 2015 at 1:23 PM, Jun Rao  wrote:
>>> 
 Parth,
 
 For 2), in your approach, the broker/controller will then always have
>> the
 overhead of resetting the ACL on startup after zookeeper.set.acl is set
>> to
 true. The benefit of using a separate migration tool is that you paid
>> the
 cost only once during upgrade. It is an extra step during the upgrade.
 However, given the other things that you need to do to upgrade to 0.9.0
 (e.g. two rounds of rolling upgrades on all brokers, etc), I am not
>> sure if
 it's worth to optimize away of this step. We probably just need to
>> document
 this clearly.
 
 Todd,
 
 Just to be clear about the shared ZK usage. Once you set
>> CREATOR_ALL_ACL +
 READ_ACL_UNSAFE on a path, only ZK clients with the same user as the
 creator can modify the path. Other ZK clients authenticated with a
 different user can read, but not modify the path. Are you concerned
>> about
 the reads or the writes to ZK?
 
 Thanks,
 
 Jun
 
 
 
 On Wed, Oct 21, 2015 at 10:46 AM, Flavio Junqueira 
>> wrote:
 
> 
>> On 21 Oct 2015, at 18:07, Parth Brahmbhatt <
 pbrahmbh...@hortonworks.com>
> wrote:
>> 
>> I have 2 suggestions:
>> 
>> 1) We need to document how does one move from secure to non secure
>> environment:
>> 1) change the config on all brokers to zookeeper.set.acl = false
> and do a
>> rolling upgrade.
>> 2) Run the migration script with the jass config file so it is
 sasl
>> authenticated with zookeeper and change the acls on all subtrees back
 to
>> World modifiable.
>> 3) remove the jaas config / or only the zookeeper section from
 the
> jaas,
>> and restart all brokers.
>> 
> 
> Thanks for bringing it up, it makes sense to have a downgrade path and
> document it.
> 
> 
>> 2) I am not sure if we should force users trying to move from unsecure
 to
>> secure environment to execute the migration script. In the second step
>> once the zookeeper.set.acl is set to true, we can secure all the
 subtrees
>> 

Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Flavio Junqueira

> On 21 Oct 2015, at 21:54, Todd Palino  wrote:
> 
> Thanks for the clarification on that, Jun. Obviously, we haven't been doing
> much with ZK authentication around here yet. There is still a small concern
> there, mostly in that you should not share credentials any more than is
> necessary, which would argue for being able to use a different ACL than the
> default. I don't really like the idea of having to use the exact same
> credentials for executing the admin tools as we do for running the brokers.
> Given that we don't need to share the credentials with all consumers, I
> think we can work around it.
> 

Let me add that a feature to separate the sub-trees of users sharing an 
ensemble is chroot.

On different credentials for admin tools, this sounds doable by setting the 
ACLs of znodes. For example, there could be an admin id and a broker id, both 
with the ability of changing znodes, but different credentials. Would something 
like that work for you?

> This does bring up another good question, however. What will be the process
> for having to rotate the credentials? That is, if the credentials are
> compromised and need to be changed, how can that be accomplished with the
> cluster online. I'm guessing some combination of using skipAcl on the
> Zookeeper ensemble and config changes to the brokers will be required, but
> this is an important enough operation that we should make sure it's
> reasonable to perform and that it is documented.

Right now there is no kafka support in the plan for this. But this is doable 
directly through the zk api. Would it be sufficient to write down how to 
perform such an operation via the zk api or do we need a tool to do it?

-Flavio

> 
> 
> On Wed, Oct 21, 2015 at 1:23 PM, Jun Rao  wrote:
> 
>> Parth,
>> 
>> For 2), in your approach, the broker/controller will then always have the
>> overhead of resetting the ACL on startup after zookeeper.set.acl is set to
>> true. The benefit of using a separate migration tool is that you paid the
>> cost only once during upgrade. It is an extra step during the upgrade.
>> However, given the other things that you need to do to upgrade to 0.9.0
>> (e.g. two rounds of rolling upgrades on all brokers, etc), I am not sure if
>> it's worth to optimize away of this step. We probably just need to document
>> this clearly.
>> 
>> Todd,
>> 
>> Just to be clear about the shared ZK usage. Once you set CREATOR_ALL_ACL +
>> READ_ACL_UNSAFE on a path, only ZK clients with the same user as the
>> creator can modify the path. Other ZK clients authenticated with a
>> different user can read, but not modify the path. Are you concerned about
>> the reads or the writes to ZK?
>> 
>> Thanks,
>> 
>> Jun
>> 
>> 
>> 
>> On Wed, Oct 21, 2015 at 10:46 AM, Flavio Junqueira  wrote:
>> 
>>> 
 On 21 Oct 2015, at 18:07, Parth Brahmbhatt <
>> pbrahmbh...@hortonworks.com>
>>> wrote:
 
 I have 2 suggestions:
 
 1) We need to document how does one move from secure to non secure
 environment:
  1) change the config on all brokers to zookeeper.set.acl = false
>>> and do a
 rolling upgrade.
  2) Run the migration script with the jass config file so it is
>> sasl
 authenticated with zookeeper and change the acls on all subtrees back
>> to
 World modifiable.
  3) remove the jaas config / or only the zookeeper section from
>> the
>>> jaas,
 and restart all brokers.
 
>>> 
>>> Thanks for bringing it up, it makes sense to have a downgrade path and
>>> document it.
>>> 
>>> 
 2) I am not sure if we should force users trying to move from unsecure
>> to
 secure environment to execute the migration script. In the second step
 once the zookeeper.set.acl is set to true, we can secure all the
>> subtrees
 by calling ensureCorrectAcls as part of broker initialization (just
>> after
 makesurePersistentPathExists). Not sure why we want to add one more
 manual/admin step when it can be automated. This also has the added
 advantage that migration script will not have to take a flag as input
>> to
 figure out if it should set the acls to secure or unsecure given it
>> will
 always be used to move from secure to unsecure.
 
>>> 
>>> The advantage of the third step is to make a single traversal to change
>>> any remaining znodes with the open ACL. As you suggest, each broker would
>>> do it, so the overhead is much higher. I do agree that eliminating a step
>>> is an advantage, though.
>>> 
 Given we are assuming all the information in zookeeper is world
>> readable
>>> ,
 I don¹t see SSL support as a must have or a blocker for this KIP.
>>> 
>>> OK, but keep in mind that SSL is only available in the 3.5 branch of ZK.
>>> 
>>> -Flavio
>>> 
 
 Thanks
 Parth
 
 
 
 On 10/21/15, 9:56 AM, "Flavio Junqueira"  wrote:
 
> 
>> On 21 Oct 2015, at 17:47, Todd Palino 

Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Todd Palino
On Wed, Oct 21, 2015 at 3:38 PM, Flavio Junqueira  wrote:

>
> > On 21 Oct 2015, at 21:54, Todd Palino  wrote:
> >
> > Thanks for the clarification on that, Jun. Obviously, we haven't been
> doing
> > much with ZK authentication around here yet. There is still a small
> concern
> > there, mostly in that you should not share credentials any more than is
> > necessary, which would argue for being able to use a different ACL than
> the
> > default. I don't really like the idea of having to use the exact same
> > credentials for executing the admin tools as we do for running the
> brokers.
> > Given that we don't need to share the credentials with all consumers, I
> > think we can work around it.
> >
>
> Let me add that a feature to separate the sub-trees of users sharing an
> ensemble is chroot.
>
> On different credentials for admin tools, this sounds doable by setting
> the ACLs of znodes. For example, there could be an admin id and a broker
> id, both with the ability of changing znodes, but different credentials.
> Would something like that work for you?
>

It would be a nice option to have, as the credentials can be protected
differently. I would consider this a nice to have, and not an "absolutely
must have" feature at this point.


> This does bring up another good question, however. What will be the
> process
> > for having to rotate the credentials? That is, if the credentials are
> > compromised and need to be changed, how can that be accomplished with the
> > cluster online. I'm guessing some combination of using skipAcl on the
> > Zookeeper ensemble and config changes to the brokers will be required,
> but
> > this is an important enough operation that we should make sure it's
> > reasonable to perform and that it is documented.
>
> Right now there is no kafka support in the plan for this. But this is
> doable directly through the zk api. Would it be sufficient to write down
> how to perform such an operation via the zk api or do we need a tool to do
> it?
>

I think as long as there is a documented procedure for how to do it, that
will be good enough. It's mostly about making sure that we can, and that we
don't put something in place that would require downtime to a cluster in
order to change credentials. We can always develop a tool later if it is a
requested item.

Thanks!

-Todd



>
> -Flavio
>
> >
> >
> > On Wed, Oct 21, 2015 at 1:23 PM, Jun Rao  wrote:
> >
> >> Parth,
> >>
> >> For 2), in your approach, the broker/controller will then always have
> the
> >> overhead of resetting the ACL on startup after zookeeper.set.acl is set
> to
> >> true. The benefit of using a separate migration tool is that you paid
> the
> >> cost only once during upgrade. It is an extra step during the upgrade.
> >> However, given the other things that you need to do to upgrade to 0.9.0
> >> (e.g. two rounds of rolling upgrades on all brokers, etc), I am not
> sure if
> >> it's worth to optimize away of this step. We probably just need to
> document
> >> this clearly.
> >>
> >> Todd,
> >>
> >> Just to be clear about the shared ZK usage. Once you set
> CREATOR_ALL_ACL +
> >> READ_ACL_UNSAFE on a path, only ZK clients with the same user as the
> >> creator can modify the path. Other ZK clients authenticated with a
> >> different user can read, but not modify the path. Are you concerned
> about
> >> the reads or the writes to ZK?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >>
> >>
> >> On Wed, Oct 21, 2015 at 10:46 AM, Flavio Junqueira 
> wrote:
> >>
> >>>
>  On 21 Oct 2015, at 18:07, Parth Brahmbhatt <
> >> pbrahmbh...@hortonworks.com>
> >>> wrote:
> 
>  I have 2 suggestions:
> 
>  1) We need to document how does one move from secure to non secure
>  environment:
>   1) change the config on all brokers to zookeeper.set.acl = false
> >>> and do a
>  rolling upgrade.
>   2) Run the migration script with the jass config file so it is
> >> sasl
>  authenticated with zookeeper and change the acls on all subtrees back
> >> to
>  World modifiable.
>   3) remove the jaas config / or only the zookeeper section from
> >> the
> >>> jaas,
>  and restart all brokers.
> 
> >>>
> >>> Thanks for bringing it up, it makes sense to have a downgrade path and
> >>> document it.
> >>>
> >>>
>  2) I am not sure if we should force users trying to move from unsecure
> >> to
>  secure environment to execute the migration script. In the second step
>  once the zookeeper.set.acl is set to true, we can secure all the
> >> subtrees
>  by calling ensureCorrectAcls as part of broker initialization (just
> >> after
>  makesurePersistentPathExists). Not sure why we want to add one more
>  manual/admin step when it can be automated. This also has the added
>  advantage that migration script will not have to take a flag as input
> >> to
>  figure out if it should set the acls to secure 

Re: [DISCUSS] KIP-37 - Add namespaces in Kafka

2015-10-21 Thread Ashish Singh
On Wed, Oct 21, 2015 at 2:22 PM, Jay Kreps  wrote:

> Gwen, It's a good question of what the producer semantics are--would we
> only allow you to produce to a partition or first level directory or would
> we hash over whatever subtree you supply? Actually not sure which makes
> more sense...
>
> Ashish, here are some thoughts:
> 1. I think we can do this online. There is a question of what happens to
> readers and writers but presumably it would the same thing as if that topic
> weren't there. There would be no guarantee this would happen atomic over
> different brokers or clients, though.
> 2. ACLs should work like unix perms, right?


Are you suggesting we should move allowed operations to R, W, X model of
unix. Currently, we support these operations

.

I think configs would overide
> hierarchically, so we would have a full set of configs for each partition
> computed by walking up the tree from the root and taking the first
> override). I think this is what you're describing, right?
>

Yes.

3. Totally agree no reason to have an arbitrary limit.
> 4. I actually don't think the physical layout on disk should be at all
> connected to the logical directory hierarchy we present.


I think it will be useful to have that connection as that will enable users
to encrypt different namespaces with different keys. Thus, one more step
towards a completely multi tenant system.


> That is, whether
> you use RAID or not shouldn't impact the location of a topic in your
> directory structure.


Even if we make physical layout on disk representative of directory
hierarchy,  I think this will not be a concern. Correct me, if I am missing
something.

Not sure if this is what you are saying or not. This
> does raise the question of how to do the disk layout. The simplest thing
> would be to keep the flat data directories but make the names of the
> partitions on disk just be logical inode numbers and then have a separate
> mapping of these inodes to logical names stored in ZK with a cache. I think
> this would make things like rename fast and atomic. The downside of this is
> that the 'ls' command will no longer tell you much about the data on a
> broker.
>

Enabling renaming of topics is definitely something that will be nice to
have, however with the flat structure we won't be able to enable encrypting
different directories/ namespaces with different keys. However, with
directory hierarchy on disk can be achieved with logical names, each dir
will need a logical name though.


> -Jay
>
> On Wed, Oct 21, 2015 at 12:43 PM, Ashish Singh 
> wrote:
>
> > In last KIP hangout following questions were raised.
> >
> >1.
> >
> >*Whether or not to support move command? If yes, how do we support
> it.*
> >I think *move* command will be essential, once we start supporting
> >directories. However, implementation might be a bit convoluted. A few
> >things required for it will be, ability to mark a topic unavailable
> > during
> >the move, update brokers’ metadata cache to reflect the move.
> >2.
> >
> >*How will acls/ configs inheritance work?*
> >Say we have /dc/ns/topic.
> >dc has dc_acl and dc_config. Similarly for ns and topic.
> >For being able to perform an action on /dc/ns/topic, the user must
> have
> >required perms on dc, ns and topic for that operation. For example,
> > User1
> >will need DESCRIBE permissions on dc, ns and topic to be able to
> > describe
> >/dc/ns/topic.
> >For configs, configs for /dc/ns/topic will be topic_config +
> ns_config +
> >dc_config, in that order. So, if a config is specified for topic then
> > that
> >will be used, else it’s parent (ns) will be checked for that config,
> and
> >this goes on.
> >3.
> >
> >*Will supporting n-deep hierarchy be a concern?*
> >This can be a performance concern, however it sounds more of a
> misusage
> >of the functionality or bad organization of topics. We can have a
> depth
> >limit, but I am not sure if it is required.
> >4.
> >
> >*Will we continue to support multi-directory on disk, that was
> proposed
> >in KAFKA-188?*
> >Yes, we should be able to support that. It is within those
> directories,
> >namespaces will be created. The heuristics for choosing least loaded
> >disc/dir will remain same.
> >5.
> >
> >*Will it be required to move existing topics from default directory/
> >namespace to a particular directory/ namespace to enable mirror-maker
> >replicate topics in that directory/namespace?*
> >I do not think it will be required, as one can simple add /*/* to
> >mirror-maker’s blacklist and this will only capture topics that exist
> in
> >default namespace. @Joel, does this answer your question?
> >
> > ​
> >
> > On Fri, Oct 16, 2015 at 6:33 PM, Ashish Singh 

Jenkins build is back to normal : kafka-trunk-jdk7 #710

2015-10-21 Thread Apache Jenkins Server
See 



[jira] [Commented] (KAFKA-2674) ConsumerRebalanceListener.onPartitionsRevoked() is not called on consumer close

2015-10-21 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968350#comment-14968350
 ] 

Jiangjie Qin commented on KAFKA-2674:
-

[~guozhang] That is a valid concern, I haven't think it through before. The 
question is what states could change and whether the changes matter. From 
consumer's point of view, there are only two states that matter: 1)Partition 
Assignment 2)Consumer Offsets.

In most cases, user would not care about partition assignment. In cases where 
user care about the partition assignment, the assignment after close would be 
empty. So it seems users do not really lose anything from here.

If user are using auto commit, that means user do not really care about the 
offset commit timing as long as the correct offset is committed. When consumer 
closes, the correct offsets to commit are the consumed offsets, and consumed 
offsets will not change unless user call poll(). If user don't want committed 
offset to change before and after consumer closes, they can either call 
commitOffsetSync() before closing consumer by themselves or disable auto 
commit, depending on the use cases.

On the other hand, if users are not using auto commit, we will not commit the 
offsets, so the state won't change.

So I think we are OK here. Do you have some specific use case that breaks?

> ConsumerRebalanceListener.onPartitionsRevoked() is not called on consumer 
> close
> ---
>
> Key: KAFKA-2674
> URL: https://issues.apache.org/jira/browse/KAFKA-2674
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0
>Reporter: Michal Turek
>Assignee: Neha Narkhede
>
> Hi, I'm investigating and testing behavior of new consumer from the planned 
> release 0.9 and found an inconsistency in calling of rebalance callbacks.
> I noticed that ConsumerRebalanceListener.onPartitionsRevoked() is NOT called 
> during consumer close and application shutdown. It's JavaDoc contract says:
> - "This method will be called before a rebalance operation starts and after 
> the consumer stops fetching data."
> - "It is recommended that offsets should be committed in this callback to 
> either Kafka or a custom offset store to prevent duplicate data."
> I believe calling consumer.close() is a start of rebalance operation and even 
> the local consumer that is actually closing should be notified to be able to 
> process any rebalance logic including offsets commit (e.g. if auto-commit is 
> disabled).
> There are commented logs of current and expected behaviors.
> {noformat}
> // Application start
> 2015-10-20 15:14:02.208 INFO  o.a.k.common.utils.AppInfoParser
> [TestConsumer-worker-0]: Kafka version : 0.9.0.0-SNAPSHOT 
> (AppInfoParser.java:82)
> 2015-10-20 15:14:02.208 INFO  o.a.k.common.utils.AppInfoParser
> [TestConsumer-worker-0]: Kafka commitId : 241b9ab58dcbde0c 
> (AppInfoParser.java:83)
> // Consumer started (the first one in group), rebalance callbacks are called 
> including empty onPartitionsRevoked()
> 2015-10-20 15:14:02.333 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Rebalance callback, revoked: [] 
> (TestConsumer.java:95)
> 2015-10-20 15:14:02.343 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Rebalance callback, assigned: [testB-1, testA-0, 
> testB-0, testB-3, testA-2, testB-2, testA-1, testA-4, testB-4, testA-3] 
> (TestConsumer.java:100)
> // Another consumer joined the group, rebalancing
> 2015-10-20 15:14:17.345 INFO  o.a.k.c.c.internals.Coordinator 
> [TestConsumer-worker-0]: Attempt to heart beat failed since the group is 
> rebalancing, try to re-join group. (Coordinator.java:714)
> 2015-10-20 15:14:17.346 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Rebalance callback, revoked: [testB-1, testA-0, 
> testB-0, testB-3, testA-2, testB-2, testA-1, testA-4, testB-4, testA-3] 
> (TestConsumer.java:95)
> 2015-10-20 15:14:17.349 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Rebalance callback, assigned: [testB-3, testA-4, 
> testB-4, testA-3] (TestConsumer.java:100)
> // Consumer started closing, there SHOULD be onPartitionsRevoked() callback 
> to commit offsets like during standard rebalance, but it is missing
> 2015-10-20 15:14:39.280 INFO  c.a.e.kafka.newapi.TestConsumer [main]: 
> Closing instance (TestConsumer.java:42)
> 2015-10-20 15:14:40.264 INFO  c.a.e.kafka.newapi.TestConsumer 
> [TestConsumer-worker-0]: Worker thread stopped (TestConsumer.java:89)
> {noformat}
> Workaround is to call onPartitionsRevoked() explicitly and manually just 
> before calling consumer.close() but it seems dirty and error prone for me. It 
> can be simply forgotten be someone without such experience.



--
This message was 

Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Jiangjie Qin
+1 (non-binding)

On Wed, Oct 21, 2015 at 3:40 PM, Joel Koshy  wrote:

> +1 binding
>
> On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira  wrote:
>
> > Thanks everyone for the feedback so far. At this point, I'd like to start
> > a vote for KIP-38.
> >
> > Summary: Add support for ZooKeeper authentication
> > KIP page:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> > >
> >
> > Thanks,
> > -Flavio
>


[GitHub] kafka pull request: MINOR: fix checkstyle failures

2015-10-21 Thread SinghAsDev
Github user SinghAsDev closed the pull request at:

https://github.com/apache/kafka/pull/347


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #49

2015-10-21 Thread Apache Jenkins Server
See 

Changes:

[junrao] KAFKA-2456 KAFKA-2472; SSL clean-ups

--
[...truncated 6820 lines...]

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testSourceTasksStdin PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > testTaskClass 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testMultipleSourcesInvalid PASSED

org.apache.kafka.copycat.file.FileStreamSinkConnectorTest > testSinkTasks PASSED

org.apache.kafka.copycat.file.FileStreamSinkConnectorTest > testTaskClass PASSED

org.apache.kafka.copycat.file.FileStreamSinkTaskTest > testPutFlush PASSED
:copycat:json:checkstyleMain
:copycat:json:compileTestJavawarning: [options] bootstrap class path not set in 
conjunction with -source 1.7
Note: 

 uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:copycat:json:processTestResources UP-TO-DATE
:copycat:json:testClasses
:copycat:json:checkstyleTest
:copycat:json:test

org.apache.kafka.copycat.json.JsonConverterTest > longToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToJsonConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
nullSchemaAndMapNonStringKeysToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndMapToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCopycatSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToCopycatConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaPrimitiveToCopycat 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatNonStringKeys 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToCopycat PASSED
:copycat:runtime:checkstyleMain
:copycat:runtime:compileTestJavawarning: [options] bootstrap class path not set 
in conjunction with -source 1.7
Note: Some input files use unchecked or unsafe operations.

Re: [DISCUSS] KIP-38: ZooKeeper Authentication

2015-10-21 Thread Flavio Junqueira

> On 21 Oct 2015, at 18:07, Parth Brahmbhatt  
> wrote:
> 
> I have 2 suggestions:
> 
> 1) We need to document how does one move from secure to non secure
> environment: 
>   1) change the config on all brokers to zookeeper.set.acl = false and do 
> a
> rolling upgrade.
>   2) Run the migration script with the jass config file so it is sasl
> authenticated with zookeeper and change the acls on all subtrees back to
> World modifiable.
>   3) remove the jaas config / or only the zookeeper section from the jaas,
> and restart all brokers.
> 

Thanks for bringing it up, it makes sense to have a downgrade path and document 
it.


> 2) I am not sure if we should force users trying to move from unsecure to
> secure environment to execute the migration script. In the second step
> once the zookeeper.set.acl is set to true, we can secure all the subtrees
> by calling ensureCorrectAcls as part of broker initialization (just after
> makesurePersistentPathExists). Not sure why we want to add one more
> manual/admin step when it can be automated. This also has the added
> advantage that migration script will not have to take a flag as input to
> figure out if it should set the acls to secure or unsecure given it will
> always be used to move from secure to unsecure.
> 

The advantage of the third step is to make a single traversal to change any 
remaining znodes with the open ACL. As you suggest, each broker would do it, so 
the overhead is much higher. I do agree that eliminating a step is an 
advantage, though.

> Given we are assuming all the information in zookeeper is world readable ,
> I don¹t see SSL support as a must have or a blocker for this KIP.

OK, but keep in mind that SSL is only available in the 3.5 branch of ZK.

-Flavio

> 
> Thanks
> Parth
> 
> 
> 
> On 10/21/15, 9:56 AM, "Flavio Junqueira"  wrote:
> 
>> 
>>> On 21 Oct 2015, at 17:47, Todd Palino  wrote:
>>> 
>>> There seems to be a bit of detail lacking in the KIP. Specifically, I'd
>>> like to understand:
>>> 
>>> 1) What znodes are the brokers going to secure? Is this configurable?
>>> How?
>> 
>> Currently it is securing all paths here except the consumers one:
>> 
>> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils
>> /ZkUtils.scala#L56
>> > s/ZkUtils.scala#L56>
>> 
>> This isn't configurable at the moment.
>> 
>>> 2) What ACL is the broker going to apply? Is this configurable?
>> 
>> The default is CREATOR_ALL_ACL + READ_ACL_UNSAFE, which means that an
>> authenticated client can manipulate secured znodes and everyone can read
>> znodes. The API of ZkUtils accommodates other ACLs, but the current code
>> is using the default.
>> 
>>> 3) How will the admin tools (such as preferred replica election and
>>> partition reassignment) interact with this?
>>> 
>> 
>> Currently, you need to set a system property passing the login config
>> file to be able to authenticate the client and perform writes to ZK.
>> 
>> -Flavio
>> 
>>> -Todd
>>> 
>>> 
>>> On Wed, Oct 21, 2015 at 9:16 AM, Ismael Juma  wrote:
>>> 
 On Wed, Oct 21, 2015 at 5:04 PM, Flavio Junqueira 
 wrote:
 
> Bringing the points Grant brought to this thread:
> 
>> Is it worth mentioning the follow up steps that were discussed in the
 KIP
>> call in this KIP document? Some of them were:
>> 
>> - Adding SSL support for Zookeeper
>> - Removing the "world readable" assumption
>> 
> 
> Grant, how would you do it? I see three options:
> 
> 1- Add to the existing KIP, but then the functionality we should be
> checking in soon won't include it, so the KIP will remain incomplete
> 
 
 A "Future work" section would make sense to me, but I don't know how
 this
 is normally handled.
 
 Ismael
 
>> 
> 



[GitHub] kafka pull request: KAFKA-2667: Fix transient error in KafkaBasedL...

2015-10-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/333


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Todd Palino
+1 (non-binding)

On Wed, Oct 21, 2015 at 6:53 PM, Jiangjie Qin 
wrote:

> +1 (non-binding)
>
> On Wed, Oct 21, 2015 at 3:40 PM, Joel Koshy  wrote:
>
> > +1 binding
> >
> > On Wed, Oct 21, 2015 at 8:17 AM, Flavio Junqueira 
> wrote:
> >
> > > Thanks everyone for the feedback so far. At this point, I'd like to
> start
> > > a vote for KIP-38.
> > >
> > > Summary: Add support for ZooKeeper authentication
> > > KIP page:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> > > >
> > >
> > > Thanks,
> > > -Flavio
> >
>


Build failed in Jenkins: kafka-trunk-jdk8 #48

2015-10-21 Thread Apache Jenkins Server
See 

Changes:

[junrao] KAFKA-2209; Change quotas dynamically using DynamicConfigManager

--
[...truncated 4557 lines...]

org.apache.kafka.copycat.file.FileStreamSourceTaskTest > testNormalLifecycle 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceTaskTest > testMissingTopic PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > testSourceTasks 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testSourceTasksStdin PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > testTaskClass 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testMultipleSourcesInvalid PASSED
:copycat:json:checkstyleMain
:copycat:json:compileTestJavawarning: [options] bootstrap class path not set in 
conjunction with -source 1.7
Note: 

 uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:copycat:json:processTestResources UP-TO-DATE
:copycat:json:testClasses
:copycat:json:checkstyleTest
:copycat:json:test

org.apache.kafka.copycat.json.JsonConverterTest > longToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToJsonConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
nullSchemaAndMapNonStringKeysToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndMapToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCopycatSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToCopycatConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaPrimitiveToCopycat 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatNonStringKeys 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToCopycat PASSED
:copycat:runtime:checkstyleMain
:copycat:runtime:compileTestJavawarning: [options] bootstrap class path not set 
in conjunction with -source 1.7
Note: Some 

Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Aditya Auradkar
+1 (non-binding).

On Wed, Oct 21, 2015 at 3:57 PM, Ismael Juma  wrote:

> +1 (non-binding)
>
> On Wed, Oct 21, 2015 at 4:17 PM, Flavio Junqueira  wrote:
>
> > Thanks everyone for the feedback so far. At this point, I'd like to start
> > a vote for KIP-38.
> >
> > Summary: Add support for ZooKeeper authentication
> > KIP page:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> > >
> >
> > Thanks,
> > -Flavio
>


  1   2   >