[GitHub] kafka pull request #1977: HOTFIX: Cannot Stop Kafka with Shell Script (Solut...

2016-10-05 Thread Mabin-J
GitHub user Mabin-J opened a pull request:

https://github.com/apache/kafka/pull/1977

HOTFIX: Cannot Stop Kafka with Shell Script (Solution 2)

If Kafka's homepath is long, kafka cannot stop with 'kafka-server-stop.sh'.

That command showed this message:
```
No kafka server to stop
```

This bug is caused that command line is too long like this.
```
/home/bbdev/Amasser/etc/alternatives/jre/bin/java -Xms1G -Xmx5G -server 
-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
-XX:+DisableExplicitGC -Djava.awt.headless=true 
-Xloggc:/home/bbdev/Amasser/var/log/kafka/kafkaServer-gc.log -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
-Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.ssl=false 
-Dkafka.logs.dir=/home/bbdev/Amasser/var/log/kafka 
-Dlog4j.configuration=file:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../config/log4j.properties
 -cp 
:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/argparse4j-0.5.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-api-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-file-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../li
 
bs/connect-json-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-runtime-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/guava-18.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-api-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-locator-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-utils-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-annotations-2.6.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-core-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-databind-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jav
 
assist-3.18.2-GA.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.annotation-api-1.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.inject-1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.inject-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-client-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-common-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-container-servlet-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-guava-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-media-jaxb-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-
 
server-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jopt-simple-4.9.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1-sources.jar:/home/bbdev/Amasser/etc
 
/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1-test-sources.jar:/home/bbdev/Amasser/etc/alternatives/kafka
```

but that is not all command line.
Full command line is this.
```
/home/bbdev/Amasser/etc/alternatives/jre/bin/java -Xms1G -Xmx5G -server 
-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
-XX:+DisableExplicitGC -Djava.awt.headless=true 
-Xloggc:/home/bbdev/Amasser/var/log/kafka/kafkaServer-gc.log -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
-Dcom.sun.management.jmxremote 

[GitHub] kafka pull request #1976: HOTFIX: Cannot Stop Kafka with Shell Script (Solut...

2016-10-05 Thread Mabin-J
GitHub user Mabin-J opened a pull request:

https://github.com/apache/kafka/pull/1976

HOTFIX: Cannot Stop Kafka with Shell Script (Solution 1)

If Kafka's homepath is long, kafka cannot stop with 'kafka-server-stop.sh'.

That command showed this message:
```
No kafka server to stop
```

This bug is caused that command line is too long like this.
```
/home/bbdev/Amasser/etc/alternatives/jre/bin/java -Xms1G -Xmx5G -server 
-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
-XX:+DisableExplicitGC -Djava.awt.headless=true 
-Xloggc:/home/bbdev/Amasser/var/log/kafka/kafkaServer-gc.log -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
-Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.ssl=false 
-Dkafka.logs.dir=/home/bbdev/Amasser/var/log/kafka 
-Dlog4j.configuration=file:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../config/log4j.properties
 -cp 
:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/argparse4j-0.5.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-api-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-file-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../li
 
bs/connect-json-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-runtime-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/guava-18.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-api-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-locator-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-utils-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-annotations-2.6.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-core-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-databind-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jav
 
assist-3.18.2-GA.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.annotation-api-1.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.inject-1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.inject-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-client-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-common-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-container-servlet-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-guava-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-media-jaxb-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-
 
server-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jopt-simple-4.9.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1-sources.jar:/home/bbdev/Amasser/etc
 
/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1-test-sources.jar:/home/bbdev/Amasser/etc/alternatives/kafka
```

but that is not all command line.
Full command line is this.
```
/home/bbdev/Amasser/etc/alternatives/jre/bin/java -Xms1G -Xmx5G -server 
-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
-XX:+DisableExplicitGC -Djava.awt.headless=true 
-Xloggc:/home/bbdev/Amasser/var/log/kafka/kafkaServer-gc.log -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
-Dcom.sun.management.jmxremote 

[GitHub] kafka-site pull request #21: Theme enhancement

2016-10-05 Thread derrickdoo
Github user derrickdoo closed the pull request at:

https://github.com/apache/kafka-site/pull/21


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #1975: HOTFIX: Cannot Stop with 'kafka-server-stop.sh'

2016-10-05 Thread Mabin-J
Github user Mabin-J closed the pull request at:

https://github.com/apache/kafka/pull/1975


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-3374) Failure in security rolling upgrade phase 2 system test

2016-10-05 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3374.

   Resolution: Fixed
 Assignee: Flavio Junqueira  (was: Ben Stopford)
Fix Version/s: (was: 0.10.1.1)
   (was: 0.10.0.2)
   0.10.1.0

We think the underlying cause for these failures is the same as KAFKA-3985, 
which has now been fixed. Closing this.

> Failure in security rolling upgrade phase 2 system test
> ---
>
> Key: KAFKA-3374
> URL: https://issues.apache.org/jira/browse/KAFKA-3374
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Reporter: Ismael Juma
>Assignee: Flavio Junqueira
> Fix For: 0.10.1.0
>
>
> [~geoffra] reported the following a few days ago.
> Seeing fairly consistent failures in
> "Module: kafkatest.tests.security_rolling_upgrade_test
> Class:  TestSecurityRollingUpgrade
> Method: test_rolling_upgrade_phase_two
> Arguments:
> {
>   "broker_protocol": "SASL_PLAINTEXT",
>   "client_protocol": "SASL_SSL"
> }
> Last successful run (git hash): 2a58ba9
> First failure: f7887bd
> (note failures are not 100% consistent, so there's non-zero chance the commit 
> that introduced the failure is prior to 2a58ba9)
> See for example:
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-03-08--001.1457454171--apache--trunk--f6e35de/report.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1975: HOTFIX: Cannot Stop with 'kafka-server-stop.sh'

2016-10-05 Thread Mabin-J
GitHub user Mabin-J opened a pull request:

https://github.com/apache/kafka/pull/1975

HOTFIX: Cannot Stop with 'kafka-server-stop.sh'

If Kafka's homepath is long, kafka cannot stop with 'kafka-server-stop.sh'.

That command showed this message:
```
No kafka server to stop
```

This bug is caused that command line is too long like this.
```
/home/bbdev/Amasser/etc/alternatives/jre/bin/java -Xms1G -Xmx5G -server 
-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
-XX:+DisableExplicitGC -Djava.awt.headless=true 
-Xloggc:/home/bbdev/Amasser/var/log/kafka/kafkaServer-gc.log -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
-Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.ssl=false 
-Dkafka.logs.dir=/home/bbdev/Amasser/var/log/kafka 
-Dlog4j.configuration=file:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../config/log4j.properties
 -cp 
:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/argparse4j-0.5.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-api-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-file-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../li
 
bs/connect-json-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/connect-runtime-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/guava-18.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-api-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-locator-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/hk2-utils-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-annotations-2.6.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-core-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-databind-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jav
 
assist-3.18.2-GA.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.annotation-api-1.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.inject-1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.inject-2.4.0-b34.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-client-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-common-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-container-servlet-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-guava-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-media-jaxb-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jersey-
 
server-2.22.2.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/jopt-simple-4.9.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1.jar:/home/bbdev/Amasser/etc/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1-sources.jar:/home/bbdev/Amasser/etc
 
/alternatives/kafka/bin/../libs/kafka_2.11-0.10.0.1-test-sources.jar:/home/bbdev/Amasser/etc/alternatives/kafka
```

but that is not all command line.
Full command line is this.
```
/home/bbdev/Amasser/etc/alternatives/jre/bin/java -Xms1G -Xmx5G -server 
-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
-XX:+DisableExplicitGC -Djava.awt.headless=true 
-Xloggc:/home/bbdev/Amasser/var/log/kafka/kafkaServer-gc.log -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
-Dcom.sun.management.jmxremote 

Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-05 Thread Mayuresh Gharat
@Kostya

Regarding "To get around this we have an awful *cough* solution whereby we
have to send our message wrapper with the headers and null content, and
then we have an application that has to consume from all the compacted
topics and when it sees this message it produces back in a null payload
record to make the broker compact it out."

 ---> This has a race condition, right?

Suppose the producer produces a message with headers and null content at
time To to Kafka.

Then the producer, at time To + 1,  sends another message with headers and
actual content to Kafka.

What we expect is that the application that is consuming and then producing
same message with null payload should happen at time To + 0.5, so that the
message at To + 1 is not deleted.

But there is no guarantee here.

If the null payload goes in to Kafka at time To + 2, then essentially you
loose the second message produced by the producer at time To + 1.


Thanks,

Mayuresh

On Wed, Oct 5, 2016 at 6:13 PM, Joel Koshy  wrote:

> @Nacho
>
> > > - Brokers can't see the headers (part of the "V" black box)>
> >
>
>
> > (Also, it would be nice if we had a way to access the headers from the
> > > brokers, something that is not trivial at this time with the current
> > broker
> > > architecture).
> >
> >
>
> I think this can be addressed with broker interceptors which we touched on
> in KIP-42
>  42%3A+Add+Producer+and+Consumer+Interceptors>
> .
>
> @Gwen
>
> You are right that the wrapper thingy “works”, but there are some drawbacks
> that Nacho and Radai have covered in detail that I can add a few more
> comments to.
>
> At LinkedIn, we *get by* without the proposed Kafka record headers by
> dumping such metadata in one or two places:
>
>- Most of our applications use Avro, so for the most part we can use an
>explicit header field in the Avro schema. Topic owners are supposed to
>include this header in their schemas.
>- A prefix to the payload that primarily contains the schema’s ID so we
>can deserialize the Avro. (We could use this for other use-cases as
> well -
>i.e., move some of the above into this prefix blob.)
>
> Dumping headers in the Avro schema pollutes the application’s data model
> with data/service-infra-related fields that are unrelated to the underlying
> topic; and forces the application to deserialize the entire blob whether or
> not the headers are actually used. Conversely from an infrastructure
> perspective, we would really like to not touch any application data. Our
> infiltration of the application’s schema is a major reason why many at
> LinkedIn sometimes assume that we (Kafka folks) are the shepherds for all
> things Avro :)
>
> Another drawback is that all this only works if everyone in the
> organization is a good citizen and includes the header; and uses our
> wrapper libraries - which is a good practice IMO - but may not always be
> easy for open source projects that wish to directly use the Apache
> producer/client. If instead we allow these headers to be inserted via
> suitable interceptors outside the application payloads it would remove such
> issues of separation in the data model and choice of clients.
>
> Radai has enumerated a number of use-cases
>  Case+for+Kafka+Headers>
> and
> I’m sure the broader community will have a lot more to add. The feature as
> such would enable an ecosystem of plugins from different vendors that users
> can mix and match in their data pipelines without requiring any specific
> payload formats or client libraries.
>
> Thanks,
>
> Joel
>
>
>
> > >
> > >
> > > On Wed, Oct 5, 2016 at 2:20 PM, Gwen Shapira 
> wrote:
> > >
> > > > Since LinkedIn has some kind of wrapper thingy that adds the headers,
> > > > where they could have added them to Apache Kafka - I'm very curious
> to
> > > > hear what drove that decision and the pros/cons of managing the
> > > > headers outside Kafka itself.
> > > >
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Build failed in Jenkins: kafka-0.10.1-jdk7 #48

2016-10-05 Thread Apache Jenkins Server
See 

Changes:

[ismael] KAFKA-3985; Transient system test failure

--
[...truncated 5372 lines...]

kafka.api.UserQuotaTest > testProducerConsumerOverrideUnthrottled STARTED

kafka.api.UserQuotaTest > testProducerConsumerOverrideUnthrottled PASSED

kafka.api.UserQuotaTest > testThrottledProducerConsumer STARTED

kafka.api.UserQuotaTest > testThrottledProducerConsumer PASSED

kafka.api.UserQuotaTest > testQuotaOverrideDelete STARTED

kafka.api.UserQuotaTest > testQuotaOverrideDelete PASSED

kafka.api.PlaintextProducerSendTest > testSerializerConstructors STARTED

kafka.api.PlaintextProducerSendTest > testSerializerConstructors PASSED

kafka.api.PlaintextProducerSendTest > 
testSendCompressedMessageWithLogAppendTime STARTED

kafka.api.PlaintextProducerSendTest > 
testSendCompressedMessageWithLogAppendTime PASSED

kafka.api.PlaintextProducerSendTest > testAutoCreateTopic STARTED

kafka.api.PlaintextProducerSendTest > testAutoCreateTopic PASSED

kafka.api.PlaintextProducerSendTest > testSendWithInvalidCreateTime STARTED

kafka.api.PlaintextProducerSendTest > testSendWithInvalidCreateTime PASSED

kafka.api.PlaintextProducerSendTest > testWrongSerializer STARTED

kafka.api.PlaintextProducerSendTest > testWrongSerializer PASSED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithLogAppendTime STARTED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithLogAppendTime PASSED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithCreateTime STARTED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithCreateTime PASSED

kafka.api.PlaintextProducerSendTest > testClose STARTED

kafka.api.PlaintextProducerSendTest > testClose PASSED

kafka.api.PlaintextProducerSendTest > testFlush STARTED

kafka.api.PlaintextProducerSendTest > testFlush PASSED

kafka.api.PlaintextProducerSendTest > testSendToPartition STARTED

kafka.api.PlaintextProducerSendTest > testSendToPartition PASSED

kafka.api.PlaintextProducerSendTest > testSendOffset STARTED

kafka.api.PlaintextProducerSendTest > testSendOffset PASSED

kafka.api.PlaintextProducerSendTest > testSendCompressedMessageWithCreateTime 
STARTED

kafka.api.PlaintextProducerSendTest > testSendCompressedMessageWithCreateTime 
PASSED

kafka.api.PlaintextProducerSendTest > testCloseWithZeroTimeoutFromCallerThread 
STARTED

kafka.api.PlaintextProducerSendTest > testCloseWithZeroTimeoutFromCallerThread 
PASSED

kafka.api.PlaintextProducerSendTest > testCloseWithZeroTimeoutFromSenderThread 
STARTED

kafka.api.PlaintextProducerSendTest > testCloseWithZeroTimeoutFromSenderThread 
PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaAssign STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaAssign PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaSubscribe STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaSubscribe PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaAssign STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaAssign PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoGroupAcl STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoProduceWithDescribeAcl 
STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoProduceWithDescribeAcl 
PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testProduceConsumeViaSubscribe STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testProduceConsumeViaSubscribe PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoProduceWithoutDescribeAcl STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoProduceWithoutDescribeAcl PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicWrite STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionWithTopicAndGroupRead STARTED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionWithTopicAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > testPatternSubscriptionWithNoTopicAccess 
STARTED

kafka.api.AuthorizerIntegrationTest > testPatternSubscriptionWithNoTopicAccess 
PASSED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededToReadFromNonExistentTopic STARTED


Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-05 Thread Joel Koshy
@Nacho

> > - Brokers can't see the headers (part of the "V" black box)>
>


> (Also, it would be nice if we had a way to access the headers from the
> > brokers, something that is not trivial at this time with the current
> broker
> > architecture).
>
>

I think this can be addressed with broker interceptors which we touched on
in KIP-42

.

@Gwen

You are right that the wrapper thingy “works”, but there are some drawbacks
that Nacho and Radai have covered in detail that I can add a few more
comments to.

At LinkedIn, we *get by* without the proposed Kafka record headers by
dumping such metadata in one or two places:

   - Most of our applications use Avro, so for the most part we can use an
   explicit header field in the Avro schema. Topic owners are supposed to
   include this header in their schemas.
   - A prefix to the payload that primarily contains the schema’s ID so we
   can deserialize the Avro. (We could use this for other use-cases as well -
   i.e., move some of the above into this prefix blob.)

Dumping headers in the Avro schema pollutes the application’s data model
with data/service-infra-related fields that are unrelated to the underlying
topic; and forces the application to deserialize the entire blob whether or
not the headers are actually used. Conversely from an infrastructure
perspective, we would really like to not touch any application data. Our
infiltration of the application’s schema is a major reason why many at
LinkedIn sometimes assume that we (Kafka folks) are the shepherds for all
things Avro :)

Another drawback is that all this only works if everyone in the
organization is a good citizen and includes the header; and uses our
wrapper libraries - which is a good practice IMO - but may not always be
easy for open source projects that wish to directly use the Apache
producer/client. If instead we allow these headers to be inserted via
suitable interceptors outside the application payloads it would remove such
issues of separation in the data model and choice of clients.

Radai has enumerated a number of use-cases

and
I’m sure the broader community will have a lot more to add. The feature as
such would enable an ecosystem of plugins from different vendors that users
can mix and match in their data pipelines without requiring any specific
payload formats or client libraries.

Thanks,

Joel



> >
> >
> > On Wed, Oct 5, 2016 at 2:20 PM, Gwen Shapira  wrote:
> >
> > > Since LinkedIn has some kind of wrapper thingy that adds the headers,
> > > where they could have added them to Apache Kafka - I'm very curious to
> > > hear what drove that decision and the pros/cons of managing the
> > > headers outside Kafka itself.
> > >
>


Build failed in Jenkins: kafka-trunk-jdk8 #949

2016-10-05 Thread Apache Jenkins Server
See 

Changes:

[ismael] KAFKA-3985; Transient system test failure

--
[...truncated 7700 lines...]
kafka.api.ProducerFailureHandlingTest > 
testResponseTooLargeForReplicationWithAckAll STARTED

kafka.api.ProducerFailureHandlingTest > 
testResponseTooLargeForReplicationWithAckAll PASSED

kafka.api.ProducerFailureHandlingTest > testNonExistentTopic STARTED

kafka.api.ProducerFailureHandlingTest > testNonExistentTopic PASSED

kafka.api.ProducerFailureHandlingTest > testInvalidPartition STARTED

kafka.api.ProducerFailureHandlingTest > testInvalidPartition PASSED

kafka.api.ProducerFailureHandlingTest > testSendAfterClosed STARTED

kafka.api.ProducerFailureHandlingTest > testSendAfterClosed PASSED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckZero STARTED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckZero PASSED

kafka.api.ProducerFailureHandlingTest > 
testPartitionTooLargeForReplicationWithAckAll STARTED

kafka.api.ProducerFailureHandlingTest > 
testPartitionTooLargeForReplicationWithAckAll PASSED

kafka.api.ProducerFailureHandlingTest > 
testNotEnoughReplicasAfterBrokerShutdown STARTED

kafka.api.ProducerFailureHandlingTest > 
testNotEnoughReplicasAfterBrokerShutdown PASSED

kafka.api.RackAwareAutoTopicCreationTest > testAutoCreateTopic STARTED

kafka.api.RackAwareAutoTopicCreationTest > testAutoCreateTopic PASSED

kafka.api.SaslPlaintextConsumerTest > testCoordinatorFailover STARTED

kafka.api.SaslPlaintextConsumerTest > testCoordinatorFailover PASSED

kafka.api.SaslPlaintextConsumerTest > testSimpleConsumption STARTED

kafka.api.SaslPlaintextConsumerTest > testSimpleConsumption PASSED

kafka.api.SslConsumerTest > testCoordinatorFailover STARTED

kafka.api.SslConsumerTest > testCoordinatorFailover PASSED

kafka.api.SslConsumerTest > testSimpleConsumption STARTED

kafka.api.SslConsumerTest > testSimpleConsumption PASSED

kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures STARTED

kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures PASSED

kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures STARTED

kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicWrite STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionWithTopicAndGroupRead STARTED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionWithTopicAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > testPatternSubscriptionWithNoTopicAccess 
STARTED

kafka.api.AuthorizerIntegrationTest > testPatternSubscriptionWithNoTopicAccess 
PASSED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededToReadFromNonExistentTopic STARTED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededToReadFromNonExistentTopic PASSED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionWithTopicDescribeOnlyAndGroupRead STARTED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionWithTopicDescribeOnlyAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > testDeleteWithWildCardAuth STARTED

kafka.api.AuthorizerIntegrationTest > testDeleteWithWildCardAuth PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoAccess STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicRead STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicRead PASSED

kafka.api.AuthorizerIntegrationTest > testListOffsetsWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testListOffsetsWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededForWritingToNonExistentTopic STARTED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededForWritingToNonExistentTopic PASSED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionMatchingInternalTopicWithDescribeOnlyPermission STARTED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionMatchingInternalTopicWithDescribeOnlyPermission PASSED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithOffsetLookupAndNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithOffsetLookupAndNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithNoTopicAccess STARTED


Jenkins build is back to normal : kafka-trunk-jdk7 #1604

2016-10-05 Thread Apache Jenkins Server
See 



Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-05 Thread K Burstev
Thank you Radai,
 
Ints are good, I was just asking more of curiosity to understand the reasoning 
in the KIP.
 
In your reply to Gwen you gave some insight into LinkedIn's internal solution. 
We do the same, and have a custom Avro wrapper also, but suffer with the same 
issues that are detailed in the KIP. Especially around compacted topics.
 
How does LinkedIn transport headers on a compacted topic where you need to send 
a null payload to indicate a delete both for the broker for compaction reasons 
and the consumers to remove from their state but need the headers still on that 
record as they contain meta infra information used for features like single 
message tracing and routing.
 
Or currently do you also suffer the same issue as I and you cannot transport 
headers on delete record on a compacted topic.
 
To get around this we have an awful *cough* solution whereby we have to send 
our message wrapper with the headers and null content, and then we have an 
application that has to consume from all the compacted topics and when it sees 
this message it produces back in a null payload record to make the broker 
compact it out.
 
Our solution is so flaky has caused a few production issues the fact we, and 
everyone it seems has to make these kinds of solutions is terrible. I cannot 
wait to burn our solution, when we can have true first class citizen headers 
provided by Kafka.
 
Kostya

05.10.2016, 22:34, "radai" :
> Linkedin currently just forces everyone using kafka to:
> 1. use avro as payload
> 2. add company-wide headers into their avro schema
>
> this leads to problems with teams that would like to produce arbitrary
> blobs while still benefitting from company-wide infra.
>
> it also means every change to these fields is a company-wide version bump
> across literally hundreds of source repositories.
>
> moving to "standard" headers would allow much more rapid development
> iteration and roll-out of features and capabilities (not to mention enable
> arbitrary binary payloads)
>
> why not strings - space efficiency. some of our payloads are very small
> (<50 bytes) which would make the headers dominate the bandwidth.
>
> header ordering - its mostly a micro-optimization. coupled with the
> namespacing suggestion it would make it faster to check is any "system
> headers" (low ids) exist. this could later be combined with lazy parsing of
> the headers blob to minimize the overhead of server side plugins.
>
> On Wed, Oct 5, 2016 at 2:20 PM, Gwen Shapira  wrote:
>
>>  Since LinkedIn has some kind of wrapper thingy that adds the headers,
>>  where they could have added them to Apache Kafka - I'm very curious to
>>  hear what drove that decision and the pros/cons of managing the
>>  headers outside Kafka itself.
>>
>>  On Wed, Oct 5, 2016 at 2:16 PM, K Burstev  wrote:
>>  > +1
>>  >
>>  > This is a much needed feature, one I also have been waiting for, and that
>>  > Kafka has been too long without.
>>  >
>>  > Especially since compaction the custom wrapper solution does not work
>>  where
>>  > you want a null payload but need the record to have headers.
>>  >
>>  > (It actually made me sign up to the mailing list so I could reply, as up
>>  > until now I just browse the archives and forums)
>>  >
>>  >
>>  > In general the KIP looks great. The solution address's all my core needs.
>>  > Really hope this makes it to the next release after the current one.
>>  >
>>  >
>>  > Questions:
>>  >
>>  > 1) Why not String,String headers?
>>  >
>>  > I assume reading the KIP it is for message size but surely compression
>>  would
>>  > greatly reduce this overhead with Strings.
>>  >
>>  > Many systems in the eco-sphere that kafka sits in, like JMS and Flume use
>>  > String,String headers as such it would be easier to integrate with these
>>  > frameworks/systems, as they can simply map across the headers.
>>  >
>>  >
>>  > 2) Key Allocation
>>  >
>>  > If you do keep with int keys why not make the key allocation the proposed
>>  > why is it an example. The example makes sense after all, and seems very
>>  > sensible and clean.
>>  >
>>  > 3) Header Ordering
>>  >
>>  > I would avoid this as per your proposed between the two options and keep
>>  > them un-ordered.
>>  > There are many clients not maintained by the core community and also
>>  > internally in many companies, that would need to implement it. Whilst
>>  > trivial it complicates matters, its easier to just expect an unordered
>>  set
>>  > as will be converted to a map client side anyhow.
>>  >
>>  > Kostya
>>  >
>>  >
>>  >
>>  > On 04/10/2016 23:35, radai wrote:
>>  >>
>>  >> another potential benefit of headers is it would reduce the number of
>>  API
>>  >> changes required to support future features (as they could be
>>  implemented
>>  >> as plugins).
>>  >> that would greatly accelerate the rate with which kafka can be extended.
>>  >>
>>  >> On Mon, Oct 3, 2016 at 12:46 PM, 

Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-05 Thread radai
@Kostya - we do not support compacted topics in combination with in-house
headers. internal systems that use compacted topics (and need to be able to
send null values) use the open source producer and cannot use some of the
company infra.

yet another good use case for "official" header support :-)

On Wed, Oct 5, 2016 at 3:44 PM, Nacho Solis 
wrote:

> There are pros and cons to having the headers be completely layered above
> the broker (and hence wire level messages), let's look at the base ones.
>
> A few pros (of headers being a higher layer thing):
> - No broker changes
> - No protocol changes
> - Messages with headers can work with brokers that don't support them.
> - Multiple header styles can be used (different applications and middleware
> can use different header styles, or no headers at all)
>
> However, there are a number of cons
> - Brokers can't see the headers (part of the "V" black box)
> - No unified header system
>
>
> Now let's look at some higher level issues.
>
> Kafka messages are identified by their topic and partition. There is no way
> to identify what the data is, for that matter what type. There are normally
> 2 schools of thought in this area. In one, we identify the content type
> using some field in the enclosing layer (let's call this the Type-field
> approach). This is the approach the IP takes (and Ethernet).  There is a
> field telling you what the type inside is. The other method is the
> self-contained approach. We can call this the Self-Describing approach.
>
> Kafka does not offer the Type-field approach. There is no way to describe
> the structure of what's inside the Key and Value. This means that the Key
> and Value are only able to be interpreted if you know a priori what they
> have, or if they have some way to self-identify what they are. Some
> protocols achieve this by the use of magic numbers.
>
> If we don't have a protocol level header system, you will need to rely on
> some external method to identify what type of data is being sent and
> whether it has headers or not.  Right now, the different deployments rely
> on assumptions and some setups of a schema registry or some other
> hack. Many Kafka users rely on some mapping of avro, topic name,
> registration system, etc. This in and of itself causes issues with
> extensibility, versioning and backward/forward compatibility.
>
> A good example is auditing and tracing, where you force people to modify
> their avro schemas with any change to the audit information. The same would
> be true for provenance (if, for example, we wanted a universal way to sign
> the contents of a message).
>
> The need to add structured data to messages is universal. Organizations and
> developers will find ways around it.  Somebody might implement some
> validation in their Kafka client wrappers to ensure that their developers
> are using approved schemas, others will do encapsulation, others would just
> delegate this to the application itself, why not let them worry about what
> cluster they're going to?
>
> The benefit of having a base system that is universal is great because we
> can achieve large benefits for all:
> - structured data can be carried with the message (this data could come
> from middleware, the stack or even brokers)
> - a community of plugins and addons can develop (some open source some
> commercial?)
>
> Right now what we have seen is a number of organizations having developed
> their own in house systems.
>
> To be clear, Gwen is right that it's possible for LinkedIn to add this to
> our kafka client wrappers; we just think that it would be something
> valuable to everybody.
>
>
> (Also, it would be nice if we had a way to access the headers from the
> brokers, something that is not trivial at this time with the current broker
> architecture).
>
>
> Nacho
>
>
>
>
> On Wed, Oct 5, 2016 at 2:20 PM, Gwen Shapira  wrote:
>
> > Since LinkedIn has some kind of wrapper thingy that adds the headers,
> > where they could have added them to Apache Kafka - I'm very curious to
> > hear what drove that decision and the pros/cons of managing the
> > headers outside Kafka itself.
> >
> > On Wed, Oct 5, 2016 at 2:16 PM, K Burstev  wrote:
> > > +1
> > >
> > > This is a much needed feature, one I also have been waiting for, and
> that
> > > Kafka has been too long without.
> > >
> > > Especially since compaction the custom wrapper solution does not work
> > where
> > > you want a null payload but need the record to have headers.
> > >
> > > (It actually made me sign up to the mailing list so I could reply, as
> up
> > > until now I just browse the archives and forums)
> > >
> > >
> > > In general the KIP looks great. The solution address's all my core
> needs.
> > > Really hope this makes it to the next release after the current one.
> > >
> > >
> > > Questions:
> > >
> > > 1) Why not String,String headers?
> > >
> > > I assume reading the KIP it is for 

Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-05 Thread K Burstev

+1

This is a much needed feature, one I also have been waiting for, and 
that Kafka has been too long without.


Especially since compaction the custom wrapper solution does not work 
where you want a null payload but need the record to have headers.


(It actually made me sign up to the mailing list so I could reply, as up 
until now I just browse the archives and forums)



In general the KIP looks great. The solution address's all my core 
needs. Really hope this makes it to the next release after the current one.



Questions:

1) Why not String,String headers?

I assume reading the KIP it is for message size but surely compression 
would greatly reduce this overhead with Strings.


Many systems in the eco-sphere that kafka sits in, like JMS and Flume 
use String,String headers as such it would be easier to integrate with 
these frameworks/systems, as they can simply map across the headers.



2) Key Allocation

If you do keep with int keys why not make the key allocation the 
proposed why is it an example. The example makes sense after all, and 
seems very sensible and clean.


3) Header Ordering

I would avoid this as per your proposed between the two options and keep 
them un-ordered.
There are many clients not maintained by the core community and also 
internally in many companies, that would need to implement it. Whilst 
trivial it complicates matters, its easier to just expect an unordered 
set as will be converted to a map client side anyhow.


Kostya


On 04/10/2016 23:35, radai wrote:

another potential benefit of headers is it would reduce the number of API
changes required to support future features (as they could be implemented
as plugins).
that would greatly accelerate the rate with which kafka can be extended.

On Mon, Oct 3, 2016 at 12:46 PM, Michael Pearce 
wrote:


Opposite way around v4 instead of v3 ;)

From: Michael Pearce
Sent: Monday, October 3, 2016 8:45 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers

Thanks guys for spotting this, i have updated KIP-82 to state v3 instead
of v4.

Mike.

From: Becket Qin 
Sent: Monday, October 3, 2016 6:18 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers

Yes, KIP-74 has already been checked in. The new FetchRequest/Response
version should be V4. :)

On Mon, Oct 3, 2016 at 10:14 AM, Sean McCauliff <
smccaul...@linkedin.com.invalid> wrote:


Change to public interfaces:

"Add ProduceRequest/ProduceResponse V3 which uses the new message format.
Add FetchRequest/FetchResponse V3 which uses the new message format."

When I look at org.apache.kafka.common.requests.FetchResponse on
master I see that there is already a version 3.  Seems like this is
from a recent commit about implementing KIP-74.  Do we need to
coordinate these changes with KIP-74?


"The serialisation of the [int, bye[]] header set will on the wire
using a strict format"  bye[] -> byte[]

Sean
--
Sean McCauliff
Staff Software Engineer
Kafka

smccaul...@linkedin.com
linkedin.com/in/sean-mccauliff-b563192


On Fri, Sep 30, 2016 at 3:43 PM, radai 

wrote:

I think headers are a great idea.

Right now, people who are trying to implement any sort of org-wide
functionality like monitoring, tracing, profiling etc pretty much have

to

define their own wrapper layers, which probably leads to everyone
implementing their own variants of the same underlying functionality.

I think a common base for headers would allow implementing a lot of

this

functionality only one in a way that different header-based

capabilities

could be shared and composed and open the door the a wide range of

possible

Kafka middleware that's simply impossible to write against the current

API.

Here's a list of things that could be implemented as "plugins" on top

of

a

header mechanism (full list here -
https://cwiki.apache.org/confluence/display/KAFKA/A+

Case+for+Kafka+Headers).

A lot of these already exist within LinkedIn and could for example be

open

sourced if Kafka had headers. I'm fairly certain the same is true in

other

organizations using Kafka



On Thu, Sep 22, 2016 at 12:31 PM, Michael Pearce <

michael.pea...@ig.com>

wrote:


Hi All,


I would like to discuss the following KIP proposal:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-
82+-+Add+Record+Headers



I have some initial ?drafts of roughly the changes that would be

needed.

This is no where finalized and look forward to the discussion

especially as

some bits I'm personally in two minds about.

https://github.com/michaelandrepearce/kafka/tree/

kafka-headers-properties



Here is a link to a alternative option mentioned in the kip but one i
would personally would discard (disadvantages mentioned in kip)

https://github.com/michaelandrepearce/kafka/tree/kafka-headers-full?


Thanks

Mike





The information 

Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-05 Thread Nacho Solis
There are pros and cons to having the headers be completely layered above
the broker (and hence wire level messages), let's look at the base ones.

A few pros (of headers being a higher layer thing):
- No broker changes
- No protocol changes
- Messages with headers can work with brokers that don't support them.
- Multiple header styles can be used (different applications and middleware
can use different header styles, or no headers at all)

However, there are a number of cons
- Brokers can't see the headers (part of the "V" black box)
- No unified header system


Now let's look at some higher level issues.

Kafka messages are identified by their topic and partition. There is no way
to identify what the data is, for that matter what type. There are normally
2 schools of thought in this area. In one, we identify the content type
using some field in the enclosing layer (let's call this the Type-field
approach). This is the approach the IP takes (and Ethernet).  There is a
field telling you what the type inside is. The other method is the
self-contained approach. We can call this the Self-Describing approach.

Kafka does not offer the Type-field approach. There is no way to describe
the structure of what's inside the Key and Value. This means that the Key
and Value are only able to be interpreted if you know a priori what they
have, or if they have some way to self-identify what they are. Some
protocols achieve this by the use of magic numbers.

If we don't have a protocol level header system, you will need to rely on
some external method to identify what type of data is being sent and
whether it has headers or not.  Right now, the different deployments rely
on assumptions and some setups of a schema registry or some other
hack. Many Kafka users rely on some mapping of avro, topic name,
registration system, etc. This in and of itself causes issues with
extensibility, versioning and backward/forward compatibility.

A good example is auditing and tracing, where you force people to modify
their avro schemas with any change to the audit information. The same would
be true for provenance (if, for example, we wanted a universal way to sign
the contents of a message).

The need to add structured data to messages is universal. Organizations and
developers will find ways around it.  Somebody might implement some
validation in their Kafka client wrappers to ensure that their developers
are using approved schemas, others will do encapsulation, others would just
delegate this to the application itself, why not let them worry about what
cluster they're going to?

The benefit of having a base system that is universal is great because we
can achieve large benefits for all:
- structured data can be carried with the message (this data could come
from middleware, the stack or even brokers)
- a community of plugins and addons can develop (some open source some
commercial?)

Right now what we have seen is a number of organizations having developed
their own in house systems.

To be clear, Gwen is right that it's possible for LinkedIn to add this to
our kafka client wrappers; we just think that it would be something
valuable to everybody.


(Also, it would be nice if we had a way to access the headers from the
brokers, something that is not trivial at this time with the current broker
architecture).


Nacho




On Wed, Oct 5, 2016 at 2:20 PM, Gwen Shapira  wrote:

> Since LinkedIn has some kind of wrapper thingy that adds the headers,
> where they could have added them to Apache Kafka - I'm very curious to
> hear what drove that decision and the pros/cons of managing the
> headers outside Kafka itself.
>
> On Wed, Oct 5, 2016 at 2:16 PM, K Burstev  wrote:
> > +1
> >
> > This is a much needed feature, one I also have been waiting for, and that
> > Kafka has been too long without.
> >
> > Especially since compaction the custom wrapper solution does not work
> where
> > you want a null payload but need the record to have headers.
> >
> > (It actually made me sign up to the mailing list so I could reply, as up
> > until now I just browse the archives and forums)
> >
> >
> > In general the KIP looks great. The solution address's all my core needs.
> > Really hope this makes it to the next release after the current one.
> >
> >
> > Questions:
> >
> > 1) Why not String,String headers?
> >
> > I assume reading the KIP it is for message size but surely compression
> would
> > greatly reduce this overhead with Strings.
> >
> > Many systems in the eco-sphere that kafka sits in, like JMS and Flume use
> > String,String headers as such it would be easier to integrate with these
> > frameworks/systems, as they can simply map across the headers.
> >
> >
> > 2) Key Allocation
> >
> > If you do keep with int keys why not make the key allocation the proposed
> > why is it an example. The example makes sense after all, and seems very
> > sensible and clean.
> >
> > 3) Header Ordering
> >
> > I would avoid 

Re: Snazzy new look to our website

2016-10-05 Thread Avi Flax

> On Oct 4, 2016, at 19:38, Guozhang Wang  wrote:
> 
> The new look is great, thanks Derrick and Gwen!

+1 it’s excellent!

> 
> I'm wondering if we should still consider breaking "document.html" into
> multiple pages and indexed as sub-topics on the left bar?

+1 absolutely, this should _definitely_ be multiple pages. It’s pretty 
challenging to work with this one giant page.



Software Architect @ Park Assist
We’re hiring! http://tech.parkassist.com/jobs/



[GitHub] kafka-site pull request #21: Theme enhancement

2016-10-05 Thread derrickdoo
GitHub user derrickdoo opened a pull request:

https://github.com/apache/kafka-site/pull/21

Theme enhancement

- Added Apache branding guideline stuff
- 302 redirects for all the odd direct links to partials that are currently 
floating around
- Improved mobile scroll interaction and layout
- Sub-nav functionality on documentation

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/derrickdoo/kafka-site themeEnhancement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka-site/pull/21.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21


commit bc1a27d7e1564e1fe8d2ec154bc00c590aadd1ea
Author: Derrick Or 
Date:   2016-10-05T04:50:41Z

improve mobile interaction with nav

improve interaction with mobile nav

commit 00fc868b4336283c072e7089de6c5ac399f896f5
Author: Derrick Or 
Date:   2016-10-05T18:10:57Z

rewrite rules added to prevent users from loading up html files meant to be 
partials

commit 5699af74eca1dc28593c1f2652e337bbdc825582
Author: Derrick Or 
Date:   2016-10-05T20:23:13Z

add notification over past versions of docs and rename partials

commit 9535d7b551cac20d7368d2b1555a2f3e6a8242e9
Author: Derrick Or 
Date:   2016-10-05T21:46:23Z

make subnav system and add required apache branding links

commit 87c23a616fcd69f259b61a59d31cc9c439c72943
Author: Derrick Or 
Date:   2016-10-05T22:25:53Z

added subnav for docs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15550072#comment-15550072
 ] 

ASF GitHub Bot commented on KAFKA-3985:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1973


> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Flavio Junqueira
> Fix For: 0.10.1.0
>
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1973: KAFKA-3985: Transient system test failure ZooKeepe...

2016-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1973


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3985.

   Resolution: Fixed
Fix Version/s: 0.10.1.0

Issue resolved by pull request 1973
[https://github.com/apache/kafka/pull/1973]

> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Flavio Junqueira
> Fix For: 0.10.1.0
>
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-05 Thread radai
Linkedin currently just forces everyone using kafka to:
1. use avro as payload
2. add company-wide headers into their avro schema

this leads to problems with teams that would like to produce arbitrary
blobs while still benefitting from company-wide infra.

it also means every change to these fields is a company-wide version bump
across literally hundreds of source repositories.

moving to "standard" headers would allow much more rapid development
iteration and roll-out of features and capabilities (not to mention enable
arbitrary binary payloads)

why not strings - space efficiency. some of our payloads are very small
(<50 bytes) which would make the headers dominate the bandwidth.

header ordering - its mostly a micro-optimization. coupled with the
namespacing suggestion it would make it faster to check is any "system
headers" (low ids) exist. this could later be combined with lazy parsing of
the headers blob to minimize the overhead of server side plugins.



On Wed, Oct 5, 2016 at 2:20 PM, Gwen Shapira  wrote:

> Since LinkedIn has some kind of wrapper thingy that adds the headers,
> where they could have added them to Apache Kafka - I'm very curious to
> hear what drove that decision and the pros/cons of managing the
> headers outside Kafka itself.
>
> On Wed, Oct 5, 2016 at 2:16 PM, K Burstev  wrote:
> > +1
> >
> > This is a much needed feature, one I also have been waiting for, and that
> > Kafka has been too long without.
> >
> > Especially since compaction the custom wrapper solution does not work
> where
> > you want a null payload but need the record to have headers.
> >
> > (It actually made me sign up to the mailing list so I could reply, as up
> > until now I just browse the archives and forums)
> >
> >
> > In general the KIP looks great. The solution address's all my core needs.
> > Really hope this makes it to the next release after the current one.
> >
> >
> > Questions:
> >
> > 1) Why not String,String headers?
> >
> > I assume reading the KIP it is for message size but surely compression
> would
> > greatly reduce this overhead with Strings.
> >
> > Many systems in the eco-sphere that kafka sits in, like JMS and Flume use
> > String,String headers as such it would be easier to integrate with these
> > frameworks/systems, as they can simply map across the headers.
> >
> >
> > 2) Key Allocation
> >
> > If you do keep with int keys why not make the key allocation the proposed
> > why is it an example. The example makes sense after all, and seems very
> > sensible and clean.
> >
> > 3) Header Ordering
> >
> > I would avoid this as per your proposed between the two options and keep
> > them un-ordered.
> > There are many clients not maintained by the core community and also
> > internally in many companies, that would need to implement it. Whilst
> > trivial it complicates matters, its easier to just expect an unordered
> set
> > as will be converted to a map client side anyhow.
> >
> > Kostya
> >
> >
> >
> > On 04/10/2016 23:35, radai wrote:
> >>
> >> another potential benefit of headers is it would reduce the number of
> API
> >> changes required to support future features (as they could be
> implemented
> >> as plugins).
> >> that would greatly accelerate the rate with which kafka can be extended.
> >>
> >> On Mon, Oct 3, 2016 at 12:46 PM, Michael Pearce 
> >> wrote:
> >>
> >>> Opposite way around v4 instead of v3 ;)
> >>> 
> >>> From: Michael Pearce
> >>> Sent: Monday, October 3, 2016 8:45 PM
> >>> To: dev@kafka.apache.org
> >>> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> >>>
> >>> Thanks guys for spotting this, i have updated KIP-82 to state v3
> instead
> >>> of v4.
> >>>
> >>> Mike.
> >>> 
> >>> From: Becket Qin 
> >>> Sent: Monday, October 3, 2016 6:18 PM
> >>> To: dev@kafka.apache.org
> >>> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> >>>
> >>> Yes, KIP-74 has already been checked in. The new FetchRequest/Response
> >>> version should be V4. :)
> >>>
> >>> On Mon, Oct 3, 2016 at 10:14 AM, Sean McCauliff <
> >>> smccaul...@linkedin.com.invalid> wrote:
> >>>
>  Change to public interfaces:
> 
>  "Add ProduceRequest/ProduceResponse V3 which uses the new message
>  format.
>  Add FetchRequest/FetchResponse V3 which uses the new message format."
> 
>  When I look at org.apache.kafka.common.requests.FetchResponse on
>  master I see that there is already a version 3.  Seems like this is
>  from a recent commit about implementing KIP-74.  Do we need to
>  coordinate these changes with KIP-74?
> 
> 
>  "The serialisation of the [int, bye[]] header set will on the wire
>  using a strict format"  bye[] -> byte[]
> 
>  Sean
>  --
>  Sean McCauliff
>  Staff Software Engineer
>  Kafka
> 
>  smccaul...@linkedin.com
>  

Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-05 Thread Gwen Shapira
Since LinkedIn has some kind of wrapper thingy that adds the headers,
where they could have added them to Apache Kafka - I'm very curious to
hear what drove that decision and the pros/cons of managing the
headers outside Kafka itself.

On Wed, Oct 5, 2016 at 2:16 PM, K Burstev  wrote:
> +1
>
> This is a much needed feature, one I also have been waiting for, and that
> Kafka has been too long without.
>
> Especially since compaction the custom wrapper solution does not work where
> you want a null payload but need the record to have headers.
>
> (It actually made me sign up to the mailing list so I could reply, as up
> until now I just browse the archives and forums)
>
>
> In general the KIP looks great. The solution address's all my core needs.
> Really hope this makes it to the next release after the current one.
>
>
> Questions:
>
> 1) Why not String,String headers?
>
> I assume reading the KIP it is for message size but surely compression would
> greatly reduce this overhead with Strings.
>
> Many systems in the eco-sphere that kafka sits in, like JMS and Flume use
> String,String headers as such it would be easier to integrate with these
> frameworks/systems, as they can simply map across the headers.
>
>
> 2) Key Allocation
>
> If you do keep with int keys why not make the key allocation the proposed
> why is it an example. The example makes sense after all, and seems very
> sensible and clean.
>
> 3) Header Ordering
>
> I would avoid this as per your proposed between the two options and keep
> them un-ordered.
> There are many clients not maintained by the core community and also
> internally in many companies, that would need to implement it. Whilst
> trivial it complicates matters, its easier to just expect an unordered set
> as will be converted to a map client side anyhow.
>
> Kostya
>
>
>
> On 04/10/2016 23:35, radai wrote:
>>
>> another potential benefit of headers is it would reduce the number of API
>> changes required to support future features (as they could be implemented
>> as plugins).
>> that would greatly accelerate the rate with which kafka can be extended.
>>
>> On Mon, Oct 3, 2016 at 12:46 PM, Michael Pearce 
>> wrote:
>>
>>> Opposite way around v4 instead of v3 ;)
>>> 
>>> From: Michael Pearce
>>> Sent: Monday, October 3, 2016 8:45 PM
>>> To: dev@kafka.apache.org
>>> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
>>>
>>> Thanks guys for spotting this, i have updated KIP-82 to state v3 instead
>>> of v4.
>>>
>>> Mike.
>>> 
>>> From: Becket Qin 
>>> Sent: Monday, October 3, 2016 6:18 PM
>>> To: dev@kafka.apache.org
>>> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
>>>
>>> Yes, KIP-74 has already been checked in. The new FetchRequest/Response
>>> version should be V4. :)
>>>
>>> On Mon, Oct 3, 2016 at 10:14 AM, Sean McCauliff <
>>> smccaul...@linkedin.com.invalid> wrote:
>>>
 Change to public interfaces:

 "Add ProduceRequest/ProduceResponse V3 which uses the new message
 format.
 Add FetchRequest/FetchResponse V3 which uses the new message format."

 When I look at org.apache.kafka.common.requests.FetchResponse on
 master I see that there is already a version 3.  Seems like this is
 from a recent commit about implementing KIP-74.  Do we need to
 coordinate these changes with KIP-74?


 "The serialisation of the [int, bye[]] header set will on the wire
 using a strict format"  bye[] -> byte[]

 Sean
 --
 Sean McCauliff
 Staff Software Engineer
 Kafka

 smccaul...@linkedin.com
 linkedin.com/in/sean-mccauliff-b563192


 On Fri, Sep 30, 2016 at 3:43 PM, radai 
>>>
>>> wrote:
>
> I think headers are a great idea.
>
> Right now, people who are trying to implement any sort of org-wide
> functionality like monitoring, tracing, profiling etc pretty much have
>>>
>>> to
>
> define their own wrapper layers, which probably leads to everyone
> implementing their own variants of the same underlying functionality.
>
> I think a common base for headers would allow implementing a lot of
>>>
>>> this
>
> functionality only one in a way that different header-based
>>>
>>> capabilities
>
> could be shared and composed and open the door the a wide range of

 possible
>
> Kafka middleware that's simply impossible to write against the current

 API.
>
> Here's a list of things that could be implemented as "plugins" on top
>>>
>>> of

 a
>
> header mechanism (full list here -
> https://cwiki.apache.org/confluence/display/KAFKA/A+

 Case+for+Kafka+Headers).
>
> A lot of these already exist within LinkedIn and could for example be

 open
>
> sourced if Kafka had headers. I'm fairly 

Re: [VOTE] KIP-47 - Add timestamp-based log deletion policy

2016-10-05 Thread Bill Warshaw
Bumping for visibility.  KIP is here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy

On Wed, Aug 24, 2016 at 2:32 PM Bill Warshaw  wrote:

> Hello Guozhang,
>
> KIP-71 seems unrelated to this KIP.  KIP-47 is just adding a new deletion
> policy (minimum timestamp), while KIP-71 is allowing deletion and
> compaction to coexist.
>
> They both will touch LogManager, but the change for KIP-47 is very
> isolated.
>
> On Wed, Aug 24, 2016 at 2:21 PM Guozhang Wang  wrote:
>
> Hi Bill,
>
> I would like to reason if there is any correlation between this KIP and
> KIP-71
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-71%3A+Enable+log+compaction+and+deletion+to+co-exist
>
> I feel they are orthogonal but would like to double check with you.
>
>
> Guozhang
>
>
> On Wed, Aug 24, 2016 at 11:05 AM, Bill Warshaw 
> wrote:
>
> > I'd like to re-awaken this voting thread now that KIP-33 has merged.
> This
> > KIP is now completely unblocked.  I have a working branch off of trunk
> with
> > my proposed fix, including testing.
> >
> > On Mon, May 9, 2016 at 8:30 PM Guozhang Wang  wrote:
> >
> > > Jay, Bill:
> > >
> > > I'm thinking of one general use case of using timestamp rather than
> > offset
> > > for log deletion, which is that for expiration handling in data
> > > replication, when the source data store decides to expire some data
> > records
> > > based on their timestamps, today we need to configure the corresponding
> > > Kafka changelog topic for compaction, and actively send a tombstone for
> > > each expired record. Since expiration usually happens with a bunch of
> > > records, this could generate large tombstone traffic. For example I
> think
> > > LI's data replication for Espresso is seeing similar issues and they
> are
> > > just not sending tombstone at all.
> > >
> > > With timestamp based log deletion policy, this can be easily handled by
> > > simply setting the current expiration timestamp; but ideally one would
> > > prefer to configure this topic to be both log compaction enabled as
> well
> > as
> > > log deletion enabled. From that point of view, I feel that current KIP
> > > still has value to be accepted.
> > >
> > > Guozhang
> > >
> > >
> > > On Mon, May 2, 2016 at 2:37 PM, Bill Warshaw 
> > wrote:
> > >
> > > > Yes, I'd agree that offset is a more precise configuration than
> > > timestamp.
> > > > If there was a way to set a partition-level configuration, I would
> > rather
> > > > use log.retention.min.offset than timestamp.  If you have an approach
> > in
> > > > mind I'd be open to investigating it.
> > > >
> > > > On Mon, May 2, 2016 at 5:33 PM, Jay Kreps  wrote:
> > > >
> > > > > Gotcha, good point. But barring that limitation, you agree that
> that
> > > > makes
> > > > > more sense?
> > > > >
> > > > > -Jay
> > > > >
> > > > > On Mon, May 2, 2016 at 2:29 PM, Bill Warshaw 
> > > > wrote:
> > > > >
> > > > > > The problem with offset as a config option is that offsets are
> > > > > > partition-specific, so we'd need a per-partition config.  This
> > would
> > > > work
> > > > > > for our particular use case, where we have single-partition
> topics,
> > > but
> > > > > for
> > > > > > multiple-partition topics it would delete from all partitions
> based
> > > on
> > > > a
> > > > > > global topic-level offset.
> > > > > >
> > > > > > On Mon, May 2, 2016 at 4:32 PM, Jay Kreps 
> > wrote:
> > > > > >
> > > > > > > I think you are saying you considered a kind of trim() api that
> > > would
> > > > > > > synchronously chop off the tail of the log starting from a
> given
> > > > > offset.
> > > > > > > That would be one option, but what I was saying was slightly
> > > > different:
> > > > > > in
> > > > > > > the proposal you have where there is a config that controls
> > > retention
> > > > > > that
> > > > > > > the user would update, wouldn't it make more sense for this
> > config
> > > to
> > > > > be
> > > > > > > based on offset rather than timestamp?
> > > > > > >
> > > > > > > -Jay
> > > > > > >
> > > > > > > On Mon, May 2, 2016 at 12:53 PM, Bill Warshaw <
> > wdwars...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > 1.  Initially I looked at using the actual offset, by adding
> a
> > > call
> > > > > to
> > > > > > > > AdminUtils to just delete anything in a given topic/partition
> > to
> > > a
> > > > > > given
> > > > > > > > offset.  I ran into a lot of trouble here trying to work out
> > how
> > > > the
> > > > > > > system
> > > > > > > > would recognize that every broker had successfully deleted
> that
> > > > range
> > > > > > > from
> > > > > > > > the partition before returning to the client.  If we were ok
> > > > treating
> > > > > > > this
> > > > > > > > as a completely asynchronous operation I would be open to
> > > > revisiting
> 

[GitHub] kafka pull request #1974: AN-75711 Spike for Kafka server side changes to pr...

2016-10-05 Thread silpamittapalli
GitHub user silpamittapalli opened a pull request:

https://github.com/apache/kafka/pull/1974

AN-75711 Spike for Kafka server side changes to prevent a broker from being 
leader



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/silpamittapalli/kafka 
sm-kafka-broker-ineligible-leader

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1974.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1974


commit a942fb68abef7b81d5102927d7a1e90f015e9d39
Author: Silpa Mittapalli 
Date:   2016-10-05T20:37:52Z

AN-75711 Spike for Kafka server side changes to prevent a broker from being 
leader




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Fault injection tests for Kafka

2016-10-05 Thread Gwen Shapira
YES!

One of my goal for the fault-injection in our system tests is that
whoever fixes the issue will also add tests to make sure it stays
fixed.

On Wed, Oct 5, 2016 at 11:33 AM, Tom Crayford  wrote:
> I did some stuff like this recently with simple calls to `tc` (samples that
> I used were in the README for https://github.com/tylertreat/comcast). The
> only notable bug I found so far is that if you cut all the kafka nodes
> entirely off from zookeeper for say, 60 seconds, then reconnect them, the
> nodes don't crash, they report as healthy in JMX, but calls to fetch
> metadata from them timeout entirely. That can be fixed with a rolling
> restart, but it doesn't sound ideal (especially in the face of cloud
> networks, where short-lived total network outages can and do happen).
> Should I file a Jira detailing that bug?
>
> On Wed, Oct 5, 2016 at 7:26 PM, Gwen Shapira  wrote:
>
>> Yeah, totally agree on discussing what we want to test first and
>> implement anything later :)
>>
>> Its just that whenever I have this discussion Jepsen came up, so I was
>> curious what was driving the interest and whether the specific
>> framework is important to the community.
>>
>> On Tue, Oct 4, 2016 at 5:46 PM, Joel Koshy  wrote:
>> > Hi Gwen,
>> >
>> > I've also seen suggestions of using Jepsen for fault injection, but
>> >> I'm not familiar with this framework.
>> >>
>> >> What do you guys think? Write our own failure injection? or write
>> >> Kafka tests in Jepsen?
>> >>
>> >
>> > This would definitely add a lot of value and save a lot on release
>> > validation overheads. I have heard of Jepsen (via the blog), but haven't
>> > used it. At LinkedIn a couple of infra teams have been using Simoorg
>> >  which being python-based would
>> > perhaps be easier to use for system test writers than Clojure (under
>> > Jepsen). The Ambry  project at
>> LinkedIn
>> > uses it extensively (and I think has added several more failure scenarios
>> > which don't seem to be reflected in the github repo). Anyway, I think we
>> > should at least enumerate what we want to test and evaluate the
>> > alternatives before reinventing.
>> >
>> > Thanks,
>> >
>> > Joel
>>
>>
>>
>> --
>> Gwen Shapira
>> Product Manager | Confluent
>> 650.450.2760 | @gwenshap
>> Follow us: Twitter | blog
>>



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


Build failed in Jenkins: kafka-trunk-jdk8 #948

2016-10-05 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-4176: Only call printStream.flush for System.out

--
[...truncated 12037 lines...]
org.apache.kafka.common.config.ConfigDefTest > testValidateMissingConfigKey 
STARTED

org.apache.kafka.common.config.ConfigDefTest > testValidateMissingConfigKey 
PASSED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse STARTED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > testClientMode STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > testClientMode PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMissingUsernameSaslPlain STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMissingUsernameSaslPlain PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testUnauthenticatedApiVersionsRequestOverPlaintext STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testUnauthenticatedApiVersionsRequestOverPlaintext PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMechanismPluggability STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMechanismPluggability PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testApiVersionsRequestWithUnsupportedVersion STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testApiVersionsRequestWithUnsupportedVersion PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMissingPasswordSaslPlain STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMissingPasswordSaslPlain PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidLoginModule STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidLoginModule PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidMechanism STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidMechanism PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testDisabledMechanism STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testDisabledMechanism PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testPacketSizeTooBig STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testPacketSizeTooBig PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidUsernameSaslPlain STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidUsernameSaslPlain PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMultipleServerMechanisms STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMultipleServerMechanisms PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testValidSaslPlainOverPlaintext STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testValidSaslPlainOverPlaintext PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testValidSaslPlainOverSsl STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testValidSaslPlainOverSsl PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidApiVersionsRequestSequence STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidApiVersionsRequestSequence PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testDisallowedKafkaRequestsBeforeAuthentication STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testDisallowedKafkaRequestsBeforeAuthentication PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testUnauthenticatedApiVersionsRequestOverSsl STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testUnauthenticatedApiVersionsRequestOverSsl PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidPasswordSaslPlain STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidPasswordSaslPlain PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidSaslPacket STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testInvalidSaslPacket PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 

Build failed in Jenkins: kafka-0.10.1-jdk7 #47

2016-10-05 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-4176: Only call printStream.flush for System.out

--
[...truncated 6993 lines...]
kafka.log.FileMessageSetTest > testIteratorIsConsistent STARTED

kafka.log.FileMessageSetTest > testIteratorIsConsistent PASSED

kafka.log.FileMessageSetTest > testTruncateIfSizeIsDifferentToTargetSize STARTED

kafka.log.FileMessageSetTest > testTruncateIfSizeIsDifferentToTargetSize PASSED

kafka.log.FileMessageSetTest > testFormatConversionWithPartialMessage STARTED

kafka.log.FileMessageSetTest > testFormatConversionWithPartialMessage PASSED

kafka.log.FileMessageSetTest > testIterationDoesntChangePosition STARTED

kafka.log.FileMessageSetTest > testIterationDoesntChangePosition PASSED

kafka.log.FileMessageSetTest > testWrittenEqualsRead STARTED

kafka.log.FileMessageSetTest > testWrittenEqualsRead PASSED

kafka.log.FileMessageSetTest > testWriteTo STARTED

kafka.log.FileMessageSetTest > testWriteTo PASSED

kafka.log.FileMessageSetTest > testPreallocateFalse STARTED

kafka.log.FileMessageSetTest > testPreallocateFalse PASSED

kafka.log.FileMessageSetTest > testPreallocateClearShutdown STARTED

kafka.log.FileMessageSetTest > testPreallocateClearShutdown PASSED

kafka.log.FileMessageSetTest > testMessageFormatConversion STARTED

kafka.log.FileMessageSetTest > testMessageFormatConversion PASSED

kafka.log.FileMessageSetTest > testSearch STARTED

kafka.log.FileMessageSetTest > testSearch PASSED

kafka.log.FileMessageSetTest > testSizeInBytes STARTED

kafka.log.FileMessageSetTest > testSizeInBytes PASSED

kafka.log.LogConfigTest > shouldValidateThrottledReplicasConfig STARTED

kafka.log.LogConfigTest > shouldValidateThrottledReplicasConfig PASSED

kafka.log.LogConfigTest > testFromPropsEmpty STARTED

kafka.log.LogConfigTest > testFromPropsEmpty PASSED

kafka.log.LogConfigTest > testKafkaConfigToProps STARTED

kafka.log.LogConfigTest > testKafkaConfigToProps PASSED

kafka.log.LogConfigTest > testFromPropsInvalid STARTED

kafka.log.LogConfigTest > testFromPropsInvalid PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[17] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[17] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] PASSED

Re: [DISCUSS] Fault injection tests for Kafka

2016-10-05 Thread Tom Crayford
I did some stuff like this recently with simple calls to `tc` (samples that
I used were in the README for https://github.com/tylertreat/comcast). The
only notable bug I found so far is that if you cut all the kafka nodes
entirely off from zookeeper for say, 60 seconds, then reconnect them, the
nodes don't crash, they report as healthy in JMX, but calls to fetch
metadata from them timeout entirely. That can be fixed with a rolling
restart, but it doesn't sound ideal (especially in the face of cloud
networks, where short-lived total network outages can and do happen).
Should I file a Jira detailing that bug?

On Wed, Oct 5, 2016 at 7:26 PM, Gwen Shapira  wrote:

> Yeah, totally agree on discussing what we want to test first and
> implement anything later :)
>
> Its just that whenever I have this discussion Jepsen came up, so I was
> curious what was driving the interest and whether the specific
> framework is important to the community.
>
> On Tue, Oct 4, 2016 at 5:46 PM, Joel Koshy  wrote:
> > Hi Gwen,
> >
> > I've also seen suggestions of using Jepsen for fault injection, but
> >> I'm not familiar with this framework.
> >>
> >> What do you guys think? Write our own failure injection? or write
> >> Kafka tests in Jepsen?
> >>
> >
> > This would definitely add a lot of value and save a lot on release
> > validation overheads. I have heard of Jepsen (via the blog), but haven't
> > used it. At LinkedIn a couple of infra teams have been using Simoorg
> >  which being python-based would
> > perhaps be easier to use for system test writers than Clojure (under
> > Jepsen). The Ambry  project at
> LinkedIn
> > uses it extensively (and I think has added several more failure scenarios
> > which don't seem to be reflected in the github repo). Anyway, I think we
> > should at least enumerate what we want to test and evaluate the
> > alternatives before reinventing.
> >
> > Thanks,
> >
> > Joel
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>


Re: [DISCUSS] Fault injection tests for Kafka

2016-10-05 Thread Gwen Shapira
Yeah, totally agree on discussing what we want to test first and
implement anything later :)

Its just that whenever I have this discussion Jepsen came up, so I was
curious what was driving the interest and whether the specific
framework is important to the community.

On Tue, Oct 4, 2016 at 5:46 PM, Joel Koshy  wrote:
> Hi Gwen,
>
> I've also seen suggestions of using Jepsen for fault injection, but
>> I'm not familiar with this framework.
>>
>> What do you guys think? Write our own failure injection? or write
>> Kafka tests in Jepsen?
>>
>
> This would definitely add a lot of value and save a lot on release
> validation overheads. I have heard of Jepsen (via the blog), but haven't
> used it. At LinkedIn a couple of infra teams have been using Simoorg
>  which being python-based would
> perhaps be easier to use for system test writers than Clojure (under
> Jepsen). The Ambry  project at LinkedIn
> uses it extensively (and I think has added several more failure scenarios
> which don't seem to be reflected in the github repo). Anyway, I think we
> should at least enumerate what we want to test and evaluate the
> alternatives before reinventing.
>
> Thanks,
>
> Joel



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


[jira] [Commented] (KAFKA-4256) Use IP for ZK broker register

2016-10-05 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549507#comment-15549507
 ] 

Jason Gustafson commented on KAFKA-4256:


[~yarekt] We have a broker configuration {{advertised.listeners}} which should 
let you override the host used in Zookeeper. Does this not work?

> Use IP for ZK broker register
> -
>
> Key: KAFKA-4256
> URL: https://issues.apache.org/jira/browse/KAFKA-4256
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Yarek Tyshchenko
>Priority: Minor
>
> Kafka seems to default to using fqdn when registering itself with Zookeeper, 
> using the call "java.net.InetAddress.getCanonicalHostName()". This means that 
> in an environment where host's hostname doesn't resolve for zookeeper node 
> will make that node unreachable.
> Currently theres no way to tell kafka to just use the IP address, I 
> understand that it would be difficult to know which interface it should use 
> to get the IP from.
> One environment like this is docker (prior to version 1.11, where networks 
> are available). Only solution right now is to hard-code the IP address in the 
> configuration file.
> It would be nice if there was a configuration option to just use the IP 
> address of a specified interface.
> For reference I'm including my workaround for research:
> https://github.com/YarekTyshchenko/kafka-docker/blob/0d79fa4f1d5089de5ff2b6793f57103d9573fe3b/ip.sh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk7 #1603

2016-10-05 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-4176: Only call printStream.flush for System.out

--
[...truncated 3821 lines...]

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupLeaderAfterFollower STARTED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupLeaderAfterFollower PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupFromUnknownMember 
STARTED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupFromUnknownMember 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testValidLeaveGroup STARTED

kafka.coordinator.GroupCoordinatorResponseTest > testValidLeaveGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testDescribeGroupInactiveGroup 
STARTED

kafka.coordinator.GroupCoordinatorResponseTest > testDescribeGroupInactiveGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupNotCoordinator 
STARTED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupNotCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatUnknownConsumerExistingGroup STARTED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testValidHeartbeat STARTED

kafka.coordinator.GroupCoordinatorResponseTest > testValidHeartbeat PASSED

kafka.coordinator.GroupMetadataTest > testDeadToAwaitingSyncIllegalTransition 
STARTED

kafka.coordinator.GroupMetadataTest > testDeadToAwaitingSyncIllegalTransition 
PASSED

kafka.coordinator.GroupMetadataTest > testOffsetCommitFailure STARTED

kafka.coordinator.GroupMetadataTest > testOffsetCommitFailure PASSED

kafka.coordinator.GroupMetadataTest > 
testPreparingRebalanceToStableIllegalTransition STARTED

kafka.coordinator.GroupMetadataTest > 
testPreparingRebalanceToStableIllegalTransition PASSED

kafka.coordinator.GroupMetadataTest > testStableToDeadTransition STARTED

kafka.coordinator.GroupMetadataTest > testStableToDeadTransition PASSED

kafka.coordinator.GroupMetadataTest > testInitNextGenerationEmptyGroup STARTED

kafka.coordinator.GroupMetadataTest > testInitNextGenerationEmptyGroup PASSED

kafka.coordinator.GroupMetadataTest > testCannotRebalanceWhenDead STARTED

kafka.coordinator.GroupMetadataTest > testCannotRebalanceWhenDead PASSED

kafka.coordinator.GroupMetadataTest > testInitNextGeneration STARTED

kafka.coordinator.GroupMetadataTest > testInitNextGeneration PASSED

kafka.coordinator.GroupMetadataTest > testPreparingRebalanceToEmptyTransition 
STARTED

kafka.coordinator.GroupMetadataTest > testPreparingRebalanceToEmptyTransition 
PASSED

kafka.coordinator.GroupMetadataTest > testSelectProtocol STARTED

kafka.coordinator.GroupMetadataTest > testSelectProtocol PASSED

kafka.coordinator.GroupMetadataTest > testCannotRebalanceWhenPreparingRebalance 
STARTED

kafka.coordinator.GroupMetadataTest > testCannotRebalanceWhenPreparingRebalance 
PASSED

kafka.coordinator.GroupMetadataTest > 
testDeadToPreparingRebalanceIllegalTransition STARTED

kafka.coordinator.GroupMetadataTest > 
testDeadToPreparingRebalanceIllegalTransition PASSED

kafka.coordinator.GroupMetadataTest > testCanRebalanceWhenAwaitingSync STARTED

kafka.coordinator.GroupMetadataTest > testCanRebalanceWhenAwaitingSync PASSED

kafka.coordinator.GroupMetadataTest > 
testAwaitingSyncToPreparingRebalanceTransition STARTED

kafka.coordinator.GroupMetadataTest > 
testAwaitingSyncToPreparingRebalanceTransition PASSED

kafka.coordinator.GroupMetadataTest > testStableToAwaitingSyncIllegalTransition 
STARTED

kafka.coordinator.GroupMetadataTest > testStableToAwaitingSyncIllegalTransition 
PASSED

kafka.coordinator.GroupMetadataTest > testEmptyToDeadTransition STARTED

kafka.coordinator.GroupMetadataTest > testEmptyToDeadTransition PASSED

kafka.coordinator.GroupMetadataTest > testSelectProtocolRaisesIfNoMembers 
STARTED

kafka.coordinator.GroupMetadataTest > testSelectProtocolRaisesIfNoMembers PASSED

kafka.coordinator.GroupMetadataTest > testStableToPreparingRebalanceTransition 
STARTED

kafka.coordinator.GroupMetadataTest > testStableToPreparingRebalanceTransition 
PASSED

kafka.coordinator.GroupMetadataTest > testPreparingRebalanceToDeadTransition 
STARTED

kafka.coordinator.GroupMetadataTest > testPreparingRebalanceToDeadTransition 
PASSED

kafka.coordinator.GroupMetadataTest > testStableToStableIllegalTransition 
STARTED

kafka.coordinator.GroupMetadataTest > testStableToStableIllegalTransition PASSED

kafka.coordinator.GroupMetadataTest > testAwaitingSyncToStableTransition STARTED

kafka.coordinator.GroupMetadataTest > testAwaitingSyncToStableTransition PASSED

kafka.coordinator.GroupMetadataTest > testOffsetCommitFailureWithAnotherPending 
STARTED

kafka.coordinator.GroupMetadataTest > testOffsetCommitFailureWithAnotherPending 
PASSED

kafka.coordinator.GroupMetadataTest > testDeadToStableIllegalTransition STARTED


[jira] [Commented] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread Geoff Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549319#comment-15549319
 ] 

Geoff Anderson commented on KAFKA-3985:
---

Yes, using canonical location in /tmp is absolutely problematic if multiple 
ducktape processes are running. 

[~fpj] getting this fix in for greener builds will be great.

> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Flavio Junqueira
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4254) Questionable handling of unknown partitions in KafkaProducer

2016-10-05 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549298#comment-15549298
 ] 

Ismael Juma commented on KAFKA-4254:


Yeah, one refresh makes sense to me.

> Questionable handling of unknown partitions in KafkaProducer
> 
>
> Key: KAFKA-4254
> URL: https://issues.apache.org/jira/browse/KAFKA-4254
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.10.1.1
>
>
> Currently the producer will raise an {{IllegalArgumentException}} if the user 
> attempts to write to a partition which has just been created. This is caused 
> by the fact that the producer does not attempt to refetch topic metadata in 
> this case, which means that its check for partition validity is based on 
> stale metadata.
> If the topic for the partition did not already exist, it works fine because 
> the producer will block until it has metadata for the topic, so this case is 
> primarily hit when the number of partitions is dynamically increased. 
> A couple options to fix this that come to mind:
> 1. We could treat unknown partitions just as we do unknown topics. If the 
> partition doesn't exist, we refetch metadata and try again (timing out when 
> max.block.ms is reached).
> 2. We can at least throw a more specific exception so that users can handle 
> the error. Raising {{IllegalArgumentException}} is not helpful in practice 
> because it can also be caused by other error.s
> My inclination is to do the first one since the producer seems incorrect to 
> tell the user that the partition is invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549221#comment-15549221
 ] 

Flavio Junqueira commented on KAFKA-3985:
-

I've created a pull request for this, focused on this issue. KAFKA-4140 is a 
broader set of changes, and it could incorporate the changes I'm proposing in 
the PR. In fact, it seems to change it in a similar way. 

It is still WIP, though, and might be better for the sake of seeing more green 
in our builds to check this one in while we wait for KAFKA-4140, but I'm happy 
with whatever folks prefer.

> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Flavio Junqueira
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-4254) Questionable handling of unknown partitions in KafkaProducer

2016-10-05 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-4254:
---
Comment: was deleted

(was: [~ijuma] I'm sympathetic to your concern, but it seems a little difficult 
to justify the current behavior of raising an exception based on stale 
metadata. As a user, I expect that I can produce to an existing partition even 
if it has just been created, so requiring me to catch exceptions until the 
producer's metadata refresh interval expires seems a bit clunky. Of course, the 
producer doesn't necessarily need to block the full {{max.block.ms}} until the 
partition is created, maybe one metadata refresh prior to raising would be 
sufficient?)

> Questionable handling of unknown partitions in KafkaProducer
> 
>
> Key: KAFKA-4254
> URL: https://issues.apache.org/jira/browse/KAFKA-4254
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.10.1.1
>
>
> Currently the producer will raise an {{IllegalArgumentException}} if the user 
> attempts to write to a partition which has just been created. This is caused 
> by the fact that the producer does not attempt to refetch topic metadata in 
> this case, which means that its check for partition validity is based on 
> stale metadata.
> If the topic for the partition did not already exist, it works fine because 
> the producer will block until it has metadata for the topic, so this case is 
> primarily hit when the number of partitions is dynamically increased. 
> A couple options to fix this that come to mind:
> 1. We could treat unknown partitions just as we do unknown topics. If the 
> partition doesn't exist, we refetch metadata and try again (timing out when 
> max.block.ms is reached).
> 2. We can at least throw a more specific exception so that users can handle 
> the error. Raising {{IllegalArgumentException}} is not helpful in practice 
> because it can also be caused by other error.s
> My inclination is to do the first one since the producer seems incorrect to 
> tell the user that the partition is invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4254) Questionable handling of unknown partitions in KafkaProducer

2016-10-05 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549204#comment-15549204
 ] 

Jason Gustafson commented on KAFKA-4254:


[~ijuma] I'm sympathetic to your concern, but it seems a little difficult to 
justify the current behavior of raising an exception based on stale metadata. 
As a user, I expect that I can produce to an existing partition even if it has 
just been created, so requiring me to catch exceptions until the producer's 
metadata refresh interval expires seems a bit clunky. Of course, the producer 
doesn't necessarily need to block the full {{max.block.ms}} until the partition 
is created, maybe one metadata refresh prior to raising would be sufficient?

> Questionable handling of unknown partitions in KafkaProducer
> 
>
> Key: KAFKA-4254
> URL: https://issues.apache.org/jira/browse/KAFKA-4254
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.10.1.1
>
>
> Currently the producer will raise an {{IllegalArgumentException}} if the user 
> attempts to write to a partition which has just been created. This is caused 
> by the fact that the producer does not attempt to refetch topic metadata in 
> this case, which means that its check for partition validity is based on 
> stale metadata.
> If the topic for the partition did not already exist, it works fine because 
> the producer will block until it has metadata for the topic, so this case is 
> primarily hit when the number of partitions is dynamically increased. 
> A couple options to fix this that come to mind:
> 1. We could treat unknown partitions just as we do unknown topics. If the 
> partition doesn't exist, we refetch metadata and try again (timing out when 
> max.block.ms is reached).
> 2. We can at least throw a more specific exception so that users can handle 
> the error. Raising {{IllegalArgumentException}} is not helpful in practice 
> because it can also be caused by other error.s
> My inclination is to do the first one since the producer seems incorrect to 
> tell the user that the partition is invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4254) Questionable handling of unknown partitions in KafkaProducer

2016-10-05 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549203#comment-15549203
 ] 

Jason Gustafson commented on KAFKA-4254:


[~ijuma] I'm sympathetic to your concern, but it seems a little difficult to 
justify the current behavior of raising an exception based on stale metadata. 
As a user, I expect that I can produce to an existing partition even if it has 
just been created, so requiring me to catch exceptions until the producer's 
metadata refresh interval expires seems a bit clunky. Of course, the producer 
doesn't necessarily need to block the full {{max.block.ms}} until the partition 
is created, maybe one metadata refresh prior to raising would be sufficient?

> Questionable handling of unknown partitions in KafkaProducer
> 
>
> Key: KAFKA-4254
> URL: https://issues.apache.org/jira/browse/KAFKA-4254
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.10.1.1
>
>
> Currently the producer will raise an {{IllegalArgumentException}} if the user 
> attempts to write to a partition which has just been created. This is caused 
> by the fact that the producer does not attempt to refetch topic metadata in 
> this case, which means that its check for partition validity is based on 
> stale metadata.
> If the topic for the partition did not already exist, it works fine because 
> the producer will block until it has metadata for the topic, so this case is 
> primarily hit when the number of partitions is dynamically increased. 
> A couple options to fix this that come to mind:
> 1. We could treat unknown partitions just as we do unknown topics. If the 
> partition doesn't exist, we refetch metadata and try again (timing out when 
> max.block.ms is reached).
> 2. We can at least throw a more specific exception so that users can handle 
> the error. Raising {{IllegalArgumentException}} is not helpful in practice 
> because it can also be caused by other error.s
> My inclination is to do the first one since the producer seems incorrect to 
> tell the user that the partition is invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira reassigned KAFKA-3985:
---

Assignee: Flavio Junqueira

> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Flavio Junqueira
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549175#comment-15549175
 ] 

ASF GitHub Bot commented on KAFKA-3985:
---

GitHub user fpj opened a pull request:

https://github.com/apache/kafka/pull/1973

KAFKA-3985: Transient system test failure 
ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fpj/kafka KAFKA-3985

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1973


commit 28ea6c105fbdd81f0cc5b8687bea4a356223915e
Author: fpj 
Date:   2016-10-05T09:25:46Z

Adding temporary directory for CA files.

commit e1f8275ac4fcffaaa35310501dfa1821b7c270bf
Author: fpj 
Date:   2016-10-05T09:48:40Z

Adding self ref to rmtree to avoid compilation error.

commit a673e863ec44ddda357f4e8b34331a2a2e284022
Author: fpj 
Date:   2016-10-05T09:51:46Z

Fixed another compilation error.

commit 2c2b2258b616c576d86075a61201deb61a7f90ca
Author: fpj 
Date:   2016-10-05T09:55:20Z

Back with rmtree reference.

commit 210084f50352448cdb52d938301ffa21720967e2
Author: fpj 
Date:   2016-10-05T11:29:51Z

Switching to use atexit




> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1973: KAFKA-3985: Transient system test failure ZooKeepe...

2016-10-05 Thread fpj
GitHub user fpj opened a pull request:

https://github.com/apache/kafka/pull/1973

KAFKA-3985: Transient system test failure 
ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fpj/kafka KAFKA-3985

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1973


commit 28ea6c105fbdd81f0cc5b8687bea4a356223915e
Author: fpj 
Date:   2016-10-05T09:25:46Z

Adding temporary directory for CA files.

commit e1f8275ac4fcffaaa35310501dfa1821b7c270bf
Author: fpj 
Date:   2016-10-05T09:48:40Z

Adding self ref to rmtree to avoid compilation error.

commit a673e863ec44ddda357f4e8b34331a2a2e284022
Author: fpj 
Date:   2016-10-05T09:51:46Z

Fixed another compilation error.

commit 2c2b2258b616c576d86075a61201deb61a7f90ca
Author: fpj 
Date:   2016-10-05T09:55:20Z

Back with rmtree reference.

commit 210084f50352448cdb52d938301ffa21720967e2
Author: fpj 
Date:   2016-10-05T11:29:51Z

Switching to use atexit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (KAFKA-4176) Node stopped receiving heartbeat responses once another node started within the same group

2016-10-05 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reassigned KAFKA-4176:


Assignee: Guozhang Wang

> Node stopped receiving heartbeat responses once another node started within 
> the same group
> --
>
> Key: KAFKA-4176
> URL: https://issues.apache.org/jira/browse/KAFKA-4176
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.1
> Environment: Centos 7: 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 
> 11:36:42 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> Java: java version "1.8.0_101"
> Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
>Reporter: Marek Svitok
>Assignee: Guozhang Wang
> Fix For: 0.10.1.1, 0.10.2.0
>
>
> I have 3 nodes working in the same group. I started them one after the other. 
> As I can see from the log the node once started receives heartbeat responses
> for the group it is part of. However once I start another node the former one 
> stops receiving these responses and the new one keeps receiving them. 
> Moreover it stops consuming any messages from previously assigner partitions:
> Node0
> 03:14:36.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.223 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.429 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30170 after 0ms
> 03:14:39.462 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30171 after 0ms
> 03:14:42.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:42.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:45.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:45.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:48.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - Attempt 
> to heart beat failed for group test_streams_id since it is rebalancing.
> 03:14:48.224 [StreamThread-2] INFO  o.a.k.c.c.i.ConsumerCoordinator - 
> Revoking previously assigned partitions [StreamTopic-2] for group 
> test_streams_id
> 03:14:48.224 [StreamThread-2] INFO  o.a.k.s.p.internals.StreamThread - 
> Removing a task 0_2
> Node1
> 03:22:18.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:18.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:21.709 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:21.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.717 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.872 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30172 after 0ms
> 03:22:24.992 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30173 after 0ms
> 03:22:27.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:27.717 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:30.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:30.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> Configuration used:
> 03:14:24.520 [main] INFO  o.a.k.c.producer.ProducerConfig - ProducerConfig 
> values: 
>  

[jira] [Resolved] (KAFKA-4176) Node stopped receiving heartbeat responses once another node started within the same group

2016-10-05 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-4176.
--
   Resolution: Fixed
Fix Version/s: 0.10.1.1
   0.10.2.0

Issue resolved by pull request 1965
[https://github.com/apache/kafka/pull/1965]

> Node stopped receiving heartbeat responses once another node started within 
> the same group
> --
>
> Key: KAFKA-4176
> URL: https://issues.apache.org/jira/browse/KAFKA-4176
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.1
> Environment: Centos 7: 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 
> 11:36:42 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> Java: java version "1.8.0_101"
> Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
>Reporter: Marek Svitok
> Fix For: 0.10.2.0, 0.10.1.1
>
>
> I have 3 nodes working in the same group. I started them one after the other. 
> As I can see from the log the node once started receives heartbeat responses
> for the group it is part of. However once I start another node the former one 
> stops receiving these responses and the new one keeps receiving them. 
> Moreover it stops consuming any messages from previously assigner partitions:
> Node0
> 03:14:36.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.223 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.429 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30170 after 0ms
> 03:14:39.462 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30171 after 0ms
> 03:14:42.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:42.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:45.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:45.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:48.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - Attempt 
> to heart beat failed for group test_streams_id since it is rebalancing.
> 03:14:48.224 [StreamThread-2] INFO  o.a.k.c.c.i.ConsumerCoordinator - 
> Revoking previously assigned partitions [StreamTopic-2] for group 
> test_streams_id
> 03:14:48.224 [StreamThread-2] INFO  o.a.k.s.p.internals.StreamThread - 
> Removing a task 0_2
> Node1
> 03:22:18.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:18.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:21.709 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:21.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.717 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.872 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30172 after 0ms
> 03:22:24.992 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30173 after 0ms
> 03:22:27.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:27.717 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:30.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:30.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> Configuration used:

[jira] [Commented] (KAFKA-4176) Node stopped receiving heartbeat responses once another node started within the same group

2016-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549073#comment-15549073
 ] 

ASF GitHub Bot commented on KAFKA-4176:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1965


> Node stopped receiving heartbeat responses once another node started within 
> the same group
> --
>
> Key: KAFKA-4176
> URL: https://issues.apache.org/jira/browse/KAFKA-4176
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.1
> Environment: Centos 7: 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 
> 11:36:42 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> Java: java version "1.8.0_101"
> Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
>Reporter: Marek Svitok
> Fix For: 0.10.1.1, 0.10.2.0
>
>
> I have 3 nodes working in the same group. I started them one after the other. 
> As I can see from the log the node once started receives heartbeat responses
> for the group it is part of. However once I start another node the former one 
> stops receiving these responses and the new one keeps receiving them. 
> Moreover it stops consuming any messages from previously assigner partitions:
> Node0
> 03:14:36.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.223 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:39.429 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30170 after 0ms
> 03:14:39.462 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30171 after 0ms
> 03:14:42.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:42.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:45.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:45.224 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:14:48.224 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - Attempt 
> to heart beat failed for group test_streams_id since it is rebalancing.
> 03:14:48.224 [StreamThread-2] INFO  o.a.k.c.c.i.ConsumerCoordinator - 
> Revoking previously assigned partitions [StreamTopic-2] for group 
> test_streams_id
> 03:14:48.224 [StreamThread-2] INFO  o.a.k.s.p.internals.StreamThread - 
> Removing a task 0_2
> Node1
> 03:22:18.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:18.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:21.709 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:21.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.717 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:24.872 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30172 after 0ms
> 03:22:24.992 [main-SendThread(mujsignal-03:2182)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x256bc1ce8c30173 after 0ms
> 03:22:27.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:27.717 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:30.710 [StreamThread-2] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> 03:22:30.716 [StreamThread-1] DEBUG o.a.k.c.c.i.AbstractCoordinator - 
> Received successful heartbeat response for group test_streams_id
> Configuration used:
> 03:14:24.520 [main] 

[GitHub] kafka pull request #1965: KAFKA-4176: Only call printStream.flush for System...

2016-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1965


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Support for Kafka

2016-10-05 Thread Syed Hussaini
Dear Kafka team.
I am in the Implementation stage of Kafka cluster and looking to find out 
does Apache Kafka supported for Ubuntu 16.04 LTS - Xenial.

Would be great if you please let us know.


[The Exchange Lab]

Syed Hussaini
Infrastructure Engineer

1 Neathouse Place
5th Floor
London, England, SW1V 1LH


syed.hussa...@theexchangelab.com

T 0203 701 3177





Follow us on Twitter: @exchangelab | Visit us 
on LinkedIn: The Exchange Lab





[jira] [Commented] (KAFKA-3224) Add timestamp-based log deletion policy

2016-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548776#comment-15548776
 ] 

ASF GitHub Bot commented on KAFKA-3224:
---

GitHub user bill-warshaw opened a pull request:

https://github.com/apache/kafka/pull/1972

KAFKA-3224: New log deletion policy based on timestamp

* adds a new topic-level broker configuration, `log.retention.min.timestamp`
  * if unset, this setting is ignored
  * setting this value to a Unix timestamp will allow the log cleaner to 
delete any segments for a given topic whose last timestamp is earlier than the 
set timestamp

--

### 
[KIP-47](https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy)
### [JIRA](https://issues.apache.org/jira/browse/KAFKA-3224)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bill-warshaw/kafka KAFKA-3224

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1972


commit 008dc813d82e5938201f069712ba1bd44277e755
Author: Bill Warshaw 
Date:   2016-02-02T16:45:47Z

KAFKA-3224: New log deletion policy based on timestamp

* setting log.retention.min.timestamp will set a timestamp for a log,
  and any message before that timestamp is eligible for deletion




> Add timestamp-based log deletion policy
> ---
>
> Key: KAFKA-3224
> URL: https://issues.apache.org/jira/browse/KAFKA-3224
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Bill Warshaw
>  Labels: kafka
>
> One of Kafka's officially-described use cases is a distributed commit log 
> (http://kafka.apache.org/documentation.html#uses_commitlog). In this case, 
> for a distributed service that needed a commit log, there would be a topic 
> with a single partition to guarantee log order. This service would use the 
> commit log to re-sync failed nodes. Kafka is generally an excellent fit for 
> such a system, but it does not expose an adequate mechanism for log cleanup 
> in such a case. With a distributed commit log, data can only be deleted when 
> the client application determines that it is no longer needed; this creates 
> completely arbitrary ranges of time and size for messages, which the existing 
> cleanup mechanisms can't handle smoothly.
> A new deletion policy based on the absolute timestamp of a message would work 
> perfectly for this case.  The client application will periodically update the 
> minimum timestamp of messages to retain, and Kafka will delete all messages 
> earlier than that timestamp using the existing log cleaner thread mechanism.
> This is based off of the work being done in KIP-32 - Add timestamps to Kafka 
> message.
> h3. Initial Approach
> https://github.com/apache/kafka/compare/trunk...bill-warshaw:KAFKA-3224



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1972: KAFKA-3224: New log deletion policy based on times...

2016-10-05 Thread bill-warshaw
GitHub user bill-warshaw opened a pull request:

https://github.com/apache/kafka/pull/1972

KAFKA-3224: New log deletion policy based on timestamp

* adds a new topic-level broker configuration, `log.retention.min.timestamp`
  * if unset, this setting is ignored
  * setting this value to a Unix timestamp will allow the log cleaner to 
delete any segments for a given topic whose last timestamp is earlier than the 
set timestamp

--

### 
[KIP-47](https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy)
### [JIRA](https://issues.apache.org/jira/browse/KAFKA-3224)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bill-warshaw/kafka KAFKA-3224

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1972


commit 008dc813d82e5938201f069712ba1bd44277e755
Author: Bill Warshaw 
Date:   2016-02-02T16:45:47Z

KAFKA-3224: New log deletion policy based on timestamp

* setting log.retention.min.timestamp will set a timestamp for a log,
  and any message before that timestamp is eligible for deletion




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4257) Inconsistencies in 0.10.1 upgrade docs

2016-10-05 Thread Jeff Klukas (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548729#comment-15548729
 ] 

Jeff Klukas commented on KAFKA-4257:


The pull request definitely helps. Thanks for jumping in to make those changes. 
I left a comment on the PR.

> Inconsistencies in 0.10.1 upgrade docs 
> ---
>
> Key: KAFKA-4257
> URL: https://issues.apache.org/jira/browse/KAFKA-4257
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.1.0
>Reporter: Jeff Klukas
>Priority: Minor
> Fix For: 0.10.1.0
>
>
> There are several inconsistencies in the 0.10.1.0 upgrade docs that make it 
> difficult to determine what client versions are compatible with what broker 
> versions.
> The initial heading in the upgrade docs is "Upgrading from 0.10.0.X to 
> 0.10.1.0", but it includes clauses about versions prior to 0.10. Is the 
> intention for these instructions to be valid for upgrading from brokers as 
> far back as 0.8? Should this section simply be called "Upgrading to 0.10.1.0"?
> I cannot tell from the docs whether I can upgrade to 0.10.1.0 clients on top 
> of 0.10.0.X brokers. In particular, step 5 of the upgrade instructions 
> mentions "Once all consumers have been upgraded to 0.10.0". Should that read 
> 0.10.1, or is the intention here truly that clients on 9.X or below need to 
> be at version 0.10.0.0 at a minimum?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4257) Inconsistencies in 0.10.1 upgrade docs

2016-10-05 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548714#comment-15548714
 ] 

Ismael Juma commented on KAFKA-4257:


[~jeff.klu...@gmail.com], does the following help?

https://github.com/apache/kafka/pull/1971

> Inconsistencies in 0.10.1 upgrade docs 
> ---
>
> Key: KAFKA-4257
> URL: https://issues.apache.org/jira/browse/KAFKA-4257
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.1.0
>Reporter: Jeff Klukas
>Priority: Minor
> Fix For: 0.10.1.0
>
>
> There are several inconsistencies in the 0.10.1.0 upgrade docs that make it 
> difficult to determine what client versions are compatible with what broker 
> versions.
> The initial heading in the upgrade docs is "Upgrading from 0.10.0.X to 
> 0.10.1.0", but it includes clauses about versions prior to 0.10. Is the 
> intention for these instructions to be valid for upgrading from brokers as 
> far back as 0.8? Should this section simply be called "Upgrading to 0.10.1.0"?
> I cannot tell from the docs whether I can upgrade to 0.10.1.0 clients on top 
> of 0.10.0.X brokers. In particular, step 5 of the upgrade instructions 
> mentions "Once all consumers have been upgraded to 0.10.0". Should that read 
> 0.10.1, or is the intention here truly that clients on 9.X or below need to 
> be at version 0.10.0.0 at a minimum?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1971: MINOR: Clarify 0.10.1.0 upgrade docs

2016-10-05 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1971

MINOR: Clarify 0.10.1.0 upgrade docs

This is a minor change to fix the most glaring issues. We have another JIRA 
to revamp the upgrade docs.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-4257-upgrade-docs-inconsitencies

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1971


commit 6d3b527641b0bc394b6958f4f5049116e2a2db49
Author: Ismael Juma 
Date:   2016-10-05T13:28:53Z

Minor clarifications in 0.10.1.0 upgrade docs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4246) Discretionary partition assignment on the consumer side not functional

2016-10-05 Thread Alexandru Ionita (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548700#comment-15548700
 ] 

Alexandru Ionita commented on KAFKA-4246:
-

It seems the {{unsubscribe()}} call does the trick. 
In this case, could the error message be improved? It was really misleading. 

Thanks.

> Discretionary partition assignment on the consumer side not functional
> --
>
> Key: KAFKA-4246
> URL: https://issues.apache.org/jira/browse/KAFKA-4246
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.0.1
>Reporter: Alexandru Ionita
>
> Trying to manually assign partition/topics to a consumer will not work 
> correctly. The consumer will be able to fetch records from the given 
> partitions, but the first commit will fail with the following message:
> {code}
> 2016-10-03 13:44:50.673 DEBUG 11757 --- [pool-9-thread-1] 
> o.a.k.c.c.internals.ConsumerCoordinator  : Offset commit for group XX 
> failed: The coordinator is not aware of this member.
> 2016-10-03 13:44:50.673  WARN 11757 --- [pool-9-thread-1] 
> o.a.k.c.c.internals.ConsumerCoordinator  : Auto offset commit failed for 
> group XX: Commit cannot be completed since the group has already 
> rebalanced and assigned the partitions to another member. This means that the 
> time between subsequent calls to poll() was longer than the configured 
> session.timeout.ms, which typically implies that the poll loop is spending 
> too much time message processing. You can address this either by increasing 
> the session timeout or by reducing the maximum size of batches returned in 
> poll() with max.poll.records.
> {code}.
> All this while the consumer will continue to poll records from the kafka 
> cluster, but every commit will fail with the same message.
> I tried setting the {{session.timeout.ms}} to values like 5, but I was 
> getting the same outcome => no successfull commits.
> If I only switch from {{consumer.assign( subscribedPartitions )}} to 
> {{consumer.subscribe( topics )}}, everything works as expected. No other 
> client configurations should be changed to make it work.
> Am I missing something here?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4257) Inconsistencies in 0.10.1 upgrade docs

2016-10-05 Thread Jeff Klukas (JIRA)
Jeff Klukas created KAFKA-4257:
--

 Summary: Inconsistencies in 0.10.1 upgrade docs 
 Key: KAFKA-4257
 URL: https://issues.apache.org/jira/browse/KAFKA-4257
 Project: Kafka
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.10.1.0
Reporter: Jeff Klukas
Priority: Minor
 Fix For: 0.10.1.0


There are several inconsistencies in the 0.10.1.0 upgrade docs that make it 
difficult to determine what client versions are compatible with what broker 
versions.

The initial heading in the upgrade docs is "Upgrading from 0.10.0.X to 
0.10.1.0", but it includes clauses about versions prior to 0.10. Is the 
intention for these instructions to be valid for upgrading from brokers as far 
back as 0.8? Should this section simply be called "Upgrading to 0.10.1.0"?

I cannot tell from the docs whether I can upgrade to 0.10.1.0 clients on top of 
0.10.0.X brokers. In particular, step 5 of the upgrade instructions mentions 
"Once all consumers have been upgraded to 0.10.0". Should that read 0.10.1, or 
is the intention here truly that clients on 9.X or below need to be at version 
0.10.0.0 at a minimum?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4256) Use IP for ZK broker register

2016-10-05 Thread Yarek Tyshchenko (JIRA)
Yarek Tyshchenko created KAFKA-4256:
---

 Summary: Use IP for ZK broker register
 Key: KAFKA-4256
 URL: https://issues.apache.org/jira/browse/KAFKA-4256
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Yarek Tyshchenko
Priority: Minor


Kafka seems to default to using fqdn when registering itself with Zookeeper, 
using the call "java.net.InetAddress.getCanonicalHostName()". This means that 
in an environment where host's hostname doesn't resolve for zookeeper node will 
make that node unreachable.
Currently theres no way to tell kafka to just use the IP address, I understand 
that it would be difficult to know which interface it should use to get the IP 
from.

One environment like this is docker (prior to version 1.11, where networks are 
available). Only solution right now is to hard-code the IP address in the 
configuration file.

It would be nice if there was a configuration option to just use the IP address 
of a specified interface.

For reference I'm including my workaround for research:
https://github.com/YarekTyshchenko/kafka-docker/blob/0d79fa4f1d5089de5ff2b6793f57103d9573fe3b/ip.sh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4253) Fix Kafka Stream thread shutting down process ordering

2016-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548284#comment-15548284
 ] 

ASF GitHub Bot commented on KAFKA-4253:
---

GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/1970

KAFKA-4253: Fix Kafka Stream thread shutting down process ordering

Changed the ordering in `StreamThread.shutdown`
1. commitAll (we need to commit so that any cached data is flushed through 
the topology)
2. close all tasks
3. producer.flush() - so any records produced during close are flushed and 
we have offsets for them
4. close all state managers
5. close producers/consumers
6. remove the tasks

Also in `onPartitionsRevoked`
1. commitAll
2. close all tasks
3. producer.flush
4. close all state managers


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-4253

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1970


commit d52e8de5fefac51c9eb59a9c965bc17dcd3aa3bc
Author: Damian Guy 
Date:   2016-10-05T10:03:21Z

change shutdown order




> Fix Kafka Stream thread shutting down process ordering
> --
>
> Key: KAFKA-4253
> URL: https://issues.apache.org/jira/browse/KAFKA-4253
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Guozhang Wang
>Assignee: Damian Guy
>
> Currently we close the stream thread in the following way:
> 0. Commit all tasks.
> 1. Close producer.
> 2. Close consumer.
> 3. Close restore consumer.
> 4. For each task, close its topology processors one-by-one following the 
> topology order by calling {{processor.close()}}.
> 5. For each task, close its state manager as well as flushing and closing all 
> its associated state stores.
> We choose to close the producer / consumer clients before shutting down the 
> tasks because we need to make sure all sent records has been acked so that we 
> have the right log-end-offset when closing the store and checkpointing the 
> offset of the changelog. However there is also an issue with this ordering, 
> in which users choose to write more records in their {{processor.close()}} 
> calls, this will cause RTE since the producers has already been closed, and 
> no changelog records will be able to write.
> Thinking about this issue, a more appropriate ordering will be:
> 1. For each task, close their topology processors following the topology 
> order by calling {{processor.close()}}.
> 2. For each task, commit its state by calling {{task.commit()}}. At this time 
> all sent records should be acked since {{producer.flush()}} is called.
> 3. For each task, close their {{ProcessorStateManager}}.
> 4. Close all embedded clients, i.e. producer / consumer / restore consumer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1970: KAFKA-4253: Fix Kafka Stream thread shutting down ...

2016-10-05 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/1970

KAFKA-4253: Fix Kafka Stream thread shutting down process ordering

Changed the ordering in `StreamThread.shutdown`
1. commitAll (we need to commit so that any cached data is flushed through 
the topology)
2. close all tasks
3. producer.flush() - so any records produced during close are flushed and 
we have offsets for them
4. close all state managers
5. close producers/consumers
6. remove the tasks

Also in `onPartitionsRevoked`
1. commitAll
2. close all tasks
3. producer.flush
4. close all state managers


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-4253

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1970


commit d52e8de5fefac51c9eb59a9c965bc17dcd3aa3bc
Author: Damian Guy 
Date:   2016-10-05T10:03:21Z

change shutdown order




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-789) Producer-side persistence for delivery guarantee

2016-10-05 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548191#comment-15548191
 ] 

Ismael Juma commented on KAFKA-789:
---

We are tracking this issue via https://issues.apache.org/jira/browse/KAFKA-1955 
which has a WIP patch.

> Producer-side persistence for delivery guarantee
> 
>
> Key: KAFKA-789
> URL: https://issues.apache.org/jira/browse/KAFKA-789
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Reporter: Matan Safriel
>Assignee: Jun Rao
>Priority: Minor
> Fix For: 0.10.1.0
>
>
> A suggestion for higher guarantee for the part of entering messages into 
> Kafka through it's producer. It aims to address the case that the entire set 
> of broker replicas for a topic and partition is not available. Currently, in 
> that case, data is lost. When a message set exhausts the send retry counter, 
> the message set will be simply dropped. It would be nice being able to 
> provide higher guarantee that a message passed to the producer would 
> eventually be received by the broker. 
> In an environment with some disk space to spare for this on the producer 
> side, persisting to disk would seem to enable keeping messages for later 
> retry (until defined space limits are exhausted). Thus somewhat elevating the 
> level of guarantee. 
> One way to facilitate this would be capitalizing on 
> https://issues.apache.org/jira/browse/KAFKA-496, as the feedback it will add 
> will enable knowing what needs to be retried again later. Changes to the 
> producer or a wrapper around it (that may require access to the partitioning 
> functions) would be able to persist failed message sets and manage delivery 
> with a nice level of guarantee. As it would affect performance and use disks, 
> should probably be a non-default option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-156) Messages should not be dropped when brokers are unavailable

2016-10-05 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548193#comment-15548193
 ] 

Ismael Juma commented on KAFKA-156:
---

To clarify, the issue where are tracking this is now 
https://issues.apache.org/jira/browse/KAFKA-1955 since it includes a WIP patch.

> Messages should not be dropped when brokers are unavailable
> ---
>
> Key: KAFKA-156
> URL: https://issues.apache.org/jira/browse/KAFKA-156
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Sharad Agarwal
>Assignee: Dru Panchal
>
> When none of the broker is available, producer should spool the messages to 
> disk and keep retrying for brokers to come back.
> This will also enable brokers upgrade/maintenance without message loss.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-156) Messages should not be dropped when brokers are unavailable

2016-10-05 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-156:
--
Fix Version/s: (was: 0.10.1.0)

> Messages should not be dropped when brokers are unavailable
> ---
>
> Key: KAFKA-156
> URL: https://issues.apache.org/jira/browse/KAFKA-156
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Sharad Agarwal
>Assignee: Dru Panchal
>
> When none of the broker is available, producer should spool the messages to 
> disk and keep retrying for brokers to come back.
> This will also enable brokers upgrade/maintenance without message loss.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548160#comment-15548160
 ] 

Rajini Sivaram commented on KAFKA-3985:
---

[~fpj] At the moment they use a fixed filename since the file never gets 
deleted. But I think [~geoffra] is fixing this under KAFKA-4140.

> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548125#comment-15548125
 ] 

Flavio Junqueira edited comment on KAFKA-3985 at 10/5/16 9:02 AM:
--

[~rsivaram]

bq. I wouldn't have expected to see a CA with start time 09:55:15 at all. Is it 
possible that there was another system test running on that same host (the 
tests use a fixed file name for CA and truststore, so the tests would fail if 
there are multiple instances of system tests or any command that loads the 
security config class that is run on that host).

That's a very good point, I'm thinking that the problem is that the CA files 
are in {{/tmp}}: 

{noformat}
self.ca_crt_path = "/tmp/test.ca.crt"
self.ca_jks_path = "/tmp/test.ca.jks"
{noformat}

and perhaps we should use {{mkdtemp}} like in {{generate_and_copy_keystore}}. 
Does it make sense?



was (Author: fpj):
[~rsivaram]

.bq I wouldn't have expected to see a CA with start time 09:55:15 at all. Is it 
possible that there was another system test running on that same host (the 
tests use a fixed file name for CA and truststore, so the tests would fail if 
there are multiple instances of system tests or any command that loads the 
security config class that is run on that host).

That's a very good point, I'm thinking that the problem is that the CA files 
are in {{/tmp}}: 

{noformat}
self.ca_crt_path = "/tmp/test.ca.crt"
self.ca_jks_path = "/tmp/test.ca.jks"
{noformat}

and perhaps we should use {{mkdtemp}} like in {{generate_and_copy_keystore}}. 
Does it make sense?


> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548125#comment-15548125
 ] 

Flavio Junqueira commented on KAFKA-3985:
-

[~rsivaram]

.bq I wouldn't have expected to see a CA with start time 09:55:15 at all. Is it 
possible that there was another system test running on that same host (the 
tests use a fixed file name for CA and truststore, so the tests would fail if 
there are multiple instances of system tests or any command that loads the 
security config class that is run on that host).

That's a very good point, I'm thinking that the problem is that the CA files 
are in {{/tmp}}: 

{noformat}
self.ca_crt_path = "/tmp/test.ca.crt"
self.ca_jks_path = "/tmp/test.ca.jks"
{noformat}

and perhaps we should use {{mkdtemp}} like in {{generate_and_copy_keystore}}. 
Does it make sense?


> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3985) Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol

2016-10-05 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548091#comment-15548091
 ] 

Rajini Sivaram commented on KAFKA-3985:
---

[~fpj] I would not have expected each test to create its own CA. Do the logs 
above correspond to 
https://jenkins.confluent.io/job/system-test-kafka-branch-builder-2/124? The 
timings seem to match. From the console logs there, CA with start time 09:35:36 
would correspond to the CA explicitly generated by 
{{kafkatest.tests.core.security_test.SecurityTest.test_client_ssl_endpoint_validation_failure}}.
 I wouldn't have expected to see a CA with start time 09:55:15 at all. Is it 
possible that there was another system test running on that same host (the 
tests use a fixed file name for CA and truststore, so the tests would fail if 
there are multiple instances of system tests or any command that loads the 
security config class that is run on that host).

> Transient system test failure 
> ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
> -
>
> Key: KAFKA-3985
> URL: https://issues.apache.org/jira/browse/KAFKA-3985
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jason Gustafson
>
> Found this in the nightly build on the 0.10.0 branch. Full details here: 
> http://testing.confluent.io/confluent-kafka-0-10-0-system-test-results/?prefix=2016-07-22--001.1469199875--apache--0.10.0--71a598a/.
>   
> {code}
> test_id:
> 2016-07-22--001.kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
> status: FAIL
> run time:   5 minutes 14.067 seconds
> 292 acked message did not make it to the Consumer. They are: 11264, 
> 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 11275, 
> 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 more. 
> Total Acked: 11343, Total Consumed: 11054. We validated that the first 272 of 
> these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/venv/local/lib/python2.7/site-packages/ducktape/mark/_mark.py",
>  line 331, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
>  line 115, in test_zk_security_upgrade
> self.run_produce_consume_validate(self.run_zk_migration)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in run_produce_consume_validate
> raise e
> AssertionError: 292 acked message did not make it to the Consumer. They are: 
> 11264, 11265, 11266, 11267, 11268, 11269, 11270, 11271, 11272, 11273, 11274, 
> 11275, 11276, 11277, 11278, 11279, 11280, 11281, 11282, 11283, ...plus 252 
> more. Total Acked: 11343, Total Consumed: 11054. We validated that the first 
> 272 of these missing messages correctly made it into Kafka's data files. This 
> suggests they were lost on their way to the consumer.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-4253) Fix Kafka Stream thread shutting down process ordering

2016-10-05 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy reassigned KAFKA-4253:
-

Assignee: Damian Guy

> Fix Kafka Stream thread shutting down process ordering
> --
>
> Key: KAFKA-4253
> URL: https://issues.apache.org/jira/browse/KAFKA-4253
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Guozhang Wang
>Assignee: Damian Guy
>
> Currently we close the stream thread in the following way:
> 0. Commit all tasks.
> 1. Close producer.
> 2. Close consumer.
> 3. Close restore consumer.
> 4. For each task, close its topology processors one-by-one following the 
> topology order by calling {{processor.close()}}.
> 5. For each task, close its state manager as well as flushing and closing all 
> its associated state stores.
> We choose to close the producer / consumer clients before shutting down the 
> tasks because we need to make sure all sent records has been acked so that we 
> have the right log-end-offset when closing the store and checkpointing the 
> offset of the changelog. However there is also an issue with this ordering, 
> in which users choose to write more records in their {{processor.close()}} 
> calls, this will cause RTE since the producers has already been closed, and 
> no changelog records will be able to write.
> Thinking about this issue, a more appropriate ordering will be:
> 1. For each task, close their topology processors following the topology 
> order by calling {{processor.close()}}.
> 2. For each task, commit its state by calling {{task.commit()}}. At this time 
> all sent records should be acked since {{producer.flush()}} is called.
> 3. For each task, close their {{ProcessorStateManager}}.
> 4. Close all embedded clients, i.e. producer / consumer / restore consumer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (KAFKA-4253) Fix Kafka Stream thread shutting down process ordering

2016-10-05 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-4253 started by Damian Guy.
-
> Fix Kafka Stream thread shutting down process ordering
> --
>
> Key: KAFKA-4253
> URL: https://issues.apache.org/jira/browse/KAFKA-4253
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Guozhang Wang
>Assignee: Damian Guy
>
> Currently we close the stream thread in the following way:
> 0. Commit all tasks.
> 1. Close producer.
> 2. Close consumer.
> 3. Close restore consumer.
> 4. For each task, close its topology processors one-by-one following the 
> topology order by calling {{processor.close()}}.
> 5. For each task, close its state manager as well as flushing and closing all 
> its associated state stores.
> We choose to close the producer / consumer clients before shutting down the 
> tasks because we need to make sure all sent records has been acked so that we 
> have the right log-end-offset when closing the store and checkpointing the 
> offset of the changelog. However there is also an issue with this ordering, 
> in which users choose to write more records in their {{processor.close()}} 
> calls, this will cause RTE since the producers has already been closed, and 
> no changelog records will be able to write.
> Thinking about this issue, a more appropriate ordering will be:
> 1. For each task, close their topology processors following the topology 
> order by calling {{processor.close()}}.
> 2. For each task, commit its state by calling {{task.commit()}}. At this time 
> all sent records should be acked since {{producer.flush()}} is called.
> 3. For each task, close their {{ProcessorStateManager}}.
> 4. Close all embedded clients, i.e. producer / consumer / restore consumer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4255) Scheme not compatible with

2016-10-05 Thread Alex (JIRA)
Alex created KAFKA-4255:
---

 Summary: Scheme not compatible with 
 Key: KAFKA-4255
 URL: https://issues.apache.org/jira/browse/KAFKA-4255
 Project: Kafka
  Issue Type: Bug
Reporter: Alex


Schemes for protocols SASL_PLAINTEXT and SASL_SSL are incorrect since they do 
not correspond to RFC 3986 (3.1. Scheme, p. 17,  
https://tools.ietf.org/html/rfc3986#page-17) and are incompatible with 
java.net.Uri.

Correct scheme is: 
scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )

That is, underscores should not be used. 

Possible solution:
SASL_SSL -> SASL-SSL
SASL_PLAINTEXT -> SASL-PLAINTEXT




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)