[jira] [Resolved] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException

2018-06-05 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6762.

Resolution: Duplicate

Duplicate of KAFKA-6854.

> log-cleaner thread terminates due to java.lang.IllegalStateException
> 
>
> Key: KAFKA-6762
> URL: https://issues.apache.org/jira/browse/KAFKA-6762
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.0.0
> Environment: os: GNU/Linux 
> arch: x86_64 
> Kernel: 4.9.77 
> jvm: OpenJDK 1.8.0
>Reporter: Ricardo Bartolome
>Priority: Major
> Attachments: __consumer_offsets-9_.tar.xz
>
>
> We are experiencing some problems with kafka log-cleaner thread on Kafka 
> 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order 
> to fix KAFKA-6683, but until then we can only confirm that it happens in 
> 1.0.0.
> log-cleaner thread crashes after a while with the following error:
> {code:java}
> [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log 
> __consumer_offsets-31. (kafka.log.LogCleaner)
> [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for 
> __consumer_offsets-31... (kafka.log.LogCleaner)
> [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log 
> __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). 
> (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log 
> __consumer_offsets-31 complete. (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 
> (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior 
> to Sat Feb 24 11:04:21 GMT 2018
> )... (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log 
> __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into 
> 0, discarding deletes. (kafka.log.LogClea
> ner)
> [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from 
> 262144bytes to 524288 bytes. (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from 
> 524288bytes to 112 bytes. (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to 
> (kafka.log.LogCleaner)
> java.lang.IllegalStateException: This log contains a message larger than 
> maximum allowable size of 112.
> at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459)
> at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396)
> at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395)
> at scala.collection.immutable.List.foreach(List.scala:389)
> at kafka.log.Cleaner.doClean(LogCleaner.scala:395)
> at kafka.log.Cleaner.clean(LogCleaner.scala:372)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64)
> [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped 
> (kafka.log.LogCleaner)
> [2018-04-04 14:25:12,773] INFO The cleaning for partition 
> __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner)
> [2018-04-04 14:25:12,773] INFO Compaction for partition 
> __broker-11-health-check-0 is resumed (kafka.log.LogCleaner)
> [2018-04-04 14:25:12,774] INFO The cleaning for partition 
> __broker-11-health-check-0 is aborted (kafka.log.LogCleaner)
> [2018-04-04 14:25:22,850] INFO Shutting down the log cleaner. 
> (kafka.log.LogCleaner)
> [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutting down 
> (kafka.log.LogCleaner)
> [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutdown 
> completed (kafka.log.LogCleaner)
> {code}
> What we know so far is:
>  * We are unable to reproduce it yet in a consistent manner.
>  * It only happens in the PRO cluster and not in the PRE cluster for the same 
> customer (which message payloads are very similar)
>  * Checking our Kafka logs, it only happened on the internal topics 
> *__consumer_offsets-**
>  * When we restart the broker process the log-cleaner starts working again 
> but it can take between 3 minutes and some hours to die again.
>  * We workaround it by temporary increasing the message.max.bytes and 
> replica.fetch.max.bytes values to 10485760 (10MB) from default 112 (~1MB).
> ** Before message.max.bytes = 10MB, we tried to match message.max.size with 
> the value of replica.fetch.max.size (1048576), but log-cleaned died with the 
> same error but different limit.
>  ** This 

[jira] [Resolved] (KAFKA-6965) log4j:WARN log messages printed when running kafka-console-producer OOB

2018-06-01 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6965.

Resolution: Not A Bug

> log4j:WARN log messages printed when running kafka-console-producer OOB
> ---
>
> Key: KAFKA-6965
> URL: https://issues.apache.org/jira/browse/KAFKA-6965
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Yeva Byzek
>Priority: Major
>  Labels: newbie
>
> This error message is presented running `bin/kafka-console-producer` out of 
> the box.  
> {noformat}
> log4j:WARN No appenders could be found for logger 
> (kafka.utils.Log4jControllerRegistration$).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6760) responses not logged properly in controller

2018-06-01 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6760.

   Resolution: Fixed
Fix Version/s: 1.1.1
   2.0.0

> responses not logged properly in controller
> ---
>
> Key: KAFKA-6760
> URL: https://issues.apache.org/jira/browse/KAFKA-6760
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
>Assignee: Mickael Maison
>Priority: Major
>  Labels: newbie
> Fix For: 2.0.0, 1.1.1
>
>
> Saw the following logging in controller.log. We need to log the 
> StopReplicaResponse properly in KafkaController.
> [2018-04-05 14:38:41,878] DEBUG [Controller id=0] Delete topic callback 
> invoked for org.apache.kafka.common.requests.StopReplicaResponse@263d40c 
> (kafka.controller.K
> afkaController)
> It seems that the same issue exists for LeaderAndIsrResponse as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6973) setting invalid timestamp causes Kafka broker restart to fail

2018-06-01 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6973.

   Resolution: Fixed
Fix Version/s: 2.0.0

> setting invalid timestamp causes Kafka broker restart to fail
> -
>
> Key: KAFKA-6973
> URL: https://issues.apache.org/jira/browse/KAFKA-6973
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Affects Versions: 1.1.0
>Reporter: Paul Brebner
>Assignee: huxihx
>Priority: Critical
> Fix For: 2.0.0
>
>
> Setting timestamp to invalid value causes Kafka broker to fail upon startup. 
> E.g.
> ./kafka-topics.sh --create --zookeeper localhost --topic duck3 --partitions 1 
> --replication-factor 1 --config message.timestamp.type=boom
>  
> Also note that the docs says the parameter name is 
> log.message.timestamp.type, but this is silently ignored.
> This works with no error for the invalid timestamp value. But next time you 
> restart Kafka:
>  
> [2018-05-29 13:09:05,806] FATAL [KafkaServer id=0] Fatal error during 
> KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
> java.util.NoSuchElementException: Invalid timestamp type boom
> at org.apache.kafka.common.record.TimestampType.forName(TimestampType.java:39)
> at kafka.log.LogConfig.(LogConfig.scala:94)
> at kafka.log.LogConfig$.fromProps(LogConfig.scala:279)
> at kafka.log.LogManager$$anonfun$17.apply(LogManager.scala:786)
> at kafka.log.LogManager$$anonfun$17.apply(LogManager.scala:785)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:221)
> at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428)
> at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at kafka.log.LogManager$.apply(LogManager.scala:785)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:222)
> at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
> at kafka.Kafka$.main(Kafka.scala:92)
> at kafka.Kafka.main(Kafka.scala)
> [2018-05-29 13:09:05,811] INFO [KafkaServer id=0] shutting down 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6956) Use Java AdminClient in BrokerApiVersionsCommand

2018-05-26 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6956:
--

 Summary: Use Java AdminClient in BrokerApiVersionsCommand
 Key: KAFKA-6956
 URL: https://issues.apache.org/jira/browse/KAFKA-6956
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


The Scala AdminClient was introduced as a stop gap until we had an officially 
supported API. The Java AdminClient is the supported API so we should migrate 
all usages to it and remove the Scala AdminClient. This JIRA is for using the 
Java AdminClient in BrokerApiVersionsCommand. We would need to verify that the 
necessary APIs are available via the Java AdminClient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6955) Use Java AdminClient in DeleteRecordsCommand

2018-05-26 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6955:
--

 Summary: Use Java AdminClient in DeleteRecordsCommand
 Key: KAFKA-6955
 URL: https://issues.apache.org/jira/browse/KAFKA-6955
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


The Scala AdminClient was introduced as a stop gap until we had an officially 
supported API. The Java AdminClient is the supported API so we should migrate 
all usages to it and remove the Scala AdminClient. This JIRA is for using the 
Java AdminClient in DeleteRecordsCommand.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6930) Update KafkaZkClient debug log

2018-05-25 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6930.

Resolution: Fixed

> Update KafkaZkClient debug log
> --
>
> Key: KAFKA-6930
> URL: https://issues.apache.org/jira/browse/KAFKA-6930
> Project: Kafka
>  Issue Type: Improvement
>  Components: core, zkclient
>Affects Versions: 1.1.0
>Reporter: darion yaphet
>Priority: Trivial
> Attachments: [KAFKA-6930]_Update_KafkaZkClient_debug_log.patch, 
> snapshot.png
>
>
> Currently , KafkaZkClient could print data: Array[Byte] in debug log , we 
> should print data as String . 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (KAFKA-3665) Default ssl.endpoint.identification.algorithm should be https

2018-05-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-3665:


> Default ssl.endpoint.identification.algorithm should be https
> -
>
> Key: KAFKA-3665
> URL: https://issues.apache.org/jira/browse/KAFKA-3665
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 2.0.0
>
>
> The default `ssl.endpoint.identification.algorithm` is `null` which is not a 
> secure default (man in the middle attacks are possible).
> We should probably use `https` instead. A more conservative alternative would 
> be to update the documentation instead of changing the default.
> A paper on the topic (thanks to Ryan Pridgeon for the reference): 
> http://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-3665) Default ssl.endpoint.identification.algorithm should be https

2018-05-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3665.

Resolution: Fixed

> Default ssl.endpoint.identification.algorithm should be https
> -
>
> Key: KAFKA-3665
> URL: https://issues.apache.org/jira/browse/KAFKA-3665
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 2.0.0
>
>
> The default `ssl.endpoint.identification.algorithm` is `null` which is not a 
> secure default (man in the middle attacks are possible).
> We should probably use `https` instead. A more conservative alternative would 
> be to update the documentation instead of changing the default.
> A paper on the topic (thanks to Ryan Pridgeon for the reference): 
> http://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6873) Broker is not returning data including requested offset

2018-05-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6873.

Resolution: Not A Problem

> Broker is not returning data including requested offset
> ---
>
> Key: KAFKA-6873
> URL: https://issues.apache.org/jira/browse/KAFKA-6873
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 1.1.0
> Environment: ubuntu
>Reporter: Adam Dratwinski
>Priority: Minor
>
> After upgrading Kafka to 1.1.0 from 0.9.x I experience issues related with 
> broker returning incomplete responses. This happens for my all log compacted 
> topics. I am using Golang client (Sarama).
> I debugged the issue and found that for some requests brokers return 
> FetchResponse with all messages having offsets lower then requested. For 
> example, I request for offset 1078831, I get FetchResponse with only one 
> message having offset 1078830, which produces missing blocks error. If I 
> request the next offset (1078832), then I get a block with many messages, 
> starting with much higher offset (e.g 1083813). There is a big gap in offsets 
> between these records, probably because I am using log compacted topics, but 
> all expected messages are there.
> Sarama client treats this as consumer error:
> {quote}kafka: response did not contain all the expected topic/partition blocks
> {quote}
> For build-in java client this issue is not happening. Looks like it is less 
> restrict regarding the data order, and when the offset is missing in the 
> returned block, it just simply request the next offset.
> I reported this issue at Shopify/sarama Github project (see 
> [https://github.com/Shopify/sarama/issues/1087)], where I got response, that 
> this seems to be Kafka bug, as according to the documentation, in this 
> situation broker should never return only messages having lower offsets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6923) Deprecate ExtendedSerializer/Serializer and ExtendedDeserializer/Deserializer

2018-05-20 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6923:
--

 Summary: Deprecate ExtendedSerializer/Serializer and 
ExtendedDeserializer/Deserializer
 Key: KAFKA-6923
 URL: https://issues.apache.org/jira/browse/KAFKA-6923
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Ismael Juma
 Fix For: 2.1.0


The Javadoc of ExtendedDeserializer states:

{code}
 * Prefer {@link Deserializer} if access to the headers is not required. Once 
Kafka drops support for Java 7, the
 * {@code deserialize()} method introduced by this interface will be added to 
Deserializer with a default implementation
 * so that backwards compatibility is maintained. This interface may be 
deprecated once that happens.
{code}

Since we have dropped Java 7 support, we should figure out how to do this. 
There are compatibility implications, so a KIP is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6921) Remove old Scala producer and all related code, tests, and tools

2018-05-19 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6921:
--

 Summary: Remove old Scala producer and all related code, tests, 
and tools
 Key: KAFKA-6921
 URL: https://issues.apache.org/jira/browse/KAFKA-6921
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma
Assignee: Ismael Juma
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-5907) Support aggregatedJavadoc in Java 9

2018-05-14 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5907.

Resolution: Fixed

[~omkreddy] It does seem to work now, so will mark it as resolved.

> Support aggregatedJavadoc in Java 9
> ---
>
> Key: KAFKA-5907
> URL: https://issues.apache.org/jira/browse/KAFKA-5907
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
> Fix For: 2.0.0
>
>
> The Java 9 Javadoc tool has some improvements including a search bar. 
> However, it currently fails with a number of errors like:
> {code}
> > Task :aggregatedJavadoc
> /Users/ijuma/src/kafka/streams/src/main/java/org/apache/kafka/streams/Topology.java:29:
>  error: package org.apache.kafka.streams.processor.internals does not exist
> import org.apache.kafka.streams.processor.internals.InternalTopologyBuilder;
>^
> /Users/ijuma/src/kafka/streams/src/main/java/org/apache/kafka/streams/Topology.java:30:
>  error: package org.apache.kafka.streams.processor.internals does not exist
> import org.apache.kafka.streams.processor.internals.ProcessorNode;
>^
> /Users/ijuma/src/kafka/streams/src/main/java/org/apache/kafka/streams/Topology.java:31:
>  error: package org.apache.kafka.streams.processor.internals does not exist
> import org.apache.kafka.streams.processor.internals.ProcessorTopology;
>^
> /Users/ijuma/src/kafka/streams/src/main/java/org/apache/kafka/streams/Topology.java:32:
>  error: package org.apache.kafka.streams.processor.internals does not exist
> import org.apache.kafka.streams.processor.internals.SinkNode;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6828) Index files are no longer sparse in Java 9/10 due to OpenJDK regression

2018-05-10 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6828.

Resolution: Fixed

Great! Let's close this then.

> Index files are no longer sparse in Java 9/10 due to OpenJDK regression
> ---
>
> Key: KAFKA-6828
> URL: https://issues.apache.org/jira/browse/KAFKA-6828
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.0.0
> Environment: CentosOS 7 on EXT4 FS
>Reporter: Enrico Olivelli
>Priority: Critical
>
> This is a very strage case. I have a Kafka broker (part of a cluster of 3 
> brokers) which cannot start upgrading Java from Oracle JDK8 to Oracle JDK 
> 9.0.4 (the same with JDK 10.0.0)
> There are a lot of .index and .timeindex files taking 10MB, they are for 
> empty partiions.
> Running with Java 9 the server seems to rebuild these files and each file 
> takes "really" 10MB.The sum of all the files (calculated using du -sh) is 
> 22GB and the broker crashes during startup, disk becomes full and no log more 
> is written. (I can send an extraction of the logs, but the tell only  about 
> 'rebuilding index', the same as on Java 8)
> Reverting the same broker to Java 8 and removing the index files, the broker 
> rebuilds such files, each files take 10MB, but the full sum of sizes 
> (calculated using du -sh) is 38 MB !
> I am running this broker on CentosOS 7 on EXT4 FS.
> I have upgraded the broker to latest and greatest Kafka 1.0.0 (from 0.10.2) 
> without any success.
>   
>  After checking on JDK nio-dev list it appears a regresion in the behaviour 
> of RandomAccessFile
>   Just for reference see this discussion  on nio-dev list on OpenJDK
>  [http://mail.openjdk.java.net/pipermail/nio-dev/2018-April/005008.html]
> see
>  [https://bugs.openjdk.java.net/browse/JDK-8168628]
>   
>   
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6879) Controller deadlock following session expiration

2018-05-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6879.

Resolution: Fixed

> Controller deadlock following session expiration
> 
>
> Key: KAFKA-6879
> URL: https://issues.apache.org/jira/browse/KAFKA-6879
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 1.1.0
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 2.0.0, 1.1.1
>
>
> We have observed an apparent deadlock situation which occurs following a 
> session expiration. The suspected deadlock occurs between the zookeeper 
> "initializationLock" and the latch inside the Expire event which we use to 
> ensure all events have been handled.
> In the logs, we see the "Session expired" message following acquisition of 
> the initialization lock: 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/zookeeper/ZooKeeperClient.scala#L358
> But we never see any logs indicating that the new session is being 
> initialized. In fact, the controller logs are basically empty from that point 
> on. The problem we suspect is that completion of the 
> {{beforeInitializingSession}} callback requires that all events have finished 
> processing in order to count down the latch: 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/KafkaController.scala#L1525.
> But an event which was dequeued just prior to the acquisition of the write 
> lock may be unable to complete because it is awaiting acquisition of the 
> initialization lock: 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/zookeeper/ZooKeeperClient.scala#L137.
> The impact is that the broker continues in a zombie state. It continues 
> fetching and is periodically added to ISRs, but it never receives any further 
> requests from the controller since it is not registered.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6390) Update ZooKeeper to 3.4.11, Gradle and other minor updates

2018-05-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6390.

Resolution: Fixed

> Update ZooKeeper to 3.4.11, Gradle and other minor updates
> --
>
> Key: KAFKA-6390
> URL: https://issues.apache.org/jira/browse/KAFKA-6390
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
> Fix For: 2.0.0
>
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-2614 is a helpful fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (KAFKA-4041) kafka unable to reconnect to zookeeper behind an ELB

2018-05-07 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-4041:


> kafka unable to reconnect to zookeeper behind an ELB
> 
>
> Key: KAFKA-4041
> URL: https://issues.apache.org/jira/browse/KAFKA-4041
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1, 0.10.0.1
> Environment: RHEL EC2 instances
>Reporter: prabhakar
>Priority: Blocker
>
> Kafka brokers are unable to connect to  zookeeper which is behind an ELB.
> Kafka is using zkClient which is caching the IP address of zookeeper  and 
> even when there is a change in the IP for zookeeper it is using the Old 
> zookeeper IP.
> The server.properties has a DNS name. Ideally kafka should resolve the IP 
> using the DNS in case of any failures connecting to the broker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6853) ResponseMetadata calculates latency incorrectly (and therefore ZooKeeperRequestLatencyMs is incorrect)

2018-05-03 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6853.

Resolution: Fixed

> ResponseMetadata calculates latency incorrectly (and therefore 
> ZooKeeperRequestLatencyMs is incorrect)
> --
>
> Key: KAFKA-6853
> URL: https://issues.apache.org/jira/browse/KAFKA-6853
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Fuud
>Priority: Minor
> Fix For: 2.0.0, 1.1.1
>
>
> responseTimeMs always negative.
> Currently:
> {code}
> case class ResponseMetadata(sendTimeMs: Long, receivedTimeMs: Long) {
>   def responseTimeMs: Long = sendTimeMs - receivedTimeMs
> }
> {code}
> Should be:
> {code}
> case class ResponseMetadata(sendTimeMs: Long, receivedTimeMs: Long) {
>   def responseTimeMs: Long = receivedTimeMs - sendTimeMs
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6855) Kafka fails to start due to faulty Java version detection

2018-05-03 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6855.

   Resolution: Fixed
Fix Version/s: 1.1.1
   2.0.0

Thanks for the report. This has already been fixed:

[https://github.com/apache/kafka/commit/e9f86c3085fa8b65e77072389e0dd147b744f117]

Since we had no JIRA for it, I will use this one.

> Kafka fails to start due to faulty Java version detection
> -
>
> Key: KAFKA-6855
> URL: https://issues.apache.org/jira/browse/KAFKA-6855
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
> Environment: Ubuntu 18.04
> Java 10
>Reporter: Anders Tornblad
>Assignee: Ismael Juma
>Priority: Major
> Fix For: 2.0.0, 1.1.1
>
>
> After downloading fresh installations of ZooKeeper and Kafka, and then 
> starting ZooKeeper and Kafka the way that is recommended on 
> [http://kafka.apache.org/documentation/#quickstart] the following error 
> message is shown:
> {{Unrecognized VM option 'PrintGCDateStamps'}}
> I found the error in the kafka-run-class.sh file, where the Java version is 
> determined and put in the JAVA_MAJOR_VERSION variable. My Java runtime 
> reports the version as openjdk version "10.0.1" 2018-04-17, which makes the 
> JAVA_MAJOR_VERSION value be "10 2018-04-17" instead of just "10". That makes 
> the subsequent if statement fail.
> I found the following line to fix the problem:
> {{JAVA_MAJOR_VERSION=$($JAVA -version 2>&1 | sed -E -n 's/.* version 
> "([^.-]*).*/\1p')}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6796) Surprising UNKNOWN_TOPIC error for produce/fetch requests to non-replicas

2018-04-24 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6796.

   Resolution: Fixed
Fix Version/s: 1.2.0

> Surprising UNKNOWN_TOPIC error for produce/fetch requests to non-replicas
> -
>
> Key: KAFKA-6796
> URL: https://issues.apache.org/jira/browse/KAFKA-6796
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.0.1
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 1.2.0
>
>
> Currently if the client sends a produce request or a fetch request to a 
> broker which isn't a replica, we return UNKNOWN_TOPIC_OR_PARTITION. This is a 
> bit surprising to see when the topic actually exists. It would be better to 
> return NOT_LEADER to avoid confusion. Clients typically handle both errors by 
> refreshing metadata and retrying, so changing this should not cause any 
> change in behavior on the client. This case can be hit following a partition 
> reassignment after the leader is moved and the local replica is deleted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (KAFKA-2334) Prevent HW from going back during leader failover

2018-04-18 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-2334:


> Prevent HW from going back during leader failover 
> --
>
> Key: KAFKA-2334
> URL: https://issues.apache.org/jira/browse/KAFKA-2334
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Guozhang Wang
>Priority: Major
>  Labels: reliability
>
> Consider the following scenario:
> 0. Kafka use replication factor of 2, with broker B1 as the leader, and B2 as 
> the follower. 
> 1. A producer keep sending to Kafka with ack=-1.
> 2. A consumer repeat issuing ListOffset request to Kafka.
> And the following sequence:
> 0. B1 current log-end-offset (LEO) 0, HW-offset 0; and same with B2.
> 1. B1 receive a ProduceRequest of 100 messages, append to local log (LEO 
> becomes 100) and hold the request in purgatory.
> 2. B1 receive a FetchRequest starting at offset 0 from follower B2, and 
> returns the 100 messages.
> 3. B2 append its received message to local log (LEO becomes 100).
> 4. B1 receive another FetchRequest starting at offset 100 from B2, knowing 
> that B2's LEO has caught up to 100, and hence update its own HW, and 
> satisfying the ProduceRequest in purgatory, and sending the FetchResponse 
> with HW 100 back to B2 ASYNCHRONOUSLY.
> 5. B1 successfully sends the ProduceResponse to the producer, and then fails, 
> hence the FetchResponse did not reach B2, whose HW remains 0.
> From the consumer's point of view, it could first see the latest offset of 
> 100 (from B1), and then see the latest offset of 0 (from B2), and then the 
> latest offset gradually catch up to 100.
> This is because we use HW to guard the ListOffset and 
> Fetch-from-ordinary-consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6772) Broker should load credentials from ZK before requests are allowed

2018-04-10 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6772:
--

 Summary: Broker should load credentials from ZK before requests 
are allowed
 Key: KAFKA-6772
 URL: https://issues.apache.org/jira/browse/KAFKA-6772
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 1.1.0, 1.0.0
Reporter: Ismael Juma


It is currently possible for clients to get an AuthenticationException during 
start-up if the brokers have not yet loaded credentials from ZK. This 
definitely affects SCRAM, but it may also affect delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6763) Consider using direct byte buffers in SslTransportLayer

2018-04-07 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6763:
--

 Summary: Consider using direct byte buffers in SslTransportLayer
 Key: KAFKA-6763
 URL: https://issues.apache.org/jira/browse/KAFKA-6763
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


We use heap byte buffers in SslTransportLayer. For netReadBuffer and 
netWriteBuffer, it means that the NIO layer has to copy to/from a native buffer 
before it can write/read to the socket. It would be good to test if switching 
to direct byte buffers improves performance. We can't be sure as the benefit of 
avoiding the copy could be offset by the specifics of the operations we perform 
on netReadBuffer, netWriteBuffer and appReadBuffer.

We should benchmark produce and consume performance and try a few combinations 
of direct/heap byte buffers for netReadBuffer, netWriteBuffer and appReadBuffer 
(the latter should probably remain as a heap byte buffer, but no harm in 
testing it too).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (KAFKA-6390) Update ZooKeeper to 3.4.11, Gradle and other minor updates

2018-03-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-6390:


> Update ZooKeeper to 3.4.11, Gradle and other minor updates
> --
>
> Key: KAFKA-6390
> URL: https://issues.apache.org/jira/browse/KAFKA-6390
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
> Fix For: 1.2.0
>
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-2614 is a helpful fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6683) ReplicaFetcher crashes with "Attempted to complete a transaction which was not started"

2018-03-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6683.

   Resolution: Fixed
Fix Version/s: 1.1.0

> ReplicaFetcher crashes with "Attempted to complete a transaction which was 
> not started" 
> 
>
> Key: KAFKA-6683
> URL: https://issues.apache.org/jira/browse/KAFKA-6683
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 1.0.0
> Environment: os: GNU/Linux 
> arch: x86_64
> Kernel: 4.9.77
> jvm: OpenJDK 1.8.0
>Reporter: Chema Sanchez
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 1.1.0
>
> Attachments: server.properties
>
>
> We have been experiencing this issue lately when restarting or replacing 
> brokers of our Kafka clusters during maintenance operations.
> Having restarted or replaced a broker, after some minutes performing normally 
> it may suddenly throw the following exception and stop replicating some 
> partitions:
> {code:none}
> 2018-03-15 17:23:01,482] ERROR [ReplicaFetcher replicaId=12, leaderId=10, 
> fetcherId=0] Error due to (kafka.server.ReplicaFetcherThread)
> java.lang.IllegalArgumentException: Attempted to complete a transaction which 
> was not started
>     at 
> kafka.log.ProducerStateManager.completeTxn(ProducerStateManager.scala:720)
>     at kafka.log.Log.$anonfun$loadProducersFromLog$4(Log.scala:540)
>     at 
> kafka.log.Log.$anonfun$loadProducersFromLog$4$adapted(Log.scala:540)
>     at scala.collection.immutable.List.foreach(List.scala:389)
>     at 
> scala.collection.generic.TraversableForwarder.foreach(TraversableForwarder.scala:35)
>     at 
> scala.collection.generic.TraversableForwarder.foreach$(TraversableForwarder.scala:35)
>     at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:44)
>     at kafka.log.Log.loadProducersFromLog(Log.scala:540)
>     at kafka.log.Log.$anonfun$loadProducerState$5(Log.scala:521)
>     at kafka.log.Log.$anonfun$loadProducerState$5$adapted(Log.scala:514)
>     at scala.collection.Iterator.foreach(Iterator.scala:929)
>     at scala.collection.Iterator.foreach$(Iterator.scala:929)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
>     at scala.collection.IterableLike.foreach(IterableLike.scala:71)
>     at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
>     at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>     at kafka.log.Log.loadProducerState(Log.scala:514)
>     at kafka.log.Log.$anonfun$truncateTo$2(Log.scala:1487)
>     at 
> scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12)
>     at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
>     at kafka.log.Log.truncateTo(Log.scala:1467)
>     at kafka.log.LogManager.$anonfun$truncateTo$2(LogManager.scala:454)
>     at 
> kafka.log.LogManager.$anonfun$truncateTo$2$adapted(LogManager.scala:445)
>     at 
> scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:789)
>     at scala.collection.immutable.Map$Map1.foreach(Map.scala:120)
>     at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:788)
>     at kafka.log.LogManager.truncateTo(LogManager.scala:445)
>     at 
> kafka.server.ReplicaFetcherThread.$anonfun$maybeTruncate$1(ReplicaFetcherThread.scala:281)
>     at scala.collection.Iterator.foreach(Iterator.scala:929)
>     at scala.collection.Iterator.foreach$(Iterator.scala:929)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
>     at scala.collection.IterableLike.foreach(IterableLike.scala:71)
>     at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
>     at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>     at 
> kafka.server.ReplicaFetcherThread.maybeTruncate(ReplicaFetcherThread.scala:265)
>     at 
> kafka.server.AbstractFetcherThread.$anonfun$maybeTruncate$2(AbstractFetcherThread.scala:135)
>     at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
>     at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:217)
>     at 
> kafka.server.AbstractFetcherThread.maybeTruncate(AbstractFetcherThread.scala:132)
>     at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:102)
>     at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64)
> [2018-03-15 17:23:01,497] INFO [ReplicaFetcher replicaId=12, leaderId=10, 
> fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread)
> {code}
> As during system updates all brokers in a cluster are restarted, it happened 
> some times the issue to man

[jira] [Resolved] (KAFKA-4974) System test failure in 0.8.2.2 upgrade tests

2018-03-06 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4974.

Resolution: Fixed

> System test failure in 0.8.2.2 upgrade tests
> 
>
> Key: KAFKA-4974
> URL: https://issues.apache.org/jira/browse/KAFKA-4974
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Reporter: Magnus Edenhill
>Priority: Major
>
> The 0.10.2 system test failed in one of the upgrade tests from 0.8.2.2:
> http://testing.confluent.io/confluent-kafka-0-10-2-system-test-results/?prefix=2017-03-21--001.1490092219--apache--0.10.2--4a019bd/TestUpgrade/test_upgrade/from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy.new_consumer=False/
> {noformat}
> [INFO  - 2017-03-21 07:35:48,802 - runner_client - log - lineno:221]: 
> RunnerClient: 
> kafkatest.tests.core.upgrade_test.TestUpgrade.test_upgrade.from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy.new_consumer=False:
>  FAIL: Kafka server didn't finish startup
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py",
>  line 321, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py",
>  line 125, in test_upgrade
> self.run_produce_consume_validate(core_test_action=lambda: 
> self.perform_upgrade(from_kafka_version,
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 114, in run_produce_consume_validate
> core_test_action(*args)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py",
>  line 126, in 
> to_message_format_version))
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py",
>  line 52, in perform_upgrade
> self.kafka.start_node(node)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/services/kafka/kafka.py",
>  line 222, in start_node
> monitor.wait_until("Kafka Server.*started", timeout_sec=30, 
> backoff_sec=.25, err_msg="Kafka server didn't finish startup")
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/cluster/remoteaccount.py",
>  line 642, in wait_until
> return wait_until(lambda: self.acct.ssh("tail -c +%d %s | grep '%s'" % 
> (self.offset+1, self.log, pattern), allow_fail=True) == 0, **kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py",
>  line 36, in wait_until
> {noformat}
> Logs:
> {noformat}
> ==> ./KafkaService-0-140646398705744/worker9/server-start-stdout-stderr.log 
> <==
> [2017-03-21 07:35:18,250] DEBUG Leaving process event 
> (org.I0Itec.zkclient.ZkClient)
> [2017-03-21 07:35:18,250] INFO EventThread shut down 
> (org.apache.zookeeper.ClientCnxn)
> [2017-03-21 07:35:18,250] INFO [Kafka Server 2], shut down completed 
> (kafka.server.KafkaServer)
> Error: Exception thrown by the agent : java.rmi.server.ExportException: Port 
> already in use: 9192; nested exception is: 
>   java.net.BindException: Address already in use
> {noformat}
> That's from starting the upgraded broker, which seems to indicate that 
> 0.8.2.2 was not properly shut down or has its RMI port in the close-wait 
> state.
> Since there probably isn't much to do about 0.8.2.2 the test should probably 
> be hardened to either select a random port, or wait for lingering port to 
> become available (can use netstat for that).
> This earlier failrue from the same 0.8.2.2 invocation might be of interest:
> {noformat}
> [2017-03-21 07:35:18,233] DEBUG Writing clean shutdown marker at 
> /mnt/kafka-data-logs (kafka.log.LogManager)
> [2017-03-21 07:35:18,235] INFO Shutdown complete. (kafka.log.LogManager)
> [2017-03-21 07:35:18,238] DEBUG Shutting down task scheduler. 
> (kafka.utils.KafkaScheduler)
> [2017-03-21 07:35:18,243] WARN sleep interrupted (kafka.utils.Utils$)
> 

[jira] [Created] (KAFKA-6616) kafka-merge-pr.py should use GitHub's REST API to merge

2018-03-06 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6616:
--

 Summary: kafka-merge-pr.py should use GitHub's REST API to merge
 Key: KAFKA-6616
 URL: https://issues.apache.org/jira/browse/KAFKA-6616
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


The merge script currently squashes the commits in the pull request locally and 
then merges it to the target branch. It can also cherry-pick it to other 
branches. The downside is that GitHub doesn't know that the pull request has 
been merged. As a workaround, the script includes a keyword in the commit 
message to close the pull request. Since the merged commit is different to the 
pull request branch, GitHub assumes that the PR was not merged.

[~hachikuji] suggested that an API may be available that mimics what the GitHub 
merge button does. And he is correct. Given our recent transition to GitBox, 
committers have write access to GitHub, so it's feasible to update the merge 
script to do this. Rough steps:
 # Replace local squashing and merging with GitHub REST API for merging 
([https://developer.github.com/v3/pulls/#merge-a-pull-request-merge-button)]
 # After the merge, pull changes from target branch and offer the option to 
cherry-pick to other branches (i.e. the code may have to be updated a little 
for the rest of the workflow to work).
 # Update wiki documentation and code to state that GITHUB_OAUTH_KEY must be 
set (it's currently optional since we don't rely on any operations that require 
authentication).
 # Update wiki documentation to remove the main downside for using the merge 
script and perhaps to recommend it.

Documentation: 
https://cwiki.apache.org/confluence/display/KAFKA/Merging+Github+Pull+Requests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6430) Improve Kafka GZip compression performance

2018-02-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6430.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Improve Kafka GZip compression performance
> --
>
> Key: KAFKA-6430
> URL: https://issues.apache.org/jira/browse/KAFKA-6430
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, compression, core
>Reporter: Ying Zheng
>Assignee: Ying Zheng
>Priority: Minor
> Fix For: 1.1.0
>
>
> To compress messages, Kafka uses DataOutputStream on top of GZIPOutputStream:
>   new DataOutputStream(new GZIPOutputStream(buffer, bufferSize));
> To decompress messages, Kafka uses DataInputStream on top of GZIPInputStream:
>new DataInputStream(new GZIPInputStream(buffer));
> This is very straight forward, but actually inefficient. For each message, in 
> addition to the key and value data, Kafka has to write about 30 some metadata 
> bytes (slightly varies in different Kafka version), including magic byte, 
> checksum, timestamp, offset, key length, value length etc. For each of these 
> bytes, java DataOutputStream has to call write(byte) once. Here is the 
> awkward writeInt() method in DataOutputStream, which writes 4 bytes 
> separately in big-endian order. 
> {code}
> public final void writeInt(int v) throws IOException {
> out.write((v >>> 24) & 0xFF);
> out.write((v >>> 16) & 0xFF);
> out.write((v >>>  8) & 0xFF);
> out.write((v >>>  0) & 0xFF);
> incCount(4);
> }
> {code}
> Unfortunately, GZIPOutputStream does not implement the write(byte) method. 
> Instead, it only provides a write(byte[], offset, len) method, which calls 
> the corresponding JNI zlib function. The write(byte) calls from 
> DataOutputStream are translated into write(byte[], offset, len) calls in a 
> very inefficient way: (Oracle JDK 1.8 code)
> {code}
> class DeflaterOutputStream {
> public void write(int b) throws IOException {
> byte[] buf = new byte[1];
> buf[0] = (byte)(b & 0xff);
> write(buf, 0, 1);
> }
> public void write(byte[] b, int off, int len) throws IOException {
> if (def.finished()) {
> throw new IOException("write beyond end of stream");
> }
> if ((off | len | (off + len) | (b.length - (off + len))) < 0) {
> throw new IndexOutOfBoundsException();
> } else if (len == 0) {
> return;
> }
> if (!def.finished()) {
> def.setInput(b, off, len);
> while (!def.needsInput()) {
> deflate();
> }
> }
> }
> }
> class GZIPOutputStream extends DeflaterOutputStream {
> public synchronized void write(byte[] buf, int off, int len)
> throws IOException
> {
> super.write(buf, off, len);
> crc.update(buf, off, len);
> }
> }
> class Deflater {
> private native int deflateBytes(long addr, byte[] b, int off, int len, int 
> flush);
> }
> class CRC32 {
> public void update(byte[] b, int off, int len) {
> if (b == null) {
> throw new NullPointerException();
> }
> if (off < 0 || len < 0 || off > b.length - len) {
> throw new ArrayIndexOutOfBoundsException();
> }
> crc = updateBytes(crc, b, off, len);
> }
> private native static int updateBytes(int crc, byte[] b, int off, int 
> len);
> }
> {code}
> For each meta data byte, the code above has to allocate 1 single byte array, 
> acquire several locks, call two native JNI methods (Deflater.deflateBytes and 
> CRC32.updateBytes). In each Kafka message, there are about 30 some meta data 
> bytes.
> The call stack of Deflater.deflateBytes():
> DeflaterOutputStream.public void write(int b) -> 
> GZIPOutputStream.write(byte[] buf, int off, int len) -> 
> DeflaterOutputStream.write(byte[] b, int off, int len) -> 
> DeflaterOutputStream.deflate() -> Deflater.deflate(byte[] b, int off, int 
> len) -> Deflater.deflate(byte[] b, int off, int len, int flush) -> 
> Deflater.deflateBytes(long addr, byte[] b, int off, int len, int flush)
> The call stack of CRC32.updateBytes():
> DeflaterOutputStream.public void write(int b) -> 
> GZIPOutputStream.write(byte[] buf, int off, int len) -> CRC32.update(byte[] 
> b, int off, int len) -> CRC32.updateBytes(int crc, byte[] b, int off, int len)
> At Uber, we found that adding a small buffer between DataOutputStream and 
> GZIPOutputStream can speed up Kafka GZip compression speed by about 60% in 
> average.
> {code}
>  -return new DataOutputStream(new 
> GZIPOutputStream(buffer, bufferSize));
> +return new DataOutputStream(new BufferedOutputStream(new 
> GZIPOutputStream(buffer, bufferSize), 1 << 14

[jira] [Resolved] (KAFKA-6464) Base64URL encoding under JRE 1.7 is broken due to incorrect padding assumption

2018-01-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6464.

Resolution: Fixed
  Reviewer: Rajini Sivaram  (was: Ismael Juma)

> Base64URL encoding under JRE 1.7 is broken due to incorrect padding assumption
> --
>
> Key: KAFKA-6464
> URL: https://issues.apache.org/jira/browse/KAFKA-6464
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 1.0.0
>Reporter: Ron Dagostino
>Priority: Minor
> Fix For: 1.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The org.apache.kafka.common.utils.Base64 class defers Base64 
> encoding/decoding to the java.util.Base64 class beginning with JRE 1.8 but 
> leverages javax.xml.bind.DatatypeConverter under JRE 1.7.  The implementation 
> of the encodeToString(bytes[]) method returned under JRE 1.7 by 
> Base64.urlEncoderNoPadding() blindly removes the last two trailing characters 
> of the Base64 encoding under the assumption that they will always be the 
> string "==" but that is incorrect; padding can be "=", "==", or non-existent.
> For example, this statement:
>  
> {code:java}
> Base64.urlEncoderNoPadding().encodeToString(
> "{\"alg\":\"none\"}".getBytes(StandardCharsets.UTF_8));{code}
>  
> Yields this, which is incorrect: (because the padding on the Base64 encoded 
> value is "=" instead of the assumed "==", so an extra character is 
> incorrectly trimmed):
> {{eyJhbGciOiJub25lIn}}
> The correct value is:
> {{eyJhbGciOiJub25lIn0}}
> There is also no Base64.urlDecoder() method, which aside from providing 
> useful functionality would also make it easy to write a unit test (there 
> currently is none).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6307) mBeanName should be removed before returning from JmxReporter#removeAttribute()

2018-01-01 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6307.

   Resolution: Fixed
Fix Version/s: 1.1.0

> mBeanName should be removed before returning from 
> JmxReporter#removeAttribute()
> ---
>
> Key: KAFKA-6307
> URL: https://issues.apache.org/jira/browse/KAFKA-6307
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: siva santhalingam
> Fix For: 1.1.0
>
>
> JmxReporter$KafkaMbean showed up near the top in the first histo output from 
> KAFKA-6199.
> In JmxReporter#removeAttribute() :
> {code}
> KafkaMbean mbean = this.mbeans.get(mBeanName);
> if (mbean != null)
> mbean.removeAttribute(metricName.name());
> return mbean;
> {code}
> mbeans.remove(mBeanName) should be called before returning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2017-12-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2729.

   Resolution: Fixed
 Assignee: Onur Karaman
Fix Version/s: 1.1.0

We believe this is fixed in trunk (to become 1.1.0). If someone sees this again 
with that version, please reopen.

> Cached zkVersion not equal to that in zookeeper, broker not recovering.
> ---
>
> Key: KAFKA-2729
> URL: https://issues.apache.org/jira/browse/KAFKA-2729
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1, 0.9.0.0, 0.10.0.0, 0.10.1.0, 0.11.0.0
>Reporter: Danil Serdyuchenko
>Assignee: Onur Karaman
> Fix For: 1.1.0
>
>
> After a small network wobble where zookeeper nodes couldn't reach each other, 
> we started seeing a large number of undereplicated partitions. The zookeeper 
> cluster recovered, however we continued to see a large number of 
> undereplicated partitions. Two brokers in the kafka cluster were showing this 
> in the logs:
> {code}
> [2015-10-27 11:36:00,888] INFO Partition 
> [__samza_checkpoint_event-creation_1,3] on broker 5: Shrinking ISR for 
> partition [__samza_checkpoint_event-creation_1,3] from 6,5 to 5 
> (kafka.cluster.Partition)
> [2015-10-27 11:36:00,891] INFO Partition 
> [__samza_checkpoint_event-creation_1,3] on broker 5: Cached zkVersion [66] 
> not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
> {code}
> For all of the topics on the effected brokers. Both brokers only recovered 
> after a restart. Our own investigation yielded nothing, I was hoping you 
> could shed some light on this issue. Possibly if it's related to: 
> https://issues.apache.org/jira/browse/KAFKA-1382 , however we're using 
> 0.8.2.1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6320) move ZK metrics in KafkaHealthCheck to ZookeeperClient

2017-12-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6320.

Resolution: Fixed

> move ZK metrics in KafkaHealthCheck to ZookeeperClient
> --
>
> Key: KAFKA-6320
> URL: https://issues.apache.org/jira/browse/KAFKA-6320
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 1.0.0
>Reporter: Jun Rao
>Assignee: Jun Rao
> Fix For: 1.1.0
>
>
> In KAFKA-5473, we will be de-commissioning the usage of KafkaHealthCheck. So, 
> we need to move the ZK metrics SessionState and ZooKeeper${eventType}PerSec 
> in that class to somewhere else (e.g. ZookeeperClient).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3496) Add reconnect attemps policies for client

2017-12-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3496.

Resolution: Won't Fix

We decided to add configs for enabling exponential backoff instead.

> Add reconnect attemps policies for client
> -
>
> Key: KAFKA-3496
> URL: https://issues.apache.org/jira/browse/KAFKA-3496
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.10.0.0
>Reporter: Florian Hussonnois
>
> Currently the client reconnection attempts is only controlled by the property 
> : reconnect.backoff.ms
> It would be nice to introduce a reconnect attempt policy. At first, two 
> policies may be defined : 
> - ConstantReconnectAttemptPolicy
> - ExponentialReconnectAttemptPolicy
> The policy could be then configure as follows : 
> Properties config = new Properties(); 
> config.put(ConsumerConfig.RECONNECT_ATTEMPTS_POLICY_CLASS_CONFIG, 
> "org.apache.kafka.clients.ExponentialReconnectAttemptPolicy");
> config.put(ConsumerConfig.RECONNECT_EXPONENTIAL_MAX_DELAY_MS_CONFIG, 5000);



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5895) Gradle 3.0+ is needed on the build

2017-12-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5895.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Gradle 3.0+ is needed on the build
> --
>
> Key: KAFKA-5895
> URL: https://issues.apache.org/jira/browse/KAFKA-5895
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.11.0.2
>Reporter: Matthias Weßendorf
>Priority: Trivial
> Fix For: 1.1.0
>
>
> The README says:
> Kafka requires Gradle 2.0 or higher.
> but running with "2.13", I am getting an ERROR message, saying that 3.0+ is 
> needed:
> {code}
> > Failed to apply plugin [class 
> > 'com.github.jengelman.gradle.plugins.shadow.ShadowBasePlugin']
>> This version of Shadow supports Gradle 3.0+ only. Please upgrade.
> {code}
> Full log here:
> {code}
> ➜  kafka git:(utils_improvment) ✗ gradle 
> To honour the JVM settings for this build a new JVM will be forked. Please 
> consider using the daemon: 
> https://docs.gradle.org/2.13/userguide/gradle_daemon.html.
> Download 
> https://jcenter.bintray.com/org/ajoberstar/grgit/1.9.3/grgit-1.9.3.pom
> Download 
> https://jcenter.bintray.com/com/github/ben-manes/gradle-versions-plugin/0.15.0/gradle-versions-plugin-0.15.0.pom
> Download 
> https://repo1.maven.org/maven2/org/scoverage/gradle-scoverage/2.1.0/gradle-scoverage-2.1.0.pom
> Download 
> https://jcenter.bintray.com/com/github/jengelman/gradle/plugins/shadow/2.0.1/shadow-2.0.1.pom
> Download 
> https://repo1.maven.org/maven2/org/eclipse/jgit/org.eclipse.jgit/4.5.2.201704071617-r/org.eclipse.jgit-4.5.2.201704071617-r.pom
> Download 
> https://repo1.maven.org/maven2/org/eclipse/jgit/org.eclipse.jgit-parent/4.5.2.201704071617-r/org.eclipse.jgit-parent-4.5.2.201704071617-r.pom
> Download 
> https://repo1.maven.org/maven2/org/eclipse/jgit/org.eclipse.jgit.ui/4.5.2.201704071617-r/org.eclipse.jgit.ui-4.5.2.201704071617-r.pom
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.jsch/0.0.9/jsch.agentproxy.jsch-0.0.9.pom
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy/0.0.9/jsch.agentproxy-0.0.9.pom
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.pageant/0.0.9/jsch.agentproxy.pageant-0.0.9.pom
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.sshagent/0.0.9/jsch.agentproxy.sshagent-0.0.9.pom
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.usocket-jna/0.0.9/jsch.agentproxy.usocket-jna-0.0.9.pom
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.usocket-nc/0.0.9/jsch.agentproxy.usocket-nc-0.0.9.pom
> Download https://repo1.maven.org/maven2/com/jcraft/jsch/0.1.54/jsch-0.1.54.pom
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.core/0.0.9/jsch.agentproxy.core-0.0.9.pom
> Download 
> https://jcenter.bintray.com/org/ajoberstar/grgit/1.9.3/grgit-1.9.3.jar
> Download 
> https://jcenter.bintray.com/com/github/ben-manes/gradle-versions-plugin/0.15.0/gradle-versions-plugin-0.15.0.jar
> Download 
> https://repo1.maven.org/maven2/org/scoverage/gradle-scoverage/2.1.0/gradle-scoverage-2.1.0.jar
> Download 
> https://jcenter.bintray.com/com/github/jengelman/gradle/plugins/shadow/2.0.1/shadow-2.0.1.jar
> Download 
> https://repo1.maven.org/maven2/org/eclipse/jgit/org.eclipse.jgit/4.5.2.201704071617-r/org.eclipse.jgit-4.5.2.201704071617-r.jar
> Download 
> https://repo1.maven.org/maven2/org/eclipse/jgit/org.eclipse.jgit.ui/4.5.2.201704071617-r/org.eclipse.jgit.ui-4.5.2.201704071617-r.jar
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.jsch/0.0.9/jsch.agentproxy.jsch-0.0.9.jar
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.pageant/0.0.9/jsch.agentproxy.pageant-0.0.9.jar
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.sshagent/0.0.9/jsch.agentproxy.sshagent-0.0.9.jar
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.usocket-jna/0.0.9/jsch.agentproxy.usocket-jna-0.0.9.jar
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.usocket-nc/0.0.9/jsch.agentproxy.usocket-nc-0.0.9.jar
> Download 
> https://repo1.maven.org/maven2/com/jcraft/jsch.agentproxy.core/0.0.9/jsch.agentproxy.core-0.0.9.jar
> Building project 'core' with Scala version 2.11.11
> FAILURE: Build failed with an exception.
> * Where:
> Build file '/home/Matthias/Work/Apache/kafka/build.gradle' line: 978
> * What went wrong:
> A problem occurred evaluating root project 'kafka'.
> > Failed to apply plugin [class 
> > 'com.github.jengelman.gradle.plugins.shadow.ShadowBasePlugin']
>> This version of Shadow supports Gradle 3.0+ only. Please upgrade.
> * Try:
> Run with --stacktrace option to get the stack trace. Run with --info or 
> --debug optio

[jira] [Resolved] (KAFKA-6331) Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs

2017-12-20 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6331.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Transient failure in 
> kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
> --
>
> Key: KAFKA-6331
> URL: https://issues.apache.org/jira/browse/KAFKA-6331
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Reporter: Guozhang Wang
>Assignee: Dong Lin
> Fix For: 1.1.0
>
>
> Saw this error once on Jenkins: 
> https://builds.apache.org/job/kafka-pr-jdk9-scala2.12/3025/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/
> {code}
> Stacktrace
> java.lang.AssertionError: timed out waiting for message produce
>   at kafka.utils.TestUtils$.fail(TestUtils.scala:347)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:861)
>   at 
> kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:357)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:844)
> Standard Output
> [2017-12-07 19:22:56,297] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:22:59,447] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:22:59,453] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:01,335] ERROR Error while creating ephemeral at 
> /controller, node already exists and owner '99134641238966279' does not match 
> current session '99134641238966277' 
> (kafka.zk.KafkaZkClient$CheckedEphemeral:71)
> [2017-12-07 19:23:04,695] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:04,760] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:06,764] ERROR Error while creating ephemeral at 
> /controller, node already exists and owner '99134641586700293' does not match 
> current session '99134641586700295' 
> (kafka.zk.KafkaZkClient$CheckedEphemeral:71)
> [2017-12-07 19:23:09,379] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:09,387] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:11,533] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:11,539] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN serve

[jira] [Created] (KAFKA-6390) Update ZooKeeper to 3.4.11 and other minor updates

2017-12-20 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6390:
--

 Summary: Update ZooKeeper to 3.4.11 and other minor updates
 Key: KAFKA-6390
 URL: https://issues.apache.org/jira/browse/KAFKA-6390
 Project: Kafka
  Issue Type: Bug
Reporter: Ismael Juma
Assignee: Ismael Juma






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5631) Use Jackson for serialising to JSON

2017-12-12 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5631.

Resolution: Fixed

> Use Jackson for serialising to JSON
> ---
>
> Key: KAFKA-5631
> URL: https://issues.apache.org/jira/browse/KAFKA-5631
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Umesh Chaudhary
>  Labels: newbie
> Fix For: 1.1.0
>
>
> We currently serialise to JSON via a manually written method `Json.encode`. 
> The implementation is naive: it does a lot of unnecessary String 
> concatenation and it doesn't handle escaping well.
> KAFKA-1595 switches to Jackson for parsing, so it would make sense to do this 
> after that one is merged.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6319) kafka-acls regression for comma characters (and maybe other characters as well)

2017-12-12 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6319.

Resolution: Fixed

> kafka-acls regression for comma characters (and maybe other characters as 
> well)
> ---
>
> Key: KAFKA-6319
> URL: https://issues.apache.org/jira/browse/KAFKA-6319
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Affects Versions: 1.0.0
> Environment: Debian 8. Java 8. SSL clients.
>Reporter: Jordan Mcmillan
>Assignee: Rajini Sivaram
>  Labels: regression
> Fix For: 1.1.0
>
>
> As of version 1.0.0, kafka-acls.sh no longer recognizes my ACLs stored in 
> zookeeper. I am using SSL and the default principle builder class. My 
> principle name contains a comma. Ex:
> "CN=myhost.mycompany.com,OU=MyOU,O=MyCompany, Inc.,ST=MyST,C=US"
> The default principle builder uses the getName() function in 
> javax.security.auth.x500:
> https://docs.oracle.com/javase/8/docs/api/javax/security/auth/x500/X500Principal.html#getName
> The documentation states "The only characters in attribute values that are 
> escaped are those which section 2.4 of RFC 2253 states must be escaped". This 
> makes sense as my kakfa-authorizor log shows the comma correctly escaped with 
> a backslash:
> INFO Principal = User:CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\, 
> Inc.,ST=MyST,C=US is Denied Operation = Describe from host = 1.2.3.4 on 
> resource = Topic:mytopic (kafka.authorizer.logger)
> Here's what I get when I try to create the ACL in kafka 1.0:
> {code:java}
> > # kafka-acls --authorizer-properties zookeeper.connect=localhost:2181 --add 
> > --allow-principal User:"CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\, 
> > Inc.,ST=MyST,C=US" --operation "Describe" --allow-host "*" --topic="mytopic"
> Adding ACLs for resource `Topic:mytopic`:
>  User:CN=CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\, Inc.,ST=MyST,C=US 
> has Allow permission for operations: Describe from hosts: *
> Current ACLs for resource `Topic:mytopic`:
>  "User:CN=CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\, Inc.,ST=MyST,C=US has 
> Allow permission for operations: Describe from hosts: *">
> {code}
> Examining Zookeeper, I can see the data. Though I notice that the json string 
> for ACLs is not actually valid since the backslash is not escaped with a 
> double backslash. This was true for 0.11.0.1 as well, but was never actually 
> a problem.
> {code:java}
> > #  zk-shell localhost:2181
> Welcome to zk-shell (1.1.1)
> (CLOSED) />
> (CONNECTED) /> get /kafka-acl/Topic/mytopic
> {"version":1,"acls":[{"principal":"User:CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\,
>  
> Inc.,ST=MyST,C=US","permissionType":"Allow","operation":"Describe","host":"*"}]}
> (CONNECTED) /> json_get /kafka-acl/Topic/mytopic acls
> Path /kafka-acl/Topic/mytopic has bad JSON.
> {code}
> Now Kafka does not recognize any ACLs that have an escaped comma in the 
> principle name and all the clients are denied access. I tried searching for 
> anything relevant that changed between 0.11.0.1 and 1.0.0 and I noticed 
> KAFKA-1595:
> https://github.com/apache/kafka/commit/8b14e11743360a711b2bb670cf503acc0e604602#diff-db89a14f2c85068b1f0076d52e590d05
> Could the new json library be failing because the acl is not actually a valid 
> json string? 
> I can store a valid json string with an escaped backslash (ex: containing 
> "O=MyCompany\\, Inc."), and the comparison between the principle builder 
> string, and what is read from zookeeper succeeds. However, successively apply 
> ACLs seems to strip the backslashes and generally corrupts things:
> {code:java}
> > #  kafka-acls --authorizer-properties zookeeper.connect=localhost:2181 
> > --add --allow-principal 
> > User:"CN=CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\\\, Inc.,ST=MyST,C=US" 
> > --operation Describe --group="*" --allow-host "*" --topic="mytopic"
> Adding ACLs for resource `Topic:mytopic`:
>  User:CN=CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\\, Inc.,ST=MyST,C=US 
> has Allow permission for operations: Describe from hosts: *
> Adding ACLs for resource `Group:*`:
>  User:CN=CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\\, Inc.,ST=MyST,C=US 
> has Allow permission for operations: Describe from hosts: *
> Current ACLs for resource `Topic:mytopic`:
>  User:CN=CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\, Inc.,ST=MyST,C=US 
> has Allow permission for operations: Describe from hosts: *
> Current ACLs for resource `Group:*`:
>  User:CN=CN=myhost.mycompany.com,OU=MyOU,O=MyCompany\, Inc.,ST=MyST,C=US 
> has Allow permission for operations: Describe from hosts: *
> {code}
> It looks as though the backslash used for escaping RFC 2253 strings is not 
> being handled correctly. That's as far as I've dug.



--
This message 

[jira] [Resolved] (KAFKA-6194) Server crash while deleting segments

2017-12-12 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6194.

   Resolution: Fixed
 Assignee: Ismael Juma
Fix Version/s: 1.1.0

I believe this was fixed by KAFKA-6324, please reopen if it still happens in 
trunk.

> Server crash while deleting segments
> 
>
> Key: KAFKA-6194
> URL: https://issues.apache.org/jira/browse/KAFKA-6194
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.0.0
> Environment: kafka version: 1.0
> client versions: 0.8.2.1-0.10.2.1
> platform: aws (eu-west-1a)
> nodes: 36 x r4.xlarge
> disk storage: 2.5 tb per node (~73% usage per node)
> topics: 250
> number of partitions: 48k (approx)
> os: ubuntu 14.04
> jvm: Java(TM) SE Runtime Environment (build 1.8.0_131-b11), Java HotSpot(TM) 
> 64-Bit Server VM (build 25.131-b11, mixed mode)
>Reporter: Ben Corlett
>Assignee: Ismael Juma
>  Labels: regression
> Fix For: 1.1.0
>
> Attachments: server.log.2017-11-14-03.gz
>
>
> We upgraded our R+D cluster to 1.0 in the hope that it would fix the deadlock 
> from 0.11.0.*. Sadly our cluster has the memory leak issues with 1.0 most 
> likely from https://issues.apache.org/jira/browse/KAFKA-6185. We are running 
> one server on a patched version of 1.0 with the pull request from that.
> However today we have had two different servers fall over for non-heap 
> related reasons. The exceptions in the kafka log are :
> {code}
> [2017-11-09 15:32:04,037] ERROR Error while deleting segments for 
> xx-49 in dir /mnt/secure/kafka/datalog 
> (kafka.server.LogDirFailureChannel)
> java.io.IOException: Delete of log .log.deleted failed.
> at kafka.log.LogSegment.delete(LogSegment.scala:496)
> at kafka.log.Log.$anonfun$asyncDeleteSegment$3(Log.scala:1596)
> at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
> at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
> at kafka.log.Log.deleteSeg$1(Log.scala:1596)
> at kafka.log.Log.$anonfun$asyncDeleteSegment$4(Log.scala:1599)
> at 
> kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:110)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> [2017-11-09 15:32:04,040] INFO [ReplicaManager broker=122] Stopping serving 
> replicas in dir /mnt/secure/kafka/datalog (kafka.server.ReplicaManager)
> [2017-11-09 15:32:04,041] ERROR Uncaught exception in scheduled task 
> 'delete-file' (kafka.utils.KafkaScheduler)
> org.apache.kafka.common.errors.KafkaStorageException: Error while deleting 
> segments for xx-49 in dir /mnt/secure/kafka/datalog
> Caused by: java.io.IOException: Delete of log 
> .log.deleted failed.
> at kafka.log.LogSegment.delete(LogSegment.scala:496)
> at kafka.log.Log.$anonfun$asyncDeleteSegment$3(Log.scala:1596)
> at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
> at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
> at kafka.log.Log.deleteSeg$1(Log.scala:1596)
> at kafka.log.Log.$anonfun$asyncDeleteSegment$4(Log.scala:1599)
> at 
> kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:110)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> .
> [2017-11-09 15:32:05,341] ERROR Error while processing data for partition 
> xxx-83 (k

[jira] [Resolved] (KAFKA-6075) Kafka cannot recover after an unclean shutdown on Windows

2017-12-12 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6075.

   Resolution: Fixed
 Assignee: Ismael Juma
Fix Version/s: 1.1.0

I believe this was fixed by KAFKA-6324, please reopen if it still happens in 
trunk.

> Kafka cannot recover after an unclean shutdown on Windows
> -
>
> Key: KAFKA-6075
> URL: https://issues.apache.org/jira/browse/KAFKA-6075
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.1
>Reporter: Vahid Hashemian
>Assignee: Ismael Juma
> Fix For: 1.1.0
>
> Attachments: 6075.v4
>
>
> An unclean shutdown of broker on Windows cannot be recovered by Kafka. Steps 
> to reproduce from a fresh build:
> # Start zookeeper
> # Start a broker
> # Create a topic {{test}}
> # Do an unclean shutdown of broker (find the process id by {{wmic process 
> where "caption = 'java.exe' and commandline like '%server.properties%'" get 
> processid}}), then kill the process by {{taskkill /pid  /f}}
> # Start the broker again
> This leads to the following errors:
> {code}
> [2017-10-17 17:13:24,819] ERROR Error while loading log dir C:\tmp\kafka-logs 
> (kafka.log.LogManager)
> java.nio.file.FileSystemException: 
> C:\tmp\kafka-logs\test-0\.timeindex: The process cannot 
> access the file because it is being used by another process.
> at 
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
> at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
> at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
> at 
> sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
> at 
> sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108)
> at java.nio.file.Files.deleteIfExists(Files.java:1165)
> at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:333)
> at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:295)
> at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
> at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
> at kafka.log.Log.loadSegmentFiles(Log.scala:295)
> at kafka.log.Log.loadSegments(Log.scala:404)
> at kafka.log.Log.(Log.scala:201)
> at kafka.log.Log$.apply(Log.scala:1729)
> at 
> kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:221)
> at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$8$$anonfun$apply$16$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:292)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> [2017-10-17 17:13:24,819] ERROR Error while deleting the clean shutdown file 
> in dir C:\tmp\kafka-logs (kafka.server.LogDirFailureChannel)
> java.nio.file.FileSystemException: 
> C:\tmp\kafka-logs\test-0\.timeindex: The process cannot 
> access the file because it is being used by another process.
> at 
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
> at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
> at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
> at 
> sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
> at 
> sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108)
> at java.nio.file.Files.deleteIfExists(Files.java:1165)
> at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:333)
> at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:295)
> at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
> at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
> at kafka.log.

[jira] [Resolved] (KAFKA-6322) Error deleting log for topic, all log dirs failed.

2017-12-12 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6322.

Resolution: Fixed
  Assignee: Ismael Juma

I believe this was fixed by KAFKA-6324, please reopen if it still happens in 
trunk.

> Error deleting log for topic, all log dirs failed.
> --
>
> Key: KAFKA-6322
> URL: https://issues.apache.org/jira/browse/KAFKA-6322
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.0
> Environment: RancherOS with NFS mounted for log directory
>Reporter: dongyan li
>Assignee: Ismael Juma
>
> Hello,
> I encountered a error when I try to delete a topic with kafka version 1.0.0, 
> the error is not present on version 0.10.2.1 which is the version I upgraded 
> from.
> I suspect that some other thread is still using that file while the Kafka is 
> trying to delete that.
> {code:java}
> 12/6/2017 3:37:32 PM[2017-12-06 21:37:32,846] ERROR Exception while deleting 
> Log(/opt/kafka/logs/topicname-0.112ff11872e4411ca7470ba7e3026ab0-delete) in 
> dir /opt/kafka/logs. (kafka.log.LogManager)
> 12/6/2017 3:37:32 PMorg.apache.kafka.common.errors.KafkaStorageException: 
> Error while deleting log for topicname-0 in dir /opt/kafka/logs
> 12/6/2017 3:37:32 PMCaused by: java.nio.file.FileSystemException: 
> /opt/kafka/logs/topicname-0.112ff11872e4411ca7470ba7e3026ab0-delete/.nfs00e609f200ce:
>  Device or resource busy
> 12/6/2017 3:37:32 PM  at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
> 12/6/2017 3:37:32 PM  at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> 12/6/2017 3:37:32 PM  at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> 12/6/2017 3:37:32 PM  at 
> sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:244)
> 12/6/2017 3:37:32 PM  at 
> sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
> 12/6/2017 3:37:32 PM  at java.nio.file.Files.delete(Files.java:1126)
> 12/6/2017 3:37:32 PM  at 
> org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:630)
> 12/6/2017 3:37:32 PM  at 
> org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:619)
> 12/6/2017 3:37:32 PM  at java.nio.file.Files.walkFileTree(Files.java:2670)
> 12/6/2017 3:37:32 PM  at java.nio.file.Files.walkFileTree(Files.java:2742)
> 12/6/2017 3:37:32 PM  at 
> org.apache.kafka.common.utils.Utils.delete(Utils.java:619)
> 12/6/2017 3:37:32 PM  at kafka.log.Log.$anonfun$delete$2(Log.scala:1432)
> 12/6/2017 3:37:32 PM  at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
> 12/6/2017 3:37:32 PM  at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
> 12/6/2017 3:37:32 PM  at kafka.log.Log.delete(Log.scala:1427)
> 12/6/2017 3:37:32 PM  at kafka.log.LogManager.deleteLogs(LogManager.scala:626)
> 12/6/2017 3:37:32 PM  at 
> kafka.log.LogManager.$anonfun$startup$7(LogManager.scala:362)
> 12/6/2017 3:37:32 PM  at 
> kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:110)
> 12/6/2017 3:37:32 PM  at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
> 12/6/2017 3:37:32 PM  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 12/6/2017 3:37:32 PM  at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 12/6/2017 3:37:32 PM  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 12/6/2017 3:37:32 PM  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 12/6/2017 3:37:32 PM  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 12/6/2017 3:37:32 PM  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 12/6/2017 3:37:32 PM  at java.lang.Thread.run(Thread.java:748)
> {code}
> Then
> {code:java}
> 12/6/2017 3:37:32 PM[2017-12-06 21:37:32,882] INFO Stopping serving logs in 
> dir /opt/kafka/logs (kafka.log.LogManager)
> 12/6/2017 3:37:32 PM[2017-12-06 21:37:32,883] FATAL Shutdown broker because 
> all log dirs in /opt/kafka/logs have failed (kafka.log.LogManager)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6332) Kafka system tests should use nc instead of log grep to detect start-up

2017-12-08 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6332:
--

 Summary: Kafka system tests should use nc instead of log grep to 
detect start-up
 Key: KAFKA-6332
 URL: https://issues.apache.org/jira/browse/KAFKA-6332
 Project: Kafka
  Issue Type: Bug
Reporter: Ismael Juma


[~ewencp] suggested using nc -z test instead of grepping the logs for a more 
reliable test. This came up when the system tests were broken by a log 
improvement change.

Reference: https://github.com/apache/kafka/pull/3834



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6330) KafkaZkClient request queue time metric

2017-12-08 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6330:
--

 Summary: KafkaZkClient request queue time metric
 Key: KAFKA-6330
 URL: https://issues.apache.org/jira/browse/KAFKA-6330
 Project: Kafka
  Issue Type: Bug
  Components: zkclient
Reporter: Ismael Juma


KafkaZkClient have a latency metric which is the time it takes to send a 
request and receive the corresponding response.

If ZooKeeperClient's `maxInFlightRequests` (10 by default) is reached, a 
request may be held for some time before sending starts. This time is not 
currently measured and it may be useful to know if requests are spending longer 
than usual in the `queue` (conceptually as the current implementation doesn't 
use a queue).

This would require a KIP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6324) Change LogSegment.delete to deleteIfExists and harden log recovery

2017-12-07 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6324:
--

 Summary: Change LogSegment.delete to deleteIfExists and harden log 
recovery
 Key: KAFKA-6324
 URL: https://issues.apache.org/jira/browse/KAFKA-6324
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma
Assignee: Ismael Juma
 Fix For: 1.1.0


Fixes KAFKA-6194 and makes the code more robust.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6313) Kafka Core should have explicit SLF4J API dependency

2017-12-07 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6313.

Resolution: Fixed

> Kafka Core should have explicit SLF4J API dependency
> 
>
> Key: KAFKA-6313
> URL: https://issues.apache.org/jira/browse/KAFKA-6313
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Randall Hauch
>Assignee: Randall Hauch
> Fix For: 1.1.0
>
>
> In an application that depends on the Kafka server artifacts with:
> {code:xml}
>   
>   org.apache.kafka
>   kafka_2.11
>   1.1.0-SNAPSHOT
>   
> {code}
> and then running the server programmatically, the following error occurs:
> {noformat}
> [2017-11-23 01:01:45,029] INFO Shutting down producer 
> (kafka.producer.Producer:63)
> [2017-11-23 01:01:45,051] INFO Closing all sync producers 
> (kafka.producer.ProducerPool:63)
> [2017-11-23 01:01:45,052] INFO Producer shutdown completed in 23 ms 
> (kafka.producer.Producer:63)
> [2017-11-23 01:01:45,052] INFO [KafkaServer id=1] shutting down 
> (kafka.server.KafkaServer:63)
> [2017-11-23 01:01:45,057] ERROR [KafkaServer id=1] Fatal error during 
> KafkaServer shutdown. (kafka.server.KafkaServer:161)
> java.lang.NoClassDefFoundError: org/slf4j/event/Level
>   at kafka.utils.CoreUtils$.swallow$default$3(CoreUtils.scala:83)
>   at kafka.server.KafkaServer.shutdown(KafkaServer.scala:520)
>...
> Caused by: java.lang.ClassNotFoundException: org.slf4j.event.Level
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:359)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:348)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:347)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:312)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 25 more
> {noformat}
> It appears that KAFKA-1044 and [this 
> PR|https://github.com/apache/kafka/pull/3477] removed the use of Log4J from 
> Core but [added use 
> of|https://github.com/confluentinc/kafka/commit/ed8b0315a6c3705b2a163ce3ab4723234779264f#diff-52505b9374ea885e44bcb73cbc4714d6R34]
>  the {{org.slf4j.event.Level}} in {{CoreUtils.scala}}. The 
> {{org.slf4j.event.Level}} class is in the {{org.slf4j:slf4j-api}} artifact, 
> which is currently not included in the dependencies of 
> {{org.apache.kafka:kafka_2.11:1.1.0-SNAPSHOT}}. Because this is needed by the 
> server, the SLF4J API library probably needs to be added to the dependencies.
> [~viktorsomogyi] and [~ijuma], was this intentional, or is it intended that 
> the SLF4J API be marked as {{provided}}? BTW, I marked this as CRITICAL just 
> because this probably needs to be sorted out before the release.
> *Update*: As the comments below explain, the actual problem is a bit 
> different to what is in the JIRA description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6065) Add ZooKeeperRequestLatencyMs to KafkaZkClient

2017-12-06 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6065.

Resolution: Fixed

> Add ZooKeeperRequestLatencyMs to KafkaZkClient
> --
>
> Key: KAFKA-6065
> URL: https://issues.apache.org/jira/browse/KAFKA-6065
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Ismael Juma
> Fix For: 1.1.0
>
>
> Among other things, KIP-188 added latency metrics to ZkUtils. We should add 
> the same metrics to KafkaZkClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6317) Maven artifact for kafka should not depend on log4j

2017-12-06 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6317:
--

 Summary: Maven artifact for kafka should not depend on log4j
 Key: KAFKA-6317
 URL: https://issues.apache.org/jira/browse/KAFKA-6317
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma
Assignee: Ismael Juma


It should only depend on slf4j-api. The release tarball should still depend on 
log4j and slf4j-log4j12.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6174) Add methods in Options classes to keep binary compatibility with 0.11

2017-12-06 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6174.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Add methods in Options classes to keep binary compatibility with 0.11
> -
>
> Key: KAFKA-6174
> URL: https://issues.apache.org/jira/browse/KAFKA-6174
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 1.1.0, 1.0.1
>
>
> From 0.11 to 1.0, we moved `DescribeClusterOptions timeoutMs(Integer 
> timeoutMs)` from DescribeClusterOptions to AbstractOptions. User reports that 
> code compiled against 0.11.0 fails when it is executed with 1.0 kafka-clients 
> jar.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2188) JBOD Support

2017-12-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2188.

Resolution: Duplicate

KIP-112 and KIP-113 replaced this one.

> JBOD Support
> 
>
> Key: KAFKA-2188
> URL: https://issues.apache.org/jira/browse/KAFKA-2188
> Project: Kafka
>  Issue Type: Bug
>Reporter: Andrii Biletskyi
>Assignee: Andrii Biletskyi
> Attachments: KAFKA-2188.patch, KAFKA-2188.patch, KAFKA-2188.patch
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-18+-+JBOD+Support



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6261) Request logging throws exception if acks=0

2017-11-22 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6261:
--

 Summary: Request logging throws exception if acks=0
 Key: KAFKA-6261
 URL: https://issues.apache.org/jira/browse/KAFKA-6261
 Project: Kafka
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Ismael Juma
Assignee: Ismael Juma
 Fix For: 1.1.0, 1.0.1






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-1044) change log4j to slf4j

2017-11-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-1044.

   Resolution: Fixed
Fix Version/s: 1.1.0

> change log4j to slf4j 
> --
>
> Key: KAFKA-1044
> URL: https://issues.apache.org/jira/browse/KAFKA-1044
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8.0
>Reporter: shijinkui
>Assignee: Viktor Somogyi
>Priority: Minor
>  Labels: newbie
> Fix For: 1.1.0
>
>
> can u chanage the log4j to slf4j, in my project, i use logback, it's conflict 
> with log4j.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4871) Kafka doesn't respect TTL on Zookeeper hostname - crash if zookeeper IP changes

2017-11-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4871.

Resolution: Duplicate

Duplicate of KAFKA-5473.

> Kafka doesn't respect TTL on Zookeeper hostname - crash if zookeeper IP 
> changes
> ---
>
> Key: KAFKA-4871
> URL: https://issues.apache.org/jira/browse/KAFKA-4871
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: Stephane Maarek
>
> I had a Zookeeper cluster that automatically obtains hostname so that they 
> remain constant over time. I deleted my 3 zookeeper machines and new machines 
> came back online, with the same hostname, and they updated their CNAME
> Kafka then failed and couldn't reconnect to Zookeeper as it didn't try to 
> resolve the IP of Zookeeper again. See log below:
> [2017-03-09 05:49:57,302] INFO Client will use GSSAPI as SASL mechanism. 
> (org.apache.zookeeper.client.ZooKeeperSaslClient)
> [2017-03-09 05:49:57,302] INFO Opening socket connection to server 
> zookeeper-3.example.com/10.12.79.43:2181. Will attempt to SASL-authenticate 
> using Login Context section 'Client' (org.apache.zookeeper.ClientCnxn)
> [ec2-user]$ dig +short zookeeper-3.example.com
> 10.12.79.36
> As you can see even though the machine is capable of finding the new 
> hostname, Kafka somehow didn't respect the TTL (was set to 60 seconds) and 
> didn't get the new IP. I feel that on failed Zookeeper connection, Kafka 
> should at least try to resolve the new Zookeeper IP. That allows Kafka to 
> keep up with Zookeeper changes over time
> What do you think? Is that expected behaviour or a bug?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4041) kafka unable to reconnect to zookeeper behind an ELB

2017-11-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4041.

Resolution: Fixed

Marking as a duplicate of KAFKA-5473. In that JIRA, we will recreate the 
`ZooKeeper` instance if there's an issue connecting and should hopefully fix 
this issue too.

> kafka unable to reconnect to zookeeper behind an ELB
> 
>
> Key: KAFKA-4041
> URL: https://issues.apache.org/jira/browse/KAFKA-4041
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1, 0.10.0.1
> Environment: RHEL EC2 instances
>Reporter: prabhakar
>Priority: Blocker
>
> Kafka brokers are unable to connect to  zookeeper which is behind an ELB.
> Kafka is using zkClient which is caching the IP address of zookeeper  and 
> even when there is a change in the IP for zookeeper it is using the Old 
> zookeeper IP.
> The server.properties has a DNS name. Ideally kafka should resolve the IP 
> using the DNS in case of any failures connecting to the broker.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (KAFKA-4041) kafka unable to reconnect to zookeeper behind an ELB

2017-11-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-4041:


> kafka unable to reconnect to zookeeper behind an ELB
> 
>
> Key: KAFKA-4041
> URL: https://issues.apache.org/jira/browse/KAFKA-4041
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1, 0.10.0.1
> Environment: RHEL EC2 instances
>Reporter: prabhakar
>Priority: Blocker
>
> Kafka brokers are unable to connect to  zookeeper which is behind an ELB.
> Kafka is using zkClient which is caching the IP address of zookeeper  and 
> even when there is a change in the IP for zookeeper it is using the Old 
> zookeeper IP.
> The server.properties has a DNS name. Ideally kafka should resolve the IP 
> using the DNS in case of any failures connecting to the broker.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4041) kafka unable to reconnect to zookeeper behind an ELB

2017-11-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4041.

Resolution: Duplicate

> kafka unable to reconnect to zookeeper behind an ELB
> 
>
> Key: KAFKA-4041
> URL: https://issues.apache.org/jira/browse/KAFKA-4041
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1, 0.10.0.1
> Environment: RHEL EC2 instances
>Reporter: prabhakar
>Priority: Blocker
>
> Kafka brokers are unable to connect to  zookeeper which is behind an ELB.
> Kafka is using zkClient which is caching the IP address of zookeeper  and 
> even when there is a change in the IP for zookeeper it is using the Old 
> zookeeper IP.
> The server.properties has a DNS name. Ideally kafka should resolve the IP 
> using the DNS in case of any failures connecting to the broker.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2193) Intermittent network + DNS issues can cause brokers to permanently drop out of a cluster

2017-11-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2193.

Resolution: Duplicate

Duplicate of KAFKA-5473.

> Intermittent network + DNS issues can cause brokers to permanently drop out 
> of a cluster
> 
>
> Key: KAFKA-2193
> URL: https://issues.apache.org/jira/browse/KAFKA-2193
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Tom Lee
>  Labels: broker
>
> Our Kafka cluster recently experienced some intermittent network & DNS 
> resolution issues such that this call to connect to Zookeeper failed with an 
> UnknownHostException:
> https://github.com/sgroschupf/zkclient/blob/0630c9c6e67ab49a51e80bfd939e4a0d01a69dfe/src/main/java/org/I0Itec/zkclient/ZkConnection.java#L67
> We observed this happen during a processStateChanged(KeeperState.Expired) 
> call:
> https://github.com/sgroschupf/zkclient/blob/0630c9c6e67ab49a51e80bfd939e4a0d01a69dfe/src/main/java/org/I0Itec/zkclient/ZkClient.java#L649
> the session expiry was in turn caused by what we suspect to be intermittent 
> network issues.
> The failed ZK reconnect seemed to put ZkClient into a state where it would 
> never recover and the Kafka broker into a state where it would need a restart 
> to rejoin the cluster: ZkConnection._zk == null, 0.3.x doesn't appear to 
> automatically try to make further attempts to reconnect after the failure, 
> and obviously no further state transitions seem likely to happen without a 
> connection to ZK.
> The newer zkclient 0.4.0/0.5.0 releases will helpfully fire a notification 
> when this occurs, so the brokers have an opportunity to handle this sort of 
> failure in a more graceful manner (e.g. by trying to reconnect after some 
> backoff period):
> https://github.com/sgroschupf/zkclient/blob/0.4.0/src/main/java/org/I0Itec/zkclient/ZkClient.java#L461
> Happy to provide more info here if I can.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2864) Bad zookeeper host causes broker to shutdown uncleanly and stall producers

2017-11-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2864.

Resolution: Duplicate

Duplicate of KAFKA-5473.

> Bad zookeeper host causes broker to shutdown uncleanly and stall producers
> --
>
> Key: KAFKA-2864
> URL: https://issues.apache.org/jira/browse/KAFKA-2864
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 0.8.2.1
>Reporter: Mahdi
>Priority: Critical
> Attachments: kafka.log
>
>
> We are using kafka 0.8.2.1 and we noticed that kafka/zookeeper-client were 
> not able to gracefully handle a non existing zookeeper instance. This caused 
> one of our brokers to get stuck during a self-inflicted shutdown and that 
> seemed to impact the partitions for which the broker was a leader even though 
> we had two other replicas.
> Here is a timeline of what happened (shortened for brevity, I'll attach log 
> snippets):
> We have a 7 node zookeeper cluster. Two of our nodes were decommissioned and 
> their dns records removed (zookeeper15 and zookeeper16). The decommissioning 
> happened about two weeks earlier. We noticed the following in the logs
> - Opening socket connection to server ip-10-0-0-1.ec2.internal/10.0.0.1:2181. 
> Will not attempt to authenticate using SASL (unknown error)
> - Client session timed out, have not heard from server in 858ms for sessionid 
> 0x1250c5c0f1f5001c, closing socket connection and attempting reconnect
> - Opening socket connection to server ip-10.0.0.2.ec2.internal/10.0.0.2:2181. 
> Will not attempt to authenticate using SASL (unknown error)
> - zookeeper state changed (Disconnected)
> - Client session timed out, have not heard from server in 2677ms for 
> sessionid 0x1250c5c0f1f5001c, closing socket connection and attempting 
> reconnect
> - Opening socket connection to server ip-10.0.0.3.ec2.internal/10.0.0.3:2181. 
> Will not attempt to authenticate using SASL (unknown error)
> - Socket connection established to ip-10.0.0.3.ec2.internal/10.0.0.3:2181, 
> initiating session
> - zookeeper state changed (Expired)
> - Initiating client connection, 
> connectString=zookeeper21.example.com:2181,zookeeper19.example.com:2181,zookeeper22.example.com:2181,zookeeper18.example.com:2181,zookeeper20.example.com:2181,zookeeper16.example.com:2181,zookeeper15.example.com:2181/foo/kafka/central
>  sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@3bbc39f8
> - Unable to reconnect to ZooKeeper service, session 0x1250c5c0f1f5001c has 
> expired, closing socket connection
> - Unable to re-establish connection. Notifying consumer of the following 
> exception:
> org.I0Itec.zkclient.exception.ZkException: Unable to connect to 
> zookeeper21.example.com:2181,zookeeper19.example.com:2181,zookeeper22.example.com:2181,zookeeper18.example.com:2181,zookeeper20.example.com:2181,zookeeper16.example.com:2181,zookeeper15.example.com:2181/foo/kafka/central
> at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:69)
> at org.I0Itec.zkclient.ZkClient.reconnect(ZkClient.java:1176)
> at org.I0Itec.zkclient.ZkClient.processStateChanged(ZkClient.java:649)
> at org.I0Itec.zkclient.ZkClient.process(ZkClient.java:560)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: java.net.UnknownHostException: zookeeper16.example.com: unknown 
> error
> at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
> at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
> at 
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
> at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
> at java.net.InetAddress.getAllByName(InetAddress.java:1192)
> at java.net.InetAddress.getAllByName(InetAddress.java:1126)
> at 
> org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61)
> at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445)
> at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:380)
> at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:67)
> ... 5 more
> That seems to have caused the following:
>  [main-EventThread] [org.apache.zookeeper.ClientCnxn ]: EventThread shut 
> down
> Which in turn caused kafka to shut itself down
> [Thread-2] [kafka.server.KafkaServer]: [Kafka Server 13], 
> shutting down
> [Thread-2] [kafka.server.KafkaServer]: [Kafka Server 13], 
> Starting controlled shutdown
> However, the shutdown didn't go as expected apparently due to an NPE in the 
> zk client
> 2015-11-12T12:03:40.101Z WARN  [Thread-2 

[jira] [Resolved] (KAFKA-6247) Fix system test dependency issues

2017-11-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6247.

   Resolution: Fixed
 Assignee: Colin P. McCabe
Fix Version/s: 1.1.0

> Fix system test dependency issues
> -
>
> Key: KAFKA-6247
> URL: https://issues.apache.org/jira/browse/KAFKA-6247
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
> Fix For: 1.1.0
>
>
> Kibosh needs to be installed on Vagrant instances as well as in Docker 
> environments.  And we need to download old Apache Kafka releases from a 
> stable mirror that will not purge old releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6223) Please delete old releases from mirroring system

2017-11-19 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6223.

Resolution: Fixed

> Please delete old releases from mirroring system
> 
>
> Key: KAFKA-6223
> URL: https://issues.apache.org/jira/browse/KAFKA-6223
> Project: Kafka
>  Issue Type: Bug
> Environment: https://dist.apache.org/repos/dist/release/kafka/
>Reporter: Sebb
>Assignee: Rajini Sivaram
>
> To reduce the load on the ASF mirrors, projects are required to delete old 
> releases [1]
> Please can you remove all non-current releases?
> It's unfair to expect the 3rd party mirrors to carry old releases.
> Note that older releases can still be linked from the download page, but such 
> links should use the archive server at:
> https://archive.apache.org/dist/kafka/
> A suggested process is:
> + Change the download page to use archive.a.o for old releases
> + Delete the corresponding directories from 
> {{https://dist.apache.org/repos/dist/release/kafka/}}
> e.g. {{svn delete https://dist.apache.org/repos/dist/release/kafka/0.8.0}}
> Thanks!
> [1] http://www.apache.org/dev/release.html#when-to-archive



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6103) one broker appear to dead lock after running serval hours with a fresh cluster

2017-11-17 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6103.

Resolution: Duplicate

> one broker appear to dead lock after running serval hours with a fresh cluster
> --
>
> Key: KAFKA-6103
> URL: https://issues.apache.org/jira/browse/KAFKA-6103
> Project: Kafka
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 0.10.1.0
> Environment: brokers: 3
> Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-96-generic x86_64)
>  cpu: 8 core mem: 16G
>Reporter: Peyton Peng
>
> today we recreated a refresh kafka cluster with three brokers, at the 
> beginning everything runs well without exception. main configuration list as 
> below:
> num.io.threads=16
> num.network.threads=3
> offsets.commit.timeout.ms=1
> #offsets.topic.num.partitions=60
> default.replication.factor=3
> offsets.topic.replication.factor=3
> num.replica.fetchers=4
> replica.fetch.wait.max.ms=1000
> replica.lag.time.max.ms=2
> replica.socket.receive.buffer.bytes=1048576
> replica.socket.timeout.ms=6
> socket.receive.buffer.bytes=1048576
> socket.send.buffer.bytes=1048576
> log.dirs=/data/kafka-logs
> num.partitions=12
> log.retention.hours=48
> log.roll.hours=48
> zookeeper.connect=
> listeners=PLAINTEXT://:9092
> advertised.listeners=PLAINTEXT://*:9092
> broker.id=3
> after serval hours we got the the LOOP exception from consumer layer as below:
> "Marking the coordinator ### dead".
> checked and found one broker running with no IO, cpu rate is ok, memory ok 
> also, but other two brokers throws exception :
> java.io.IOException: Connection to 3 was disconnected before the response was 
> read
>   at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:115)
>   at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:112)
>   at scala.Option.foreach(Option.scala:257)
>   at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:112)
>   at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:108)
>   at 
> kafka.utils.NetworkClientBlockingOps$.recursivePoll$1(NetworkClientBlockingOps.scala:137)
>   at 
> kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
>   at 
> kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:108)
>   at 
> kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:253)
>   at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
>   at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
>   at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
>   at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> finally we found the jvm stacks show dead lock:
> Found one Java-level deadlock:
> =
> "executor-Heartbeat":
>   waiting to lock monitor 0x7f747c038db8 (object 0x0005886a5ae8, a 
> kafka.coordinator.GroupMetadata),
>   which is held by "group-metadata-manager-0"
> "group-metadata-manager-0":
>   waiting to lock monitor 0x7f757c0085e8 (object 0x0005a71fccb0, a 
> java.util.LinkedList),
>   which is held by "kafka-request-handler-7"
> "kafka-request-handler-7":
>   waiting to lock monitor 0x7f747c038db8 (object 0x0005886a5ae8, a 
> kafka.coordinator.GroupMetadata),
>   which is held by "group-metadata-manager-0"
> Java stack information for the threads listed above:
> ===
> "executor-Heartbeat":
> at 
> kafka.coordinator.GroupCoordinator.onExpireHeartbeat(GroupCoordinator.scala:739)
> - waiting to lock <0x0005886a5ae8> (a 
> kafka.coordinator.GroupMetadata)
> at 
> kafka.coordinator.DelayedHeartbeat.onExpiration(DelayedHeartbeat.scala:33)
> at kafka.server.DelayedOperation.run(DelayedOperation.scala:107)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> "group-meta

[jira] [Resolved] (KAFKA-6046) DeleteRecordsRequest to a non-leader

2017-11-17 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6046.

Resolution: Fixed

> DeleteRecordsRequest to a non-leader
> 
>
> Key: KAFKA-6046
> URL: https://issues.apache.org/jira/browse/KAFKA-6046
> Project: Kafka
>  Issue Type: Bug
>Reporter: Tom Bentley
>Assignee: Ted Yu
> Fix For: 1.1.0
>
>
> When a `DeleteRecordsRequest` is sent to a broker that's not the leader for 
> the partition the  `DeleteRecordsResponse` returns 
> `UNKNOWN_TOPIC_OR_PARTITION`. This is ambiguous (does the topic not exist on 
> any broker, or did we just sent the request to the wrong broker?), and 
> inconsistent (a `ProduceRequest` would return `NOT_LEADER_FOR_PARTITION`).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-679) Phabricator for code review

2017-11-17 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-679.
---
Resolution: Won't Do

Yes, this can be safely closed.

> Phabricator for code review
> ---
>
> Key: KAFKA-679
> URL: https://issues.apache.org/jira/browse/KAFKA-679
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Neha Narkhede
>Assignee: Sriram Subramanian
>
> Sriram proposed adding phabricator support for code reviews. 
> From http://phabricator.org/ : "Phabricator is a open source collection of 
> web applications which make it easier to write, review, and share source 
> code. It is currently available as an early release. Phabricator was 
> developed at Facebook."
> It's open source so pretty much anyone could host an instance of this 
> software.
> To begin with, there will be a public-facing instance located at 
> http://reviews.facebook.net (sponsored by Facebook and hosted by the OSUOSL 
> http://osuosl.org).
> We can use this JIRA to deal with adding (and ensuring) Apache-friendly 
> support that will allow us to do code reviews with Phabricator for Kafka.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-1408) Kafk broker can not stop itself normaly after problems with connection to ZK

2017-11-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-1408.

Resolution: Duplicate

Closing as duplicate of KAFKA-1317 based on other comments and the fact that 
there has been no activity on this issue for years.

> Kafk broker can not stop itself normaly after problems with connection to ZK
> 
>
> Key: KAFKA-1408
> URL: https://issues.apache.org/jira/browse/KAFKA-1408
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.1
>Reporter: Dmitry Bugaychenko
>Assignee: Neha Narkhede
>
> After getting to inconsistence state due to short netwrok failure broker can 
> not stop itself. The last message in the log is:
> {code}
> INFO   | jvm 1| 2014/04/21 08:53:07 | [2014-04-21 09:53:06,999] INFO 
> [kafka-log-cleaner-thread-0], Stopped  (kafka.log.LogCleaner)
> INFO   | jvm 1| 2014/04/21 08:53:07 | [2014-04-21 09:53:06,999] INFO 
> [kafka-log-cleaner-thread-0], Shutdown completed (kafka.log.LogCleaner)
> {code}
> There is also a preceding error:
> {code}
> INFO   | jvm 1| 2014/04/21 08:52:55 | [2014-04-21 09:52:55,015] WARN 
> Controller doesn't exist (kafka.utils.Utils$)
> INFO   | jvm 1| 2014/04/21 08:52:55 | kafka.common.KafkaException: 
> Controller doesn't exist
> INFO   | jvm 1| 2014/04/21 08:52:55 |   at 
> kafka.utils.ZkUtils$.getController(ZkUtils.scala:70)
> INFO   | jvm 1| 2014/04/21 08:52:55 |   at 
> kafka.server.KafkaServer.kafka$server$KafkaServer$$controlledShutdown(KafkaServer.scala:148)
> INFO   | jvm 1| 2014/04/21 08:52:55 |   at 
> kafka.server.KafkaServer$$anonfun$shutdown$1.apply$mcV$sp(KafkaServer.scala:220)
> {code}
> Here is a part of jstack (it looks like there is a deadlock between 
> delete-topics-thread  and ZkClient-EventThread):
> {code}
> IWrapper-Connection id=10 state=WAITING
> - waiting on <0x15d6aa44> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> - locked <0x15d6aa44> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>  owned by ZkClient-EventThread-37-devlnx2:2181 id=37
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
> at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
> at kafka.utils.Utils$.inLock(Utils.scala:536)
> at kafka.controller.KafkaController.shutdown(KafkaController.scala:641)
> at 
> kafka.server.KafkaServer$$anonfun$shutdown$8.apply$mcV$sp(KafkaServer.scala:233)
> at kafka.utils.Utils$.swallow(Utils.scala:167)
> at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
> at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
> at kafka.utils.Logging$class.swallow(Logging.scala:94)
> at kafka.utils.Utils$.swallow(Utils.scala:46)
> at kafka.server.KafkaServer.shutdown(KafkaServer.scala:233)
> at odkl.databus.server.Main.stop(Main.java:184)
> at 
> org.tanukisoftware.wrapper.WrapperManager.stopInner(WrapperManager.java:1982)
> at 
> org.tanukisoftware.wrapper.WrapperManager.handleSocket(WrapperManager.java:2391)
> at org.tanukisoftware.wrapper.WrapperManager.run(WrapperManager.java:2696)
> at java.lang.Thread.run(Thread.java:744)
> ZkClient-EventThread-37-devlnx2:2181 id=37 state=WAITING
> - waiting on <0x3d5f9878> (a java.util.concurrent.CountDownLatch$Sync)
> - locked <0x3d5f9878> (a java.util.concurrent.CountDownLatch$Sync)
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236)
> at kafka.utils.ShutdownableThread.shutdown(ShutdownableThread.scala:36)
> at 
> kafka.controller.TopicDeletionManager.shutdown(TopicDeletionManager.scala:93)
> at 
> kafka.controller.KafkaController$$anonfun$onControllerResignation$1.apply$mcV$sp(KafkaController

[jira] [Reopened] (KAFKA-1993) Enable topic deletion as default

2017-11-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-1993:


> Enable topic deletion as default
> 
>
> Key: KAFKA-1993
> URL: https://issues.apache.org/jira/browse/KAFKA-1993
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Gwen Shapira
>Assignee: Gwen Shapira
> Attachments: KAFKA-1993.patch
>
>
> Since topic deletion is now throughly tested and works as well as most Kafka 
> features, we should enable it by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-1993) Enable topic deletion as default

2017-11-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-1993.

Resolution: Duplicate

> Enable topic deletion as default
> 
>
> Key: KAFKA-1993
> URL: https://issues.apache.org/jira/browse/KAFKA-1993
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Gwen Shapira
>Assignee: Gwen Shapira
> Attachments: KAFKA-1993.patch
>
>
> Since topic deletion is now throughly tested and works as well as most Kafka 
> features, we should enable it by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4675) Subsequent CreateTopic command could be lost after a DeleteTopic command

2017-11-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4675.

Resolution: Duplicate

KAFKA-6098 is the same issue and has more information.

> Subsequent CreateTopic command could be lost after a DeleteTopic command
> 
>
> Key: KAFKA-4675
> URL: https://issues.apache.org/jira/browse/KAFKA-4675
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>  Labels: admin
>
> This is discovered while investigating KAFKA-3896: If an admin client sends a 
> delete topic command and a create topic command consecutively, even if it 
> wait for the response of the previous command before issuing the second, 
> there is still a race condition that the create topic command could be "lost".
> This is because currently these commands are all asynchronous as defined in 
> KIP-4, and controller will return the response once it has written the 
> corresponding data to ZK path, which can be handled by different listener 
> threads at different paces, and if the thread handling create is faster than 
> the other, the executions could be effectively re-ordered.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6210) IllegalArgumentException if 1.0.0 is used for inter.broker.protocol.version or log.message.format.version

2017-11-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6210.

Resolution: Fixed

> IllegalArgumentException if 1.0.0 is used for inter.broker.protocol.version 
> or log.message.format.version
> -
>
> Key: KAFKA-6210
> URL: https://issues.apache.org/jira/browse/KAFKA-6210
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 1.1.0, 1.0.1
>
>
> The workaround is trivial (use 1.0), but this is sure to trip many users. 
> Also, if you have automatic restarts on crashes, it may not be obvious what's 
> going on. Reported by Brett Rahn.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6210) IllegalArgumentException if 1.0.0 is used for inter.broker.protocol.version or log.message.format.version

2017-11-15 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6210:
--

 Summary: IllegalArgumentException if 1.0.0 is used for 
inter.broker.protocol.version or log.message.format.version
 Key: KAFKA-6210
 URL: https://issues.apache.org/jira/browse/KAFKA-6210
 Project: Kafka
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Ismael Juma
Assignee: Ismael Juma
Priority: Critical
 Fix For: 1.1.0, 1.0.1


The workaround is trivial (use 1.0), but this is sure to trip many users. Also, 
if you have automatic restarts on crashes, it may not be obvious what's going 
on. Reported by Brett Rahn.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6185) Selector memory leak with high likelihood of OOM in case of down conversion

2017-11-09 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6185.

Resolution: Fixed

> Selector memory leak with high likelihood of OOM in case of down conversion
> ---
>
> Key: KAFKA-6185
> URL: https://issues.apache.org/jira/browse/KAFKA-6185
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.0
> Environment: Ubuntu 14.04.5 LTS
> 5 brokers: 1&2 on 1.0.0 3,4,5 on 0.11.0.1
> inter.broker.protocol.version=0.11.0.1
> log.message.format.version=0.11.0.1
> clients a mix of 0.9, 0.10, 0.11
>Reporter: Brett Rann
>Assignee: Rajini Sivaram
>Priority: Blocker
>  Labels: regression
> Fix For: 1.1.0, 1.0.1
>
> Attachments: Kafka_Internals___Datadog.png, 
> Kafka_Internals___Datadog.png
>
>
> We are testing 1.0.0 in a couple of environments.
> Both have about 5 brokers, with two 1.0.0 brokers and the rest 0.11.0.1 
> brokers.
> One is using on disk message format 0.9.0.1, the other 0.11.0.1
> we have 0.9, 0.10, and 0.11 clients connecting.
> The cluster on the 0.9.0.1 format is running fine for a week.
> But the cluster on the 0.11.0.1 format is consistently having memory issues, 
> only on the two upgraded brokers running 1.0.0.
> The first occurrence of the error comes along with this stack trace
> {noformat}
> {"timestamp":"2017-11-06 
> 14:22:32,402","level":"ERROR","logger":"kafka.server.KafkaApis","thread":"kafka-request-handler-7","message":"[KafkaApi-1]
>  Error when handling request 
> {replica_id=-1,max_wait_time=500,min_bytes=1,topics=[{topic=maxwell.users,partitions=[{partition=0,fetch_offset=227537,max_bytes=1100},{partition=4,fetch_offset=354468,max_bytes=1100},{partition=5,fetch_offset=266524,max_bytes=1100},{partition=8,fetch_offset=324562,max_bytes=1100},{partition=10,fetch_offset=292931,max_bytes=1100},{partition=12,fetch_offset=325718,max_bytes=1100},{partition=15,fetch_offset=229036,max_bytes=1100}]}]}"}
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.kafka.common.record.AbstractRecords.downConvert(AbstractRecords.java:101)
> at 
> org.apache.kafka.common.record.FileRecords.downConvert(FileRecords.java:253)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1$$anonfun$apply$4.apply(KafkaApis.scala:520)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1$$anonfun$apply$4.apply(KafkaApis.scala:518)
> at scala.Option.map(Option.scala:146)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1.apply(KafkaApis.scala:518)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1.apply(KafkaApis.scala:508)
> at scala.Option.flatMap(Option.scala:171)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$convertedPartitionData$1(KafkaApis.scala:508)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$createResponse$2$1.apply(KafkaApis.scala:556)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$createResponse$2$1.apply(KafkaApis.scala:555)
> at scala.collection.Iterator$class.foreach(Iterator.scala:891)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$createResponse$2(KafkaApis.scala:555)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$fetchResponseCallback$1$1.apply(KafkaApis.scala:569)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$fetchResponseCallback$1$1.apply(KafkaApis.scala:569)
> at 
> kafka.server.KafkaApis$$anonfun$sendResponseMaybeThrottle$1.apply$mcVI$sp(KafkaApis.scala:2034)
> at 
> kafka.server.ClientRequestQuotaManager.maybeRecordAndThrottle(ClientRequestQuotaManager.scala:52)
> at 
> kafka.server.KafkaApis.sendResponseMaybeThrottle(KafkaApis.scala:2033)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$fetchResponseCallback$1(KafkaApis.scala:569)
> at 
> kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$processResponseCallback$1$1.apply$mcVI$sp(KafkaApis.scala:588)
> at 
> kafka.server.ClientQuotaManager.maybeRecordAndThrottle(ClientQuotaManager.scala:175)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$processResponseCallback$1(KafkaApis.scala:587)
> at 

[jira] [Resolved] (KAFKA-6146) minimize the number of triggers enqueuing PreferredReplicaLeaderElection events

2017-11-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6146.

Resolution: Fixed

> minimize the number of triggers enqueuing PreferredReplicaLeaderElection 
> events
> ---
>
> Key: KAFKA-6146
> URL: https://issues.apache.org/jira/browse/KAFKA-6146
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 1.1.0
>Reporter: Jun Rao
>Assignee: Onur Karaman
> Fix For: 1.1.0
>
>
> We currently enqueue a PreferredReplicaLeaderElection controller event in 
> PreferredReplicaElectionHandler's handleCreation, handleDeletion, and 
> handleDataChange. We can just enqueue the event upon znode creation and after 
> preferred replica leader election completes. The processing of this latter 
> enqueue will register the exist watch on PreferredReplicaElectionZNode and 
> perform any pending preferred replica leader election that may have occurred 
> between completion and registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-1506) Cancel "kafka-reassign-partitions" Job

2017-11-01 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-1506.

Resolution: Duplicate

Marking as a duplicate of KAFKA-1676 as the latter contains more information.

> Cancel "kafka-reassign-partitions" Job
> --
>
> Key: KAFKA-1506
> URL: https://issues.apache.org/jira/browse/KAFKA-1506
> Project: Kafka
>  Issue Type: New Feature
>  Components: replication, tools
>Affects Versions: 0.8.1, 0.8.1.1
>Reporter: Paul Lung
>Priority: Major
>  Labels: newbie++
>
> I started a reassignment, and for some reason it just takes forever. However, 
> it won¹t let me start another reassignment job while this one is running. So 
> a tool to cancel a reassignment job is needed. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (KAFKA-3917) Some __consumer_offsets replicas grow way too big

2017-10-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-3917:


If you're running 0.10.2.0, then potentially KAFKA-5413, but I'll reopen until 
we can verify.

> Some __consumer_offsets replicas grow way too big
> -
>
> Key: KAFKA-3917
> URL: https://issues.apache.org/jira/browse/KAFKA-3917
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0
> Environment: Runs with Docker 1.10.1 in a container on 
> Linux 3.13.0-77-generic #121-Ubuntu SMP Wed Jan 20 10:50:42 UTC 2016 x86_64
>Reporter: Maxim Vladimirskiy
>  Labels: reliability
>
> We noticed that some replicas of partitions of the __consumer_offsets topic 
> grow way too big. Looking inside respective folders it became apparent that 
> old segments had not been cleaned up. Please see below example of disk usage 
> data for both affected and not affected partitions:
> Not affected partitions:
> Partition: 0  Leader: 2   Replicas: 2,3,4 Isr: 2,4,3
> 2: 49M
> 3: 49M
> 4: 49M
> Affected partitions:
> Partition: 10 Leader: 2   Replicas: 2,0,1 Isr: 1,2,0
> 0: 86M
> 1: 22G <<< too big!
> 2: 86M
> Partition: 38 Leader: 0   Replicas: 0,4,1 Isr: 1,0,4
> 0: 43M
> 1: 26G <<<  too big!
> 4: 26G <<<  too big!
> As you can see sometimes only one replica is affected, sometimes both 
> replicas are affected.
> When I try to restart a broker that has affected replicas it fails to start 
> with an exception that looks like this:
> [2016-06-28 23:15:20,441] ERROR There was an error in one of the threads 
> during logs loading: java.lang.IllegalArgumentException: requirement failed: 
> Corrupt index found, index file 
> (/var/kafka/__consumer_offsets-38/.index) has non-zero 
> size but the last offset is -676703869 and the base offset is 0 
> (kafka.log.LogManager)
> [2016-06-28 23:15:20,442] FATAL [Kafka Server 1], Fatal error during 
> KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
> java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
> index file (/var/kafka/__consumer_offsets-38/.index) has 
> non-zero size but the last offset is -676703869 and the base offset is 0
> at scala.Predef$.require(Predef.scala:233)
> at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
> at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:184)
> at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:183)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at kafka.log.Log.loadSegments(Log.scala:183)
> at kafka.log.Log.(Log.scala:67)
> at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$7$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:142)
> at kafka.utils.Utils$$anon$1.run(Utils.scala:54)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> After the content of the affected partition is deleted broker starts 
> successfully. 
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2616) Improve Kakfa client exceptions

2017-10-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2616.

Resolution: Auto Closed

BlockingChannel is only used by the Scala clients. which are no longer 
supported. Please upgrade to the Java clients whenever possible.


> Improve Kakfa client exceptions
> ---
>
> Key: KAFKA-2616
> URL: https://issues.apache.org/jira/browse/KAFKA-2616
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.8.2.1
>Reporter: Hurshal Patel
>Priority: Minor
>
> Any sort of network failure results in a {{java.nio.ClosedChannelException}} 
> which is bubbled up from {{kafka.network.BlockingChannel}}. 
> Displaying such an exception to a user with little knowledge about Kafka can 
> be more confusing than informative. A better user experience for the Kafka 
> consumer would be to throw a more appropriately named exception when a 
> {{ClosedChannelException}} is encountered.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2636) Producer connectivity obscured connection failure logging

2017-10-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2636.

Resolution: Not A Problem

The producer should throw a TimeoutException to the callback or if `Future.get` 
is called.

> Producer connectivity obscured connection failure logging
> -
>
> Key: KAFKA-2636
> URL: https://issues.apache.org/jira/browse/KAFKA-2636
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1, 0.8.2.2
> Environment: Windows 8 running java implementation of Kafka Producer
>Reporter: Jason Kania
>
> The Kafka Producer does not generate a visible exception when a connection 
> cannot be made. Instead DEBUG settings are required to observe the problem as 
> shown below:
> [2015-10-12 21:23:20,335] DEBUG Error connecting to node 0 at 
> 482f4769eed1:9092: (org.apache.kafka.clients.NetworkClient)
> java.io.IOException: Can't resolve address: 482f4769eed1:9092
>   at org.apache.kafka.common.network.Selector.connect(Selector.java:138)
>   at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:417)
>   at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:116)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:165)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
>   at java.lang.Thread.run(Unknown Source)
> Caused by: java.nio.channels.UnresolvedAddressException
>   at sun.nio.ch.Net.checkAddress(Unknown Source)
>   at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
>   at org.apache.kafka.common.network.Selector.connect(Selector.java:135)
>   ... 5 more
> [2015-10-12 21:23:20,358] DEBUG Initiating connection to node 0 at 
> 482f4769eed1:9092. (org.apache.kafka.clients.NetworkClient)
> Secondly, the errors do not identify the node by IP address making error 
> investigation more difficult especially when learning to use Kafka.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2662) Make ConsumerIterator thread-safe for multiple threads in different Kafka groups

2017-10-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2662.

Resolution: Auto Closed

The Scala consumers are no longer supported. Please upgrade to the Java 
consumer whenever possible.

> Make ConsumerIterator thread-safe for multiple threads in different Kafka 
> groups
> 
>
> Key: KAFKA-2662
> URL: https://issues.apache.org/jira/browse/KAFKA-2662
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8.2.1
>Reporter: Andrew Pennebaker
>Assignee: Neha Narkhede
>
> The API for obtaining a ConsumerIterator requires a group parameter, implying 
> that ConsumerIterators are thread-safe, as long as each thread is in a 
> different Kafka group. However, in practice, attempting to call hasNext() on 
> ConsumerIterators for a thread in one group, and for a thread in another 
> group, results in an InvalidStateException.
> In the future, can we please make ConsumerIterator thread-safe, for a common 
> use case of one consumer thread per group?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2725) high level consumer rebalances with auto-commit disabled should throw an exception

2017-10-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2725.

Resolution: Auto Closed

The Scala consumers are no longer supported. If this still applies to the Java 
consumer, please file a new issue.

> high level consumer rebalances with auto-commit disabled should throw an 
> exception
> --
>
> Key: KAFKA-2725
> URL: https://issues.apache.org/jira/browse/KAFKA-2725
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.1
> Environment: Experienced on Java running in linux
>Reporter: Cliff Rhyne
>
> Auto-commit is a very resilient mode.  Drops in zookeeper sessions due to JVM 
> garbage collection, network, rebalance or other interference are handled 
> gracefully within the kafka client.
> Systems still can drop due to unexpected gc or network behavior.  My proposal 
> is to handle this drop better when auto-commit is turned off:
> - If a rebalance or similar occur (which cause the offset to get reverted in 
> the client), check and see if the client was assigned back to the same 
> partition or a different one.  If it's the same partition, find the place 
> last consumed (it doesn't do this today for us).  This is to make a graceful 
> recovery.
> - If the partition assignment changes (which can mean duplicate data is 
> getting processed), throw an exception back to the application code.  This 
> lets the application code handle this exception-case with respect to the work 
> it's doing (with might be transactional).  Failing "silently" (yes it's still 
> getting logged) is very dangerous in our situation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3071) Kafka Server 0.8.2 ERROR OOME with siz

2017-10-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3071.

Resolution: Auto Closed

The SimpleConsumer is no longer supported and will be removed in a future 
release. Please upgrade to the Java Consumer whenever possible.

> Kafka Server 0.8.2 ERROR OOME with siz
> --
>
> Key: KAFKA-3071
> URL: https://issues.apache.org/jira/browse/KAFKA-3071
> Project: Kafka
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.8.2.0
> Environment: Linux * 2.6.32-431.23.3.el6.x86_64 #1 SMP Wed 
> Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: vijay bhaskar
>  Labels: build
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> [2016-01-06 12:34:18.819-0700] INFO Truncating log hughes-order-status-73 to 
> offset 18. (kafka.log.Log)
> [2016-01-06 12:34:18.819-0700] INFO Truncating log troubleshoot-completed-125 
> to offset 3. (kafka.log.Log)
> [2016-01-06 12:34:19.064-0700] DEBUG Scheduling task highwatermark-checkpoint 
> with initial delay 0 ms and period 5000 ms. (kafka.utils.KafkaScheduler)
> [2016-01-06 12:34:19.106-0700] DEBUG Scheduling task [__consumer_offsets,0] 
> with initial delay 0 ms and period -1 ms. (kafka.utils.KafkaScheduler)
> [2016-01-06 12:34:19.106-0700] INFO Loading offsets from 
> [__consumer_offsets,0] (kafka.server.OffsetManager)
> [2016-01-06 12:34:19.108-0700] INFO Finished loading offsets from 
> [__consumer_offsets,0] in 2 milliseconds. (kafka.server.OffsetManager)
> [2016-01-06 12:34:27.023-0700] ERROR OOME with size 743364196 
> (kafka.network.BoundedByteBufferReceive)
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
> at 
> kafka.network.BoundedByteBufferReceive.byteBufferAllocate(BoundedByteBufferReceive.scala:80)
> at 
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:63)
> at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> at 
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> at kafka.network.BlockingChannel.receive(BlockingChannel.scala:108)
> at 
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:72)
> at 
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:113)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:113)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:113)
> at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:112)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:112)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:112)
> at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:111)
> at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:97)
> at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:89)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> [2016-01-06 12:34:32.003-0700] ERROR OOME with size 743364196 
> (kafka.network.BoundedByteBufferReceive)
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
> at 
> kafka.network.BoundedByteBufferReceive.byteBufferAllocate(BoundedByteBufferReceive.scala:80)
> at 
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:63)
> at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> at 
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> at kafka.network.BlockingChannel.receive(BlockingChannel.scala:108)
> at 
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:80)
> at 
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:113)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:113)
> at 
> k

[jira] [Resolved] (KAFKA-3860) No broker partitions consumed by consumer thread

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3860.

Resolution: Auto Closed

The old consumer is no longer supported, please upgrade to the Java consumer 
whenever possible.

> No broker partitions consumed by consumer thread
> 
>
> Key: KAFKA-3860
> URL: https://issues.apache.org/jira/browse/KAFKA-3860
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.1
> Environment: centOS 
>Reporter: Kundan
>Assignee: Neha Narkhede
>
> I'm using kakfa-clients-0.8.2.1 to consume messages as Java KafkaConsumer.
> Problem: I have two consumer one consumer is able receive message message but 
> another consumer is unable to receive message and one of the WARNING was 
> visible in log saying that "No broker partitions consumed by consumer thread" 
> and also it was observed that offset was (-1)[selected partitions : 
> mytopic:0: fetched offset = -1: consumed offset = -1] while fetching offset 
> at consumer end.
> Another log was "entering consume " which is kept 
> ZookeeperConsumerConnector.scala file at line 220.
> Other information
> 1) Environment: CentOS
> 2) No of zookeper: 5,
> 3) Properties used to connect zookeeper: 
>  a)zookeeper.connect : zk1:2181,zk2:2181,zk2:2181,zk4:2181,zk5:2181,
>  b)group.id: mygroupId
>  c) fetch.message.max.bytes: 5242880 (Producer side also set)
>  d) auto.commit.enable: false
> 4) Single thread Highlevel consumer code used to consume data.
> 5) Consumer is running in separate VM  Kafka/zookeeper in separate VM
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3756) javadoc has issues with incorrect @param, @throws, @return

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3756.

Resolution: Auto Closed

Closing as 0.8.2 branch is no longer supported.

> javadoc has issues with incorrect @param, @throws, @return 
> ---
>
> Key: KAFKA-3756
> URL: https://issues.apache.org/jira/browse/KAFKA-3756
> Project: Kafka
>  Issue Type: Improvement
>  Components: packaging
>Affects Versions: 0.8.2.0
>Reporter: Rekha Joshi
>Assignee: Rekha Joshi
>Priority: Minor
>
> javadoc has issues with incorrect @param, @throws, @return 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3917) Some __consumer_offsets replicas grow way too big

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3917.

Resolution: Duplicate

This log cleaner issue has been fixed in more recent versions. I suggest 
upgrading.

> Some __consumer_offsets replicas grow way too big
> -
>
> Key: KAFKA-3917
> URL: https://issues.apache.org/jira/browse/KAFKA-3917
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.2
> Environment: Runs with Docker 1.10.1 in a container on 
> Linux 3.13.0-77-generic #121-Ubuntu SMP Wed Jan 20 10:50:42 UTC 2016 x86_64
>Reporter: Maxim Vladimirskiy
>  Labels: reliability
>
> We noticed that some replicas of partitions of the __consumer_offsets topic 
> grow way too big. Looking inside respective folders it became apparent that 
> old segments had not been cleaned up. Please see below example of disk usage 
> data for both affected and not affected partitions:
> Not affected partitions:
> Partition: 0  Leader: 2   Replicas: 2,3,4 Isr: 2,4,3
> 2: 49M
> 3: 49M
> 4: 49M
> Affected partitions:
> Partition: 10 Leader: 2   Replicas: 2,0,1 Isr: 1,2,0
> 0: 86M
> 1: 22G <<< too big!
> 2: 86M
> Partition: 38 Leader: 0   Replicas: 0,4,1 Isr: 1,0,4
> 0: 43M
> 1: 26G <<<  too big!
> 4: 26G <<<  too big!
> As you can see sometimes only one replica is affected, sometimes both 
> replicas are affected.
> When I try to restart a broker that has affected replicas it fails to start 
> with an exception that looks like this:
> [2016-06-28 23:15:20,441] ERROR There was an error in one of the threads 
> during logs loading: java.lang.IllegalArgumentException: requirement failed: 
> Corrupt index found, index file 
> (/var/kafka/__consumer_offsets-38/.index) has non-zero 
> size but the last offset is -676703869 and the base offset is 0 
> (kafka.log.LogManager)
> [2016-06-28 23:15:20,442] FATAL [Kafka Server 1], Fatal error during 
> KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
> java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
> index file (/var/kafka/__consumer_offsets-38/.index) has 
> non-zero size but the last offset is -676703869 and the base offset is 0
> at scala.Predef$.require(Predef.scala:233)
> at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
> at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:184)
> at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:183)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at kafka.log.Log.loadSegments(Log.scala:183)
> at kafka.log.Log.(Log.scala:67)
> at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$7$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:142)
> at kafka.utils.Utils$$anon$1.run(Utils.scala:54)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> After the content of the affected partition is deleted broker starts 
> successfully. 
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4096) Kafka Backup and Recovery

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4096.

Resolution: Auto Closed

Thanks for the JIRA. This should work. If you are still seeing this with a more 
recent version, please reopen with more information (broker logs and configs 
would be the minimum required).

> Kafka Backup and Recovery
> -
>
> Key: KAFKA-4096
> URL: https://issues.apache.org/jira/browse/KAFKA-4096
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
> Environment: RHEL 7.2, AWS EC2 compute instance
>Reporter: Karthik Reddy
>Assignee: Neha Narkhede
>Priority: Critical
>
> Hi Team,
> We are trying to move the data on Kafka Cluster from one region to another 
> region.Region here could be a separate Data center or a separate cluster 
> within the same region.
> In the effort to do this, we have stopped the ZK/Kafka of the old Cluster, 
> detached the EBS volumes where kafka stores all topics related data and then 
> attached the EBS volumes to the new cluster.
> We observed that new ZK cluster came with all the data that previous ZK 
> persisted meaning all the topic metadata and consumer offset information. 
> However, on the Kafka side, we noticed that messages are not seen, all the 
> index and log files are of empty size.
> The recovery point and recovery offset checkpoint indicate the correct base 
> offset as present in the old cluster.
> Apart from the MirrorMaker strategy to move the data from all the topics, can 
> you let us know is there any specific process to copy the file system 
> snapshots from one region to other.
> We did restart of Kafka/ZK but that didn't help.
> Thanks,
> Karthik



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4832) kafka producer send Async message to the wrong IP cannot to stop producer.close()

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4832.

Resolution: Auto Closed

The Scala producers have been deprecated for a while and no further work is 
planned. Please upgrade to the Java producer whenever possible.

> kafka producer send Async message to the wrong IP cannot to stop 
> producer.close()
> -
>
> Key: KAFKA-4832
> URL: https://issues.apache.org/jira/browse/KAFKA-4832
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.2
> Environment: JDK8 Eclipse Mars Win7
>Reporter: Wang Hong
>
> 1.When I tried to send msg by Async with wrong IP in loop 1400times 10batches.
> 2.I use javaapi.kafkaproducer designed by Factory.
> 3.1 of 10 batches I take Producer.Connected() and Producer.Closed().
> 4.I know I send msg to a wrong IP finally, But I noticed the terminal was 
> blocking. It can't close normally.
> function just like that :
>   public static void go(int s) throws Exception {
> KafkaService kf = new KafkaServiceImpl();//init properties
>   for (int i = 0; i < 1400; i++) {
>   String msg = "a b 0a b c d e 0a b c d e 0" + s + "--" + 
> i;
>   System.out.println(msg);
>   kf.push(msg); //producer.send()
>   }
> kf.closeProducerFactory();//producer.closed()
>   System.out.println(s);
>   Thread.sleep(1000);
>   }
> kf.closeProducerFactory() is used by producer.closed(),
> But Async send was always waiting for kafka server .I gave it a wrong IP.
> I think it waits for long time Will bring problem with whole system.it occupy 
> resources.
> And another problem was I sending kafka msg with true IP and Runnable 
> ,Threadpools, all is right .Also use ↑ examples for loop.
> It take error that said wait for 3 tries.
> I also configered 
> advertised.host.name=xxx.xxx.xxx.xxx
> advertised.port=9092
> Now I think it maybe cannot get so much concurrent volume in a time.
> Our System is  over 1000tps.
> Thank you .
> Resource Code part:
> package kafka.baseconfig;
> import java.util.Properties;
> import com.travelsky.util.ConFile;
> import kafka.javaapi.producer.Producer;
> import kafka.producer.KeyedMessage;
> import kafka.producer.ProducerConfig;
> /**
>  * kafka工厂模式
>  * 
>  * 1.替代Producer方法.//多线程效率不适合.
>  * 2.使用三部: 
>  * ProducerFactory fac = new ProducerFactory();
>  * fac.openProducer(); ->初始化对象
>  * fac.push(msg); ->发消息主体
>  * fac.closeProducer(); ->关闭对象
>  * @author 王宏
>  *
>  */
> public class ProducerFactory {
>   protected Producer producer = null;
>   protected ConFile conf = null;
>   private Properties props = new Properties();
>   private String topic = null;
>   {
>   try {
>   conf = new ConFile("KafkaProperties.conf");
>   topic = conf.getString("topic");
>   if (conf == null) {
>   throw new Exception("kafka配置文件有问题");
>   }
>   } catch (Exception e) {
>   e.printStackTrace();
>   }
>   }
>   
>   /**
>* 发送消息方法
>* @param msg
>*/
>   public void push(String msg) {
>   if (producer == null) {
>   throw new RuntimeException("producer实例为空");
>   }
>   KeyedMessage messageForSend = new 
> KeyedMessage(topic, msg);
>   producer.send(messageForSend);
>   }
>   
>   /**
>* 打开生产者
>*/
>   public void openProducer() {
>   props.put("serializer.class", "kafka.serializer.StringEncoder");
>   props.put("metadata.broker.list", 
> conf.getString("kafkaserverurl"));
>   // 异步发送
>   props.put("producer.type", conf.getString("synctype"));
>   // 每次发送多少条
>   props.put("batch.num.messages", conf.getString("batchmsgnums"));
>   
>   //
>   props.put("request.required.acks", "1");
>   //
>   props.put("queue.enqueue.timeout.ms", "1");
>   //
>   props.put("request.timeout.ms", "1");
>   //
>   props.put("timeout.ms", "1");
>   //
>   props.put("reconnect.backoff.ms", "1");
>   //
>   props.put("retry.backoff.ms", "1");
>   //
>   props.put("message.send.max.retries", "1");
>   //
>   props.put("retry.backoff.ms", "1");
>   //
>   props.put("linger.ms", "1");
>   //
>   props.put("max.block.ms", "

[jira] [Resolved] (KAFKA-5963) Null Pointer Exception on server

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5963.

Resolution: Auto Closed

Thanks for the JIRA. Kafka 0.8.x is no longer supported and the issue has most 
likely been fixed since then. The Selector code has received a large number of 
fixes in the last 2+ years.

> Null Pointer Exception on server
> 
>
> Key: KAFKA-5963
> URL: https://issues.apache.org/jira/browse/KAFKA-5963
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Mudit Garg
>
> I am using Kafka 0.8.1.1. I am sending the logs to Kafka Streams. 
> Initially, everything works fine but after some time when the server is idle 
> and no logs are produced, we get *Null Pointer Exception on the server* and 
> *nothing gets pushed into Kafka*.
> Following is the stack trace :
> java.lang.NullPointerException
>   at org.apache.kafka.common.network.Selector.poll(Selector.java:187)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:154)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:101)
>   at java.lang.Thread.run(Thread.java:745)
> I checked the code at 
> https://github.com/apache/kafka/blob/0.8.1.1/clients/src/main/java/org/apache/kafka/common/network/Selector.java#L187
> Looks like transmissions object is null. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5981) Update the affected versions on bug KAFKA-1194

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5981.

Resolution: Duplicate

Updated the fix versions. Only included the ones that are still supported.

> Update the affected versions on bug KAFKA-1194
> --
>
> Key: KAFKA-5981
> URL: https://issues.apache.org/jira/browse/KAFKA-5981
> Project: Kafka
>  Issue Type: Sub-task
>  Components: log
>Affects Versions: 0.8.1, 0.9.0.0, 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 
> 0.10.2.0, 0.11.0.0
> Environment: Windows
>Reporter: Amund Gjersøe
>Priority: Critical
>  Labels: windows
>
> KAFKA-1194 is a bug that has been around for close to 4 years.
> The original poster hasn't updated the affected version, so I did based on 
> own testing and what that have been reported in the comments.
> If someone with the right to change bug reports could update the original 
> one, that would be better than my approach. I just want to make sure that it 
> is not seen as a "v0.8 only" bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6043) Kafka 8.1.1 - .ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:110) blocked

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6043.

Resolution: Auto Closed

The old consumer is deprecated and no further changes are planned. Please 
upgrade to the Java consumer whenever possible.

> Kafka 8.1.1 - 
> .ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:110) 
> blocked
> 
>
> Key: KAFKA-6043
> URL: https://issues.apache.org/jira/browse/KAFKA-6043
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.1.1
>Reporter: Ramkumar
> Attachments: 6043.v1, stdout.LRMIID-158150.zip
>
>
> we are using 3 node kafka cluster. Around this kafka cluster we have a 
> RESTful service which provides http APIs for client. This service maintains 
> the consumer connection in cache. And this cache is set to expire in 60 
> minutes after which , the consumer connection will get disconnected.
> But we see this zookeeperconnection thread is blocked and the consumer object 
> is still hanging in jvm. Can you pls let me know if there is any solution 
> identified for this
> Below is the output from thread dump when this occurred
> pool-25100-thread-1" prio=10 tid=0x7f711804e820 nid=0x6726 waiting for 
> monitor entry [0x7f6cd999b000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:161)
> - waiting to lock <0x00076a922c40> (a java.lang.Object)
> at 
> kafka.javaapi.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:110)
> at 
> com.att.nsa.cambria.backends.kafka.KafkaConsumer$2.call(KafkaConsumer.java:207)
> at 
> com.att.nsa.cambria.backends.kafka.KafkaConsumer$2.call(KafkaConsumer.java:201)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> T--APPC-LCM-READ-E2E_T2_watcher_executor" prio=10 tid=0x7f7180198030 
> nid=0x55a2 waiting on condition [0x7f6c9f4fa000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00076b2e1508> (a 
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236)
> at 
> kafka.utils.ShutdownableThread.shutdown(ShutdownableThread.scala:36)
> at 
> kafka.server.AbstractFetcherThread.shutdown(AbstractFetcherThread.scala:71)
> at 
> kafka.server.AbstractFetcherManager$$anonfun$closeAllFetchers$2.apply(AbstractFetcherManager.scala:121)
> at 
> kafka.server.AbstractFetcherManager$$anonfun$closeAllFetchers$2.apply(AbstractFetcherManager.scala:120)
> at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
> at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
> at 
> kafka.server.AbstractFetcherManager.closeAllFetchers(AbstractFetcherManager.scala:120)
> - locked <0x00076a922340> (a java.lang.Object)
> at 
> kafka.consumer.ConsumerFetcherManager.stopConnections(ConsumerFetcherManager.scala:148)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$closeFetchersForQueues(ZookeeperConsumerConnector.scala
> :524)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.closeFetchers(ZookeeperConsumerConnector.scala:562)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:457)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumer

[jira] [Resolved] (KAFKA-6117) One Broker down can't rejoin the cluster

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6117.

Resolution: Auto Closed

Thanks for the JIRA. 0.8.2.x is unsupported at this point. Newer releases have 
a huge number of bug fixes, so please upgrade whenever possible.

> One Broker down can't rejoin the cluster
> 
>
> Key: KAFKA-6117
> URL: https://issues.apache.org/jira/browse/KAFKA-6117
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.2
>Reporter: GangGu
> Attachments: server.log
>
>
> One broker shutdown, I restart it, but it can't rejoin in the cluster.
> My broker's port is 6667. When i restart the broker,
> I used 'netstat -apn | grep '
> tcp0  0 10.221.157.109:5648210.221.157.109:6667 
> TIME_WAIT   -   
> tcp0  0 :::10.221.157.109:6667  :::*
> LISTEN  7349/java 
> these is no client connect.
> log list:
> -rw-r--r--. 1 kafka hadoop215 Oct 25 18:03 controller.log
> -rw-r--r--. 1 kafka hadoop  0 Oct 25 18:03 kafka.err
> -rw-r--r--. 1 kafka hadoop 161010 Oct 25 18:22 kafka.out
> -rw-r--r--. 1 kafka hadoop  0 Oct 25 18:03 kafka-request.log
> -rw-r--r--. 1 kafka hadoop   2144 Oct 25 18:13 kafkaServer-gc.log
> -rw-r--r--. 1 kafka hadoop  0 Oct 25 18:03 log-cleaner.log
> -rw-r--r--. 1 kafka hadoop  97055 Oct 25 18:22 server.log
> -rw-r--r--. 1 kafka hadoop  0 Oct 25 18:03 state-change.log



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5934) Cross build for scala 2.12

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5934.

Resolution: Won't Fix

Thanks for the JIRA and PR. 0.8.x is unsupported at this point and we don't 
intend to publish any new releases or review changes for 0.8.x branches.

> Cross build for scala 2.12
> --
>
> Key: KAFKA-5934
> URL: https://issues.apache.org/jira/browse/KAFKA-5934
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.2
>Reporter: Srdan Srepfler
>Priority: Minor
>
> It would be nice if possible to try to cross-compile last Kafka 0.8 to Scala 
> 2.12 as there are still production services lying around which are running on 
> that version and it's not easy to evolve clients as the old libraries still 
> go only up to 2.11. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2156) Possibility to plug in custom MetricRegistry

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2156.

Resolution: Auto Closed

It's unlikely that we will do this for Yammer Metrics. Please file a separate 
JIRA for Kafka Metrics  if changes are required for the flexibility you desire.

> Possibility to plug in custom MetricRegistry
> 
>
> Key: KAFKA-2156
> URL: https://issues.apache.org/jira/browse/KAFKA-2156
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 0.8.1.2
>Reporter: Andras Sereny
>Assignee: Jun Rao
>
> The trait KafkaMetricsGroup refers to Metrics.defaultRegistry() throughout. 
> It would be nice to be able to inject any MetricsRegistry instead of the 
> default one. 
> (My usecase is that I'd like to channel Kafka metrics into our application's 
> metrics system, for which I'd need custom implementations of 
> com.yammer.metrics.core.Metric.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2180) topics never create on brokers though it succeeds in tool and is in zookeeper

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2180.

Resolution: Auto Closed

Not enough information and very old so auto closing.

> topics never create on brokers though it succeeds in tool and is in zookeeper
> -
>
> Key: KAFKA-2180
> URL: https://issues.apache.org/jira/browse/KAFKA-2180
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.2
>Reporter: Joe Stein
>Priority: Critical
>
> Ran into an issue with a 0.8.2.1 cluster where create topic was succeeding 
> when running bin/kafka-topics.sh --create and seen in zookeeper but brokers 
> never get updated. 
> We ended up fixing this by deleting the /controller znode so controller 
> leader election would result. Wwe really should have some better way to make 
> the controller failover ( KAFKA-1778 ) than rmr /controller in the zookeeper 
> shell



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2222) Write "Input/output error" did not result in broker shutdown

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-.

Resolution: Won't Fix

The code changed dramatically since this was filed (the stacktrace doesn't even 
apply any more). Having said that, the underlying issue that we don't disable 
the disk or shutdown the broker if there's an error during `sendfile` is 
probably still there. Not sure there's a reasonable fix and if produce requests 
are happening, we will disable the disk (1.0.0 and later) or shutdown the 
broker (before 1.0.0).

> Write "Input/output error" did not result in broker shutdown
> 
>
> Key: KAFKA-
> URL: https://issues.apache.org/jira/browse/KAFKA-
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.2
>Reporter: Jason Rosenberg
>
> We had a disk start failing intermittently, and began seeing errors like this 
> in the broker.  Usually, IOExceptions during a file write result in the 
> broker shutting down immediately.  This is with version 0.8.2.1.
> {code}
> 2015-05-21 23:59:57,841 ERROR [kafka-network-thread-27330-2] 
> network.Processor - Closing socket for /1.2.3.4 because of error
> java.io.IOException: Input/output error
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at 
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:443)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:575)
> at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:147)
> at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:70)
> at kafka.network.MultiSend.writeTo(Transmission.scala:101)
> at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:125)
> at kafka.network.MultiSend.writeTo(Transmission.scala:101)
> at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:231)
> at kafka.network.Processor.write(SocketServer.scala:472)
> at kafka.network.Processor.run(SocketServer.scala:342)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This resulted in intermittent producer failures failing to send messages 
> successfully, etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6064) Cluster hung when the controller tried to delete a bunch of topics

2017-10-28 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6064.

Resolution: Auto Closed

0.8.2.1 is no longer supported. Many bugs related to topic deletion have been 
fixed in the releases since then. I suggest upgrading.

> Cluster hung when the controller tried to delete a bunch of topics 
> ---
>
> Key: KAFKA-6064
> URL: https://issues.apache.org/jira/browse/KAFKA-6064
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: rhel 6, 12 core, 48GB 
>Reporter: Chaitanya GSK
>  Labels: controller, kafka-0.8
>
> Hi, 
> We have been using 0.8.2.1 in our kafka cluster and we had a full cluster 
> outage when we programmatically tried to delete 220 topics and the controller 
> got hung and went out of memory. This has somehow led to the whole cluster 
> outage and the clients were not able to push the data at the right rate. 
> AFAIK, controller shouldn't impact the write rate to the fellow brokers and 
> in this case, it did. Below is the client error.
> [WARN] Failed to send kafka.producer.async request with correlation id 
> 1613935688 to broker 44 with data for partitions 
> [topic_2,65],[topic_2,167],[topic_3,2],[topic_4,0],[topic_5,30],[topic_2,48],[topic_2,150]
> java.io.IOException: Broken pipe
>   at sun.nio.ch.FileDispatcherImpl.writev0(Native Method) ~[?:1.8.0_60]
>   at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) 
> ~[?:1.8.0_60]
>   at sun.nio.ch.IOUtil.write(IOUtil.java:148) ~[?:1.8.0_60]
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:504) 
> ~[?:1.8.0_60]
>   at java.nio.channels.SocketChannel.write(SocketChannel.java:502) 
> ~[?:1.8.0_60]
>   at 
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56) 
> ~[stormjar.jar:?]
>   at kafka.network.Send$class.writeCompletely(Transmission.scala:75) 
> ~[stormjar.jar:?]
>   at 
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
>  ~[stormjar.jar:?]
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:103) 
> ~[stormjar.jar:?]
>   at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) 
> ~[stormjar.jar:?]
>   at 
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
>  ~[stormjar.jar:?]
>   at 
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103)
>  ~[stormjar.jar:?]
>   at 
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
>  ~[stormjar.jar:?]
>   at 
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
>  ~[stormjar.jar:?]
>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) ~[stormjar.jar:?]
>   at 
> kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102)
>  ~[stormjar.jar:?]
>   at 
> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) 
> ~[stormjar.jar:?]
>   at 
> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) 
> ~[stormjar.jar:?]
>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) ~[stormjar.jar:?]
>   at kafka.producer.SyncProducer.send(SyncProducer.scala:101) 
> ~[stormjar.jar:?]
>   at 
> kafka.producer.async.YamasKafkaEventHandler.kafka$producer$async$YamasKafkaEventHandler$$send(YamasKafkaEventHandler.scala:481)
>  [stormjar.jar:?]
>   at 
> kafka.producer.async.YamasKafkaEventHandler$$anonfun$dispatchSerializedData$2.apply(YamasKafkaEventHandler.scala:144)
>  [stormjar.jar:?]
>   at 
> kafka.producer.async.YamasKafkaEventHandler$$anonfun$dispatchSerializedData$2.apply(YamasKafkaEventHandler.scala:138)
>  [stormjar.jar:?]
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>  [stormjar.jar:?]
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) 
> [stormjar.jar:?]
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) 
> [stormjar.jar:?]
>   at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) 
> [stormjar.jar:?]
>   at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) 
> [stormjar.jar:?]
>   at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) 
> [stormjar.jar:?]
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>  [stormjar.jar:?]
>   at 
> kafka.producer.async.YamasKafkaEventHandler.dispatchSerializedData(YamasKafkaEventHandler.scala:138)
>  [stormjar.jar:?]
>   at 
> kafka.producer.async.Yama

[jira] [Created] (KAFKA-6136) Transient test failure: SaslPlainSslEndToEndAuthorizationTest.testTwoConsumersWithDifferentSaslCredentials

2017-10-27 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6136:
--

 Summary: Transient test failure: 
SaslPlainSslEndToEndAuthorizationTest.testTwoConsumersWithDifferentSaslCredentials
 Key: KAFKA-6136
 URL: https://issues.apache.org/jira/browse/KAFKA-6136
 Project: Kafka
  Issue Type: Bug
Reporter: Ismael Juma


Looks like a cleanup issue:

{code}
testTwoConsumersWithDifferentSaslCredentials – 
kafka.api.SaslPlainSslEndToEndAuthorizationTest
a few seconds
Error
org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to 
access group: group
Stacktrace
org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to 
access group: group
Standard Output
[2017-10-27 00:37:47,919] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
Adding ACLs for resource `Cluster:kafka-cluster`: 
User:admin has Allow permission for operations: ClusterAction from 
hosts: * 
Current ACLs for resource `Cluster:kafka-cluster`: 
User:admin has Allow permission for operations: ClusterAction from 
hosts: * 
[2017-10-27 00:37:48,961] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
fetcherId=0] Error for partition __consumer_offsets-0 from broker 0 
(kafka.server.ReplicaFetcherThread:107)
org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to 
access topics: [Topic authorization failed.]
[2017-10-27 00:37:48,967] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
fetcherId=0] Error for partition __consumer_offsets-0 from broker 0 
(kafka.server.ReplicaFetcherThread:107)
org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to 
access topics: [Topic authorization failed.]
Adding ACLs for resource `Topic:*`: 
User:admin has Allow permission for operations: Read from hosts: * 
Current ACLs for resource `Topic:*`: 
User:admin has Allow permission for operations: Read from hosts: * 
[2017-10-27 00:37:52,330] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-10-27 00:37:52,345] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
Adding ACLs for resource `Cluster:kafka-cluster`: 
User:admin has Allow permission for operations: ClusterAction from 
hosts: * 
Current ACLs for resource `Cluster:kafka-cluster`: 
User:admin has Allow permission for operations: ClusterAction from 
hosts: * 
[2017-10-27 00:37:53,459] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
fetcherId=0] Error for partition __consumer_offsets-0 from broker 0 
(kafka.server.ReplicaFetcherThread:107)
org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to 
access topics: [Topic authorization failed.]
[2017-10-27 00:37:53,462] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
fetcherId=0] Error for partition __consumer_offsets-0 from broker 0 
(kafka.server.ReplicaFetcherThread:107)
org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to 
access topics: [Topic authorization failed.]
Adding ACLs for resource `Topic:*`: 
User:admin has Allow permission for operations: Read from hosts: * 
Current ACLs for resource `Topic:*`: 
User:admin has Allow permission for operations: Read from hosts: * 
Adding ACLs for resource `Topic:e2etopic`: 
User:user has Allow permission for operations: Write from hosts: *
User:user has Allow permission for operations: Describe from hosts: * 
Adding ACLs for resource `Cluster:kafka-cluster`: 
User:user has Allow permission for operations: Create from hosts: * 
Current ACLs for resource `Topic:e2etopic`: 
User:user has Allow permission for operations: Write from hosts: *
User:user has Allow permission for operations: Describe from hosts: * 
Adding ACLs for resource `Topic:e2etopic`: 
User:user has Allow permission for operations: Read from hosts: *
User:user has Allow permission for operations: Describe from hosts: * 
Adding ACLs for resource `Group:group`: 
User:user has Allow permission for operations: Read from hosts: * 
Current ACLs for resource `Topic:e2etopic`: 
User:user has Allow permission for operations: Write from hosts: *
User:user has Allow permission for operations: Describe from hosts: *
User:user has Allow permission for operations: Read from hosts: * 
Current ACLs for resource `Group:group`: 
User:user has Allow permission for operations: Read from hosts: * 
[2017-10-27 00:37:55,520] WARN caught end of stream exception 
(org.apache.zookeeper.server.NIOServerCnxn:368)
EndOfStreamException: Unable to read additio

[jira] [Resolved] (KAFKA-6131) Transaction markers are sometimes discarded if txns complete concurrently

2017-10-26 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6131.

   Resolution: Fixed
Fix Version/s: 1.1.0
   0.11.0.2
   1.0.0

> Transaction markers are sometimes discarded if txns complete concurrently
> -
>
> Key: KAFKA-6131
> URL: https://issues.apache.org/jira/browse/KAFKA-6131
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.1, 1.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 1.0.0, 0.11.0.2, 1.1.0
>
>
> Concurrent tests being added under KAFKA-6096 for transaction coordinator 
> fail to complete some transactions when multiple transactions are completed 
> concurrently.
> The problem is with the following code snippet - there are two very similar 
> uses of concurrent map in {{TransactionMarkerChannelManager}} and the test 
> fails because some transaction markers are discarded. {{getOrElseUpdate}} in 
> scala maps are not atomic. The test passes consistently with one thread.
> {quote}
> val markersQueuePerBroker: concurrent.Map[Int, TxnMarkerQueue] = new 
> ConcurrentHashMap[Int, TxnMarkerQueue]().asScala
> val brokerRequestQueue = markersQueuePerBroker.getOrElseUpdate(brokerId, new 
> TxnMarkerQueue(broker))
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6128) Shutdown script does not do a clean shutdown

2017-10-26 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6128.

Resolution: Not A Problem

> Shutdown script does not do a clean shutdown
> 
>
> Key: KAFKA-6128
> URL: https://issues.apache.org/jira/browse/KAFKA-6128
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.1
>Reporter: Alastair Munro
>Priority: Minor
>
> Shutdown script (sending term signal) does not do a clean shutdown.
> We are running kafka in kubernetes/openshift 0.11.0.0. The statefulset kafka 
> runs the shutdown script prior to stopping the pod kafka is running on:
> {code}
> lifecycle:
>   preStop:
> exec:
>   command:
>   - ./bin/kafka-server-stop.sh
> {code}
> This worked perfectly in 0.11.0.0 but doesn't in 0.11.0.1. Also we see the 
> same behaviour if we send a TERM signal to the kafka process (same as the 
> shutdown script).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (KAFKA-6128) Shutdown script does not do a clean shutdown

2017-10-26 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-6128:


> Shutdown script does not do a clean shutdown
> 
>
> Key: KAFKA-6128
> URL: https://issues.apache.org/jira/browse/KAFKA-6128
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.1
>Reporter: Alastair Munro
>Priority: Minor
>
> Shutdown script (sending term signal) does not do a clean shutdown.
> We are running kafka in kubernetes/openshift 0.11.0.0. The statefulset kafka 
> runs the shutdown script prior to stopping the pod kafka is running on:
> {code}
> lifecycle:
>   preStop:
> exec:
>   command:
>   - ./bin/kafka-server-stop.sh
> {code}
> This worked perfectly in 0.11.0.0 but doesn't in 0.11.0.1. Also we see the 
> same behaviour if we send a TERM signal to the kafka process (same as the 
> shutdown script).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6116) Major performance issue due to excessive logging during leader election

2017-10-24 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6116.

Resolution: Fixed

Fixed in 0.11.0, 1.0 and trunk. Commit for 0.11.0:

https://github.com/apache/kafka/commit/d798c515992bdfd57b0a958d5b430d1e1a3e296e

> Major performance issue due to excessive logging during leader election
> ---
>
> Key: KAFKA-6116
> URL: https://issues.apache.org/jira/browse/KAFKA-6116
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Reporter: Ismael Juma
>Assignee: Onur Karaman
>Priority: Blocker
> Fix For: 1.0.0, 0.11.0.2, 1.1.0
>
>
> This was particularly problematic in clusters with a large number of 
> partitions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6116) Major performance issue due to excessive logging during leader election

2017-10-24 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6116:
--

 Summary: Major performance issue due to excessive logging during 
leader election
 Key: KAFKA-6116
 URL: https://issues.apache.org/jira/browse/KAFKA-6116
 Project: Kafka
  Issue Type: Bug
  Components: controller
Reporter: Ismael Juma
Assignee: Onur Karaman
Priority: Blocker
 Fix For: 1.0.0, 0.11.0.2, 1.1.0


This was particularly problematic in clusters with a large number of partitions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6111) Tests for KafkaControllerZkUtils

2017-10-23 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6111:
--

 Summary: Tests for KafkaControllerZkUtils
 Key: KAFKA-6111
 URL: https://issues.apache.org/jira/browse/KAFKA-6111
 Project: Kafka
  Issue Type: Sub-task
Reporter: Ismael Juma
 Fix For: 1.1.0


It has no tests at the moment and we need to fix that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6070) ducker-ak: add ipaddress and enum34 dependencies to docker image

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6070.

   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.0

> ducker-ak: add ipaddress and enum34 dependencies to docker image
> 
>
> Key: KAFKA-6070
> URL: https://issues.apache.org/jira/browse/KAFKA-6070
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
> Fix For: 1.0.0, 1.1.0
>
>
> ducker-ak: add ipaddress and enum34 dependencies to docker image



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5720) In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with org.apache.kafka.common.errors.TimeoutException

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5720.

   Resolution: Fixed
Fix Version/s: (was: 1.1.0)
   1.0.0

> In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with 
> org.apache.kafka.common.errors.TimeoutException
> ---
>
> Key: KAFKA-5720
> URL: https://issues.apache.org/jira/browse/KAFKA-5720
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>Priority: Minor
> Fix For: 1.0.0
>
>
> In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with 
> org.apache.kafka.common.errors.TimeoutException.
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout.
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:213)
>   at 
> kafka.api.AdminClientIntegrationTest.testCallInFlightTimeouts(AdminClientIntegrationTest.scala:399)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.errors.TimeoutException: Aborted due to 
> timeout.
> {code}
> It's unclear whether this was an environment error or test bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


<    1   2   3   4   5   6   7   8   9   10   >