[jira] [Commented] (KAFKA-8624) 这边是服务端的kafka,版本是 kafka_2.11-1.0.0.2,当客户端向主题发送消息的时候抛出如下异常,客户端跟服务端的版本不一致吗?
[ https://issues.apache.org/jira/browse/KAFKA-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877530#comment-16877530 ] lizhitao commented on KAFKA-8624: - {color:#00}Please give the client-side and server-side versions respectively,then ,helps you analyze mistake of context。 {color} > 这边是服务端的kafka,版本是 kafka_2.11-1.0.0.2,当客户端向主题发送消息的时候抛出如下异常,客户端跟服务端的版本不一致吗? > > > Key: KAFKA-8624 > URL: https://issues.apache.org/jira/browse/KAFKA-8624 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 1.0.0 >Reporter: CHARELS >Priority: Major > Labels: pull-request-available > Original Estimate: 12h > Remaining Estimate: 12h > > ERROR [KafkaApi-1004] Error when handling request > \{replica_id=-1,max_wait_time=1,min_bytes=0,topics=[{topic=ETE,partitions=[{partition=3,fetch_offset=119224671,max_bytes=1048576}]}]} > (kafka.server.KafkaApis) > java.lang.IllegalArgumentException: Magic v0 does not support record headers > at > org.apache.kafka.common.record.MemoryRecordsBuilder.appendWithOffset(MemoryRecordsBuilder.java:403) > at > org.apache.kafka.common.record.MemoryRecordsBuilder.append(MemoryRecordsBuilder.java:586) > at > org.apache.kafka.common.record.AbstractRecords.convertRecordBatch(AbstractRecords.java:134) > at > org.apache.kafka.common.record.AbstractRecords.downConvert(AbstractRecords.java:109) > at > org.apache.kafka.common.record.FileRecords.downConvert(FileRecords.java:253) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1$$anonfun$apply$4.apply(KafkaApis.scala:519) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1$$anonfun$apply$4.apply(KafkaApis.scala:517) > at scala.Option.map(Option.scala:146) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1.apply(KafkaApis.scala:517) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1.apply(KafkaApis.scala:507) > at scala.Option.flatMap(Option.scala:171) > at > kafka.server.KafkaApis.kafka$server$KafkaApis$$convertedPartitionData$1(KafkaApis.scala:507) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$createResponse$2$1.apply(KafkaApis.scala:555) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$createResponse$2$1.apply(KafkaApis.scala:554) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at > kafka.server.KafkaApis.kafka$server$KafkaApis$$createResponse$2(KafkaApis.scala:554) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$fetchResponseCallback$1$1.apply(KafkaApis.scala:568) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$fetchResponseCallback$1$1.apply(KafkaApis.scala:568) > at > kafka.server.KafkaApis$$anonfun$sendResponseMaybeThrottle$1.apply$mcVI$sp(KafkaApis.scala:2033) > at > kafka.server.ClientRequestQuotaManager.maybeRecordAndThrottle(ClientRequestQuotaManager.scala:54) > at kafka.server.KafkaApis.sendResponseMaybeThrottle(KafkaApis.scala:2032) > at > kafka.server.KafkaApis.kafka$server$KafkaApis$$fetchResponseCallback$1(KafkaApis.scala:568) > at > kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$processResponseCallback$1$1.apply$mcVI$sp(KafkaApis.scala:587) > at > kafka.server.ClientQuotaManager.maybeRecordAndThrottle(ClientQuotaManager.scala:176) > at > kafka.server.KafkaApis.kafka$server$KafkaApis$$processResponseCallback$1(KafkaApis.scala:586) > at > kafka.server.KafkaApis$$anonfun$handleFetchRequest$3.apply(KafkaApis.scala:603) > at > kafka.server.KafkaApis$$anonfun$handleFetchRequest$3.apply(KafkaApis.scala:603) > at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:820) > at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:595) > at kafka.server.KafkaApis.handle(KafkaApis.scala:99) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7245) Deprecate WindowStore#put(key, value)
[ https://issues.apache.org/jira/browse/KAFKA-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877520#comment-16877520 ] Omkar Mestry commented on KAFKA-7245: - [~mjsax] as the KIP accepted shall I close this Jira or any task is to be perfomed furthur. > Deprecate WindowStore#put(key, value) > - > > Key: KAFKA-7245 > URL: https://issues.apache.org/jira/browse/KAFKA-7245 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Omkar Mestry >Priority: Minor > Labels: needs-kip, newbie > > We want to remove `WindowStore#put(key, value)` – for this, we first need to > deprecate is via a KIP and remove later. > Instead of using `WindowStore#put(key, value)` we need to migrate code to > specify the timestamp explicitly using `WindowStore#put(key, value, > timestamp)`. The current code base use the explicit call to set the timestamp > in production code already. The simplified `put(key, value)` is only used in > tests, and thus, we would need to update those tests. > KIP-474 :- > [https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=115526545] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8624) 这边是服务端的kafka,版本是 kafka_2.11-1.0.0.2,当客户端向主题发送消息的时候抛出如下异常,客户端跟服务端的版本不一致吗?
CHARELS created KAFKA-8624: -- Summary: 这边是服务端的kafka,版本是 kafka_2.11-1.0.0.2,当客户端向主题发送消息的时候抛出如下异常,客户端跟服务端的版本不一致吗? Key: KAFKA-8624 URL: https://issues.apache.org/jira/browse/KAFKA-8624 Project: Kafka Issue Type: Bug Components: log Affects Versions: 1.0.0 Reporter: CHARELS ERROR [KafkaApi-1004] Error when handling request \{replica_id=-1,max_wait_time=1,min_bytes=0,topics=[{topic=ETE,partitions=[{partition=3,fetch_offset=119224671,max_bytes=1048576}]}]} (kafka.server.KafkaApis) java.lang.IllegalArgumentException: Magic v0 does not support record headers at org.apache.kafka.common.record.MemoryRecordsBuilder.appendWithOffset(MemoryRecordsBuilder.java:403) at org.apache.kafka.common.record.MemoryRecordsBuilder.append(MemoryRecordsBuilder.java:586) at org.apache.kafka.common.record.AbstractRecords.convertRecordBatch(AbstractRecords.java:134) at org.apache.kafka.common.record.AbstractRecords.downConvert(AbstractRecords.java:109) at org.apache.kafka.common.record.FileRecords.downConvert(FileRecords.java:253) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1$$anonfun$apply$4.apply(KafkaApis.scala:519) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1$$anonfun$apply$4.apply(KafkaApis.scala:517) at scala.Option.map(Option.scala:146) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1.apply(KafkaApis.scala:517) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$convertedPartitionData$1$1.apply(KafkaApis.scala:507) at scala.Option.flatMap(Option.scala:171) at kafka.server.KafkaApis.kafka$server$KafkaApis$$convertedPartitionData$1(KafkaApis.scala:507) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$createResponse$2$1.apply(KafkaApis.scala:555) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$createResponse$2$1.apply(KafkaApis.scala:554) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.server.KafkaApis.kafka$server$KafkaApis$$createResponse$2(KafkaApis.scala:554) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$fetchResponseCallback$1$1.apply(KafkaApis.scala:568) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$fetchResponseCallback$1$1.apply(KafkaApis.scala:568) at kafka.server.KafkaApis$$anonfun$sendResponseMaybeThrottle$1.apply$mcVI$sp(KafkaApis.scala:2033) at kafka.server.ClientRequestQuotaManager.maybeRecordAndThrottle(ClientRequestQuotaManager.scala:54) at kafka.server.KafkaApis.sendResponseMaybeThrottle(KafkaApis.scala:2032) at kafka.server.KafkaApis.kafka$server$KafkaApis$$fetchResponseCallback$1(KafkaApis.scala:568) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$processResponseCallback$1$1.apply$mcVI$sp(KafkaApis.scala:587) at kafka.server.ClientQuotaManager.maybeRecordAndThrottle(ClientQuotaManager.scala:176) at kafka.server.KafkaApis.kafka$server$KafkaApis$$processResponseCallback$1(KafkaApis.scala:586) at kafka.server.KafkaApis$$anonfun$handleFetchRequest$3.apply(KafkaApis.scala:603) at kafka.server.KafkaApis$$anonfun$handleFetchRequest$3.apply(KafkaApis.scala:603) at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:820) at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:595) at kafka.server.KafkaApis.handle(KafkaApis.scala:99) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8620) Fix potential race condition in StreamThread state change
[ https://issues.apache.org/jira/browse/KAFKA-8620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877440#comment-16877440 ] Boyang Chen commented on KAFKA-8620: In callback #StreamThread.onPartitionsAssigned we will do a state transition to PARTITION_ASSIGNED. If KafkaStreams instance invokes the shutdown hook, our state transits to PENDING_SHUTDOWN and the current state machine will return null to indicate a transition failure which causes the code path proceeds without triggering the taskManager.createNewTasks() call. The consequence was that we do assigned topic partitions for the thread but the taskManager contains empty active task map, so when we would receive non-empty records and call subsequent addRecordsToTask in #StreamThread.runOnce we will throw a NPE for being unable to find the correct tasks. The proper fixes involve: # Debug log on the state transition error # State check after we exit from pollRequest in runOnce. If we are in pending shutdown already, we shouldn’t proceed > Fix potential race condition in StreamThread state change > - > > Key: KAFKA-8620 > URL: https://issues.apache.org/jira/browse/KAFKA-8620 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Boyang Chen >Assignee: Boyang Chen >Priority: Major > > In the call to `StreamThread.addRecordsToTasks` we don't have synchronization > when we attempt to extract active tasks. If after one long poll in runOnce > the application state changes to PENDING_SHUTDOWN, there is a potential close > on TaskManager which erases the active tasks map, thus triggering NPE and > bringing the thread state to a false shutdown. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8611) Make topic optional when using through() operations in DSL
[ https://issues.apache.org/jira/browse/KAFKA-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8611: --- Description: When using DSL in Kafka Streams, data re-partition happens only when key-changing operation is followed by stateful operation. On the other hand, in DSL, stateful computation can happen using _transform()_ operation as well. Problem with this approach is that, even if any upstream operation was key-changing before calling _transform()_, no auto-repartition is triggered. If repartitioning is required, a call to _through(String)_ should be performed before _transform()_. With the current implementation, burden of managing and creating the topic falls on user and introduces extra complexity of managing Kafka Streams application. KIP-221: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Enhance+KStream+with+Connecting+Topic+Creation+and+Repartition+Hint] was:When using DSL in Kafka Streams, data re-partition happens only when key-changing operation is followed by stateful operation. On the other hand, in DSL, stateful computation can happen using _transform()_ operation as well. Problem with this approach is that, even if any upstream operation was key-changing before calling _transform()_, no auto-repartition is triggered. If repartitioning is required, a call to _through(String)_ should be performed before _transform()_. With the current implementation, burden of managing and creating the topic falls on user and introduces extra complexity of managing Kafka Streams application. > Make topic optional when using through() operations in DSL > -- > > Key: KAFKA-8611 > URL: https://issues.apache.org/jira/browse/KAFKA-8611 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Levani Kokhreidze >Assignee: Levani Kokhreidze >Priority: Minor > Labels: kip > > When using DSL in Kafka Streams, data re-partition happens only when > key-changing operation is followed by stateful operation. On the other hand, > in DSL, stateful computation can happen using _transform()_ operation as > well. Problem with this approach is that, even if any upstream operation was > key-changing before calling _transform()_, no auto-repartition is triggered. > If repartitioning is required, a call to _through(String)_ should be > performed before _transform()_. With the current implementation, burden of > managing and creating the topic falls on user and introduces extra complexity > of managing Kafka Streams application. > KIP-221: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Enhance+KStream+with+Connecting+Topic+Creation+and+Repartition+Hint] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8611) Make topic optional when using through() operations in DSL
[ https://issues.apache.org/jira/browse/KAFKA-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8611: --- Labels: kip (was: needs-kip) > Make topic optional when using through() operations in DSL > -- > > Key: KAFKA-8611 > URL: https://issues.apache.org/jira/browse/KAFKA-8611 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Levani Kokhreidze >Assignee: Levani Kokhreidze >Priority: Minor > Labels: kip > > When using DSL in Kafka Streams, data re-partition happens only when > key-changing operation is followed by stateful operation. On the other hand, > in DSL, stateful computation can happen using _transform()_ operation as > well. Problem with this approach is that, even if any upstream operation was > key-changing before calling _transform()_, no auto-repartition is triggered. > If repartitioning is required, a call to _through(String)_ should be > performed before _transform()_. With the current implementation, burden of > managing and creating the topic falls on user and introduces extra complexity > of managing Kafka Streams application. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8522) Tombstones can survive forever
[ https://issues.apache.org/jira/browse/KAFKA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877341#comment-16877341 ] Jun Rao commented on KAFKA-8522: [~dhruvilshah] mentioned that the original timestamp of the tombstone record could be useful for time-based retention and timestamp to offset translation. So, it may not be ideal to modify the timestamp in the tombstone record itself. We probably have to store the cleaning timestamp somewhere else (e.g. record header). > Tombstones can survive forever > -- > > Key: KAFKA-8522 > URL: https://issues.apache.org/jira/browse/KAFKA-8522 > Project: Kafka > Issue Type: Bug > Components: log cleaner >Reporter: Evelyn Bayes >Priority: Minor > > This is a bit grey zone as to whether it's a "bug" but it is certainly > unintended behaviour. > > Under specific conditions tombstones effectively survive forever: > * Small amount of throughput; > * min.cleanable.dirty.ratio near or at 0; and > * Other parameters at default. > What happens is all the data continuously gets cycled into the oldest > segment. Old records get compacted away, but the new records continuously > update the timestamp of the oldest segment reseting the countdown for > deleting tombstones. > So tombstones build up in the oldest segment forever. > > While you could "fix" this by reducing the segment size, this can be > undesirable as a sudden change in throughput could cause a dangerous number > of segments to be created. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8584) Allow "bytes" type to generated a ByteBuffer rather than byte arrays
[ https://issues.apache.org/jira/browse/KAFKA-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877315#comment-16877315 ] Steven b commented on KAFKA-8584: - [~suryateja...@gmail.com] - any tips on code paths to look into. I'm new and wanted to see what the codebase is like :) > Allow "bytes" type to generated a ByteBuffer rather than byte arrays > > > Key: KAFKA-8584 > URL: https://issues.apache.org/jira/browse/KAFKA-8584 > Project: Kafka > Issue Type: Sub-task >Reporter: Guozhang Wang >Assignee: SuryaTeja Duggi >Priority: Major > Labels: newbie > > Right now in the RPC definition, type {{bytes}} would be translated into > {{byte[]}} in generated Java code. However, for some requests like > ProduceRequest#partitionData, the underlying type would better be a > ByteBuffer rather than a byte array. > One proposal is to add an additional boolean tag {{useByteBuffer}} for > {{bytes}} type, which by default is false; when set to {{true}} set the > corresponding field to generate {{ByteBuffer}} instead of {{[]byte}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.
[ https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877309#comment-16877309 ] Yanyun Shi commented on KAFKA-6582: --- The Jira status is still Open. Is the issue indeed resolved in 2.1.1, or is it actually considered to be open? Thanks. > Partitions get underreplicated, with a single ISR, and doesn't recover. Other > brokers do not take over and we need to manually restart the broker. > -- > > Key: KAFKA-6582 > URL: https://issues.apache.org/jira/browse/KAFKA-6582 > Project: Kafka > Issue Type: Bug > Components: network >Affects Versions: 1.0.0 > Environment: Ubuntu 16.04 > Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 > x86_64 x86_64 x86_64 GNU/Linux > java version "9.0.1" > Java(TM) SE Runtime Environment (build 9.0.1+11) > Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) > but also tried with the latest JVM 8 before with the same result. >Reporter: Jurriaan Pruis >Priority: Major > Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot > 2019-01-18 at 13.16.59.png > > > Partitions get underreplicated, with a single ISR, and doesn't recover. Other > brokers do not take over and we need to manually restart the 'single ISR' > broker (if you describe the partitions of replicated topic it is clear that > some partitions are only in sync on this broker). > This bug resembles KAFKA-4477 a lot, but since that issue is marked as > resolved this is probably something else but similar. > We have the same issue (or at least it looks pretty similar) on Kafka 1.0. > Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've > upgraded from Kafka 0.10.2.1). > This happens almost every 24-48 hours on a random broker. This is why we > currently have a cronjob which restarts every broker every 24 hours. > During this issue the ISR shows the following server log: > {code:java} > [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor) > [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor) > [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor) > [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor) > [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor) > [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor) > {code} > Also on the ISR broker, the controller log shows this: > {code:java} > [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: > Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending > state change requests (kafka.controller.RequestSendThread) > [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: > Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending > state change requests (kafka.controller.RequestSendThread) > [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: > Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending > state change requests (kafka.controller.RequestSendThread){code} > And the non-ISR brokers show these kind of errors: > > {code:java} > 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, > fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, > replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, > fetchData={..}, isolationLevel=READ_UNCOMMITTED) > (kafka.server.ReplicaFetcherThread) > java.io.IOException: Connection to 3 was disconnected before the response was > read > at > org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95) > at > kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:96) > at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:205) > at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala
[jira] [Commented] (KAFKA-7504) Broker performance degradation caused by call of sendfile reading disk in network thread
[ https://issues.apache.org/jira/browse/KAFKA-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877175#comment-16877175 ] Allen Wang commented on KAFKA-7504: --- [~junrao] I tried the configuration you suggested but did not see much difference. (1) I made sure that there are multiple segments for each partition and enough data (> 1TB) on each broker. (2) All client requests went to the same listener and num.network.threads is changed to 1. Once the lagging consumer started, 99% Produce response send time increased from 5ms to 35ms. 99% FetchConsumer response send time increased from 10ms to 600ms. > Broker performance degradation caused by call of sendfile reading disk in > network thread > > > Key: KAFKA-7504 > URL: https://issues.apache.org/jira/browse/KAFKA-7504 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.10.2.1 >Reporter: Yuto Kawamura >Assignee: Yuto Kawamura >Priority: Major > Labels: latency, performance > Attachments: Network_Request_Idle_After_Patch.png, > Network_Request_Idle_Per_Before_Patch.png, Response_Times_After_Patch.png, > Response_Times_Before_Patch.png, image-2018-10-14-14-18-38-149.png, > image-2018-10-14-14-18-57-429.png, image-2018-10-14-14-19-17-395.png, > image-2018-10-14-14-19-27-059.png, image-2018-10-14-14-19-41-397.png, > image-2018-10-14-14-19-51-823.png, image-2018-10-14-14-20-09-822.png, > image-2018-10-14-14-20-19-217.png, image-2018-10-14-14-20-33-500.png, > image-2018-10-14-14-20-46-566.png, image-2018-10-14-14-20-57-233.png > > > h2. Environment > OS: CentOS6 > Kernel version: 2.6.32-XX > Kafka version: 0.10.2.1, 0.11.1.2 (but reproduces with latest build from > trunk (2.2.0-SNAPSHOT) > h2. Phenomenon > Response time of Produce request (99th ~ 99.9th %ile) degrading to 50x ~ 100x > more than usual. > Normally 99th %ile is lower than 20ms, but when this issue occurs it marks > 50ms to 200ms. > At the same time we could see two more things in metrics: > 1. Disk read coincidence from the volume assigned to log.dirs. > 2. Raise in network threads utilization (by > `kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent`) > As we didn't see increase of requests in metrics, we suspected blocking in > event loop ran by network thread as the cause of raising network thread > utilization. > Reading through Kafka broker source code, we understand that the only disk > IO performed in network thread is reading log data through calling > sendfile(2) (via FileChannel#transferTo). > To probe that the calls of sendfile(2) are blocking network thread for some > moments, I ran following SystemTap script to inspect duration of sendfile > syscalls. > {code:java} > # Systemtap script to measure syscall duration > global s > global records > probe syscall.$1 { > s[tid()] = gettimeofday_us() > } > probe syscall.$1.return { > elapsed = gettimeofday_us() - s[tid()] > delete s[tid()] > records <<< elapsed > } > probe end { > print(@hist_log(records)) > }{code} > {code:java} > $ stap -v syscall-duration.stp sendfile > # value (us) > value | count > 0 | 0 > 1 |71 > 2 |@@@ 6171 >16 |@@@ 29472 >32 |@@@ 3418 > 2048 | 0 > ... > 8192 | 3{code} > As you can see there were some cases taking more than few milliseconds, > implies that it blocks network thread for that long and applying the same > latency for all other request/response processing. > h2. Hypothesis > Gathering the above observations, I made the following hypothesis. > Let's say network-thread-1 multiplexing 3 connections. > - producer-A > - follower-B (broker replica fetch) > - consumer-C > Broker receives requests from each of those clients, [Produce, FetchFollower, > FetchConsumer]. > They are processed well by request handler threads, and now the response > queue of the network-thread contains 3 responses in following order: > [FetchConsumer, Produce, FetchFollower]. > network-thread-1 takes 3 responses and processes them sequentially > ([https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/network/SocketServer.scala#L632]). > Ideally processing of these 3 responses completes in microseconds as in it > just copies ready responses into client socket's buffer with non-blocking > manner. > However, Kafka uses sendfile(2) for transferring log data to client sockets. > The
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877136#comment-16877136 ] Di Campo edited comment on KAFKA-5998 at 7/2/19 4:52 PM: - Just in case it helps. I just found it today on 2.1.1 (again, I commented here some months ago). 5 brokers cluster, 3 Kafka Streams instances (2 `num.streams.threads` each). AMZN Linux. Docker on ECS. I've seen that, before the task dies, it prints the following WARNs from one task. Please note that from the 64 partitions, only a few of them fail starting at 13:17. And the same batch of the same partitions start failing again at 13:42. Why are the same partitions failing? Does it match with your findings? {{[2019-07-02 13:17:01,101] WARN task [2_31] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_31/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:17:01,118] WARN task [2_47] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_47/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:17:01,156] WARN task [2_27] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_27/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:12,360] WARN task [2_63] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_63/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:12,579] WARN task [2_35] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_35/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:13,001] WARN task [2_23] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_23/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:23:18,421] WARN task [2_39] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_39/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:23:18,613] WARN task [2_55] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_55/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,366] WARN task [2_31] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_31/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,473] WARN task [2_47] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_47/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,639] WARN task [2_27] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_27/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:19,888] WARN task [2_63] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_63/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,042] WARN task [2_35] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_35/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,380] WARN task [2_55] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_55/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,384] WARN task [2_23] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_23/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:48:07,011] WARN task [2_39] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_39/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} Later, the application died some minutes later, at 13:59:13. In case there is a relation, it was killed due to OOM. was (Author: xmar): Just in case it helps. I just found it today on 2.1.1 (again, I commented here some months ago). 5 brokers cluster, 3 Kafka Streams instances (2 `num.streams.threads` each). AMZN Linux. Docker on ECS. I've seen that, before the task dies, it prints the following WARNs from one task. Please note that from the 64 partitions, only a few of them fail starting at 13:17. And the same batch of the same partitions start failing again at 13:
[jira] [Commented] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877136#comment-16877136 ] Di Campo commented on KAFKA-5998: - Just in case it helps. I just found it today on 2.1.1 (again, I commented here some months ago). 5 brokers cluster, 3 Kafka Streams instances (2 `num.streams.threads` each). AMZN Linux. Docker on ECS. I've seen that, before the task dies, it prints the following WARNs from one task. Please note that from the 64 partitions, only a few of them fail starting at 13:17. And the same batch of the same partitions start failing again at 13:42. Why are the same partitions failing? Does it match with your findings? {{ [2019-07-02 13:17:01,101] WARN task [2_31] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_31/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{ [2019-07-02 13:17:01,118] WARN task [2_47] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_47/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{ [2019-07-02 13:17:01,156] WARN task [2_27] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_27/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:12,360] WARN task [2_63] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_63/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:12,579] WARN task [2_35] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_35/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:13,001] WARN task [2_23] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_23/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:23:18,421] WARN task [2_39] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_39/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:23:18,613] WARN task [2_55] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_55/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,366] WARN task [2_31] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_31/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,473] WARN task [2_47] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_47/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,639] WARN task [2_27] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_27/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:19,888] WARN task [2_63] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_63/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,042] WARN task [2_35] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_35/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,380] WARN task [2_55] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_55/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,384] WARN task [2_23] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_23/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:48:07,011] WARN task [2_39] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_39/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} Later, the application died some minutes later, at 13:59:13. In case there is a relation, it was killed due to OOM. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Assignee: Bill Bejeck >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt
[jira] [Commented] (KAFKA-8560) The Kafka protocol generator should support common structures
[ https://issues.apache.org/jira/browse/KAFKA-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877132#comment-16877132 ] ASF GitHub Bot commented on KAFKA-8560: --- gwenshap commented on pull request #6966: KAFKA-8560. The Kafka protocol generator should support common structures URL: https://github.com/apache/kafka/pull/6966 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > The Kafka protocol generator should support common structures > - > > Key: KAFKA-8560 > URL: https://issues.apache.org/jira/browse/KAFKA-8560 > Project: Kafka > Issue Type: Improvement >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe >Priority: Major > Fix For: 2.4.0 > > > The Kafka protocol generator should support common structures. This would > make things simpler in cases where we need to refer to a single structure > from multiple places in a message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8560) The Kafka protocol generator should support common structures
[ https://issues.apache.org/jira/browse/KAFKA-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gwen Shapira resolved KAFKA-8560. - Resolution: Fixed Fix Version/s: 2.4.0 > The Kafka protocol generator should support common structures > - > > Key: KAFKA-8560 > URL: https://issues.apache.org/jira/browse/KAFKA-8560 > Project: Kafka > Issue Type: Improvement >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe >Priority: Major > Fix For: 2.4.0 > > > The Kafka protocol generator should support common structures. This would > make things simpler in cases where we need to refer to a single structure > from multiple places in a message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-3410) Unclean leader election and "Halting because log truncation is not allowed"
[ https://issues.apache.org/jira/browse/KAFKA-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877122#comment-16877122 ] sandeep gupta edited comment on KAFKA-3410 at 7/2/19 4:37 PM: -- I also encountered the same issue. Is there any solution for this. I don't want to lose any data. In our kafka based network, there are 4 kafka brokers and 3 zookeepers running as docker containers, we have 3 channels and one orderer system channel testchainid in our network. Everything was working fine till sunday. However after that we saw errors in ordering service during invokations. After that we restarted our zookeepers services and then restarted kafka0, kafka1, kafka2 and kafka3 in the same order maintaining 10 secs gap after every kafka restart. This process we used to do (roughly every 3 weeks) whenever we faced such issue. However this time when we did the same process, kafka2 and kafka1 got shut down after restart and when we checked the logs we found this error *FATAL [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Exiting because log truncation is not allowed for partition testchainid-0, current leader's latest offset 96672 is less than replica's latest offset 96674 (kafka.server.ReplicaFetcherThread)* on broker2 and same error we found on broker1 as well. so basically we have two channels on broker0 as leader - ort and testchainid and rest of the channels are present on other brokers. Also when we stopped kafka0 broker and then restart kafka1, kafka2 and kafka3, then kafka1 and kafka2 didn't get shut down. So the problem is as soon as i restart kafka0 broker, then the brokers kafka1 and kafka2 get shut down immediately. Now with kafka0 stopped and rest of the brokers kafka1, kafka2, and kafka3 running, though I am able to invoke on rest of the other channels but I am able to see error in orderer logs for the ort channel *[orderer/consensus/kafka] processMessagesToBlocks -> ERRO 3a0123d [channel: ort] Error during consumption: kafka: error while consuming ort/0: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes.* I am using docker-compose to start zookeeper and kafka brokers. Below given is the docker-compose for one of the zookeeper and kafka zookeeper0: container_name: zookeeper0 image: hyperledger/fabric-zookeeper:latest dns_search: . ports: - 2181:2181 - 2888:2888 - 3888:3888 environment: - ZOO_MY_ID=1 - ZOO_SERVERS=server.1=zookeeper0:2888:3888 server.2=zookeeper1:2888:3888 server.3=zookeeper2:2888:3888 networks: - fabric-ca volumes: - ./hosts/zookeeper0hosts/hosts:/etc/hosts kafka0: container_name: kafka0 image: hyperledger/fabric-kafka:latest dns_search: . environment: - KAFKA_MESSAGE_MAX_BYTES=103809024 # 99 * 1024 * 1024 B - KAFKA_REPLICA_FETCH_MAX_BYTES=103809024 # 99 * 1024 * 1024 B - KAFKA_UNCLEAN_LEADER_ELECTION_ENABLE=false - KAFKA_BROKER_ID=0 - KAFKA_HOST_NAME=kafka0 - KAFKA_LISTENERS=EXTERNAL://0.0.0.0:9092,REPLICATION://0.0.0.0:9093 - KAFKA_ADVERTISED_LISTENERS=EXTERNAL://10.64.67.212:9092,REPLICATION://kafka0:9093 - KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=EXTERNAL:PLAINTEXT,REPLICATION:PLAINTEXT - KAFKA_INTER_BROKER_LISTENER_NAME=REPLICATION - KAFKA_MIN_INSYNC_REPLICAS=2 - KAFKA_DEFAULT_REPLICATION_FACTOR=3 - KAFKA_ZOOKEEPER_CONNECT=zookeeper0:2181,zookeeper1:2181,zookeeper2:2181 ports: - 9092:9092 - 9093:9093 networks: - fabric-ca volumes: - ./hosts/kafka0hosts/hosts:/etc/hosts Also below given are kafka brokers logs after restart. Broker0 - [https://hastebin.com/zavocatace.sql] Broker1 - [https://hastebin.com/latojedemu.sql] Broker2 - [https://hastebin.com/poxudijepi.sql] Broker3 - [https://hastebin.com/doliqohufa.sql] was (Author: javrevasandeep): I also encountered the same issue. Is there any solution for this. I don't want to lose any data. In our kafka based network, there are 4 kafka brokers and 3 zookeepers running as docker containers, we have 3 channels and one orderer system channel testchainid in our network. Everything was working fine till sunday. However after that we saw errors in ordering service during invokations. After that we restarted our zookeepers services and then restarted kafka0, kafka1, kafka2 and kafka3 in the same order maintaining 10 secs gap after every kafka restart. This process we used to do (roughly every 3 weeks) whenever we faced such issue. However this time when we did the same process, kafka2 and kafka1 got shut down after restart and when we checked the logs we found this error *FATAL [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Exiting because log truncation is not allowed for partition testchainid-0, current leader's latest offset 96672 is less than replica's latest offset 96674 (kafka.server.ReplicaFetcherThread)* on broker2 and sa
[jira] [Comment Edited] (KAFKA-3410) Unclean leader election and "Halting because log truncation is not allowed"
[ https://issues.apache.org/jira/browse/KAFKA-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877122#comment-16877122 ] sandeep gupta edited comment on KAFKA-3410 at 7/2/19 4:35 PM: -- I also encountered the same issue. Is there any solution for this. I don't want to lose any data. In our kafka based network, there are 4 kafka brokers and 3 zookeepers running as docker containers, we have 3 channels and one orderer system channel testchainid in our network. Everything was working fine till sunday. However after that we saw errors in ordering service during invokations. After that we restarted our zookeepers services and then restarted kafka0, kafka1, kafka2 and kafka3 in the same order maintaining 10 secs gap after every kafka restart. This process we used to do (roughly every 3 weeks) whenever we faced such issue. However this time when we did the same process, kafka2 and kafka1 got shut down after restart and when we checked the logs we found this error *FATAL [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Exiting because log truncation is not allowed for partition testchainid-0, current leader's latest offset 96672 is less than replica's latest offset 96674 (kafka.server.ReplicaFetcherThread)* on broker2 and same error we found on broker1 as well. so basically we have two channels on broker0 as leader - ort and testchainid and rest of the channels are present on other brokers. Also when we stopped kafka0 broker and then restart kafka1, kafka2 and kafka3, then kafka1 and kafka2 didn't get shut down. So the problem is as soon as i restart kafka0 broker, then the brokers kafka1 and kafka2 get shut down immediately. Now with kafka0 stopped and rest of the brokers kafka1, kafka2, and kafka3 running, though I am able to invoke on rest of the other channels but I am able to see error in orderer logs for the ort channel *[orderer/consensus/kafka] processMessagesToBlocks -> ERRO 3a0123d [channel: ort] Error during consumption: kafka: error while consuming ort/0: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes.* I am using docker-compose to start zookeeper and kafka brokers. Below given is the docker-compose for one of the zookeeper and kafka zookeeper0: container_name: zookeeper0 image: hyperledger/fabric-zookeeper:latest dns_search: . ports: - 2181:2181 - 2888:2888 - 3888:3888 environment: - ZOO_MY_ID=1 - ZOO_SERVERS=server.1=zookeeper0:2888:3888 server.2=zookeeper1:2888:3888 server.3=zookeeper2:2888:3888 networks: - fabric-ca volumes: - ./hosts/zookeeper0hosts/hosts:/etc/hosts kafka0: container_name: kafka0 image: hyperledger/fabric-kafka:latest dns_search: . environment: - KAFKA_MESSAGE_MAX_BYTES=103809024 # 99 * 1024 * 1024 B - KAFKA_REPLICA_FETCH_MAX_BYTES=103809024 # 99 * 1024 * 1024 B - KAFKA_UNCLEAN_LEADER_ELECTION_ENABLE=false - KAFKA_BROKER_ID=0 - KAFKA_HOST_NAME=kafka0 - KAFKA_LISTENERS=EXTERNAL://0.0.0.0:9092,REPLICATION://0.0.0.0:9093 - KAFKA_ADVERTISED_LISTENERS=EXTERNAL://10.64.67.212:9092,REPLICATION://kafka0:9093 - KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=EXTERNAL:PLAINTEXT,REPLICATION:PLAINTEXT - KAFKA_INTER_BROKER_LISTENER_NAME=REPLICATION - KAFKA_MIN_INSYNC_REPLICAS=2 - KAFKA_DEFAULT_REPLICATION_FACTOR=3 - KAFKA_ZOOKEEPER_CONNECT=zookeeper0:2181,zookeeper1:2181,zookeeper2:2181 ports: - 9092:9092 - 9093:9093 networks: - fabric-ca volumes: - ./hosts/kafka0hosts/hosts:/etc/hosts was (Author: javrevasandeep): I also encountered the same issue. Is there any solution for this. I don't want to lose any data. In our kafka based network, there are 4 kafka brokers and 3 zookeepers running as docker containers, we have 3 channels and one orderer system channel testchainid in our network. Everything was working fine till sunday. However after that we saw errors in ordering service during invokations. After that we restarted our zookeepers services and then restarted kafka0, kafka1, kafka2 and kafka3 in the same order maintaining 10 secs gap after every kafka restart. This process we used to do (roughly every 3 weeks) whenever we faced such issue. However this time when we did the same process, kafka2 and kafka1 got shut down after restart and when we checked the logs we found this error *FATAL [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Exiting because log truncation is not allowed for partition testchainid-0, current leader's latest offset 96672 is less than replica's latest offset 96674 (kafka.server.ReplicaFetcherThread)* on broker2 and same error we found on broker1 as well. so basically we have two channels on broker0 as leader - ort and testchainid and rest of the channels are present on other brokers. Also when we stopped kafka0 broker and then restart kafka1, kafka2 and kafka3, t
[jira] [Comment Edited] (KAFKA-3410) Unclean leader election and "Halting because log truncation is not allowed"
[ https://issues.apache.org/jira/browse/KAFKA-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877122#comment-16877122 ] sandeep gupta edited comment on KAFKA-3410 at 7/2/19 4:33 PM: -- I also encountered the same issue. Is there any solution for this. I don't want to lose any data. In our kafka based network, there are 4 kafka brokers and 3 zookeepers running as docker containers, we have 3 channels and one orderer system channel testchainid in our network. Everything was working fine till sunday. However after that we saw errors in ordering service during invokations. After that we restarted our zookeepers services and then restarted kafka0, kafka1, kafka2 and kafka3 in the same order maintaining 10 secs gap after every kafka restart. This process we used to do (roughly every 3 weeks) whenever we faced such issue. However this time when we did the same process, kafka2 and kafka1 got shut down after restart and when we checked the logs we found this error *FATAL [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Exiting because log truncation is not allowed for partition testchainid-0, current leader's latest offset 96672 is less than replica's latest offset 96674 (kafka.server.ReplicaFetcherThread)* on broker2 and same error we found on broker1 as well. so basically we have two channels on broker0 as leader - ort and testchainid and rest of the channels are present on other brokers. Also when we stopped kafka0 broker and then restart kafka1, kafka2 and kafka3, then kafka1 and kafka2 didn't get shut down. So the problem is as soon as i restart kafka0 broker, then the brokers kafka1 and kafka2 get shut down immediately. Now with kafka0 stopped and rest of the brokers kafka1, kafka2, and kafka3 running, though I am able to invoke on rest of the other channels but I am able to see error in orderer logs for the ort channel *[orderer/consensus/kafka] processMessagesToBlocks -> ERRO 3a0123d [channel: ort] Error during consumption: kafka: error while consuming ort/0: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes.* I am using docker-compose to start zookeeper and kafka brokers. Below given is the docker-compose for one of the zookeeper and kafka zookeeper0: container_name: zookeeper0 image: hyperledger/fabric-zookeeper:latest dns_search: . # restart: always ports: - 2181:2181 - 2888:2888 - 3888:3888 environment: - ZOO_MY_ID=1 - ZOO_SERVERS=server.1=zookeeper0:2888:3888 server.2=zookeeper1:2888:3888 server.3=zookeeper2:2888:3888 networks: - fabric-ca volumes: - ./hosts/zookeeper0hosts/hosts:/etc/hosts kafka0: container_name: kafka0 image: hyperledger/fabric-kafka:latest dns_search: . # restart: always environment: - KAFKA_MESSAGE_MAX_BYTES=103809024 # 99 * 1024 * 1024 B - KAFKA_REPLICA_FETCH_MAX_BYTES=103809024 # 99 * 1024 * 1024 B - KAFKA_UNCLEAN_LEADER_ELECTION_ENABLE=false - KAFKA_BROKER_ID=0 - KAFKA_HOST_NAME=kafka0 - KAFKA_LISTENERS=EXTERNAL://0.0.0.0:9092,REPLICATION://0.0.0.0:9093 - KAFKA_ADVERTISED_LISTENERS=EXTERNAL://10.64.67.212:9092,REPLICATION://kafka0:9093 - KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=EXTERNAL:PLAINTEXT,REPLICATION:PLAINTEXT - KAFKA_INTER_BROKER_LISTENER_NAME=REPLICATION - KAFKA_MIN_INSYNC_REPLICAS=2 - KAFKA_DEFAULT_REPLICATION_FACTOR=3 - KAFKA_ZOOKEEPER_CONNECT=zookeeper0:2181,zookeeper1:2181,zookeeper2:2181 ports: - 9092:9092 - 9093:9093 networks: - fabric-ca volumes: - ./hosts/kafka0hosts/hosts:/etc/hosts was (Author: javrevasandeep): I also encountered the same issue. Is there any solution for this. I don't want to lose any data. In our kafka based network, there are 4 kafka brokers and 3 zookeepers running as docker containers, we have 3 channels and one orderer system channel testchainid in our network. Everything was working fine till sunday. However after that we saw errors in ordering service during invokations. After that we restarted our zookeepers services and then restarted kafka0, kafka1, kafka2 and kafka3 in the same order maintaining 10 secs gap after every kafka restart. This process we used to do (roughly every 3 weeks) whenever we faced such issue. However this time when we did the same process, kafka2 and kafka1 got shut down after restart and when we checked the logs we found this error *FATAL [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Exiting because log truncation is not allowed for partition testchainid-0, current leader's latest offset 96672 is less than replica's latest offset 96674 (kafka.server.ReplicaFetcherThread)* on broker2 and same error we found on broker1 as well. so basically we have two channels on broker0 as leader - ort and testchainid and rest of the channels are present on other brokers. Also when we stopped kafka0 broker and then restart kafka1, kafka2 and kafka3, then
[jira] [Commented] (KAFKA-3410) Unclean leader election and "Halting because log truncation is not allowed"
[ https://issues.apache.org/jira/browse/KAFKA-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877122#comment-16877122 ] sandeep gupta commented on KAFKA-3410: -- I also encountered the same issue. Is there any solution for this. I don't want to lose any data. In our kafka based network, there are 4 kafka brokers and 3 zookeepers running as docker containers, we have 3 channels and one orderer system channel testchainid in our network. Everything was working fine till sunday. However after that we saw errors in ordering service during invokations. After that we restarted our zookeepers services and then restarted kafka0, kafka1, kafka2 and kafka3 in the same order maintaining 10 secs gap after every kafka restart. This process we used to do (roughly every 3 weeks) whenever we faced such issue. However this time when we did the same process, kafka2 and kafka1 got shut down after restart and when we checked the logs we found this error *FATAL [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Exiting because log truncation is not allowed for partition testchainid-0, current leader's latest offset 96672 is less than replica's latest offset 96674 (kafka.server.ReplicaFetcherThread)* on broker2 and same error we found on broker1 as well. so basically we have two channels on broker0 as leader - ort and testchainid and rest of the channels are present on other brokers. Also when we stopped kafka0 broker and then restart kafka1, kafka2 and kafka3, then kafka1 and kafka2 didn't get shut down. So the problem is as soon as i restart kafka0 broker, then the brokers kafka1 and kafka2 get shut down immediately. Now with kafka0 stopped and rest of the brokers kafka1, kafka2, and kafka3 running, though I am able to invoke on rest of the other channels but I am able to see error in orderer logs for the ort channel *[orderer/consensus/kafka] processMessagesToBlocks -> ERRO 3a0123d [channel: ort] Error during consumption: kafka: error while consuming ort/0: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes.* > Unclean leader election and "Halting because log truncation is not allowed" > --- > > Key: KAFKA-3410 > URL: https://issues.apache.org/jira/browse/KAFKA-3410 > Project: Kafka > Issue Type: Bug > Components: replication >Reporter: James Cheng >Priority: Major > Labels: reliability > > I ran into a scenario where one of my brokers would continually shutdown, > with the error message: > [2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because > log truncation is not allowed for topic test, Current leader 1's latest > offset 0 is less than replica 2's latest offset 151 > (kafka.server.ReplicaFetcherThread) > I managed to reproduce it with the following scenario: > 1. Start broker1, with unclean.leader.election.enable=false > 2. Start broker2, with unclean.leader.election.enable=false > 3. Create topic, single partition, with replication-factor 2. > 4. Write data to the topic. > 5. At this point, both brokers are in the ISR. Broker1 is the partition > leader. > 6. Ctrl-Z on broker2. (Simulates a GC pause or a slow network) Broker2 gets > dropped out of ISR. Broker1 is still the leader. I can still write data to > the partition. > 7. Shutdown Broker1. Hard or controlled, doesn't matter. > 8. rm -rf the log directory of broker1. (This simulates a disk replacement or > full hardware replacement) > 9. Resume broker2. It attempts to connect to broker1, but doesn't succeed > because broker1 is down. At this point, the partition is offline. Can't write > to it. > 10. Resume broker1. Broker1 resumes leadership of the topic. Broker2 attempts > to join ISR, and immediately halts with the error message: > [2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because > log truncation is not allowed for topic test, Current leader 1's latest > offset 0 is less than replica 2's latest offset 151 > (kafka.server.ReplicaFetcherThread) > I am able to recover by setting unclean.leader.election.enable=true on my > brokers. > I'm trying to understand a couple things: > * In step 10, why is broker1 allowed to resume leadership even though it has > no data? > * In step 10, why is it necessary to stop the entire broker due to one > partition that is in this state? Wouldn't it be possible for the broker to > continue to serve traffic for all the other topics, and just mark this one as > unavailable? > * Would it make sense to allow an operator to manually specify which broker > they want to become the new master? This would give me more control over how > much data loss I am willing to handle. In this case, I would w
[jira] [Commented] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877105#comment-16877105 ] ASF GitHub Bot commented on KAFKA-5998: --- vvcephei commented on pull request #7027: KAFKA-5998: fix checkpointableOffsets handling URL: https://github.com/apache/kafka/pull/7027 * each task should write only the checkpoint offsets for changelog/partitions that it owns * check upon loading checkpointableOffsets and also immediately upon writing checkpointableOffsets that the task actually owns the partitions in question. This will prevent checkpoint file corruption in the future, and also help people clear out their corrupted checkpoint files so that they can be rebuild correctly. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Assignee: Bill Bejeck >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.Strea
[jira] [Commented] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877103#comment-16877103 ] John Roesler commented on KAFKA-5998: - I've been talking with [~pkleindl] in the above thread, and I think he figured it out. I'm preparing a bugfix PR. It seems that the problem is actually with the way ProcessorStateManager writes the checkpoint file. It dumps out all the restored offsets from the ChangelogReader, but the ChangelogReader is scoped to the thread, not the current task. This would cause each task to erroneously write checkpoint offsets for _all_ the stores in _all_ the tasks that happened to be assigned to the same thread. This would explain the warning we see, as well as another strange aspect Patrik reported, that the checkpoints in the checkpoint file for a particular task actually contains offsets for stores in other partitions not owned by that task. This bug would not actually cause any correctness-impacting bugs, since upon loading the checkpoint file, stores only pay attention to the offsets for stores they own, which would have been written correctly. I'm proposing a fix and a new assertion: * the fix is that each task should write only the checkpoint offsets for changelog/partitions that it owns * the assertion would check upon loading checkpointableOffsets and also immediately upon writing checkpointableOffsets that the task actually owns the partitions in question. This will prevent checkpoint file corruption in the future, and also help people clear out their corrupted checkpoint files so that they can be rebuild correctly. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Assignee: Bill Bejeck >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.process
[jira] [Assigned] (KAFKA-8568) MirrorMaker 2.0 resource leak
[ https://issues.apache.org/jira/browse/KAFKA-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryanne Dolan reassigned KAFKA-8568: --- Assignee: Ryanne Dolan > MirrorMaker 2.0 resource leak > - > > Key: KAFKA-8568 > URL: https://issues.apache.org/jira/browse/KAFKA-8568 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect, mirrormaker >Affects Versions: 2.2.2 >Reporter: Péter Gergő Barna >Assignee: Ryanne Dolan >Priority: Major > > This issue produced by the branch KIP-382 (I am not sure which version is > affected by that branch). > While MirrorMaker 2.0 is running, the following command returns a number that > is getting larger and larger. > > {noformat} > lsof -p | grep ESTABLISHED | wc -l{noformat} > > Meanwhile, in the error log, NullPointers pop up from the > MirrorSourceTask.cleanup, because either the consumer or the producer is null > when the cleanup method tries to close them. > > {noformat} > Exception in thread "Thread-790" java.lang.NullPointerException > at > org.apache.kafka.connect.mirror.MirrorSourceTask.cleanup(MirrorSourceTask.java:110) > at java.lang.Thread.run(Thread.java:748) > Exception in thread "Thread-792" java.lang.NullPointerException > at > org.apache.kafka.connect.mirror.MirrorSourceTask.cleanup(MirrorSourceTask.java:110) > at java.lang.Thread.run(Thread.java:748) > Exception in thread "Thread-791" java.lang.NullPointerException > at > org.apache.kafka.connect.mirror.MirrorSourceTask.cleanup(MirrorSourceTask.java:116) > at java.lang.Thread.run(Thread.java:748) > Exception in thread "Thread-793" java.lang.NullPointerException > at > org.apache.kafka.connect.mirror.MirrorSourceTask.cleanup(MirrorSourceTask.java:110) > at java.lang.Thread.run(Thread.java:748){noformat} > When the number of the established connections (returned by lsof) reaches a > certain limit, new exceptions start to pop up in the logs: Too many open files > {noformat} > [2019-06-19 12:56:43,949] ERROR > WorkerSourceTask{id=MirrorHeartbeatConnector-0} failed to send record to > heartbeats: {} (org.apache.kafka.connect.runtime.WorkerSourceTask) > org.apache.kafka.common.errors.SaslAuthenticationException: An error: > (java.security.PrivilegedActionException: javax.security.sasl.SaslException: > GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Too many open > files)]) occurred when evaluating SASL token received from the Kafka Broker. > Kafka Client will go to A > UTHENTICATION_FAILED state. > Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Too many open > files)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > at > org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.lambda$createSaslToken$1(SaslClientAuthenticator.java:461) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslToken(SaslClientAuthenticator.java:461) > at > org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.sendSaslClientToken(SaslClientAuthenticator.java:370) > at > org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.sendInitialToken(SaslClientAuthenticator.java:290) > at > org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.authenticate(SaslClientAuthenticator.java:230) > at > org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:173) > at > org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:536) > at org.apache.kafka.common.network.Selector.poll(Selector.java:472) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:535) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:311) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:235) > at java.lang.Thread.run(Thread.java:748) > Caused by: GSSException: No valid credentials provided (Mechanism level: Too > many open files) > at > sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:775) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) > ... 14 more > Caused by: java.net.SocketException: Too many open files > at java.net.Socket.createImpl(Socket.java:460) >
[jira] [Commented] (KAFKA-8622) Snappy Compression Not Working
[ https://issues.apache.org/jira/browse/KAFKA-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876898#comment-16876898 ] Kunal Verma commented on KAFKA-8622: [~kaushik srinivas] It worked. Thanks for the quick response. I have some follow up question: 1. Even previously I am using *root user* to execute the kafka broker so that, this permission issue does not come. And now creating a dedicated tmp folder sort it out. How does this even work? 2. How much space does snappy require for the temp folder? 3. In the case of snappy error, I had replaced snappy with gzip, and there was no issue reported by gzip. > Snappy Compression Not Working > -- > > Key: KAFKA-8622 > URL: https://issues.apache.org/jira/browse/KAFKA-8622 > Project: Kafka > Issue Type: Bug > Components: compression >Affects Versions: 2.3.0, 2.2.1 >Reporter: Kunal Verma >Assignee: kaushik srinivas >Priority: Major > > I am trying to produce a message on the broker with compression enabled as > snappy. > Environment : > Brokers[Kafka-cluster] are hosted on Centos 7 > I have download the latest version (2.3.0 & 2.2.1) tar, extract it and moved > to /opt/kafka- > I have executed the broker with standard configuration. > In my producer service(written in java), I have enabled snappy compression. > props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy"); > > so while sending record on broker, I am getting following errors: > org.apache.kafka.common.errors.UnknownServerException: The server experienced > an unexpected error when processing the request > > While investing further at broker end I got following error in log > > logs/kafkaServer.out:java.lang.UnsatisfiedLinkError: > /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: > /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: > failed to map segment from shared object: Operation not permitted > -- > > [2019-07-02 15:29:43,399] ERROR [ReplicaManager broker=1] Error processing > append operation on partition test-bulk-1 (kafka.server.ReplicaManager) > java.lang.NoClassDefFoundError: Could not initialize class > org.xerial.snappy.Snappy > at > org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:435) > at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:466) > at java.io.DataInputStream.readByte(DataInputStream.java:265) > at org.apache.kafka.common.utils.ByteUtils.readVarint(ByteUtils.java:168) > at > org.apache.kafka.common.record.DefaultRecord.readFrom(DefaultRecord.java:293) > at > org.apache.kafka.common.record.DefaultRecordBatch$1.readNext(DefaultRecordBatch.java:264) > at > org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:569) > at > org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:538) > at > org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:327) > at > scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:55) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > at > kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1(LogValidator.scala:269) > at > kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1$adapted(LogValidator.scala:261) > at scala.collection.Iterator.foreach(Iterator.scala:941) > at scala.collection.Iterator.foreach$(Iterator.scala:941) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > at > kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:261) > at > kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:73) > at kafka.log.Log.liftedTree1$1(Log.scala:881) > at kafka.log.Log.$anonfun$append$2(Log.scala:868) > at kafka.log.Log.maybeHandleIOException(Log.scala:2065) > at kafka.log.Log.append(Log.scala:850) > at kafka.log.Log.appendAsLeader(Log.scala:819) > at > kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) > at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) > at > kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) > at scala.collection.mutable.HashMap.$an
[jira] [Commented] (KAFKA-8622) Snappy Compression Not Working
[ https://issues.apache.org/jira/browse/KAFKA-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876874#comment-16876874 ] kaushik srinivas commented on KAFKA-8622: - This can be due to permission issue for the snappy library. try to create a folder with write permissions and add this below flag in the kafka-run-class.sh script to point to the newly created directory. -Dorg.xerial.snappy.tempdir=/home/cloud-user/GENERIC_FRAMEWORK/kaushik/confluent-5.1.2/tmp It should work fine. Let know the test results. > Snappy Compression Not Working > -- > > Key: KAFKA-8622 > URL: https://issues.apache.org/jira/browse/KAFKA-8622 > Project: Kafka > Issue Type: Bug > Components: compression >Affects Versions: 2.3.0, 2.2.1 >Reporter: Kunal Verma >Assignee: kaushik srinivas >Priority: Major > > I am trying to produce a message on the broker with compression enabled as > snappy. > Environment : > Brokers[Kafka-cluster] are hosted on Centos 7 > I have download the latest version (2.3.0 & 2.2.1) tar, extract it and moved > to /opt/kafka- > I have executed the broker with standard configuration. > In my producer service(written in java), I have enabled snappy compression. > props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy"); > > so while sending record on broker, I am getting following errors: > org.apache.kafka.common.errors.UnknownServerException: The server experienced > an unexpected error when processing the request > > While investing further at broker end I got following error in log > > logs/kafkaServer.out:java.lang.UnsatisfiedLinkError: > /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: > /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: > failed to map segment from shared object: Operation not permitted > -- > > [2019-07-02 15:29:43,399] ERROR [ReplicaManager broker=1] Error processing > append operation on partition test-bulk-1 (kafka.server.ReplicaManager) > java.lang.NoClassDefFoundError: Could not initialize class > org.xerial.snappy.Snappy > at > org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:435) > at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:466) > at java.io.DataInputStream.readByte(DataInputStream.java:265) > at org.apache.kafka.common.utils.ByteUtils.readVarint(ByteUtils.java:168) > at > org.apache.kafka.common.record.DefaultRecord.readFrom(DefaultRecord.java:293) > at > org.apache.kafka.common.record.DefaultRecordBatch$1.readNext(DefaultRecordBatch.java:264) > at > org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:569) > at > org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:538) > at > org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:327) > at > scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:55) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > at > kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1(LogValidator.scala:269) > at > kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1$adapted(LogValidator.scala:261) > at scala.collection.Iterator.foreach(Iterator.scala:941) > at scala.collection.Iterator.foreach$(Iterator.scala:941) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > at > kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:261) > at > kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:73) > at kafka.log.Log.liftedTree1$1(Log.scala:881) > at kafka.log.Log.$anonfun$append$2(Log.scala:868) > at kafka.log.Log.maybeHandleIOException(Log.scala:2065) > at kafka.log.Log.append(Log.scala:850) > at kafka.log.Log.appendAsLeader(Log.scala:819) > at > kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) > at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) > at > kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) > at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) > at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)
[jira] [Assigned] (KAFKA-8622) Snappy Compression Not Working
[ https://issues.apache.org/jira/browse/KAFKA-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kaushik srinivas reassigned KAFKA-8622: --- Assignee: kaushik srinivas > Snappy Compression Not Working > -- > > Key: KAFKA-8622 > URL: https://issues.apache.org/jira/browse/KAFKA-8622 > Project: Kafka > Issue Type: Bug > Components: compression >Affects Versions: 2.3.0, 2.2.1 >Reporter: Kunal Verma >Assignee: kaushik srinivas >Priority: Major > > I am trying to produce a message on the broker with compression enabled as > snappy. > Environment : > Brokers[Kafka-cluster] are hosted on Centos 7 > I have download the latest version (2.3.0 & 2.2.1) tar, extract it and moved > to /opt/kafka- > I have executed the broker with standard configuration. > In my producer service(written in java), I have enabled snappy compression. > props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy"); > > so while sending record on broker, I am getting following errors: > org.apache.kafka.common.errors.UnknownServerException: The server experienced > an unexpected error when processing the request > > While investing further at broker end I got following error in log > > logs/kafkaServer.out:java.lang.UnsatisfiedLinkError: > /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: > /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: > failed to map segment from shared object: Operation not permitted > -- > > [2019-07-02 15:29:43,399] ERROR [ReplicaManager broker=1] Error processing > append operation on partition test-bulk-1 (kafka.server.ReplicaManager) > java.lang.NoClassDefFoundError: Could not initialize class > org.xerial.snappy.Snappy > at > org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:435) > at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:466) > at java.io.DataInputStream.readByte(DataInputStream.java:265) > at org.apache.kafka.common.utils.ByteUtils.readVarint(ByteUtils.java:168) > at > org.apache.kafka.common.record.DefaultRecord.readFrom(DefaultRecord.java:293) > at > org.apache.kafka.common.record.DefaultRecordBatch$1.readNext(DefaultRecordBatch.java:264) > at > org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:569) > at > org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:538) > at > org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:327) > at > scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:55) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > at > kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1(LogValidator.scala:269) > at > kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1$adapted(LogValidator.scala:261) > at scala.collection.Iterator.foreach(Iterator.scala:941) > at scala.collection.Iterator.foreach$(Iterator.scala:941) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > at > kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:261) > at > kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:73) > at kafka.log.Log.liftedTree1$1(Log.scala:881) > at kafka.log.Log.$anonfun$append$2(Log.scala:868) > at kafka.log.Log.maybeHandleIOException(Log.scala:2065) > at kafka.log.Log.append(Log.scala:850) > at kafka.log.Log.appendAsLeader(Log.scala:819) > at > kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) > at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) > at > kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) > at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) > at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) > at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) > at scala.collection.TraversableLike.map(TraversableLike.scala:237) > at scala.collection.TraversableLike.map$(TraversableLike.scala:230) > at scala.collection.
[jira] [Created] (KAFKA-8623) KafkaProducer possible deadlock when sending to different topics
Alexander Bagiev created KAFKA-8623: --- Summary: KafkaProducer possible deadlock when sending to different topics Key: KAFKA-8623 URL: https://issues.apache.org/jira/browse/KAFKA-8623 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 2.2.1 Reporter: Alexander Bagiev Project with bug reproduction: [https://github.com/abagiev/kafka-producer-bug] It was found that sending two messages in two different topics in a row results in hanging of KafkaProducer for 60s and the following exception: {noformat} org.springframework.kafka.core.KafkaProducerException: Failed to send; nested exception is org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. at org.springframework.kafka.core.KafkaTemplate.lambda$buildCallback$0(KafkaTemplate.java:405) ~[spring-kafka-2.2.7.RELEASE.jar:2.2.7.RELEASE] at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:877) ~[kafka-clients-2.0.1.jar:na] at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:803) ~[kafka-clients-2.0.1.jar:na] at org.springframework.kafka.core.DefaultKafkaProducerFactory$CloseSafeProducer.send(DefaultKafkaProducerFactory.java:444) ~[spring-kafka-2.2.7.RELEASE.jar:2.2.7.RELEASE] at org.springframework.kafka.core.KafkaTemplate.doSend(KafkaTemplate.java:381) ~[spring-kafka-2.2.7.RELEASE.jar:2.2.7.RELEASE] at org.springframework.kafka.core.KafkaTemplate.send(KafkaTemplate.java:193) ~[spring-kafka-2.2.7.RELEASE.jar:2.2.7.RELEASE] ... {noformat} It looks like KafkaProducer requests two times for meta information for each topic and hangs just before second request due to some deadlock. When 60s pass TimeoutException is thrown and meta information is requested/received immediately (but after exception has been already thrown). The issue in the example project is reproduced every time; and the use case is trivial. This is a critical bug for us. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5876) IQ should throw different exceptions for different errors
[ https://issues.apache.org/jira/browse/KAFKA-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876836#comment-16876836 ] ASF GitHub Bot commented on KAFKA-5876: --- vitojeng commented on pull request #7026: KAFKA-5876: [WIP]IQ should throw different exceptions for different errors URL: https://github.com/apache/kafka/pull/7026 This PR is used for KIP-216 discussion: [KIP-216: IQ should throw different exceptions for different errors(https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors) ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > IQ should throw different exceptions for different errors > - > > Key: KAFKA-5876 > URL: https://issues.apache.org/jira/browse/KAFKA-5876 > Project: Kafka > Issue Type: Task > Components: streams >Reporter: Matthias J. Sax >Assignee: Vito Jeng >Priority: Major > Labels: needs-kip, newbie++ > > Currently, IQ does only throws {{InvalidStateStoreException}} for all errors > that occur. However, we have different types of errors and should throw > different exceptions for those types. > For example, if a store was migrated it must be rediscovered while if a store > cannot be queried yet, because it is still re-created after a rebalance, the > user just needs to wait until store recreation is finished. > There might be other examples, too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8622) Snappy Compression Not Working
[ https://issues.apache.org/jira/browse/KAFKA-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal updated KAFKA-8622: - Description: I am trying to produce a message on the broker with compression enabled as snappy. Environment : Brokers[Kafka-cluster] are hosted on Centos 7 I have download the latest version (2.3.0 & 2.2.1) tar, extract it and moved to /opt/kafka- I have executed the broker with standard configuration. In my producer service(written in java), I have enabled snappy compression. props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy"); so while sending record on broker, I am getting following errors: org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request While investing further at broker end I got following error in log logs/kafkaServer.out:java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: failed to map segment from shared object: Operation not permitted -- [2019-07-02 15:29:43,399] ERROR [ReplicaManager broker=1] Error processing append operation on partition test-bulk-1 (kafka.server.ReplicaManager) java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:435) at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:466) at java.io.DataInputStream.readByte(DataInputStream.java:265) at org.apache.kafka.common.utils.ByteUtils.readVarint(ByteUtils.java:168) at org.apache.kafka.common.record.DefaultRecord.readFrom(DefaultRecord.java:293) at org.apache.kafka.common.record.DefaultRecordBatch$1.readNext(DefaultRecordBatch.java:264) at org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:569) at org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:538) at org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:327) at scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:55) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1(LogValidator.scala:269) at kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1$adapted(LogValidator.scala:261) at scala.collection.Iterator.foreach(Iterator.scala:941) at scala.collection.Iterator.foreach$(Iterator.scala:941) at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:261) at kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:73) at kafka.log.Log.liftedTree1$1(Log.scala:881) at kafka.log.Log.$anonfun$append$2(Log.scala:868) at kafka.log.Log.maybeHandleIOException(Log.scala:2065) at kafka.log.Log.append(Log.scala:850) at kafka.log.Log.appendAsLeader(Log.scala:819) at kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) at kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) at scala.collection.TraversableLike.map(TraversableLike.scala:237) at scala.collection.TraversableLike.map$(TraversableLike.scala:230) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:751) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:492) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:544) at kafka.server.KafkaApis.handle(KafkaApis.scala:113) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at java.lang.Thread.run(Thread.java:748) --- I have checked the snappy jar is on the classpath. Please refer Client environment:java.class.path=/opt/
[jira] [Created] (KAFKA-8622) Snappy Compression Not Working
Kunal created KAFKA-8622: Summary: Snappy Compression Not Working Key: KAFKA-8622 URL: https://issues.apache.org/jira/browse/KAFKA-8622 Project: Kafka Issue Type: Bug Components: compression Affects Versions: 2.2.1, 2.3.0 Reporter: Kunal I am trying to produce a message on the broker with compression enabled as snappy. Environment : Brokers[Kafka-cluster] are hosted on Centos 7 I have download the latest version (2.3.0 & 2.2.1) tar, extract it and moved to /opt/kafka- I have executed the broker with standard configuration. In my producer service(written in java), I have enabled snappy compression. props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy"); so while sending record on broker, I am getting following errors: org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request While investing further at broker end I got following error in log logs/kafkaServer.out:java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: /tmp/snappy-1.1.7-ecd381af-ffdd-4a5c-a3d8-b802d0fa4e85-libsnappyjava.so: failed to map segment from shared object: Operation not permitted -- [2019-07-02 15:29:43,399] ERROR [ReplicaManager broker=1] Error processing append operation on partition test-bulk-1 (kafka.server.ReplicaManager) java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:435) at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:466) at java.io.DataInputStream.readByte(DataInputStream.java:265) at org.apache.kafka.common.utils.ByteUtils.readVarint(ByteUtils.java:168) at org.apache.kafka.common.record.DefaultRecord.readFrom(DefaultRecord.java:293) at org.apache.kafka.common.record.DefaultRecordBatch$1.readNext(DefaultRecordBatch.java:264) at org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:569) at org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:538) at org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:327) at scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:55) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1(LogValidator.scala:269) at kafka.log.LogValidator$.$anonfun$validateMessagesAndAssignOffsetsCompressed$1$adapted(LogValidator.scala:261) at scala.collection.Iterator.foreach(Iterator.scala:941) at scala.collection.Iterator.foreach$(Iterator.scala:941) at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:261) at kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:73) at kafka.log.Log.liftedTree1$1(Log.scala:881) at kafka.log.Log.$anonfun$append$2(Log.scala:868) at kafka.log.Log.maybeHandleIOException(Log.scala:2065) at kafka.log.Log.append(Log.scala:850) at kafka.log.Log.appendAsLeader(Log.scala:819) at kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) at kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) at scala.collection.TraversableLike.map(TraversableLike.scala:237) at scala.collection.TraversableLike.map$(TraversableLike.scala:230) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:751) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:492) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:544) at kafka.server.KafkaApis.handle(KafkaApis.scala:113) at kafka.server.Kafka
[jira] [Commented] (KAFKA-8555) Flaky test ExampleConnectIntegrationTest#testSourceConnector
[ https://issues.apache.org/jira/browse/KAFKA-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876795#comment-16876795 ] Bruno Cadonna commented on KAFKA-8555: -- https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/23145/ {code:java} org.apache.kafka.connect.errors.DataException: Insufficient records committed by connector simple-conn in 15000 millis. Records expected=2000, actual=1517 at org.apache.kafka.connect.integration.ConnectorHandle.awaitCommits(ConnectorHandle.java:188) at org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector(ExampleConnectIntegrationTest.java:181) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) ... {code} log attached: log-job23145.txt > Flaky test ExampleConnectIntegrationTest#testSourceConnector > > > Key: KAFKA-8555 > URL: https://issues.apache.org/jira/browse/KAFKA-8555 > Project: Kafka > Issue Type: Bug >Reporter: Boyang Chen >Priority: Major > Attachments: log-job23145.txt > > > [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22798/console] > *02:03:21* > org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector > failed, log available in > /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/connect/runtime/build/reports/testOutput/org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector.test.stdout*02:03:21* > *02:03:21* > org.apache.kafka.connect.integration.ExampleConnectIntegrationTest > > testSourceConnector FAILED*02:03:21* > org.apache.kafka.connect.errors.DataException: Insufficient records committed > by connector simple-conn in 15000 millis. Records expected=2000, > actual=1013*02:03:21* at > org.apache.kafka.connect.integration.ConnectorHandle.awaitCommits(ConnectorHandle.java:188)*02:03:21* > at > org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector(ExampleConnectIntegrationTest.java:181)*02:03:21* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8555) Flaky test ExampleConnectIntegrationTest#testSourceConnector
[ https://issues.apache.org/jira/browse/KAFKA-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno Cadonna updated KAFKA-8555: - Attachment: log-job23145.txt > Flaky test ExampleConnectIntegrationTest#testSourceConnector > > > Key: KAFKA-8555 > URL: https://issues.apache.org/jira/browse/KAFKA-8555 > Project: Kafka > Issue Type: Bug >Reporter: Boyang Chen >Priority: Major > Attachments: log-job23145.txt > > > [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22798/console] > *02:03:21* > org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector > failed, log available in > /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/connect/runtime/build/reports/testOutput/org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector.test.stdout*02:03:21* > *02:03:21* > org.apache.kafka.connect.integration.ExampleConnectIntegrationTest > > testSourceConnector FAILED*02:03:21* > org.apache.kafka.connect.errors.DataException: Insufficient records committed > by connector simple-conn in 15000 millis. Records expected=2000, > actual=1013*02:03:21* at > org.apache.kafka.connect.integration.ConnectorHandle.awaitCommits(ConnectorHandle.java:188)*02:03:21* > at > org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector(ExampleConnectIntegrationTest.java:181)*02:03:21* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8621) KIP-486: Support for pluggable KeyStore and TrustStore
[ https://issues.apache.org/jira/browse/KAFKA-8621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Zhou reassigned KAFKA-8621: -- Assignee: Thomas Zhou > KIP-486: Support for pluggable KeyStore and TrustStore > -- > > Key: KAFKA-8621 > URL: https://issues.apache.org/jira/browse/KAFKA-8621 > Project: Kafka > Issue Type: New Feature >Reporter: Maulin Vasavada >Assignee: Thomas Zhou >Priority: Minor > > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-486%3A+Support+for+pluggable+KeyStore+and+TrustStore] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8621) KIP-486: Support for pluggable KeyStore and TrustStore
MAULIN VASAVADA created KAFKA-8621: -- Summary: KIP-486: Support for pluggable KeyStore and TrustStore Key: KAFKA-8621 URL: https://issues.apache.org/jira/browse/KAFKA-8621 Project: Kafka Issue Type: New Feature Reporter: MAULIN VASAVADA [https://cwiki.apache.org/confluence/display/KAFKA/KIP-486%3A+Support+for+pluggable+KeyStore+and+TrustStore] -- This message was sent by Atlassian JIRA (v7.6.3#76005)