[jira] [Commented] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

2017-01-24 Thread Vincent Rischmann (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836543#comment-15836543
 ] 

Vincent Rischmann commented on KAFKA-3894:
--

No not all logs, just the .log one. If you notice, Kafka 
computes the number of messages based on the number in the filename, hence why 
it reports there is 13042136566 messages in your log, which is almost surely 
not true. At least it wasn't for me.

The file name is just wrong basically. Come to think of it, you could maybe 
just rename the file to some arbitrary number where you know the difference 
between the _next_  segment number and _this_ segment buffer is something that 
would fit in your dedupe buffer ? For example, here your second segment has the 
number _13042136566_, you could rename the .log to 
_13042136566 - 100_ then your offset map only needs to fit 1M offsets which 
it can do based on your log.

I'm just thinking out loud here, I didn't do this but I think it could work, 
and would be less risky than just deleting all data, maybe.

> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---
>
> Key: KAFKA-3894
> URL: https://issues.apache.org/jira/browse/KAFKA-3894
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Oracle JDK 8
> Ubuntu Precise
>Reporter: Tim Carey-Smith
>Assignee: Tom Crayford
>  Labels: compaction
> Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/47580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

2017-01-24 Thread William Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836524#comment-15836524
 ] 

William Yu commented on KAFKA-3894:
---

[~m...@vrischmann.me] Thanks for the quick reply. Do you mean you deleted all 
logs from the partition? Or where you just targeting specific files which were 
throwing the error?

I was thinking of shutting down the brokers & consumers and removing the 
unstable partition and restarting with my `auto.offset.reset=latest` set on my 
consumers.

> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---
>
> Key: KAFKA-3894
> URL: https://issues.apache.org/jira/browse/KAFKA-3894
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Oracle JDK 8
> Ubuntu Precise
>Reporter: Tim Carey-Smith
>Assignee: Tom Crayford
>  Labels: compaction
> Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/47580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

2017-01-24 Thread Vincent Rischmann (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836495#comment-15836495
 ] 

Vincent Rischmann commented on KAFKA-3894:
--

I had the exact same bug, didn't realize it at first. What I did was simply 
deleting all 00... files manually with the broker stopped, and restarting 
the broker. 

I was betting that it would be fine because it's the cosumer offsets topic, and 
chances are the data in that file is useless anyway since my consumers commit 
constantly, and only fetch offsets while starting up essentially. It's a little 
risky, but worked. (and at that time the patch wasn't available)

Still have no idea what generated those files though.

> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---
>
> Key: KAFKA-3894
> URL: https://issues.apache.org/jira/browse/KAFKA-3894
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Oracle JDK 8
> Ubuntu Precise
>Reporter: Tim Carey-Smith
>Assignee: Tom Crayford
>  Labels: compaction
> Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/47580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

2017-01-24 Thread William Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836473#comment-15836473
 ] 

William Yu commented on KAFKA-3894:
---

I know this bug is resolved, but we just encountered this bug in our 0.9.0.1 
Cluster. 

{code}
[2017-01-24 17:17:30,035] ERROR [kafka-log-cleaner-thread-0], Error due to  
(kafka.log.LogCleaner)
java.lang.IllegalArgumentException: requirement failed: 13042136566 messages in 
segment __consumer_offsets-32/.log but offset map can fit 
only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
log.cleaner.threads
{code}

We've tried the following attempts to fix:
- Increase the log.cleaner.dedupe.buffer.size but the # of messages is greater 
then MAX_INT and will be beyond the amount of memory we can allocated.
- Wipe away the partition in question from a single broker and let the data 
replicate back. Did not work as all the replicas for this partition also have 
an issue where they cannot compact the topic.

Does anyone else know of another solution to recover this partition? Do I need 
to just wipe away the whole partition completely?


> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---
>
> Key: KAFKA-3894
> URL: https://issues.apache.org/jira/browse/KAFKA-3894
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Oracle JDK 8
> Ubuntu Precise
>Reporter: Tim Carey-Smith
>Assignee: Tom Crayford
>  Labels: compaction
> Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/47580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

2016-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431303#comment-15431303
 ] 

ASF GitHub Bot commented on KAFKA-3894:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1725


> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---
>
> Key: KAFKA-3894
> URL: https://issues.apache.org/jira/browse/KAFKA-3894
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Oracle JDK 8
> Ubuntu Precise
>Reporter: Tim Carey-Smith
>  Labels: compaction
> Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/47580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

2016-08-11 Thread Tom Crayford (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417675#comment-15417675
 ] 

Tom Crayford commented on KAFKA-3894:
-

Hi Jun,

We're probably going to start on b. for now. I think a. is incredibly valuable, 
but it doesn't impact this manner of the log cleaner crashing. I think there 
are some cases where we will fail to clean up data, but having those exist 
seems far more preferable than crashing the thread entirely.

We'll get started with b., hopefully will have a patch up within a few business 
days.

> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---
>
> Key: KAFKA-3894
> URL: https://issues.apache.org/jira/browse/KAFKA-3894
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Oracle JDK 8
> Ubuntu Precise
>Reporter: Tim Carey-Smith
>  Labels: compaction
> Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/47580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

2016-08-08 Thread Elias Dorneles (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412430#comment-15412430
 ] 

Elias Dorneles commented on KAFKA-3894:
---

I've bumped the bumped into this same issue (log cleaner threads dying because 
messages wouldn't fit the offset map).

For some of the topics the messages would almost fit, so I was able to get away 
just increasing the dedupe buffer load factor 
(https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaConfig.scala#L252)
 which defaults to 90% of the 2Gb max buffer size.

For other topics that had more messages and wouldn't fit in the 2Gb in any way, 
the workaround was to:

1) decrease the segment size config for that topic [1]
2) reassign topic partitions, in order to end up with new segments with sizes 
obeying the config change
3) rolling restart the nodes, to restart log cleaner threads

I'd love to know if there is another way of doing this, step 3 is particularly 
frustrating.

Good luck!

[1]: This can be done only for a particular topic with: `kafka-topics.sh 
--zookeeper $ZK --topic $TOPIC --alter --config segment.bytes`, but if needed 
you can also set `log.segment.bytes` for topics across all cluster.

> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---
>
> Key: KAFKA-3894
> URL: https://issues.apache.org/jira/browse/KAFKA-3894
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Oracle JDK 8
> Ubuntu Precise
>Reporter: Tim Carey-Smith
>  Labels: compaction
> Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/47580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)