[jira] [Commented] (KAFKA-8724) log cleaner thread dies when attempting to clean a __consumer_offsets partition after upgrade from 2.0->2.3

2019-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923578#comment-16923578
 ] 

ASF GitHub Bot commented on KAFKA-8724:
---

hachikuji commented on pull request #7264: KAFKA-8724; Improve range checking 
when computing cleanable partitions
URL: https://github.com/apache/kafka/pull/7264
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> log cleaner thread dies when attempting to clean a __consumer_offsets 
> partition after upgrade from 2.0->2.3
> ---
>
> Key: KAFKA-8724
> URL: https://issues.apache.org/jira/browse/KAFKA-8724
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.3.0
> Environment: Linux 3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 
> 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Keith So
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 2.3.1
>
> Attachments: KAFKA-308-stack-trace.txt
>
>
> We are attempting an upgrade from Kafka 2.0 to 2.3 on a single cluster setup. 
>  We have a mixture of Java/C++ and Python clients (Python clients are using 
> kafka-python libraries).
> After the upgrade, the log cleaner occasionally dies with the attached stack 
> trace.  Using timestamp correlation, we pinned it down to the cleaning of a 
> __consumer_offsets partition.  The config logged at initialization shows that 
> inter.broker.protocol.version = 2.3-IV1
> log.message.format.version = 2.3-IV1
> We initially thought this was to do with unclean upgrade from 2.0 to 2.3, but 
> after resetting the consumer offsets topic (via 
> [https://medium.com/@nblaye/reset-consumer-offsets-topic-in-kafka-with-zookeeper-5910213284a2])
>   this still recurs on initially empty consumer offset partitions.
> At the moment we are working around by toggling log.cleaner.threads option 
> using dynamic broker configuration to restore the log cleaner threads



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8724) log cleaner thread dies when attempting to clean a __consumer_offsets partition after upgrade from 2.0->2.3

2019-08-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917940#comment-16917940
 ] 

ASF GitHub Bot commented on KAFKA-8724:
---

hachikuji commented on pull request #7264: KAFKA-8724; Improve range checking 
when computing cleanable partitions
URL: https://github.com/apache/kafka/pull/7264
 
 
   This patch contains a few improvements on the offset range handling when 
computing the cleanable range of offsets.
   
   1. It adds bounds checking to ensure the dirty offset cannot be larger than 
the log end offset. If it is, we reset to the log start offset.
   2. It adds a method to get the non-active segments in the log while holding 
the lock. This ensures that a truncation cannot lead to an invalid segment 
range.
   3. It improves exception messages in the case that an inconsistent segment 
range is provided so that we have more information to find the root cause.
   
   The patch also fixes a few problems in `LogCleanerManagerTest` due to 
unintended reuse of the underlying log directory.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> log cleaner thread dies when attempting to clean a __consumer_offsets 
> partition after upgrade from 2.0->2.3
> ---
>
> Key: KAFKA-8724
> URL: https://issues.apache.org/jira/browse/KAFKA-8724
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.3.0
> Environment: Linux 3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 
> 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Keith So
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 2.3.1
>
> Attachments: KAFKA-308-stack-trace.txt
>
>
> We are attempting an upgrade from Kafka 2.0 to 2.3 on a single cluster setup. 
>  We have a mixture of Java/C++ and Python clients (Python clients are using 
> kafka-python libraries).
> After the upgrade, the log cleaner occasionally dies with the attached stack 
> trace.  Using timestamp correlation, we pinned it down to the cleaning of a 
> __consumer_offsets partition.  The config logged at initialization shows that 
> inter.broker.protocol.version = 2.3-IV1
> log.message.format.version = 2.3-IV1
> We initially thought this was to do with unclean upgrade from 2.0 to 2.3, but 
> after resetting the consumer offsets topic (via 
> [https://medium.com/@nblaye/reset-consumer-offsets-topic-in-kafka-with-zookeeper-5910213284a2])
>   this still recurs on initially empty consumer offset partitions.
> At the moment we are working around by toggling log.cleaner.threads option 
> using dynamic broker configuration to restore the log cleaner threads



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8724) log cleaner thread dies when attempting to clean a __consumer_offsets partition after upgrade from 2.0->2.3

2019-08-27 Thread Jason Gustafson (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917085#comment-16917085
 ] 

Jason Gustafson commented on KAFKA-8724:


There is a race condition here, but I'm not sure I can explain how it could be 
regularly hit. When the log cleaner attempts to collect the non-active 
segments, it calls `{{log.logSegments(firstDirtyOffset, 
log.activeSegment.baseOffset)}}`. Since it's not holding the log lock, a log 
roll might invalidate the expectation that the active segment base offset is 
larger than the dirty offset. I'm guessing the fact that the log directory was 
wiped is also playing a factor here breaking normal offset assumptions. For 
example, the checkpointed dirty offset may have gotten well ahead of an empty 
log. It wouldn't surprise me to find some unprotected cases when that happens. 
I will submit a patch to fix the known race condition and improve error logging 
a little bit so we'll have more to go on if we miss a case.

> log cleaner thread dies when attempting to clean a __consumer_offsets 
> partition after upgrade from 2.0->2.3
> ---
>
> Key: KAFKA-8724
> URL: https://issues.apache.org/jira/browse/KAFKA-8724
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.3.0
> Environment: Linux 3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 
> 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Keith So
>Priority: Critical
> Fix For: 2.3.1
>
> Attachments: KAFKA-308-stack-trace.txt
>
>
> We are attempting an upgrade from Kafka 2.0 to 2.3 on a single cluster setup. 
>  We have a mixture of Java/C++ and Python clients (Python clients are using 
> kafka-python libraries).
> After the upgrade, the log cleaner occasionally dies with the attached stack 
> trace.  Using timestamp correlation, we pinned it down to the cleaning of a 
> __consumer_offsets partition.  The config logged at initialization shows that 
> inter.broker.protocol.version = 2.3-IV1
> log.message.format.version = 2.3-IV1
> We initially thought this was to do with unclean upgrade from 2.0 to 2.3, but 
> after resetting the consumer offsets topic (via 
> [https://medium.com/@nblaye/reset-consumer-offsets-topic-in-kafka-with-zookeeper-5910213284a2])
>   this still recurs on initially empty consumer offset partitions.
> At the moment we are working around by toggling log.cleaner.threads option 
> using dynamic broker configuration to restore the log cleaner threads



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8724) log cleaner thread dies when attempting to clean a __consumer_offsets partition after upgrade from 2.0->2.3

2019-07-31 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896827#comment-16896827
 ] 

Ismael Juma commented on KAFKA-8724:


cc [~hachikuji] [~mumrah]

> log cleaner thread dies when attempting to clean a __consumer_offsets 
> partition after upgrade from 2.0->2.3
> ---
>
> Key: KAFKA-8724
> URL: https://issues.apache.org/jira/browse/KAFKA-8724
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.3.0
> Environment: Linux 3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 
> 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Keith So
>Priority: Critical
> Fix For: 2.3.1
>
> Attachments: KAFKA-308-stack-trace.txt
>
>
> We are attempting an upgrade from Kafka 2.0 to 2.3 on a single cluster setup. 
>  We have a mixture of Java/C++ and Python clients (Python clients are using 
> kafka-python libraries).
> After the upgrade, the log cleaner occasionally dies with the attached stack 
> trace.  Using timestamp correlation, we pinned it down to the cleaning of a 
> __consumer_offsets partition.  The config logged at initialization shows that 
> inter.broker.protocol.version = 2.3-IV1
> log.message.format.version = 2.3-IV1
> We initially thought this was to do with unclean upgrade from 2.0 to 2.3, but 
> after resetting the consumer offsets topic (via 
> [https://medium.com/@nblaye/reset-consumer-offsets-topic-in-kafka-with-zookeeper-5910213284a2])
>   this still recurs on initially empty consumer offset partitions.
> At the moment we are working around by toggling log.cleaner.threads option 
> using dynamic broker configuration to restore the log cleaner threads



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (KAFKA-8724) log cleaner thread dies when attempting to clean a __consumer_offsets partition after upgrade from 2.0->2.3

2019-07-29 Thread Keith So (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895679#comment-16895679
 ] 

Keith So commented on KAFKA-8724:
-

This is the config we have for __consumer_offsets, if there is a quick 
workaround via config we'd much appreciate it.

{{$ kafka-topics --bootstrap-server localhost:9091 --describe --topic 
__consumer_offsets}}
{{Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:1 
Configs:compression.type=producer,cleanup.policy=compact,segment.bytes=104857600,retention.ms=1000,message.timestamp.type=LogAppendTime,delete.retention.ms=1000,segment.ms=6}}

> log cleaner thread dies when attempting to clean a __consumer_offsets 
> partition after upgrade from 2.0->2.3
> ---
>
> Key: KAFKA-8724
> URL: https://issues.apache.org/jira/browse/KAFKA-8724
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.3.0
> Environment: Linux 3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 
> 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Keith So
>Priority: Critical
> Fix For: 2.3.1
>
> Attachments: KAFKA-308-stack-trace.txt
>
>
> We are attempting an upgrade from Kafka 2.0 to 2.3 on a single cluster setup. 
>  We have a mixture of Java/C++ and Python clients (Python clients are using 
> kafka-python libraries).
> After the upgrade, the log cleaner occasionally dies with the attached stack 
> trace.  Using timestamp correlation, we pinned it down to the cleaning of a 
> __consumer_offsets partition.  The config logged at initialization shows that 
> inter.broker.protocol.version = 2.3-IV1
> log.message.format.version = 2.3-IV1
> We initially thought this was to do with unclean upgrade from 2.0 to 2.3, but 
> after resetting the consumer offsets topic (via 
> [https://medium.com/@nblaye/reset-consumer-offsets-topic-in-kafka-with-zookeeper-5910213284a2])
>   this still recurs on initially empty consumer offset partitions.
> At the moment we are working around by toggling log.cleaner.threads option 
> using dynamic broker configuration to restore the log cleaner threads



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)