[jira] [Commented] (KAFKA-12946) __consumer_offsets topic with very big partitions

2022-09-29 Thread zhangzhisheng (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611018#comment-17611018
 ] 

zhangzhisheng commented on KAFKA-12946:
---

upgrade 2.4.2

> __consumer_offsets topic with very big partitions
> -
>
> Key: KAFKA-12946
> URL: https://issues.apache.org/jira/browse/KAFKA-12946
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.0.0
>Reporter: Emi
>Priority: Critical
>
> I am using Kafka 2.0.0 with java 8u191
>  There is a partitions of the __consumer_offsets topic that is 600 GB with 
> 6000 segments older than 4 months. Other partitions of that topic are small: 
> 20-30 MB.
> There are 60 consumer groups, 90 topics and 100 partitions per topic.
> There aren't errors in the logs. From the log of the logcleaner, I can see 
> that partition is never touched from the logcleaner thread for the 
> compaction, but it only add new segments.
>  How is this possible?
> There was another partition with the same problem, but after some months it 
> has been compacted. Now there is only one partition with this problem, but 
> this is bigger and keep growing
> I have used the kafka-dump-log tool to check these old segments and I can see 
> many duplicates. So I would assume that is not compacted.
> My settings:
>  {{offsets.commit.required.acks = -1}}
>  {{[offsets.commit.timeout.ms|http://offsets.commit.timeout.ms/]}} = 5000
>  {{offsets.load.buffer.size = 5242880}}
>  
> {{[offsets.retention.check.interval.ms|http://offsets.retention.check.interval.ms/]}}
>  = 60
>  {{offsets.retention.minutes = 10080}}
>  {{offsets.topic.compression.codec = 0}}
>  {{offsets.topic.num.partitions = 50}}
>  {{offsets.topic.replication.factor = 3}}
>  {{offsets.topic.segment.bytes = 104857600}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-12946) __consumer_offsets topic with very big partitions

2021-06-15 Thread Ron Dagostino (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363805#comment-17363805
 ] 

Ron Dagostino commented on KAFKA-12946:
---

The only one I am familiar with and would recommend is the upgrade.

> __consumer_offsets topic with very big partitions
> -
>
> Key: KAFKA-12946
> URL: https://issues.apache.org/jira/browse/KAFKA-12946
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.0.0
>Reporter: Emi
>Priority: Critical
>
> I am using Kafka 2.0.0 with java 8u191
>  There is a partitions of the __consumer_offsets topic that is 600 GB with 
> 6000 segments older than 4 months. Other partitions of that topic are small: 
> 20-30 MB.
> There are 60 consumer groups, 90 topics and 100 partitions per topic.
> There aren't errors in the logs. From the log of the logcleaner, I can see 
> that partition is never touched from the logcleaner thread for the 
> compaction, but it only add new segments.
>  How is this possible?
> There was another partition with the same problem, but after some months it 
> has been compacted. Now there is only one partition with this problem, but 
> this is bigger and keep growing
> I have used the kafka-dump-log tool to check these old segments and I can see 
> many duplicates. So I would assume that is not compacted.
> My settings:
>  {{offsets.commit.required.acks = -1}}
>  {{[offsets.commit.timeout.ms|http://offsets.commit.timeout.ms/]}} = 5000
>  {{offsets.load.buffer.size = 5242880}}
>  
> {{[offsets.retention.check.interval.ms|http://offsets.retention.check.interval.ms/]}}
>  = 60
>  {{offsets.retention.minutes = 10080}}
>  {{offsets.topic.compression.codec = 0}}
>  {{offsets.topic.num.partitions = 50}}
>  {{offsets.topic.replication.factor = 3}}
>  {{offsets.topic.segment.bytes = 104857600}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12946) __consumer_offsets topic with very big partitions

2021-06-15 Thread Emi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363622#comment-17363622
 ] 

Emi commented on KAFKA-12946:
-

Ok :)
I found other possible solutions, for example:
- increase the *log.cleaner.threads*
- upgrade kafka
- set the cleanup.policy=compact,delete for the topic *__consumer_offsets* for 
a while
- delete the *cleaner-offset-checkpoint* file to force the compaction

Could these solutions work in your opinion?
I am on a production environment, so I have to be sure that these solutions are 
safe. 

Thanks again ;)

> __consumer_offsets topic with very big partitions
> -
>
> Key: KAFKA-12946
> URL: https://issues.apache.org/jira/browse/KAFKA-12946
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.0.0
>Reporter: Emi
>Priority: Critical
>
> I am using Kafka 2.0.0 with java 8u191
>  There is a partitions of the __consumer_offsets topic that is 600 GB with 
> 6000 segments older than 4 months. Other partitions of that topic are small: 
> 20-30 MB.
> There are 60 consumer groups, 90 topics and 100 partitions per topic.
> There aren't errors in the logs. From the log of the logcleaner, I can see 
> that partition is never touched from the logcleaner thread for the 
> compaction, but it only add new segments.
>  How is this possible?
> There was another partition with the same problem, but after some months it 
> has been compacted. Now there is only one partition with this problem, but 
> this is bigger and keep growing
> I have used the kafka-dump-log tool to check these old segments and I can see 
> many duplicates. So I would assume that is not compacted.
> My settings:
>  {{offsets.commit.required.acks = -1}}
>  {{[offsets.commit.timeout.ms|http://offsets.commit.timeout.ms/]}} = 5000
>  {{offsets.load.buffer.size = 5242880}}
>  
> {{[offsets.retention.check.interval.ms|http://offsets.retention.check.interval.ms/]}}
>  = 60
>  {{offsets.retention.minutes = 10080}}
>  {{offsets.topic.compression.codec = 0}}
>  {{offsets.topic.num.partitions = 50}}
>  {{offsets.topic.replication.factor = 3}}
>  {{offsets.topic.segment.bytes = 104857600}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12946) __consumer_offsets topic with very big partitions

2021-06-15 Thread Ron Dagostino (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363584#comment-17363584
 ] 

Ron Dagostino commented on KAFKA-12946:
---

Yeah, there are bugs.  The KIP I referred to mentions one.  There have also 
been several changes to make the log cleaner thread more robust to failure over 
time — even since the 2.0 version you are on.  Upgrading might not help 
immediately, but you will want to leverage the KIP-664 tools at some point, so 
best to keep current.  You should definitely read that KIP.

> __consumer_offsets topic with very big partitions
> -
>
> Key: KAFKA-12946
> URL: https://issues.apache.org/jira/browse/KAFKA-12946
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.0.0
>Reporter: Emi
>Priority: Critical
>
> I am using Kafka 2.0.0 with java 8u191
>  There is a partitions of the __consumer_offsets topic that is 600 GB with 
> 6000 segments older than 4 months. Other partitions of that topic are small: 
> 20-30 MB.
> There are 60 consumer groups, 90 topics and 100 partitions per topic.
> There aren't errors in the logs. From the log of the logcleaner, I can see 
> that partition is never touched from the logcleaner thread for the 
> compaction, but it only add new segments.
>  How is this possible?
> There was another partition with the same problem, but after some months it 
> has been compacted. Now there is only one partition with this problem, but 
> this is bigger and keep growing
> I have used the kafka-dump-log tool to check these old segments and I can see 
> many duplicates. So I would assume that is not compacted.
> My settings:
>  {{offsets.commit.required.acks = -1}}
>  {{[offsets.commit.timeout.ms|http://offsets.commit.timeout.ms/]}} = 5000
>  {{offsets.load.buffer.size = 5242880}}
>  
> {{[offsets.retention.check.interval.ms|http://offsets.retention.check.interval.ms/]}}
>  = 60
>  {{offsets.retention.minutes = 10080}}
>  {{offsets.topic.compression.codec = 0}}
>  {{offsets.topic.num.partitions = 50}}
>  {{offsets.topic.replication.factor = 3}}
>  {{offsets.topic.segment.bytes = 104857600}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12946) __consumer_offsets topic with very big partitions

2021-06-15 Thread Emi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363522#comment-17363522
 ] 

Emi commented on KAFKA-12946:
-

[~rndgstn] Interesting, it could be a solution that I am going to consider. But 
I am more interested to know why this happen. So, why is there this very big 
partition in the __consumer_offsets topic? Is it really a bug of Kafka? 

> __consumer_offsets topic with very big partitions
> -
>
> Key: KAFKA-12946
> URL: https://issues.apache.org/jira/browse/KAFKA-12946
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.0.0
>Reporter: Emi
>Priority: Critical
>
> I am using Kafka 2.0.0 with java 8u191
>  There is a partitions of the __consumer_offsets topic that is 600 GB with 
> 6000 segments older than 4 months. Other partitions of that topic are small: 
> 20-30 MB.
> There are 60 consumer groups, 90 topics and 100 partitions per topic.
> There aren't errors in the logs. From the log of the logcleaner, I can see 
> that partition is never touched from the logcleaner thread for the 
> compaction, but it only add new segments.
>  How is this possible?
> There was another partition with the same problem, but after some months it 
> has been compacted. Now there is only one partition with this problem, but 
> this is bigger and keep growing
> I have used the kafka-dump-log tool to check these old segments and I can see 
> many duplicates. So I would assume that is not compacted.
> My settings:
>  {{offsets.commit.required.acks = -1}}
>  {{[offsets.commit.timeout.ms|http://offsets.commit.timeout.ms/]}} = 5000
>  {{offsets.load.buffer.size = 5242880}}
>  
> {{[offsets.retention.check.interval.ms|http://offsets.retention.check.interval.ms/]}}
>  = 60
>  {{offsets.retention.minutes = 10080}}
>  {{offsets.topic.compression.codec = 0}}
>  {{offsets.topic.num.partitions = 50}}
>  {{offsets.topic.replication.factor = 3}}
>  {{offsets.topic.segment.bytes = 104857600}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12946) __consumer_offsets topic with very big partitions

2021-06-15 Thread Ron Dagostino (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363496#comment-17363496
 ] 

Ron Dagostino commented on KAFKA-12946:
---

I mean if you take a look at the size on disk, is the size of the log 
significantly smaller?  Broker 0 might be the leader for partition 0 with 600 
GB of size.  Maybe broker 1 is a follower with about the same 600 GB size, but 
perhaps broker 2 is a follower with just 100 MB.  It is unexplained why this 
would occur, but it is possible, and if so then you can make 2 the leader, move 
1 to 3, move 3 back to 1, move 0 to 3, move 3 back to 0, and then make 0 the 
leader again -- now you have the same leadership and followers as before but 
100 MB on all 3 replicas.

> __consumer_offsets topic with very big partitions
> -
>
> Key: KAFKA-12946
> URL: https://issues.apache.org/jira/browse/KAFKA-12946
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.0.0
>Reporter: Emi
>Priority: Critical
>
> I am using Kafka 2.0.0 with java 8u191
>  There is a partitions of the __consumer_offsets topic that is 600 GB with 
> 6000 segments older than 4 months. Other partitions of that topic are small: 
> 20-30 MB.
> There are 60 consumer groups, 90 topics and 100 partitions per topic.
> There aren't errors in the logs. From the log of the logcleaner, I can see 
> that partition is never touched from the logcleaner thread for the 
> compaction, but it only add new segments.
>  How is this possible?
> There was another partition with the same problem, but after some months it 
> has been compacted. Now there is only one partition with this problem, but 
> this is bigger and keep growing
> I have used the kafka-dump-log tool to check these old segments and I can see 
> many duplicates. So I would assume that is not compacted.
> My settings:
>  {{offsets.commit.required.acks = -1}}
>  {{[offsets.commit.timeout.ms|http://offsets.commit.timeout.ms/]}} = 5000
>  {{offsets.load.buffer.size = 5242880}}
>  
> {{[offsets.retention.check.interval.ms|http://offsets.retention.check.interval.ms/]}}
>  = 60
>  {{offsets.retention.minutes = 10080}}
>  {{offsets.topic.compression.codec = 0}}
>  {{offsets.topic.num.partitions = 50}}
>  {{offsets.topic.replication.factor = 3}}
>  {{offsets.topic.segment.bytes = 104857600}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12946) __consumer_offsets topic with very big partitions

2021-06-14 Thread Emi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363203#comment-17363203
 ] 

Emi commented on KAFKA-12946:
-

[~rndgstn] What do you mean for "has a significantly smaller size than the 
leader"? Thanks

> __consumer_offsets topic with very big partitions
> -
>
> Key: KAFKA-12946
> URL: https://issues.apache.org/jira/browse/KAFKA-12946
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.0.0
>Reporter: Emi
>Priority: Critical
>
> I am using Kafka 2.0.0 with java 8u191
>  There is a partitions of the __consumer_offsets topic that is 600 GB with 
> 6000 segments older than 4 months. Other partitions of that topic are small: 
> 20-30 MB.
> There are 60 consumer groups, 90 topics and 100 partitions per topic.
> There aren't errors in the logs. From the log of the logcleaner, I can see 
> that partition is never touched from the logcleaner thread for the 
> compaction, but it only add new segments.
>  How is this possible?
> There was another partition with the same problem, but after some months it 
> has been compacted. Now there is only one partition with this problem, but 
> this is bigger and keep growing
> I have used the kafka-dump-log tool to check these old segments and I can see 
> many duplicates. So I would assume that is not compacted.
> My settings:
>  {{offsets.commit.required.acks = -1}}
>  {{[offsets.commit.timeout.ms|http://offsets.commit.timeout.ms/]}} = 5000
>  {{offsets.load.buffer.size = 5242880}}
>  
> {{[offsets.retention.check.interval.ms|http://offsets.retention.check.interval.ms/]}}
>  = 60
>  {{offsets.retention.minutes = 10080}}
>  {{offsets.topic.compression.codec = 0}}
>  {{offsets.topic.num.partitions = 50}}
>  {{offsets.topic.replication.factor = 3}}
>  {{offsets.topic.segment.bytes = 104857600}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12946) __consumer_offsets topic with very big partitions

2021-06-14 Thread Ron Dagostino (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363155#comment-17363155
 ] 

Ron Dagostino commented on KAFKA-12946:
---

If the partition isn't being cleaned then you can try setting 
min.cleanable.dirty.ratio=0 for the __consumer_offsets topic; this might allow 
it to get cleaned.  You can delete that config after a while to let the value 
default back.

Another possibility might exist if one of the follower replicas has a 
significantly smaller size than the leader; in such cases you can move 
leadership to the smaller replica and then reassign the follower replicas to 
new brokers so that they will copy the (much smaller-sized) data; then you can 
migrate the followers back to where they were originally and move the leader 
back to the original leader.  This solution will only work if you have more 
brokers than the replication factor.

Finally, take a look at 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-664%3A+Provide+tooling+to+detect+and+abort+hanging+transactions.
  You may not have any other options right now if it is a hanging transaction, 
but help is coming.

> __consumer_offsets topic with very big partitions
> -
>
> Key: KAFKA-12946
> URL: https://issues.apache.org/jira/browse/KAFKA-12946
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.0.0
>Reporter: Emi
>Priority: Critical
>
> I am using Kafka 2.0.0 with java 8u191
>  There is a partitions of the __consumer_offsets topic that is 600 GB with 
> 6000 segments older than 4 months. Other partitions of that topic are small: 
> 20-30 MB.
> There are 60 consumer groups, 90 topics and 100 partitions per topic.
> There aren't errors in the logs. From the log of the logcleaner, I can see 
> that partition is never touched from the logcleaner thread for the 
> compaction, but it only add new segments.
>  How is this possible?
> There was another partition with the same problem, but after some months it 
> has been compacted. Now there is only one partition with this problem, but 
> this is bigger and keep growing
> I have used the kafka-dump-log tool to check these old segments and I can see 
> many duplicates. So I would assume that is not compacted.
> My settings:
>  {{offsets.commit.required.acks = -1}}
>  {{[offsets.commit.timeout.ms|http://offsets.commit.timeout.ms/]}} = 5000
>  {{offsets.load.buffer.size = 5242880}}
>  
> {{[offsets.retention.check.interval.ms|http://offsets.retention.check.interval.ms/]}}
>  = 60
>  {{offsets.retention.minutes = 10080}}
>  {{offsets.topic.compression.codec = 0}}
>  {{offsets.topic.num.partitions = 50}}
>  {{offsets.topic.replication.factor = 3}}
>  {{offsets.topic.segment.bytes = 104857600}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)