[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2017-02-13 Thread Andrew Olson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864128#comment-15864128
 ] 

Andrew Olson commented on KAFKA-1379:
-

[~hachikuji] Jason, could you confirm if this bug has been fixed?

> Partition reassignment resets clock for time-based retention
> 
>
> Key: KAFKA-1379
> URL: https://issues.apache.org/jira/browse/KAFKA-1379
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Reporter: Joel Koshy
>
> Since retention is driven off mod-times reassigned partitions will result in
> data that has been on a leader to be retained for another full retention
> cycle. E.g., if retention is seven days and you reassign partitions on the
> sixth day then those partitions will remain on the replicas for another
> seven days.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2017-01-20 Thread Andrew Olson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832155#comment-15832155
 ] 

Andrew Olson commented on KAFKA-1379:
-

[~jjkoshy] / [~becket_qin] should this Jira now be closed as a duplicate of 
KAFKA-3163?

https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-Enforcetimebasedlogretention

> Partition reassignment resets clock for time-based retention
> 
>
> Key: KAFKA-1379
> URL: https://issues.apache.org/jira/browse/KAFKA-1379
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>
> Since retention is driven off mod-times reassigned partitions will result in
> data that has been on a leader to be retained for another full retention
> cycle. E.g., if retention is seven days and you reassign partitions on the
> sixth day then those partitions will remain on the replicas for another
> seven days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2016-06-01 Thread Luca Toscano (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310068#comment-15310068
 ] 

Luca Toscano commented on KAFKA-1379:
-

Hi Moritz,

thanks a lot for pointing us to this Jira in users@. At the moment we use a 
similar trick to resolve disk partitions filling up (retention.ms):
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Kafka/Administration#Temporarily_Modify_Per_Topic_Retention_Settings

I also opened a Phabricator task to track this problem 
https://phabricator.wikimedia.org/T136690

retention.bytes is definitely worth to try, but is there anything else that can 
mitigate this issue?

> Partition reassignment resets clock for time-based retention
> 
>
> Key: KAFKA-1379
> URL: https://issues.apache.org/jira/browse/KAFKA-1379
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>
> Since retention is driven off mod-times reassigned partitions will result in
> data that has been on a leader to be retained for another full retention
> cycle. E.g., if retention is seven days and you reassign partitions on the
> sixth day then those partitions will remain on the replicas for another
> seven days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2016-05-26 Thread Moritz Siuts (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301936#comment-15301936
 ] 

Moritz Siuts commented on KAFKA-1379:
-

>From the user-mailinglist:

{quote}
We’ve recently upgraded to 0.9.  In 0.8, when we restarted a broker, data
log file mtimes were not changed.  In 0.9, any data log file that was on
disk before the broker has it’s mtime modified to the time of the broker
restart.
{quote}

A workaround can be to set {{retention.bytes}} on a topic level, like this:

{noformat}
./bin/kafka-topics.sh --zookeeper X.X.X.X:2181/kafka -alter --config 
retention.bytes=500 –topic my_topic
{noformat}

The settings controls the max size in bytes of a partition oft he specified 
topic. So you can find a good size by checking the size of a partition with 
{{du -b}} and use this value.

> Partition reassignment resets clock for time-based retention
> 
>
> Key: KAFKA-1379
> URL: https://issues.apache.org/jira/browse/KAFKA-1379
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>
> Since retention is driven off mod-times reassigned partitions will result in
> data that has been on a leader to be retained for another full retention
> cycle. E.g., if retention is seven days and you reassign partitions on the
> sixth day then those partitions will remain on the replicas for another
> seven days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2015-09-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903212#comment-14903212
 ] 

Xavier Léauté commented on KAFKA-1379:
--

This is a huge issue for us as well, since it requires we keep double the disk 
capacity on hand, in case one of our brokers or disks fails, which happens 
relatively often at our scale.

Alternatively, we have to go in and remove expired segments by hand, by 
comparing replicated segments with the partition leader, before disks run out 
of space.


> Partition reassignment resets clock for time-based retention
> 
>
> Key: KAFKA-1379
> URL: https://issues.apache.org/jira/browse/KAFKA-1379
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>
> Since retention is driven off mod-times reassigned partitions will result in
> data that has been on a leader to be retained for another full retention
> cycle. E.g., if retention is seven days and you reassign partitions on the
> sixth day then those partitions will remain on the replicas for another
> seven days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2015-02-23 Thread Moritz Siuts (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1438#comment-1438
 ] 

Moritz Siuts commented on KAFKA-1379:
-

This also happens when a broker dies and loses it's data. 

When the broker comes back without any data it will use more and more disk 
space until it doubles the used disk space until the retention kicks in and the 
usage drops to normal.

IMHO this is pretty bad for disaster scenarios, so I would like to see a higher 
prio on this.


 Partition reassignment resets clock for time-based retention
 

 Key: KAFKA-1379
 URL: https://issues.apache.org/jira/browse/KAFKA-1379
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy

 Since retention is driven off mod-times reassigned partitions will result in
 data that has been on a leader to be retained for another full retention
 cycle. E.g., if retention is seven days and you reassign partitions on the
 sixth day then those partitions will remain on the replicas for another
 seven days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2015-02-23 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334253#comment-14334253
 ] 

Joel Koshy commented on KAFKA-1379:
---

We have been thinking through various alternatives and this is included in a 
proposal here: 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Enriched+Message+Metadata

 Partition reassignment resets clock for time-based retention
 

 Key: KAFKA-1379
 URL: https://issues.apache.org/jira/browse/KAFKA-1379
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy

 Since retention is driven off mod-times reassigned partitions will result in
 data that has been on a leader to be retained for another full retention
 cycle. E.g., if retention is seven days and you reassign partitions on the
 sixth day then those partitions will remain on the replicas for another
 seven days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)