[jira] [Commented] (KAFKA-3802) log mtimes reset on broker restart

2016-06-08 Thread Moritz Siuts (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320391#comment-15320391
 ] 

Moritz Siuts commented on KAFKA-3802:
-

This 
https://github.com/emetriq/kafka/commit/94bca24bcef5d479f1e4350c65faad2bf7a09246
 seems to fix the issue. Before I do a PR I need to add a good unit tests for 
the change, but maybe you already have feedback.

> log mtimes reset on broker restart
> --
>
> Key: KAFKA-3802
> URL: https://issues.apache.org/jira/browse/KAFKA-3802
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Andrew Otto
>
> Folks over in 
> http://mail-archives.apache.org/mod_mbox/kafka-users/201605.mbox/%3CCAO8=cz0ragjad1acx4geqcwj+rkd1gmdavkjwytwthkszfg...@mail.gmail.com%3E
>  are commenting about this issue.
> In 0.9, any data log file that was on
> disk before the broker has it's mtime modified to the time of the broker
> restart.
> This causes problems with log retention, as all the files then look like
> they contain recent data to kafka.  We use the default log retention of 7
> days, but if all the files are touched at the same time, this can cause us
> to retain up to 2 weeks of log data, which can fill up our disks.
> This happens *most* of the time, but seemingly not all.  We have seen broker 
> restarts where mtimes were not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3802) log mtimes reset on broker restart

2016-06-08 Thread Moritz Siuts (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320162#comment-15320162
 ] 

Moritz Siuts commented on KAFKA-3802:
-

I have been able to reproduce the problem with a small test program and to hunt 
it down to a specific change. The problem occurs when Kafka is shutted down, so 
when the logfiles are closed.

The problem seems to be introduced with KAFKA-1646 which added {{trim()}} to 
the {{close()}} method in the {{FileMessageSet}}.
The trim method calls {{channel.truncate()}} which on some systems (I can 
reproduce it  on Ubuntu 12.04 with Java7 but not on Mac OS X with Java 8) 
modifies the mtime. If I delete the truncate code in my PoC below the problem 
does not occur.

I think one could fix this, by checking in {{truncateTo()}} that the targetSize 
is different from channel.size before calling truncate on the channel, but I 
was not able to find the time to test this. 

Because the code was not changed in Kafka 0.10.0 it should have the same 
problems.

Code for reproducing (watch the mtime of {{/tmp/kafka.txt}} while it is 
sleeping for 2 Minutes:

{noformat}
import java.io.File;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.concurrent.TimeUnit;

public class Main {

public static void main(String[] args) throws Exception {
File file = new File("/tmp/kafka.txt");

FileChannel channel = new RandomAccessFile(file, "rw").getChannel();


channel.write(ByteBuffer.wrap("Kafka".getBytes("UTF-8")));

System.out.println("Going to sleep.");
Thread.sleep(TimeUnit.MINUTES.toMillis(2));

System.out.println("Going to close the channel.");
channel.force(true);
channel.truncate(channel.size()); // problem is here

channel.close();
}
}

{noformat}

> log mtimes reset on broker restart
> --
>
> Key: KAFKA-3802
> URL: https://issues.apache.org/jira/browse/KAFKA-3802
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Andrew Otto
>
> Folks over in 
> http://mail-archives.apache.org/mod_mbox/kafka-users/201605.mbox/%3CCAO8=cz0ragjad1acx4geqcwj+rkd1gmdavkjwytwthkszfg...@mail.gmail.com%3E
>  are commenting about this issue.
> In 0.9, any data log file that was on
> disk before the broker has it's mtime modified to the time of the broker
> restart.
> This causes problems with log retention, as all the files then look like
> they contain recent data to kafka.  We use the default log retention of 7
> days, but if all the files are touched at the same time, this can cause us
> to retain up to 2 weeks of log data, which can fill up our disks.
> This happens *most* of the time, but seemingly not all.  We have seen broker 
> restarts where mtimes were not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2016-05-26 Thread Moritz Siuts (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301936#comment-15301936
 ] 

Moritz Siuts commented on KAFKA-1379:
-

>From the user-mailinglist:

{quote}
We’ve recently upgraded to 0.9.  In 0.8, when we restarted a broker, data
log file mtimes were not changed.  In 0.9, any data log file that was on
disk before the broker has it’s mtime modified to the time of the broker
restart.
{quote}

A workaround can be to set {{retention.bytes}} on a topic level, like this:

{noformat}
./bin/kafka-topics.sh --zookeeper X.X.X.X:2181/kafka -alter --config 
retention.bytes=500 –topic my_topic
{noformat}

The settings controls the max size in bytes of a partition oft he specified 
topic. So you can find a good size by checking the size of a partition with 
{{du -b}} and use this value.

> Partition reassignment resets clock for time-based retention
> 
>
> Key: KAFKA-1379
> URL: https://issues.apache.org/jira/browse/KAFKA-1379
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>
> Since retention is driven off mod-times reassigned partitions will result in
> data that has been on a leader to be retained for another full retention
> cycle. E.g., if retention is seven days and you reassign partitions on the
> sixth day then those partitions will remain on the replicas for another
> seven days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention

2015-02-23 Thread Moritz Siuts (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1438#comment-1438
 ] 

Moritz Siuts commented on KAFKA-1379:
-

This also happens when a broker dies and loses it's data. 

When the broker comes back without any data it will use more and more disk 
space until it doubles the used disk space until the retention kicks in and the 
usage drops to normal.

IMHO this is pretty bad for disaster scenarios, so I would like to see a higher 
prio on this.


 Partition reassignment resets clock for time-based retention
 

 Key: KAFKA-1379
 URL: https://issues.apache.org/jira/browse/KAFKA-1379
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy

 Since retention is driven off mod-times reassigned partitions will result in
 data that has been on a leader to be retained for another full retention
 cycle. E.g., if retention is seven days and you reassign partitions on the
 sixth day then those partitions will remain on the replicas for another
 seven days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)