[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

Jakob Homan (JIRA) Wed, 04 Feb 2015 12:33:55 -0800

    [ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231258#comment-14231258
 ]


Jakob Homan edited comment on KAFKA-1646 at 2/4/15 8:32 PM:
------------------------------------------------------------

Hi, Jun. 
I also came from same team with Xueqiang & Honghai. 
Just as Jay mentioned before, recover the log is also necessary when the broker 
is stopped gracefully. Since we just recover the activeSegment (only one 
LogSegment), so it would not cost a lot of time. 
And I just write some code to test the performance of LogSegment.recover(),
{code}
public class TestLogSegment {
    public static void main(String[] args) throws Exception {
        String dirTemplate = args.length > 0? args[0] : 
"D:\\C\\scp2\\tachyon\\logs\\testbroker\\mvlogs-";
        int logFileNums = args.length > 1? Integer.parseInt(args[1]) : 10;

        long startOffset = 0L;
        int indexIntervalBytes = 4096;
        int maxIndexSize = 10485760;
        long initFileSize = 536870912L;
        int maxMessageSize = 1000000;

        long totalTime = 0L;
        for (int i=0; i<logFileNums; i++)
        {
            LogSegment segment = new LogSegment(new File(dirTemplate + i), 0, 
indexIntervalBytes, maxIndexSize, initFileSize, new SystemTime());
            long start = System.currentTimeMillis();
            segment.recover(maxMessageSize);
            long end = System.currentTimeMillis();
            totalTime += (end - start);
        }

        System.out.println("Recover cost time: " + totalTime/logFileNums);
    }
}
{code}
I use some scripts to create 1000 copies from a real partition of one topic 
(mvlogs-0, mvlogs-1, ..., mvlogs-999), and use this code to test the average 
time cost of LogSegment.recover() to 1000 log partition directories. After 
test, the average recover time cost is dozen seconds (in my server is 11ms).
So, I think the recovery operation is just OK.


was (Author: qixia):
Hi, Jun. 
I also came from same team with Xueqiang & Honghai. 
Just as Jay mentioned before, recover the log is also necessary when the broker 
is stopped gracefully. Since we just recover the activeSegment (only one 
LogSegment), so it would not cost a lot of time. 
And I just write some code to test the performance of LogSegment.recover(), 
public class TestLogSegment {
    public static void main(String[] args) throws Exception {
        String dirTemplate = args.length > 0? args[0] : 
"D:\\C\\scp2\\tachyon\\logs\\testbroker\\mvlogs-";
        int logFileNums = args.length > 1? Integer.parseInt(args[1]) : 10;

        long startOffset = 0L;
        int indexIntervalBytes = 4096;
        int maxIndexSize = 10485760;
        long initFileSize = 536870912L;
        int maxMessageSize = 1000000;

        long totalTime = 0L;
        for (int i=0; i<logFileNums; i++)
        {
            LogSegment segment = new LogSegment(new File(dirTemplate + i), 0, 
indexIntervalBytes, maxIndexSize, initFileSize, new SystemTime());
            long start = System.currentTimeMillis();
            segment.recover(maxMessageSize);
            long end = System.currentTimeMillis();
            totalTime += (end - start);
        }

        System.out.println("Recover cost time: " + totalTime/logFileNums);
    }
}
I use some scripts to create 1000 copies from a real partition of one topic 
(mvlogs-0, mvlogs-1, ..., mvlogs-999), and use this code to test the average 
time cost of LogSegment.recover() to 1000 log partition directories. After 
test, the average recover time cost is dozen seconds (in my server is 11ms).
So, I think the recovery operation is just OK.

> Improve consumer read performance for Windows
> ---------------------------------------------
>
>                 Key: KAFKA-1646
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1646
>             Project: Kafka
>          Issue Type: Improvement
>          Components: log
>    Affects Versions: 0.8.1.1
>         Environment: Windows
>            Reporter: xueqiang wang
>            Assignee: xueqiang wang
>              Labels: newbie, patch
>         Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

Reply via email to