[ 
https://issues.apache.org/jira/browse/KAFKA-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-372:
--------------------------

    Attachment: kafka-372_v1.patch

There were several issues that caused the problem.

1. Log.nextAppendOffset() calls flush each time. Since this method is called 
for every produce request, we force a disk flush for every produce request 
independent of the flush interval in the broker config. This makes producers 
very slow.

2. The default value for MaxFetchWaitMs in consumer config is 3 secs, which is 
too long.

3. The script runs console consumer in background and only waits for 20 secs, 
which is too short. What we should do is to run console consumer in foreground 
and wait until it finishes (since it has consumer timeout).

Attach patch v1 that fixes items 1 and 2. The test now passes. However, we 
should address item 3 in the script too.
                
> Consumer doesn't receive all data if there are multiple segment files
> ---------------------------------------------------------------------
>
>                 Key: KAFKA-372
>                 URL: https://issues.apache.org/jira/browse/KAFKA-372
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>         Attachments: kafka-372_v1.patch, multi_seg_files_data_loss_debug.patch
>
>
> This issue happens inconsistently but could be reproduced by following the 
> steps below (repeat step 4 a few times to reproduce it):
> 1. Check out 0.8 branch (currently reproducible with rev. 1352634)
> 2. Apply kafka-306-v4.patch
> 3. Please note that the log.file.size is set to 10000000 in 
> system_test/broker_failure/config/server_*.properties (small enough to 
> trigger multi segment files)
> 4. Under the directory <kafka home>/system_test/broker_failure, execute 
> command:
> $ bin/run-test.sh 20 0
> 5. After the test is completed, the result will probably look like the 
> following:
> ========================================================
> no. of messages published            : 14000
> producer unique msg rec'd            : 14000
> source consumer msg rec'd            : 7271
> source consumer unique msg rec'd     : 7271
> mirror consumer msg rec'd            : 6960
> mirror consumer unique msg rec'd     : 6960
> total source/mirror duplicate msg    : 0
> source/mirror uniq msg count diff    : 311
> ========================================================
> 6. By checking the kafka log files, the sum of the sizes of the source 
> cluster segments files are equal to those in the target cluster.
> [/tmp] $  find kafka* -name *.kafka -ls
> 18620155 9860 -rw-r--r--   1 jfung    eng      10096535 Jun 21 11:09 
> kafka-source3-logs/test01-0/00000000000000000000.kafka
> 18620161 9772 -rw-r--r--   1 jfung    eng      10004418 Jun 21 11:11 
> kafka-source3-logs/test01-0/00000000000020105286.kafka
> 18620160 9776 -rw-r--r--   1 jfung    eng      10008751 Jun 21 11:10 
> kafka-source3-logs/test01-0/00000000000010096535.kafka
> 18620162 4708 -rw-r--r--   1 jfung    eng       4819067 Jun 21 11:11 
> kafka-source3-logs/test01-0/00000000000030109704.kafka
> 19406431 9920 -rw-r--r--   1 jfung    eng      10157685 Jun 21 11:10 
> kafka-target2-logs/test01-0/00000000000010335039.kafka
> 19406429 10096 -rw-r--r--   1 jfung    eng      10335039 Jun 21 11:09 
> kafka-target2-logs/test01-0/00000000000000000000.kafka
> 19406432 10300 -rw-r--r--   1 jfung    eng      10544850 Jun 21 11:11 
> kafka-target2-logs/test01-0/00000000000020492724.kafka
> 19406433 3800 -rw-r--r--   1 jfung    eng       3891197 Jun 21 11:12 
> kafka-target2-logs/test01-0/00000000000031037574.kafka
> 7. If the log.file.size in target cluster is configured to a very large value 
> such that there is only 1 data file, the result would look like this:
> ========================================================
> no. of messages published            : 14000
> producer unique msg rec'd            : 14000
> source consumer msg rec'd            : 7302
> source consumer unique msg rec'd     : 7302
> mirror consumer msg rec'd            : 13750
> mirror consumer unique msg rec'd     : 13750
> total source/mirror duplicate msg    : 0
> source/mirror uniq msg count diff    : -6448
> ========================================================
> 8. The log files are like these:
> [/tmp] $ find kafka* -name *.kafka -ls
> 18620160 9840 -rw-r--r--   1 jfung    eng      10075058 Jun 21 11:24 
> kafka-source2-logs/test01-0/00000000000010083679.kafka
> 18620155 9848 -rw-r--r--   1 jfung    eng      10083679 Jun 21 11:23 
> kafka-source2-logs/test01-0/00000000000000000000.kafka
> 18620162 4484 -rw-r--r--   1 jfung    eng       4589474 Jun 21 11:26 
> kafka-source2-logs/test01-0/00000000000030269045.kafka
> 18620161 9876 -rw-r--r--   1 jfung    eng      10110308 Jun 21 11:25 
> kafka-source2-logs/test01-0/00000000000020158737.kafka
> 19406429 34048 -rw-r--r--   1 jfung    eng      34858519 Jun 21 11:26 
> kafka-target3-logs/test01-0/00000000000000000000.kafka

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to