There are usually three reasons things get slow: 1. Locking--no over-utilized resource 2. I/O--disk over-utilized (either throughput or iops) 3. CPU
In addition since I think this is a single producer case the bottleneck may be either the producer or the broker. Couple of things we can do to trace this down: - Try adding a few more producers (or even just producer threads) to both tests and see if that improves throughput The perf tool should make this fairly easy. If so that would tend to indicate a bottleneck in the producer. - Try checking IOPS and throughput in both versions. Theoretically with the same config they should produce the same IOPs pattern. If not that may be the root cause. To get this out of your dstat add the -d and -r options. - Run hprof or another cpu profiler on both clients or both servers and compare the trace. This should show any difference in cpu usage between the two. Since producer CPU is lower in your trace in 0.7.1, producer CPU can't be the bottleneck. Disk throughput is only half in 0.7.1 but it isn't clear if that is cause or effect. What is the flush policy (log.flush.interval and log.default.flush.interval.ms)? What are the IOPS between the two? I also notice that broker CPU in 0.7.1 is 2x what it is in 0.7.0. That is a possibility. How many CPU cores does the box have? Even though it is only 8-9% if there is only one producer thread then there is only one broker thread working to process that, so that might be the bottleneck. -Jay On Tue, Jul 10, 2012 at 7:45 AM, Jun Rao <jun...@gmail.com> wrote: > Min, > > Interesting. Could you file a jira and attach your producer code and dstat > output there? > > Thanks, > > Jun > > On Mon, Jul 9, 2012 at 11:53 PM, Min <mini...@gmail.com> wrote: > > > Hello, > > > > I tested again and I could reproduce the situation. > > > > I run 0.7.0 with the following shell command > > ./kafka-0.7.0-incubating-src/bin/kafka-server-start.sh > > config/server.properties > > then 0.7.1 > > ./kafka-0.7.1-incubating/bin/kafka-server-start.sh > config/server.properties > > > > configuration values in the server.properties are all default but the > > ZK address. > > > > My producer and broker were running on different machines in same rack. > > They are not VMs but physical machines. Also I don't feel some other > > process interfere my test. > > > > > > I'm attaching the dstat result and my producer code. > > Producer is so simple that read files in a given directory by line and > > send every 1000 lines to the broker. > > > > Thanks > > Min > > > > ./kafka-0.7.0-incubating-src/bin/kafka-server-start.sh > > config/server.properties > > (Producer) > > ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- > > usr sys idl wai hiq siq| read writ| recv send| in out | int csw > > 23 2 72 2 0 1| 88M 896k| 98k 50M| 0 0 |2855 2627 > > 20 1 77 1 0 1| 81M 1144k| 79k 48M| 0 0 |2858 2423 > > 21 1 77 0 0 1| 85M 0 | 77k 47M| 0 0 |2655 2222 > > 6 0 93 0 0 0| 24M 240k| 38k 15M| 0 0 |1664 1232 > > 18 1 80 0 1 1| 71M 0 | 72k 42M| 0 0 |2657 2152 > > 21 1 69 7 0 1| 75M 1704k| 92k 48M| 0 0 |2959 2872 > > 20 2 52 25 0 1| 90M 512k| 82k 48M| 0 0 |2884 2940 > > 20 1 78 0 0 0| 85M 0 | 77k 47M| 0 0 |2800 2459 > > 22 2 75 0 1 1| 88M 320k| 81k 51M| 0 0 |2840 2590 > > 24 1 74 0 1 0| 90M 0 | 88k 52M| 0 0 |2819 2526 > > 23 1 71 2 0 1| 87M 1408k| 96k 51M| 0 0 |2924 2661 > > 23 2 73 1 1 1| 89M 744k| 82k 51M| 0 0 |2851 2485 > > 25 1 72 0 1 1| 94M 0 | 89k 55M| 0 0 |2941 2545 > > 22 1 76 0 0 1| 88M 240k| 81k 50M| 0 0 |2804 2581 > > 23 1 75 0 0 1| 93M 0 | 83k 53M| 0 0 |2868 2530 > > 23 1 69 5 0 1| 87M 1400k| 100k 53M| 0 0 |2959 2668 > > > > (Broker 0.7.0) > > usr sys idl wai hiq siq| read writ| recv send| in out | int csw > > 4 3 86 4 0 2|2056k 197M| 51M 68k| 0 0 | 11k 1911 > > 5 3 86 4 0 2|2048k 209M| 53M 72k| 0 0 | 11k 2097 > > 5 3 85 4 0 2|2056k 207M| 52M 77k| 0 0 | 11k 2084 > > 4 4 86 4 0 2|2048k 208M| 53M 71k| 0 0 | 11k 2101 > > 5 3 86 4 0 2|2064k 201M| 51M 68k| 0 0 | 11k 2055 > > 4 4 86 4 0 2|2048k 207M| 53M 71k| 0 0 | 11k 2007 > > 5 4 86 4 0 2|2048k 209M| 53M 71k| 0 0 | 11k 2018 > > 5 3 85 4 0 2|2056k 207M| 52M 76k| 0 0 | 11k 1997 > > 4 4 85 5 0 2|2056k 197M| 50M 67k| 0 0 | 11k 1907 > > 5 4 86 4 0 2|2056k 205M| 52M 71k| 0 0 | 11k 2053 > > 5 3 86 4 0 2|2048k 205M| 52M 70k| 0 0 | 11k 2000 > > > > ./kafka-0.7.1-incubating/bin/kafka-server-start.sh > config/server.properties > > > > (Producer) > > ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- > > usr sys idl wai hiq siq| read writ| recv send| in out | int csw > > 9 1 89 1 0 0| 41M 416k| 51k 27M| 0 0 |2205 1618 > > 9 1 88 2 0 0| 49M 0 | 47k 27M| 0 0 |2212 1952 > > 8 1 89 1 0 0| 44M 0 | 58k 26M| 0 0 |2201 1713 > > 10 1 87 2 0 0| 47M 256k| 57k 26M| 0 0 |2199 1992 > > 9 1 84 5 0 1| 46M 2488k| 56k 27M| 0 0 |2472 1924 > > 9 1 89 1 0 0| 42M 384k| 53k 26M| 0 0 |2187 1730 > > 9 1 89 1 0 0| 48M 0 | 44k 27M| 0 0 |2189 1691 > > 10 1 88 1 0 0| 45M 0 | 58k 27M| 0 0 |2205 1884 > > 9 1 89 1 0 0| 48M 224k| 58k 27M| 0 0 |2225 1715 > > 9 1 76 12 0 1| 44M 2496k| 57k 26M| 0 0 |2425 2133 > > 9 1 88 2 0 0| 45M 584k| 51k 26M| 0 0 |2194 1685 > > 9 1 89 1 0 0| 46M 0 | 43k 25M| 0 0 |2166 1727 > > 9 1 90 1 0 0| 47M 0 | 54k 25M| 0 0 |2183 1629 > > > > (Broker - 0.7.1) > > ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- > > usr sys idl wai hiq siq| read writ| recv send| in out | int csw > > 9 2 86 2 0 1|2048k 103M| 26M 36k| 0 0 |6376 1849 > > 8 2 86 2 0 1|2056k 104M| 26M 37k| 0 0 |6431 1945 > > 9 2 86 2 0 1|2048k 101M| 26M 42k| 0 0 |6431 1884 > > 9 2 87 2 0 1|2048k 104M| 26M 35k| 0 0 |6300 1561 > > 9 2 86 2 0 1|2048k 104M| 27M 37k| 0 0 |6542 1543 > > 9 2 86 2 0 1|2056k 104M| 26M 37k| 0 0 |6487 1638 > > 9 2 87 2 0 1|2048k 101M| 26M 36k| 0 0 |6402 1721 > > 9 2 86 2 0 1|2048k 104M| 26M 43k| 0 0 |6436 1762 > > 9 2 86 2 0 1|2048k 106M| 27M 37k| 0 0 |6527 1799 > > 9 2 86 3 0 1|2056k 101M| 26M 36k| 0 0 |6378 1521 > > 9 2 86 2 0 1|2048k 104M| 27M 37k| 0 0 |6521 1568 > > 9 2 87 2 0 1|2048k 104M| 26M 37k| 0 0 |6452 1708 > > > > > > > > > > 2012/7/9 Jay Kreps <jay.kr...@gmail.com>: > > > Not that is known to me. Is this something you can reproduce? Are the > > > settings the same on the two brokers? Can you check iostat between the > > two > > > and see if the change is due to disk activity or cpu? > > > > > > -Jay > > > > > > On Sun, Jul 8, 2012 at 12:03 AM, Min <mini...@gmail.com> wrote: > > > > > >> Hello, > > >> > > >> Recently I've upgrade to 0.7.1. But 0.7.1 didn't show performance as > > >> 0.7.0. > > >> > > >> With 0.7.0 I can push 45MBps to the broker. But I only can push 27MBps > > >> with 0.7.1. > > >> > > >> Is there any a fetch to affect the performance? > > >> > > >> Thanks > > >> > > >