Anyone still concern about this issue? On Thu, Mar 29, 2012 at 4:14 PM, 刘明敏 <diveintotomor...@gmail.com> wrote:
> I coped what I think might be important below(tell me if you need any > other info): > > *One Broker:* > > rank self accum count trace method > 1 77.55% 77.55% 32651 301152 sun.nio.ch.FileDispatcherImpl.writev0 > 2 9.69% 87.24% 4080 300446 java.net.PlainSocketImpl.socketAccept > 3 1.50% 88.75% 633 301478 sun.misc.Unsafe.unpark > 4 0.62% 89.37% 260 301005 > kafka.perf.ProducerPerformance$ProducerThread.run > 5 0.52% 89.89% 219 301123 > scala.collection.LinearSeqOptimized$class.length > > TRACE 301152: > > sun.nio.ch.FileDispatcherImpl.writev0(FileDispatcherImpl.java:Unknown line) > sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) > sun.nio.ch.IOUtil.write(IOUtil.java:142) > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:483) > java.nio.channels.SocketChannel.write(SocketChannel.java:493) > > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:49) > kafka.network.Send$class.writeCompletely(Transmission.scala:73) > > kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:95) > kafka.producer.SyncProducer.send(SyncProducer.scala:94) > kafka.producer.SyncProducer.multiSend(SyncProducer.scala:135) > > kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:50) > > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:44) > > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:113) > > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:92) > TRACE 300446: > java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown > line) > > java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398) > java.net.ServerSocket.implAccept(ServerSocket.java:522) > java.net.ServerSocket.accept(ServerSocket.java:490) > > sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52) > > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387) > > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359) > java.lang.Thread.run(Thread.java:722) > TRACE 301478: > sun.misc.Unsafe.unpark(Unsafe.java:Unknown line) > java.util.concurrent.locks.LockSupport.unpark(LockSupport.java:152) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.unparkSuccessor(AbstractQueuedSynchronizer.java:662) > > java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1263) > > java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:460) > > java.util.concurrent.LinkedBlockingQueue.signalNotFull(LinkedBlockingQueue.java:187) > > java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:477) > > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$1.apply(ProducerSendThread.scala:69) > > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$1.apply(ProducerSendThread.scala:69) > > scala.collection.immutable.Stream$$anonfun$continually$1.apply(Stream.scala:598) > > scala.collection.immutable.Stream$$anonfun$continually$1.apply(Stream.scala:598) > scala.collection.immutable.Stream$Cons.tail(Stream.scala:555) > scala.collection.immutable.Stream$Cons.tail(Stream.scala:549) > > scala.collection.immutable.Stream$$anonfun$takeWhile$1.apply(Stream.scala:394) > > scala.collection.immutable.Stream$$anonfun$takeWhile$1.apply(Stream.scala:394) > > *Two broker:* > rank self accum count trace method > 1 59.73% 59.73% 13226 301374 sun.nio.ch.FileDispatcherImpl.writev0 > 2 6.00% 65.72% 1328 300436 java.net.PlainSocketImpl.socketAccept > 3 5.07% 70.80% 1123 300925 sun.misc.Unsafe.unpark > 4 1.75% 72.54% 387 301001 > kafka.perf.ProducerPerformance$ProducerThread.run > 5 0.93% 73.47% 205 301075 > kafka.message.ByteBufferMessageSet.kafka$message$ByteBufferMessageSet$$internalIterator > > *Three broker:* > rank self accum count trace method > 1 40.88% 40.88% 8471 300939 sun.misc.Unsafe.unpark > 2 9.63% 50.51% 1995 301329 sun.nio.ch.FileDispatcherImpl.writev0 > 3 6.36% 56.87% 1317 300447 java.net.PlainSocketImpl.socketAccept > 4 2.85% 59.72% 591 301828 sun.misc.Unsafe.park > 5 2.13% 61.85% 442 301011 > kafka.perf.ProducerPerformance$ProducerThread.run > > On Thu, Mar 29, 2012 at 1:20 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > >> Can you run with the profiler enabled so we can see where the cpu is >> going? Add something like the following to the java command line: >> >> -agentlib:hprof=cpu=samples,depth=15 >> >> -Jay >> >> On Wed, Mar 28, 2012 at 8:34 PM, 刘明敏 <diveintotomor...@gmail.com> wrote: >> > >> > wow,seems that's the case: >> > >> > when running async producer,the cpu utilization is like this( I use >> vmstat >> > to check the cpu usage) : >> > >> > procs -----------memory---------- ---swap-- -----io---- --system-- >> > -----cpu----- >> > r b swpd free buff cache si so bi bo in cs us sy >> idwa st >> > 29 0 37472 336768 73560 2555980 0 0 0 2224 9137 424623 79 >> 20 >> > 1 0 0 >> > 21 0 37472 1260664 62384 1608728 0 0 0 4412 17336 422321 >> 53 >> > 29 16 1 0 >> > 14 0 37472 1252836 62424 1612268 0 0 0 7720 11486 1043389 >> 57 >> > 40 3 0 0 >> > 12 0 37472 1255632 62496 1621888 0 0 0 9920 10556 1394823 >> 53 >> > 46 1 0 0 >> > 13 2 37472 1259244 62548 1631272 0 0 0 9920 10335 1383344 >> 51 >> > 47 2 0 0 >> > 11 0 37472 1267484 61660 1601680 0 0 0 9932 16050 1360485 >> 49 >> > 48 4 0 0 >> > 14 1 37472 1257632 61732 1611536 0 0 0 10460 9847 1431470 >> 51 >> > 48 1 0 0 >> > 12 0 37472 1247524 61800 1620844 0 0 0 8804 13658 1191618 >> 54 >> > 42 4 0 0 >> > 13 0 37472 1239464 61868 1631208 0 0 0 10984 14603 1443709 >> 50 >> > 48 2 1 0 >> > 15 2 37472 1230156 61936 1641628 0 0 0 10980 11569 1480576 >> 49 >> > 50 1 0 0 >> > 12 0 37472 1220236 61992 1652180 0 0 0 9912 11564 1475147 >> 50 >> > 49 1 1 0 >> > 12 1 37472 1210812 62056 1662000 0 0 0 10964 10469 1455622 >> 49 >> > 49 2 0 0 >> > 13 0 37472 1201636 62132 1672208 0 0 0 9904 12127 1467242 >> 49 >> > 49 1 0 0 >> > 12 2 37472 1191964 62208 1682308 0 0 0 10980 12817 1448184 >> 50 >> > 50 1 0 0 >> > 12 0 37472 1181796 62264 1692868 0 0 0 9884 15399 1452468 >> 49 >> > 49 2 0 0 >> > 14 0 37472 1179980 61824 1695388 0 0 0 10988 14860 1303783 >> 50 >> > 46 3 1 0 >> > 12 2 37472 1202232 59256 1653940 0 0 0 10984 14979 1344551 >> 47 >> > 52 2 0 0 >> > 12 0 37472 1191568 59320 1664252 0 0 0 9884 10081 1488794 >> 49 >> > 51 1 0 0 >> > 14 2 37472 1181276 59396 1674340 0 0 0 10984 12535 1433143 >> 48 >> > 48 3 1 0 >> > 16 1 37472 1170788 59468 1685128 0 0 0 10960 11839 1468341 >> 48 >> > 50 2 0 0 >> > 13 0 37472 1161612 59548 1695324 0 0 0 9968 15067 1467583 >> 48 >> > 50 1 0 0 >> > 11 2 37472 1151940 59624 1705468 0 0 0 10984 13112 1482698 >> 48 >> > 51 1 0 0 >> > 15 2 37472 1141524 59696 1716104 0 0 0 10984 14507 1444776 >> 50 >> > 48 2 1 0 >> > 4 2 37472 1152144 57940 1677388 0 0 0 10992 13915 1029274 >> 51 >> > 39 9 2 0 >> > >> > while the sync producer's cpu usage is much lower,like this: >> > >> > procs -----------memory---------- ---swap-- -----io---- --system-- >> > -----cpu----- >> > r b swpd free buff cache si so bi bo in cs us sy >> id wa >> > st >> > 1 2 37472 1148876 52496 1729876 0 0 0 21964 9423 3919 29 >> > 5 4521 0 >> > 0 2 37472 1130836 52652 1751268 0 0 0 21976 7931 4185 15 >> > 3 5625 0 >> > 0 2 37472 1112360 52820 1774020 0 0 0 23104 7275 4437 14 >> > 3 5032 0 >> > 0 2 37472 1094876 52984 1796472 0 0 0 23060 7201 3801 13 >> 4 >> > 53 31 0 >> > 3 2 37472 1078572 53140 1819200 0 0 0 23064 6965 4418 13 >> 4 >> > 57 26 0 >> > 4 0 37472 1061400 53288 1840004 0 0 0 20864 7027 3930 12 >> 3 >> > 58 26 0 >> > 3 0 37472 1043916 53444 1861068 0 0 0 21964 6957 4379 14 >> 3 >> > 61 21 0 >> > 3 2 37472 1025324 53616 1883332 0 0 0 23068 7062 4104 14 >> 4 >> > 64 18 0 >> > 4 2 37472 1008708 53764 1904044 0 0 0 20868 6246 3802 12 >> 3 >> > 57 29 0 >> > >> > can you help me understand why the async producer is much more cpu >> > extensive than the sync procuder ? >> > >> > On Thu, Mar 29, 2012 at 12:04 AM, Jay Kreps <jay.kr...@gmail.com> >> wrote: >> > >> > > Ah, but what is the cpu utilization? Can you run "top" or something >> during >> > > the test? My suspicion is that the producer is maybe just cpu bound. >> > > >> > > -Jay >> > > >> > > On Wed, Mar 28, 2012 at 3:57 AM, 刘明敏 <diveintotomor...@gmail.com> >> wrote: >> > > >> > > > 4 cpus on Procuder: >> > > > >> > > > processor : 0 >> > > > vendor_id : GenuineIntel >> > > > cpu family : 6 >> > > > model : 30 >> > > > model name : Intel(R) Xeon(R) CPU X3430 @ 2.40GHz >> > > > stepping : 5 >> > > > cpu MHz : 2393.580 >> > > > cache size : 8192 KB >> > > > physical id : 0 >> > > > siblings : 4 >> > > > core id : 0 >> > > > cpu cores : 4 >> > > > apicid : 0 >> > > > initial apicid : 0 >> > > > fpu : yes >> > > > fpu_exception : yes >> > > > cpuid level : 11 >> > > > wp : yes >> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov >> pat >> > > pse36 >> > > > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm >> > > > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc >> > > > aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 >> xtpr pdcm >> > > > sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept >> vpid >> > > > bogomips : 4787.16 >> > > > clflush size : 64 >> > > > cache_alignment : 64 >> > > > address sizes : 36 bits physical, 48 bits virtual >> > > > power management: >> > > > >> > > > processor : 1 >> > > > vendor_id : GenuineIntel >> > > > cpu family : 6 >> > > > model : 30 >> > > > model name : Intel(R) Xeon(R) CPU X3430 @ 2.40GHz >> > > > stepping : 5 >> > > > cpu MHz : 2393.580 >> > > > cache size : 8192 KB >> > > > physical id : 0 >> > > > siblings : 4 >> > > > core id : 1 >> > > > cpu cores : 4 >> > > > apicid : 2 >> > > > initial apicid : 2 >> > > > fpu : yes >> > > > fpu_exception : yes >> > > > cpuid level : 11 >> > > > wp : yes >> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov >> pat >> > > pse36 >> > > > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm >> > > > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc >> > > > aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 >> xtpr pdcm >> > > > sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept >> vpid >> > > > bogomips : 4787.17 >> > > > clflush size : 64 >> > > > cache_alignment : 64 >> > > > address sizes : 36 bits physical, 48 bits virtual >> > > > power management: >> > > > >> > > > processor : 2 >> > > > vendor_id : GenuineIntel >> > > > cpu family : 6 >> > > > model : 30 >> > > > model name : Intel(R) Xeon(R) CPU X3430 @ 2.40GHz >> > > > stepping : 5 >> > > > cpu MHz : 2393.580 >> > > > cache size : 8192 KB >> > > > physical id : 0 >> > > > siblings : 4 >> > > > core id : 2 >> > > > cpu cores : 4 >> > > > apicid : 4 >> > > > initial apicid : 4 >> > > > fpu : yes >> > > > fpu_exception : yes >> > > > cpuid level : 11 >> > > > wp : yes >> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov >> pat >> > > pse36 >> > > > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm >> > > > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc >> > > > aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 >> xtpr pdcm >> > > > sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept >> vpid >> > > > bogomips : 4787.17 >> > > > clflush size : 64 >> > > > cache_alignment : 64 >> > > > address sizes : 36 bits physical, 48 bits virtual >> > > > power management: >> > > > >> > > > processor : 3 >> > > > vendor_id : GenuineIntel >> > > > cpu family : 6 >> > > > model : 30 >> > > > model name : Intel(R) Xeon(R) CPU X3430 @ 2.40GHz >> > > > stepping : 5 >> > > > cpu MHz : 2393.580 >> > > > cache size : 8192 KB >> > > > physical id : 0 >> > > > siblings : 4 >> > > > core id : 3 >> > > > cpu cores : 4 >> > > > apicid : 6 >> > > > initial apicid : 6 >> > > > fpu : yes >> > > > fpu_exception : yes >> > > > cpuid level : 11 >> > > > wp : yes >> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov >> pat >> > > pse36 >> > > > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm >> > > > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc >> > > > aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 >> xtpr pdcm >> > > > sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept >> vpid >> > > > bogomips : 4787.17 >> > > > clflush size : 64 >> > > > cache_alignment : 64 >> > > > address sizes : 36 bits physical, 48 bits virtual >> > > > power management: >> > > > >> > > > === >> > > > not sure if I am sending the right place this time. >> > > > >> > > > 2012/3/28 Jay Kreps <jay.kr...@gmail.com> >> > > > >> > > > > What does CPU look like on the producer? >> > > > > >> > > > > Also, can you please use the apache mailing list (cc'd). This >> google >> > > > group >> > > > > is from our pre-apache days and is long dead. >> > > > > >> > > > > -Jay >> > > > > >> > > > > On Tue, Mar 27, 2012 at 7:42 PM, Kafka >> > > > > <kafka-dev+msga...@googlegroups.com>wrote: >> > > > > >> > > > > > A message has been sent to the kafka-dev< >> > > > > http://groups.google.com/group/kafka-dev?hl=en_US>group and is >> > > awaiting >> > > > > approval. We've included this message for your review. >> > > > > > >> > > > > > The message requires moderation because the user does not have >> > > > permission >> > > > > > to post. >> > > > > > >> > > > > > You can approve or reject< >> > > > > http://groups.google.com/group/kafka-dev/pendmsg?hl=en_US>this >> message >> > > > or >> > > > > you can approve this message by replying to this email. >> > > > > > [image: Visit Google Groups] >> > > > > > >> > > > > > Start < >> https://groups.google.com/groups/create?hl=en_US&src=email> >> > > > your >> > > > > > own group, visit < >> > > http://groups.google.com/support/?hl=en_US&src=email >> > > > >the >> > > > > help center, or >> > > > > > report < >> https://groups.google.com/groups/abuse?hl=en_US&src=email> >> > > > > abuse. >> > > > > > >> > > > > > >> > > > > > ---------- Forwarded message ---------- >> > > > > > From: 刘明敏 <diveintotomor...@gmail.com> >> > > > > > To: kafka-users@incubator.apache.org >> > > > > > Cc: kafka-...@googlegroups.com >> > > > > > Date: Wed, 28 Mar 2012 10:42:11 +0800 >> > > > > > Subject: Re: [kafka-dev] performance of async producer stops >> increase >> > > > > when >> > > > > > broker is more than 2 >> > > > > > I have 1 producer on 1 machine. >> > > > > > >> > > > > > The perf of sync producer is : >> > > > > > >> > > > > > 1 borker: >> > > > > > *20.3277(M/s)*,*213151.4441(#/s)* (sync) >> > > > > > 2 borkers: >> > > > > > *39.5995*, *415230.6606* (sync) >> > > > > > 3 borkers: >> > > > > > *58.0836*, *609050.4903*(sync) >> > > > > > >> > > > > > compared with async producer : >> > > > > > when 1 broker: >> > > > > > 20.3164 M/s,213033.3823#/s (async) >> > > > > > >> > > > > > when 2 brokers: >> > > > > > 37.4754 M/s,392958.1892#/s (async) >> > > > > > >> > > > > > when 3 brokers(perf stops increase) >> > > > > > 36.6925 M/s, 384748.5668#/s (async) >> > > > > > >> > > > > > the sync producer reached 600K #/s(far beyond sync producer's >> 384k >> > > #/s) >> > > > > So >> > > > > > I doubt if it is the case of maxing out network or procuder. >> > > > > > >> > > > > > === >> > > > > > and,I post the thread from google group: >> > > > > > https://groups.google.com/forum/?fromgroups#!forum/kafka-dev >> > > > > > >> > > > > > On Wed, Mar 28, 2012 at 3:25 AM, Jay Kreps <jay.kr...@gmail.com >> > >> > > > wrote: >> > > > > > >> > > > > >> Hi, >> > > > > >> >> > > > > >> How many producers are you using and on how many machines? >> Also what >> > > > is >> > > > > >> the >> > > > > >> network speed (1gbit?) Likely you are maxing out either the >> producer >> > > > > >> machine(s) or the network itself. >> > > > > >> >> > > > > >> Also, quick question, how did you come across this address? >> This >> > > > address >> > > > > >> has been dead for a while now, and the new mailing list is on >> Apache >> > > > so >> > > > > I >> > > > > >> just want to make sure we don't have a dangling pointer >> somewhere. >> > > > Would >> > > > > >> be >> > > > > >> good to move the discussion to the Apache list so others can >> find >> > > it. >> > > > > >> >> > > > > >> -Jay >> > > > > >> >> > > > > >> On Tue, Mar 27, 2012 at 2:10 AM, mmLiu < >> diveintotomor...@gmail.com> >> > > > > >> wrote: >> > > > > >> >> > > > > >> > I tested the performance of producer with >> > > > > >> ./bin/kafka-producer-perf-test.sh >> > > > > >> > >> > > > > >> > when 1 broker: >> > > > > >> > 20.3164 M/s,213033.3823#/s (async) >> > > > > >> > >> > > > > >> > when 2 brokers: >> > > > > >> > 37.4754 M/s,392958.1892#/s (async) >> > > > > >> > >> > > > > >> > when 3 brokers(perf stops increase) >> > > > > >> > 36.6925 M/s, 384748.5668#/s (async) >> > > > > >> > >> > > > > >> > when 4 brokers(even decrease): >> > > > > >> > 33.2244 M/s, 348383.5006 #/s (async) >> > > > > >> > >> > > > > >> > >> > > > > >> > Any idea why? >> > > > > >> > >> > > > > >> > FYI:the performance of sync producer increases(almost) >> linearly as >> > > > > >> number >> > > > > >> > of brokers increases. >> > > > > >> >> > > > > > >> > > > > > >> > > > > > >> > > > > > -- >> > > > > > Best Regards >> > > > > > >> > > > > > ---------------------- >> > > > > > 刘明敏 | mmLiu >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > Best Regards >> > > > >> > > > ---------------------- >> > > > 刘明敏 | mmLiu >> > > > >> > > >> > >> > >> > >> > -- >> > Best Regards >> > >> > ---------------------- >> > 刘明敏 | mmLiu >> > > > > -- > Best Regards > > ---------------------- > 刘明敏 | mmLiu > > -- Best Regards ---------------------- 刘明敏 | mmLiu