Hi Bhavesh I will collect the dump and I will send for you.
I am using a program that I have caught here https://github.com/edenhill/librdkafka/tree/master/examples <https://github.com/edenhill/librdkafka/tree/master/examples> and I have changed to meet my tests. I have attached the files. > On Nov 5, 2014, at 04:45, Bhavesh Mistry <mistry.p.bhav...@gmail.com> wrote: > > Hi Eduardo, > > Can you please take thread dump and see if there are blocking issues on > producer side ? Do you have single instance of Producers and Multiple > treads ? > > Are you using Scala Producer or New Java Producer ? Also, what is your > producer property ? > > > Thanks, > > Bhavesh > > On Tue, Nov 4, 2014 at 12:40 AM, Eduardo Alfaia <e.costaalf...@unibs.it> > wrote: > >> Hi Gwen, >> I have changed the java code kafkawordcount to use reducebykeyandwindow in >> spark. >> >> ----- Messaggio originale ----- >> Da: "Gwen Shapira" <gshap...@cloudera.com> >> Inviato: 03/11/2014 21:08 >> A: "users@kafka.apache.org" <users@kafka.apache.org> >> Cc: "u...@spark.incubator.apache.org" <u...@spark.incubator.apache.org> >> Oggetto: Re: Spark Kafka Performance >> >> Not sure about the throughput, but: >> >> "I mean that the words counted in spark should grow up" - The spark >> word-count example doesn't accumulate. >> It gets an RDD every n seconds and counts the words in that RDD. So we >> don't expect the count to go up. >> >> >> >> On Mon, Nov 3, 2014 at 6:57 AM, Eduardo Costa Alfaia < >> e.costaalf...@unibs.it >>> wrote: >> >>> Hi Guys, >>> Anyone could explain me how to work Kafka with Spark, I am using the >>> JavaKafkaWordCount.java like a test and the line command is: >>> >>> ./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount >>> spark://192.168.0.13:7077 computer49:2181 test-consumer-group unibs.it 3 >>> >>> and like a producer I am using this command: >>> >>> rdkafka_cachesender -t unibs.nec -p 1 -b 192.168.0.46:9092 -f output.txt >>> -l 100 -n 10 >>> >>> >>> rdkafka_cachesender is a program that was developed by me which send to >>> kafka the output.txt’s content where -l is the length of each send(upper >>> bound) and -n is the lines to send in a row. Bellow is the throughput >>> calculated by the program: >>> >>> File is 2235755 bytes >>> throughput (b/s) = 699751388 >>> throughput (b/s) = 723542382 >>> throughput (b/s) = 662989745 >>> throughput (b/s) = 505028200 >>> throughput (b/s) = 471263416 >>> throughput (b/s) = 446837266 >>> throughput (b/s) = 409856716 >>> throughput (b/s) = 373994467 >>> throughput (b/s) = 366343097 >>> throughput (b/s) = 373240017 >>> throughput (b/s) = 386139016 >>> throughput (b/s) = 373802209 >>> throughput (b/s) = 369308515 >>> throughput (b/s) = 366935820 >>> throughput (b/s) = 365175388 >>> throughput (b/s) = 362175419 >>> throughput (b/s) = 358356633 >>> throughput (b/s) = 357219124 >>> throughput (b/s) = 352174125 >>> throughput (b/s) = 348313093 >>> throughput (b/s) = 355099099 >>> throughput (b/s) = 348069777 >>> throughput (b/s) = 348478302 >>> throughput (b/s) = 340404276 >>> throughput (b/s) = 339876031 >>> throughput (b/s) = 339175102 >>> throughput (b/s) = 327555252 >>> throughput (b/s) = 324272374 >>> throughput (b/s) = 322479222 >>> throughput (b/s) = 319544906 >>> throughput (b/s) = 317201853 >>> throughput (b/s) = 317351399 >>> throughput (b/s) = 315027978 >>> throughput (b/s) = 313831014 >>> throughput (b/s) = 310050384 >>> throughput (b/s) = 307654601 >>> throughput (b/s) = 305707061 >>> throughput (b/s) = 307961102 >>> throughput (b/s) = 296898200 >>> throughput (b/s) = 296409904 >>> throughput (b/s) = 294609332 >>> throughput (b/s) = 293397843 >>> throughput (b/s) = 293194876 >>> throughput (b/s) = 291724886 >>> throughput (b/s) = 290031314 >>> throughput (b/s) = 289747022 >>> throughput (b/s) = 289299632 >>> >>> The throughput goes down after some seconds and it does not maintain the >>> performance like the initial values: >>> >>> throughput (b/s) = 699751388 >>> throughput (b/s) = 723542382 >>> throughput (b/s) = 662989745 >>> >>> Another question is about spark, after I have started the spark line >>> command after 15 sec spark continue to repeat the words counted, but my >>> program continue to send words to kafka, so I mean that the words counted >>> in spark should grow up. I have attached the log from spark. >>> >>> My Case is: >>> >>> ComputerA(Kafka_cachsesender) -> ComputerB(Kakfa-Brokers-Zookeeper) -> >>> ComputerC (Spark) >>> >>> If I don’t explain very well send a reply to me. >>> >>> Thanks Guys >>> -- >>> Informativa sulla Privacy: http://www.unibs.it/node/8155 >>> >> >> -- >> Informativa sulla Privacy: http://www.unibs.it/node/8155 >> -- Informativa sulla Privacy: http://www.unibs.it/node/8155