Re: Yarn jobs in accepted state

2015-05-20 Thread Shekar Tippur
Wang, Thanks for pointing out. Please let me know if you can see this - http://www.awesomescreenshot.com/showImage?img_id=246654 - Shekar On Wed,

Re: Yarn jobs in accepted state

2015-05-20 Thread Guozhang Wang
Hello Shekar, The Apache mailing list blocks most attachments, could you send a link of the screenshot here? Guozhang On Wed, May 20, 2015 at 7:14 PM, Shekar Tippur wrote: > Hello, > > After submitting Samza job to Yarn, I see a lot of jobs in accepted state. > > Please see the attached screen

Re: Samza job throughput much lower than Kafka throughput

2015-05-20 Thread Guozhang Wang
Hi George, Is there any reason you need to set the following configs? systems.kafka.consumer.fetch.wait.max.ms= 1 This setting will basically disable long pooling of the consumer which will then busy fetching data from broker, which has a large impact on network latency especially when the consu

Yarn jobs in accepted state

2015-05-20 Thread Shekar Tippur
Hello, After submitting Samza job to Yarn, I see a lot of jobs in accepted state. Please see the attached screenshot. Wondering if this is due to any missed setting. I see the jobs progressing but would it cause any harm? - Shekar

Library version conflict issues

2015-05-20 Thread Yi Pan
Hi, all, Just curious about one thing: - Samza as a platform brings in a set of dependency libraries - Applications developed in Samza may bring in other libraries that conflicts w/ the Samza libraries (we have got one use case that requires jackson 1.4.2 which conflicts with jackson 1.8.5 that Sa

Re: Samza job throughput much lower than Kafka throughput

2015-05-20 Thread George Li
Hi Yi, Thanks for the reply. Below is my job config and code. When we run this job inside our dev docker container, which has zookeeper, broker, and yarn installed locally, its throughput is at least 50% higher than our cluster run's. Thanks, George Configuration: job.factory.class=org.ap

Review Request 34500: SAMZA-552 Operator API change: builder and simplified operator classes

2015-05-20 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34500/ --- Review request for samza, Yan Fang, Chris Riccomini, Guozhang Wang, Milinda Path

Re: Review Request 33419: SAMZA-625: Provide tool to consume changelog and materialize a state store

2015-05-20 Thread Yan Fang
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33419/ --- (Updated May 20, 2015, 10:04 p.m.) Review request for samza. Changes ---

Re: Samza job throughput much lower than Kafka throughput

2015-05-20 Thread Yi Pan
Hi, George, Could you share w/ us the code and configuration of your sample test job? Thanks! -Yi On Wed, May 20, 2015 at 1:19 PM, George Li wrote: > Hi, > > We are evaluating Samza's performance, and our sample job with > TestPerformanceTask is much slower than a program reading directly from

Samza job throughput much lower than Kafka throughput

2015-05-20 Thread George Li
Hi, We are evaluating Samza's performance, and our sample job with TestPerformanceTask is much slower than a program reading directly from Kafka. Scenario: * Cluster: 1 master node for Zookeeper and yarn. 3 Kafka broker nodes 3 yarn worker nodes * Kafka: Topic has only 1 partition. Average me