RE: I need some help with the production server architecture

2016-12-01 Thread Tauzell, Dave
For low volume zookeeper doesn't seem to use many resources. I would put it on nodejs server as that will have less IO and heavy IO could impact zookeeper. Or, you could put some ZK nodes on nodejs and some on DB servers to hedge your bets. As always, you'll find out a lot once you

Re: Detecting when all the retries are expired for a message

2016-12-01 Thread Ismael Juma
The callback should give you what you are asking for. Has it not worked as you expect when you tried it? Ismael On Thu, Dec 1, 2016 at 1:22 PM, Mevada, Vatsal wrote: > Hi, > > > > I am reading a file and dumping each record on Kafka. Here is my producer > code: > > > >

Re: I need some help with the production server architecture

2016-12-01 Thread Sachin Mittal
And what about my brokers. Should I hedge them as well. Like say put 2 zk on nodejs server and 1 on db server. Put 2 brokers on db server and 1 on nodejs server, something like that. Thanks Sachin On Thu, Dec 1, 2016 at 8:59 PM, Tauzell, Dave wrote: > For low

Re: I need some help with the production server architecture

2016-12-01 Thread Michael Noll
+1 to what Dave said. On Thu, Dec 1, 2016 at 4:29 PM, Tauzell, Dave wrote: > For low volume zookeeper doesn't seem to use many resources. I would put > it on nodejs server as that will have less IO and heavy IO could impact > zookeeper. Or, you could put some

Detecting when all the retries are expired for a message

2016-12-01 Thread Mevada, Vatsal
Hi, I am reading a file and dumping each record on Kafka. Here is my producer code: public void produce(String topicName, String filePath, String bootstrapServers, String encoding) { try (BufferedReader bf = getBufferedReader(filePath, encoding);

KTables + aggregation - how to make lots of ppl happy

2016-12-01 Thread Jon Yeargers
Seems like there are many questions on SO and related about "how do I know when my windowed aggregation is 'done'?" The answer has always been "It's never done. You're thinking about it the wrong way." I propose a new function for KStream: finiteAggregateByKey(Initializer, Aggregator, Windows,

RE: I need some help with the production server architecture

2016-12-01 Thread Tauzell, Dave
Do you have some idea of the size and number of messages per second you'll put onto the topics at peak? -Dave -Original Message- From: Sachin Mittal [mailto:sjmit...@gmail.com] Sent: Thursday, December 1, 2016 9:44 AM To: users@kafka.apache.org Subject: Re: I need some help with the

kafka.log:type=LogFlushStats,name=LogFlushRateAndTimeMs Not Reported

2016-12-01 Thread Madhuri Sasurkar
Hi All, I have set up my kafka cluster and enabled JMX. I was going through all the mbeans that are exposed and I am not able to find mbean "kafka.log:type=LogFlushStats,name=LogFlushRateAndTimeMs" Am I missing anything? I have set up the log flush time as 6 Thanks,

Re: I need some help with the production server architecture

2016-12-01 Thread Sachin Mittal
Messages are around 100 per second. A message size is around 4 KB. This is for source topic. >From this, it is keyed to an intermediate topic and aggregated to a table. Keyed message will be around 1 KB or so. On Thu, Dec 1, 2016 at 9:44 PM, Tauzell, Dave wrote:

RE: I need some help with the production server architecture

2016-12-01 Thread Tauzell, Dave
I wasn't paying attention enough and didn't think about the brokers. Assuming all the VMs have the same underlying SAN for disk I would start by putting brokers on the VMs with the most free memory and zookeeper on the others. -Dave -Original Message- From: Sachin Mittal

Expected client producer/consumer CPU utilization when idle

2016-12-01 Thread Niklas Ström
Hello all Can anyone say something about what CPU utilization we can expect for a producer/consumer process that is idle, i.e. not producing or consuming any messages? Should it be like 0%? What is your experience? We have a small program with a few kafka producers and consumers and we are

Re: Release Kafka 0.9.0.2?

2016-12-01 Thread Ben Osheroff
+1. At the least an answer regarding timelines would be good here. On Wed, Nov 30, 2016 at 02:46:59PM +0100, Stevo Slavić wrote: > Hello Apache Kafka community, > > Would it be possible to release 0.9.0.2? > > It has few important fixes, like KAFKA-3594 >

Re: Kafka Streams question - KStream.leftJoin(KTable)

2016-12-01 Thread Matthias J. Sax
Hi Ivan, If I understand you correct, the issue with the leftJoin is that your stream does contain records with key==null and thus those records get dropped? What about this: streamBB = streamB.selectKey(..); streamC = streamB.leftJoin(tableA); streamBNull = streamB.filter((k,v) -> k == null);

Re: Release Kafka 0.9.0.2?

2016-12-01 Thread Asaf Mesika
Waiting for the retries fix which doesn't work before 0.9.0.2 - had to apply the Pull Request my self to a forked repo to get our Log Receiver working properly. It's a must in my opinion. On Thu, Dec 1, 2016 at 6:46 PM Ben Osheroff wrote: > +1. At the least an answer

Re: Detecting when all the retries are expired for a message

2016-12-01 Thread Asaf Mesika
There's a critical bug in that section that has only been fixed in 0.9.0.2 which has not been release yet. Without the fix it doesn't really retry. I forked the kafka repo, applied the fix, built it and placed it in our own Nexus Maven repository until 0.9.0.2 will be released.

Re: kafka 0.10.1 ProcessorTopologyTestDriver issue with .map and .groupByKey.count

2016-12-01 Thread Guozhang Wang
Thanks! On Thu, Dec 1, 2016 at 5:17 AM, Hamidreza Afzali < hamidreza.afz...@hivestreaming.com> wrote: > I have added an example for KStreamDriver to the GitHub Gist and updated > the JIRA issue. > > https://issues.apache.org/jira/browse/KAFKA-4461 > >

Re: About stopping a leader

2016-12-01 Thread Apurva Mehta
Yes, the leader should move to K2 or K3. You can check the controller log on all 3 machines to find out where the new leader is placed. It is not guaranteed to move back to K1 when you restart it 2 hours later, however. On Mon, Nov 21, 2016 at 3:38 AM, marcel bichon

Kafka Streams question - KStream.leftJoin(KTable)

2016-12-01 Thread Ivan Ilichev
Hi Guys, I am implementing a stream processor where I aggregate a stream of events by their keys into a KTable tableA and then I am “enriching” another streamB by the values of tableA. So essentially I have this: streamC = streamB .selectKey(..) .leftJoin(tableA); This works great however

RE: Detecting when all the retries are expired for a message

2016-12-01 Thread Mevada, Vatsal
@Ismael: I can handle TimeoutException in the callback. However as per the documentation of Callback(link: https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/producer/Callback.html), TimeoutException is a retriable exception and it says that it "may be covered by increasing

Re: A strange controller log in Kafka 0.9.0.1

2016-12-01 Thread Json Tu
Hi, Can someone else help to review the pr in jira: https://issues.apache.org/jira/browse/KAFKA-4447 . > 在 2016年11月23日,下午11:28,Json Tu 写道: > > Hi, > We have a cluster of kafka 0.9.0.1 with 3 nodes, and we found

Re: OOM errors

2016-12-01 Thread Guozhang Wang
I see. For windowed aggregations the disk space (i.e. "/tmp/kafka-streams/appname") as well as memory consumption on RocksDB should not keep increasing forever. One thing to note is that you are using a tumbling window where a new window will be created every minute, so within 20 minutes of "event

Re: I need some help with the production server architecture

2016-12-01 Thread Sachin Mittal
Folks any help on this. Just to put it in simple terms, since we have limited resources available to us what is better option 1. run zookeeper on servers running the nodejs web server or db server. 2. what about kafka brokers. Thanks Sachin On Tue, Nov 29, 2016 at 1:06 PM, Sachin Mittal

Kafka Logo as HighRes or Vectorgraphics

2016-12-01 Thread Jan Filipiak
Hi Everyone, we want to print some big banners of the Kafka logo to decorate our offices. Can anyone help me find a version of the kafka logo that would still look nice printed onto 2x4m flags? Highly appreciated! Best Jan

Re: Is there a way to control pipeline flow to downstream

2016-12-01 Thread Sachin Mittal
Hi, Thanks for the link. What I understand is that when cache.max.bytes.buffering value is reached it will push the aggregation to downstream. What is the default value for the same? And how can I determine my cache size for current stream so as to set an optimal value. I also suppose the push

Is there a way to control pipeline flow to downstream

2016-12-01 Thread Sachin Mittal
Hi all, Say I have a pipleline like this topic.aggregateByKey( ...) => to downstream Now for every message in topic it will call aggregateByKey and send it to downstream Is there a way to tell the pipeline that if it gets a certain message then only push the current aggregation result to

Re: Is there a way to control pipeline flow to downstream

2016-12-01 Thread Eno Thereska
Hi Sachin, If you are using the DSL, currently there is no way to do fine-grained control of the downstream sending. There is some coarse-grained control in that you can use the record cache to dedup messages with the same key before sending downstream, or you can choose to get all records by

Re: Is there a way to control pipeline flow to downstream

2016-12-01 Thread Sachin Mittal
Hi, I checked the docs http://kafka.apache.org/0100/javadoc/index.html class StreamsConfig but did not find this CACHE_MAX_BYTES_BUFFERING_CONFIG setting. Also on the first option: use the record cache to dedup messages with the same key before sending downstream I did not understand this. How

Re: Is there a way to control pipeline flow to downstream

2016-12-01 Thread Eno Thereska
Hi Sachin, This landed in 0.10.1, so the docs are at http://kafka.apache.org/0101/javadoc/index.html . This wiki has a good description of how this works:

Re: kafka 0.10.1 ProcessorTopologyTestDriver issue with .map and .groupByKey.count

2016-12-01 Thread Hamidreza Afzali
I have added an example for KStreamDriver to the GitHub Gist and updated the JIRA issue. https://issues.apache.org/jira/browse/KAFKA-4461 https://gist.github.com/hrafzali/c2f50e7b957030dab13693eec1e49c13 Hamid