Re: Right Tool

2014-09-12 Thread Patrick Barker
Yeah, I would want to know they made it there. I like to use polyglot for the availability of data, I build my recommendation engine in graph, my bulk data is in mongo, and sql is kind of my default/ad hoc store. This is working really well for me, but I want to ease up on the payload within my app

Re: Right Tool

2014-09-12 Thread Steve Morin
You would need make sure they were all persisted down properly to each database? Why are you persisting it to three different databases (sql, mongo, graph)? -Steve On Fri, Sep 12, 2014 at 7:35 PM, Patrick Barker wrote: > I'm just getting familiar with kafka, currently I just save everything to

Re: Right Tool

2014-09-12 Thread Patrick Barker
I'm just getting familiar with kafka, currently I just save everything to all my db's in a single transaction, if any of them fail I roll them all back. However, this is slowing my app down. So, as I understand it I could write to kafka, close the transaction, and then it would keep on publishing o

Re: Getting replicas back in sync

2014-09-12 Thread Joe Stein
Hey Stephen, two things on that. 1) You need to figure out what is the root cause making the leader election occur. Could be the brokers are having ZK timeouts and leader election is occurring as result... if so you need to dig into why (look at all your logs... You should look for some type of fl

Re: Getting replicas back in sync

2014-09-12 Thread Stephen Sprague
i find this situation occurs frequently in my setup - only takes one day - and blam - the leader board is all skewed to a single one. not really sure to overcome that once it happens so if there is a solution out there i'd be interested. On Fri, Sep 12, 2014 at 12:50 PM, Cory Watson wrote: > Wh

Re: Right Tool

2014-09-12 Thread Steve Morin
What record format are you writing to Kafka with? > On Sep 12, 2014, at 17:45, Patrick Barker wrote: > > O, I'm not trying to use it for persistence, I'm wanting to sync 3 > databases: sql, mongo, graph. I want to publish to kafka and then have it > update the db's. I'm wanting to keep this as e

Re: Right Tool

2014-09-12 Thread cac...@gmail.com
Right that makes much more sense. You will probably want to make sure that your updates are idempotent (or you could just accept the risk), though in the SQL case you could commit your offset to the DB as part of the same transaction (requires more custom stuff). Christian On Fri, Sep 12, 2014 at

Re: Right Tool

2014-09-12 Thread Patrick Barker
O, I'm not trying to use it for persistence, I'm wanting to sync 3 databases: sql, mongo, graph. I want to publish to kafka and then have it update the db's. I'm wanting to keep this as efficient as possible. On Fri, Sep 12, 2014 at 6:39 PM, cac...@gmail.com wrote: > I would say that it depends

Re: Right Tool

2014-09-12 Thread cac...@gmail.com
I would say that it depends upon what you mean by persistence. I don't believe Kafka is intended to be your permanent data store, but it would work if you were basically write once with appropriate query patterns. It would be an odd way to describe it though. Christian On Fri, Sep 12, 2014 at 4:0

Re: [Java New Producer Configuration] Maximum time spent in Queue in Async mode

2014-09-12 Thread Jun Rao
This is controlled by linger.ms in the new producer in trunk. Thanks, Jun On Thu, Sep 11, 2014 at 5:56 PM, Bhavesh Mistry wrote: > Hi Kafka team, > > How do I configure a max amount a message spend in Queue ? In old > producer, there is property called queue.buffering.max.ms and it is not > p

Re: Setting log.default.flush.interval.ms and log.default.flush.scheduler.interval.ms

2014-09-12 Thread Neha Narkhede
Hemanth, Specifically, you'd want to monitor kafka:type=kafka.SocketServerStats:getMaxProduceRequestMs and kafka:type=kafka.LogFlushStats:getMaxFlushMs. If the broker is under load due to frequent flushes, it will almost certainly show up as spikes in the flush latency and consequently the produce

Re: Right Tool

2014-09-12 Thread Stephen Boesch
Hi Patrick, Kafka can be used at any scale including small ones (initially anyways). The issues I ran into personally various issues with ZooKeeper management and a bug in deleting topics (is that fixed yet?) In any case you might try out Kafka - given its highly performant, scalable, and flexi

Right Tool

2014-09-12 Thread Patrick Barker
Hey, I'm new to kafka and I'm trying to get a handle on how it all works. I want to integrate polyglot persistence into my application. Kafka looks like exactly what I want just on a smaller scale. I am currently only dealing with about 2,000 users, which may grow, but is kafka a good use case her

Re: Getting replicas back in sync

2014-09-12 Thread Cory Watson
What follows is a guess on my part, but here's what I *think* was happening: We hit an OOM that seems to've killed some of the replica fetcher threads. I had a mishmash of replicas that weren't making progress as determined by the JMX stats for the replica. The thread for which the JMX attribute w

Re: Getting replicas back in sync

2014-09-12 Thread Kashyap Paidimarri
We're seeing the same behaviour today on our cluster. It is not like a single broker went out of the cluster, rather a few partitions seem lazy on every broker. On Fri, Sep 12, 2014 at 9:31 PM, Cory Watson wrote: > I noticed this morning that a few of our partitions do not have their full > comp

Re: Setting log.default.flush.interval.ms and log.default.flush.scheduler.interval.ms

2014-09-12 Thread Jun Rao
One of the differences btw 0.7.x and 0.8.x is that the latter does the I/O flushing in the background. So, in 0.7.x, more frequent I/O flushing will increase the producer latency. Thanks, Jun On Thu, Sep 11, 2014 at 5:48 PM, Hemanth Yamijala wrote: > Neha, > > Thanks. We are on 0.7.2. I have w

Getting replicas back in sync

2014-09-12 Thread Cory Watson
I noticed this morning that a few of our partitions do not have their full complement of ISRs: Topic:migration PartitionCount:16 ReplicationFactor:3 Configs:retention.bytes=32985348833280 Topic: migration Partition: 0 Leader: 1 Replicas: 1,4,5 Isr: 1,5,4 Topic: migration Partition: 1 Leader: 1 Rep

Re: Dynamic partitioning

2014-09-12 Thread Joe Stein
That command will change how many partitions the topic has. What you are looking for I think is https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool which allows you to change what partitions are running on which replicas and which replicas

Dynamic partitioning

2014-09-12 Thread István
Hi all, My understanding is that with 0.8.1.x you can manually change the number of partitions on the broker, and this change is going to be picked up by the producers and consumers (high level). kafka-topics.sh --alter --zookeeper zk.net:2181/stream --topic test --partitions 3 Is that the case?

Re: Kafka High Level Consumer

2014-09-12 Thread Joe Stein
You want to use the createMessageStreamsByFilter and pass in a WhiteList with a regex that would include everything you want... here is e.g. how to use that https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/consumer/ConsoleConsumer.scala#L196 /**

Kafka High Level Consumer

2014-09-12 Thread Rahul Mittal
Hi , Is there a way in kafka to read data from all topics, from a consumer group without specifying topics in a dynamic way. That is if new topics are created on kafka brokers the consumer group should figure it out and start reading from the new topic as well without explicitly defining new topic