Re: Using Kafka without persisting message to disk

2016-07-14 Thread Sharninder Khera
I'd second Tom here. Create a ram disk or just let Kafka write to disk. Use compression and batch messages and the OS fscache would take care of the rest. Kafka is pretty fast and you probably won't notice.  _ From: Tom Crayford

Re: Deleting a topic on the 0.8x brokers

2016-07-14 Thread Harsha Chintalapani
One way to delete is to delete the topic partition directories from disks and delete /broker/topics. If you just shutdown those brokers controller might try to replicate the topic onto brokers and since you don't have any leaders you might replica fetcher errors in the logs. Thanks, Harsha On

Re: Kafka Streams reducebykey and tumbling window - need final windowed KTable values

2016-07-14 Thread Srikanth
You are right. We are more likely to be interested in value when window expires and sometimes when retention limit is reached. I lost my "time sense" when I read the last email! I guess we can query a 12:00:00 window at 12:00:05(5 sec window). That will be some sort of poll(loop) as opposed to a

Re: KTable DSL join

2016-07-14 Thread Srikanth
I was looking for KTable-KTable semantics where both trigger updates. The result will be used to enrich a few KStreams. I'll keep an eye on this jira. Meanwhile, I'll use custom processor or like you said convert it to KStream-KTable join and continue with my test. Srikanth On Thu, Jul 14,

How to manually bring offline partition back?

2016-07-14 Thread Jun MA
Hi all, We have some partitions that only have 1 replication, and after the broker failed, partitions on that broker becomes unavailable. We set unclean.leader.election.enable=false, so the controller doesn’t bring those partitions back online even after that broker is up. We tried Preferred

Deleting a topic on the 0.8x brokers

2016-07-14 Thread Rajiv Kurian
We plan to stop using a particular Kafka topic running on a certain subset of a 0.82x cluster. This topic is served by 9 brokers (leaders + replicas) and these 9 brokers have no other topics on them. Once we have stopped sending and consuming traffic from this topic (and hence the 9 brokers) what

Re: Read all record from a Topic.

2016-07-14 Thread Jean-Baptiste Silvy
Thanks a lot James for the detailed article. This is exactly what I was looking for! This confirm me to use the seekToEnd method (option #5 in your post), as I'm using Kafka 0.10. Jean-Baptiste On 16-07-13 05:45 PM, James Cheng wrote: Jean-Baptiste, I wrote a blog post recently on this

Re: Using Kafka without persisting message to disk

2016-07-14 Thread Tom Crayford
Hi Jack, No, kafka doesn't support not writing to disk. If you're really 100% sure of yourself you could use a ramdisk and mount Kafka on it, but that's not supported. I'd recommend "just" writing to disk, it's plenty fast enough for nearly all use cases. Thanks Tom Crayford Heroku Kafka On

Using Kafka without persisting message to disk

2016-07-14 Thread Jack Huang
Hi all, Is there a way to make a topic to be stored in memory only and not writing to disk? If not, what's the best way to minimize writing to disk? For this application we only need the notion of partitions and a short retention time (1hr or so) from Kafka. We want to use Kafka because we want

Re: Kafka Streams reducebykey and tumbling window - need final windowed KTable values

2016-07-14 Thread Guozhang Wang
Hi Srikanth, In you do not care about the intermediate results but only want to query the results when the window is no longer retained, you can consider just querying the state stores at the time that the window is about to "expire" (i.e. it will no longer be retained). For example, with a

Re: KTable DSL join

2016-07-14 Thread Matthias J. Sax
Sorry that I can only point to a Jira right now :/ About your CustomProcessor. I doubt that this will work: CustomProcessor#init() will be executed after you call KafkaStream.start(); Thus, your table2.foreach will be executed after the topology was built and I guess, will just not have any

Re: Java 0.9.0.1 Consumer Does not failover

2016-07-14 Thread Michael Freeman
For future reference server the following is needed offsets.topic.replication.factor=3 Michael > On 14 Jul 2016, at 10:56, Michael Freeman wrote: > > Anyone have any ideas? Looks like the group coordinator is not failing over. > Or at least not detected by the Java

Re: KTable DSL join

2016-07-14 Thread Srikanth
Ah, I was hoping for a magic solution not a jira! Another thought was to embed table2 as KTable in custom processor and use that for lookup. class CustomProcesser(kStreamBuilder: KStreamBuilder) extends Processor[Int, Int] { private var kvStore: KeyValueStore[Int, Int] = _ private val table2

Re: KTable DSL join

2016-07-14 Thread Matthias J. Sax
My bad... I should have considered this in the first place. You are absolutely right. Supporting this kind of a join is work in progress. https://issues.apache.org/jira/browse/KAFKA-3705 Your custom solution (Option 1) might work... But as you mentioned, the problem will be that the first table

Re: KTable DSL join

2016-07-14 Thread Srikanth
I should have mentioned that I tried this. It worked in other case but will not work for this one. I'm pasting a sample from table1 that I gave in my first email. Table1 111 -> aaa 222 -> bbb 333 -> aaa Here value is not unique(aaa). So, I can't just make it a key. 333 will then override

Re: Kafka Streams reducebykey and tumbling window - need final windowed KTable values

2016-07-14 Thread Srikanth
Michael, > This allows Kafka Streams to retain old window buckets for a period of time in order to wait for the late arrival of records whose timestamps fall within the window interval. If a record arrives after the retention period has passed, the record cannot be processed and is dropped. So,

Re: KTable DSL join

2016-07-14 Thread Matthias J. Sax
You will need to set a new key before you do the re-partitioning. In your case, it seems you want to switch key and value. This can be done with a simple map > table1.toStream() > .map(new KeyValueMapper() { > public KeyValue apply(K key, V value) { >

Re: KTable DSL join

2016-07-14 Thread Srikanth
Michael, Thanks! Looking forward to the update. An interface like KTable is very conducive for joins. Hopefully, it will get more flexible. Srikanth On Thu, Jul 14, 2016 at 4:35 AM, Michael Noll wrote: > Srikant, > > > Its not a case for join() as the keys don't match.

Re: KTable DSL join

2016-07-14 Thread Srikanth
Matthias, With option 2, how would we perform join after re-partition. Although we re-partitioned with value, the key doesn't change. KTable joins always use keys and ValueJoiner get values from both table when keys match. Having data co-located will not be sufficient rt?? Srikanth On Thu, Jul

Re: __consumer_offsets rebalance

2016-07-14 Thread Anderson Goulart
Hi, Thanks for your help. It is part of our plan to move forward to 0.10, but we need to do it one step at a time. And rebalancing topics, including __consumer_offsets, are on the top as we had incidents associated with it. -- Anderson On 14/07/2016 13:59, Tom Crayford wrote: Also note

Re: KTable DSL join

2016-07-14 Thread Avi Flax
On 7/14/16, 04:35, "Michael Noll" wrote: > Also, a heads up: It turned out that user questions around joins in Kafka > Streams are pretty common. We are currently working on improving the > documentation for joins to make this more clear. Excellent!

Re: __consumer_offsets rebalance

2016-07-14 Thread Tom Crayford
Also note that there were a number of bugs fixed in the log cleaner thread between 0.8 and the latest release. I wouldn't be comfortable relying on kafka committed offsets on a version under 0.10 for a new production system, and would carefully consider an upgrade all the way to the latest

Co ordinator failover

2016-07-14 Thread Michael Freeman
I'm running a three broker cluster. Do I need to have offsets.topic.replication.factor=3 set in order for co ordinator failover to occur? Michael

Re: __consumer_offsets rebalance

2016-07-14 Thread Todd Palino
It's safe to move the partitions of the offsets topic around. You'll move the consumer coordinators as you do, however, so the one thing you want to make sure of, especially running an older version, is that log compaction is working on your brokers and those partitions have been compacted. The

__consumer_offsets rebalance

2016-07-14 Thread Anderson Goulart
Hi, I am running kafka 0.8.2.1 under aws instances with multiple availability zones. As we want a rack aware partition replication, we have our own partition layout distribution, to make sure all partitions are well balanced between nodes, leaders and availability zones. The problem arises

Re: Java 0.9.0.1 Consumer Does not failover

2016-07-14 Thread Michael Freeman
Anyone have any ideas? Looks like the group coordinator is not failing over. Or at least not detected by the Java consumer. A new leader is elected so I'm at a loss. Michael > On 13 Jul 2016, at 20:58, Michael Freeman wrote: > > Hi, > I'm running a Kafka cluster

Re: KStream-to-KStream Join Example

2016-07-14 Thread Michael Noll
Vivek, another option for you is to replace the `map()` calls with `mapValues()`. Can you give that a try? Background: You are calling `map()` on your two input streams, but in neither of the two cases are you actually changing the key (it is always the userId before the map, and it stays the

Re: Kafka Streams reducebykey and tumbling window - need final windowed KTable values

2016-07-14 Thread Michael Noll
Srikanth, > This would be useful in place where we use a key-value store just to > duplicate a KTable for get() operations. > Any rough idea when this is targeted for release? We are aiming to add the queryable state feature into the next release of Kafka. > Its still not clear how to use this

Re: KTable DSL join

2016-07-14 Thread Michael Noll
Srikant, > Its not a case for join() as the keys don't match. Its more a lookup table. Yes, the semantics of streaming joins in Kafka Streams are bit different from joins in traditional RDBMS. See http://docs.confluent.io/3.0.0/streams/developer-guide.html#joining-streams. Also, a heads up: It

Re: KStream-to-KStream Join Example

2016-07-14 Thread Matthias J. Sax
Hi, Both streams need to be co-partitioned, ie, if you change the key of one join input, you need to re-partitioned this stream on the new key via .through(). You should create the topic you use in through() manually, before you start your Kafka Streams application. (For future release, this

Re: KTable DSL join

2016-07-14 Thread Matthias J. Sax
I would recommend re-partitioning as described in Option 2. -Matthias On 07/13/2016 10:53 PM, Srikanth wrote: > Hello, > > I'm trying the following join using KTable. There are two change log tables. > Table1 > 111 -> aaa > 222 -> bbb > 333 -> aaa > > Table2 > aaa -> 999 > bbb -> 888