Re: Convert a KStream to KTable

2016-10-07 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 That is correct. On 10/7/16 8:03 PM, Elias Levy wrote: > I am correct in assuming there is no way to convert a KStream into > a KTable, similar to KTable.toStream() but in the reverse > direction, other than using KSteam.reduceByKey and a Reducer

I found kafka lsot message

2016-10-07 Thread yangyuqi
Hello every body,I build a kafka cluster(include 5 domains) use kafka_2.11-0.10.0.0 and kafka-python-1.3.1.I create a topic by 100 partitions and 2 replicate, then use 20 consumers to receive message. But, I found sometime the kafka lost message! I found some partition's offsite lost at

Convert a KStream to KTable

2016-10-07 Thread Elias Levy
I am correct in assuming there is no way to convert a KStream into a KTable, similar to KTable.toStream() but in the reverse direction, other than using KSteam.reduceByKey and a Reducer or looping back through Kafka and using KStreamBuilder.table?

Re: Deleting a message after all consumer have consumed it

2016-10-07 Thread Ali Akhtar
Also, you can set a retention period and have messages get auto deleted after a certain time (default 1 week) On Sat, Oct 8, 2016 at 3:21 AM, Hans Jespersen wrote: > Kafka doesn’t work that way. Kafka is “Publish-subscribe messaging > rethought as a distributed commit log”.

Re: [VOTE] 0.10.1.0 RC0

2016-10-07 Thread Jason Gustafson
Quick update: I'm planning to cut a new RC on Monday due to https://issues.apache.org/jira/browse/KAFKA-4274. If you discover any new problems in the meantime, let me know on this thread. Thanks, Jason On Fri, Oct 7, 2016 at 9:36 AM, Vahid S Hashemian wrote: >

Re: Deleting a message after all consumer have consumed it

2016-10-07 Thread Hans Jespersen
Kafka doesn’t work that way. Kafka is “Publish-subscribe messaging rethought as a distributed commit log”. The messages in the log do not get deleted just because “all" clients have consumed the messages. Besides you could always have a late joining consumer come along and if you mistakenly

Re: In Kafka Streaming, Serdes should use Optionals

2016-10-07 Thread Ali Akhtar
Hey G, Looks like the only difference is a valueSerde parameter. How does that prevent having to look for nulls in the consumer? E.g, I wrote a custom Serde which converts the messages (which are json strings) into a Java class using Jackson. If the json parse fails, it sends back a null.

Deleting a message after all consumer have consumed it

2016-10-07 Thread Hysi, Lorenc
Hello, Thank you for your time. I wanted to ask whether it's possible to remove a message from a topic after making sure all consumers have gotten a it. If so what is the best way to achieve this? Also how do I make sure that all consumers have received a message. Any way to do this in

puncutuate() bug

2016-10-07 Thread David Garcia
Ok I found the bug. Basically, if there is an empty topic (in the list of topics being consumed), any partition-group with partitions from the topic will always return -1 as the smallest timestamp (see PartitionGroup.java). To reproduce, simply start a kstreams consumer with one or more empty

Data loss when ack != -1

2016-10-07 Thread Justin Lin
Hi everyone, I am currently running kafka 0.8.1.1 in a cluster, with 6 brokers and i set the replication factor to 3. My producer set the ack to be 2 when producing messages. I recently came across a bad situation that i had to reboot one broker machine by shutdown the power, and that caused data

Re: In Kafka Streaming, Serdes should use Optionals

2016-10-07 Thread Guozhang Wang
Hello Ali, We do have corresponding overloaded functions for most of KStream / KTable operators to avoid enforcing users to specify "null"; in these cases the default serdes specified in the configs are then used. For example: KTable aggregate(Initializer initializer,

Re: puncutuate() never called

2016-10-07 Thread David Garcia
Yeah, this is possible. We have run the application (and have confirmed data is being received) for over 30 mins…with a 60-second timer. So, do we need to just rebuild our cluster with bigger machines? -David On 10/7/16, 11:18 AM, "Michael Noll" wrote: David,

Re: [VOTE] 0.10.1.0 RC0

2016-10-07 Thread Vahid S Hashemian
Jason, Sure, I'll submit a patch for the trivial changes in the quick start. Do you recommend adding Windows version of commands along with the current commands? I'll also open a JIRA for the new consumer issue. --Vahid From: Jason Gustafson To:

Re: Same leader for all partitions for topics

2016-10-07 Thread David Garcia
Maybe someone already answered this…but you can use the repartitioner to fix that (it’s included with Kafka) As far as root cause, you probably had a few leader elections due to excessive latency. There is a cascading scenario that I noticed Kafka is vulnerable to. The events transpire as

Re: In Kafka Streaming, Serdes should use Optionals

2016-10-07 Thread Michael Noll
Ali, the Apache Kafka project still targets Java 7, which means we can't use Java 8 features just yet. FYI: There's on ongoing conversation about when Kafka would move from Java 7 to Java 8. On Fri, Oct 7, 2016 at 6:14 PM, Ali Akhtar wrote: > Since we're using Java 8 in

Re: puncutuate() never called

2016-10-07 Thread Michael Noll
David, punctuate() is still data-driven at this point, even when you're using the WallClock timestamp extractor. To use an example: Imagine you have configured punctuate() to be run every 5 seconds. If there's no data being received for a minute, then punctuate won't be called -- even though

In Kafka Streaming, Serdes should use Optionals

2016-10-07 Thread Ali Akhtar
Since we're using Java 8 in most cases anyway, Serdes / Serialiazers should use options, to avoid having to deal with the lovely nulls.

Re: [VOTE] 0.10.1.0 RC0

2016-10-07 Thread Jason Gustafson
> > I suggest not having a "Fix version" set for issues that don't fix anything > (it's not part of any release really). Yeah, good call. On Fri, Oct 7, 2016 at 8:59 AM, Ismael Juma wrote: > On Fri, Oct 7, 2016 at 4:56 PM, Jason Gustafson > wrote: > > >

Re: Printing to stdin from KStreams?

2016-10-07 Thread Michael Noll
Nice idea. :-) Happy to hear it works for you, and thanks for sharing your workaround with the mailing list. On Fri, Oct 7, 2016 at 5:33 PM, Ali Akhtar wrote: > Thank you. > > I've resolved this by adding a run config in Intellij for running > streams-reset, and using the

puncutuate() never called

2016-10-07 Thread David Garcia
Hello, I’m sure this question has been asked many times. We have a test-cluster (confluent 3.0.0 release) of 3 aws m4.xlarges. We have an application that needs to use the punctuate() function to do some work on a regular interval. We are using the WallClock extractor. Unfortunately, the

Re: [VOTE] 0.10.1.0 RC0

2016-10-07 Thread Jason Gustafson
@Vahid Thanks, do you want to submit a patch for the quickstart fixes? We won't need another RC if it's just doc changes. The exception is a little more troubling. Perhaps open a JIRA and we can begin investigation? It's especially strange that you say it's specific to the new consumer. @Henry

Re: [VOTE] 0.10.1.0 RC0

2016-10-07 Thread Ismael Juma
On Fri, Oct 7, 2016 at 4:56 PM, Jason Gustafson wrote: > @Vahid Thanks, do you want to submit a patch for the quickstart fixes? We > won't need another RC if it's just doc changes. The exception is a little > more troubling. Perhaps open a JIRA and we can begin investigation?

Re: Printing to stdin from KStreams?

2016-10-07 Thread Ali Akhtar
Thank you. I've resolved this by adding a run config in Intellij for running streams-reset, and using the same application id in all applications in development (transparently reading the application id from environment variables, so in my kubernetes config I can specify different app ids for

Problems encountered during the consumer shutdown.

2016-10-07 Thread Maria Khan
I am using spring integration ,kafka inbound channel adapter.while closing the consumer app ,i close the adapter and call Metrics.defaultRegistry().shutdown().I get the following exception .SEVERE [ContainerBackgroundProcessor[StandardEngine[Catalina]]]

Kafka consumer not getting old messages

2016-10-07 Thread Ajinkya Joshi
Hi, My kafka consumer is not receiving older messages which were published prior to the consumer bootup. *Setup -* - I am using new consumer paradigm using kafka 0.10. - This is the only consumer in that consumer group. - log.retention.hours=168 Following is the *output on

Re: Printing to stdin from KStreams?

2016-10-07 Thread Michael Noll
> Is it possible to have kafka-streams-reset be automatically called during > development? Something like streams.cleanUp() but which also does reset? Unfortunately this isn't possible (yet), Ali. I am also not aware of any plan to add such a feature in the short-term. On Fri, Oct 7, 2016 at

Re: kafka stream to new topic based on message key

2016-10-07 Thread Michael Noll
Great, happy to hear that, Gary! On Fri, Oct 7, 2016 at 3:30 PM, Gary Ogden wrote: > Thanks for all the help gents. I really appreciate it. It's exactly what I > needed. > > On 7 October 2016 at 06:56, Michael Noll wrote: > > > Gary, > > > > adding to

Re: kafka stream to new topic based on message key

2016-10-07 Thread Gary Ogden
Thanks for all the help gents. I really appreciate it. It's exactly what I needed. On 7 October 2016 at 06:56, Michael Noll wrote: > Gary, > > adding to what Guozhang said: Yes, you can programmatically create a new > Kafka topic from within your application. But how

Re: offset topics growing huge

2016-10-07 Thread Tom Crayford
On Mon, Oct 3, 2016 at 5:38 PM, Ali Akhtar wrote: > Newbie question, but what exactly does log.cleaner.enable=true do, and how > do I know if I need to set it to be true? > If you're using any compacted topics (including __consumer_offsets), it needs to be on. > > Also,

Re: Delayed Queue usecase

2016-10-07 Thread Tom Crayford
Kafka doesn't support time delays at all, no. On Thu, Oct 6, 2016 at 12:14 AM, Akshay Joglekar < akshay.jogle...@danalinc.com> wrote: > Hi, > > I have a use case where I need to process certain messages only after a > certain amount time has elapsed. Does Kafka have any support for time >

Using topic or partition for large size of consumers (subscribers)?

2016-10-07 Thread jupiter
Hi, I can run a producer to send data to a particular partition within a topic by specifying the topic ID and the partition ID, then a consumer can receive a particular partition within a topic by subscribing with the topic ID and partition ID. If I have n consumers, each subscribes to a

Re: Printing to stdin from KStreams?

2016-10-07 Thread Ali Akhtar
Is it possible to have kafka-streams-reset be automatically called during development? Something like streams.cleanUp() but which also does reset? On Fri, Oct 7, 2016 at 2:45 PM, Michael Noll wrote: > Ali, > > adding to what Matthias said: > > Kafka 0.10 changed the

Re: kafka stream to new topic based on message key

2016-10-07 Thread Michael Noll
Gary, adding to what Guozhang said: Yes, you can programmatically create a new Kafka topic from within your application. But how you'd do that varies a bit between current Kafka versions and the upcoming 0.10.1 release. As of today (Kafka versions before the upcoming 0.10.1 release), you would

Re: Printing to stdin from KStreams?

2016-10-07 Thread Michael Noll
Ali, adding to what Matthias said: Kafka 0.10 changed the message format to add so-called "embedded timestamps" into each Kafka message. The Java producer included in Kafka 0.10 includes such embedded timestamps into any generated message as expected. However, other clients (like the go kafka

Same leader for all partitions for topics

2016-10-07 Thread Misra, Rahul
Hi, I have been using a 3 node kafka cluster for development for some time now. I have created some topics on this cluster. Yesterday I observed the following when I used 'describe' for the topics: The Kafka version I'm using is: 9.0.1 (kafka_2.11-0.9.0.1). Topic:topicIc PartitionCount:3

Re: kafka stream to new topic based on message key

2016-10-07 Thread Guozhang Wang
If you can create a ZK client inside your processor implementation then you can definitely to create any topics by talking to ZK directly, it's just that Kafka Streams public interface does not expose any efficient ways beyond that for now. Note that in KIP-4 we are trying to introduce the admin