Re: Do consumer offsets stored in zookeeper ever get cleaned up?

2016-06-04 Thread Ewen Cheslack-Postava
You would do this manually with the ConsumerGroupCommand (which also allows you to do deletion of offsets just by topic). -Ewen On Thu, May 19, 2016 at 4:16 PM, James Cheng wrote: > I know that when offsets get stored in Kafka, they get cleaned up based on > the

Re: Suggestions of pulling local application logs into Kafka

2016-06-04 Thread Ewen Cheslack-Postava
Kafka Connect can definitely be used for this -- it's one of the reasons we designed it with standalone mode ( http://docs.confluent.io/3.0.0/connect/userguide.html#workers). For the specific connector, we include a very simple File connector with Kafka which will just take each line and send it

Re: Kafka Connect: fork process from a SinkTask ?

2016-06-04 Thread Ewen Cheslack-Postava
I can't think of anything that would break except that your connector may not be able to run in some environments if certain syscalls are restricted. -Ewen On Wed, May 11, 2016 at 6:05 PM, Dean Arnold wrote: > I need to run an external filter program from a SinkTask. Is

Re: Resetting the Offset of a Kafka Sink Connector

2016-06-04 Thread Ewen Cheslack-Postava
Connectors don't perform any data copying and don't rewind offsets -- that's the job of Tasks. In your SinkTask implementation you have access to the SinkTaskContext via its context field. -Ewen On Tue, May 31, 2016 at 9:47 AM, Jack Lund wrote: > Yes, the one

Re: Kafka Windows Support

2016-06-04 Thread Ewen Cheslack-Postava
Microsoft runs Kafka on Windows at large scale: https://twitter.com/nehanarkhede/status/667903877769891840 -Ewen On Tue, May 17, 2016 at 7:20 PM, Murthy Kakarlamudi wrote: > Hello, > Have a question in installing Kafka on windows. Our server farm is > totally windows

Re: Yet another .NET client

2016-06-04 Thread Ewen Cheslack-Postava
Added to the clients page here: https://cwiki.apache.org/confluence/display/KAFKA/Clients Thanks! -Ewen On Wed, Jun 1, 2016 at 7:45 AM, Serge Danzanvilliers < serge.danzanvilli...@gmail.com> wrote: > Hi, > > Criteo has open sourced its Kafka .NET client. The driver focuses on the > producer but

Re: kafka connect - fetch avro data from the SinkRecord put method

2016-06-04 Thread Ewen Cheslack-Postava
There isn't currently a way to get at the intermediate Avro formatted data -- the point of Connect's generic data API is to decouple the connector implementations from the details of (de)serialization. This allows connectors to work with data written to Kafka in a variety of data formats without

Re: Are key.converter.schemas.enable and value.converter.schemas.enable of any use in Kafka connector?

2016-06-04 Thread Ewen Cheslack-Postava
key.converter and value.converter are namespace prefixes in this case. These settings are used by the JsonConverter https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L53 If schemas are enabled, all JSON messages are sent using an

Re: Does the Kafka Streams DSL support non-Kafka sources/sinks?

2016-06-04 Thread Ewen Cheslack-Postava
And to add yet more, re: usage of Connect. You're right that the custom websocket API would require a custom connector. I'd still suggest considering it, it takes care of all the Kafka pieces for you so all you need to do is write the WebSocket API adapter. For the database side, custom schemas

Re: Schema registry question

2016-06-04 Thread Ewen Cheslack-Postava
In any case, to answer the question, I think there's just an omission in the docs. Getting by subject + version (GET /subjects/{subject}/versions/{version} - http://docs.confluent.io/3.0.0/schema-registry/docs/api.html#get--subjects-%28string-%20subject%29-versions-%28versionId-%20version%29) also

Re: Kafka behind a load balancer

2016-06-04 Thread Ewen Cheslack-Postava
Note, however, that a load balancer can be useful for bootstrapping purposes, i.e. use it for the bootstrap.servers setting to have a single consistent value for the setting but allow the broker list to change over time. From there, as Tom says, it'll start using broker hostnames and automatically

Re: macbook air and kafka

2016-06-04 Thread Ewen Cheslack-Postava
Connect and Streams are both java and compile very quickly -- almost all build time is in Scala. There are other things that can affect this too and may be one-offs, e.g. gradle building its caches can be slow, but after the first build is incremental and cheaper. -Ewen On Sat, May 28, 2016 at

Re: Kafka behind a load balancer

2016-06-04 Thread Todd Palino
Yep, what Ewen said. We have all of our Kafka clusters behind hardware load balancers. Producers (and eventually consumers, once we switch to the new consumer) get configured with those VIPs. It’s better than providing a list of brokers for the cluster, because we often change the particular

Maintaining message ordering using KafkaSpout/Bolt

2016-06-04 Thread Kanagha
Hi, I'm looking at the documentation for using KafkaSpout/KafkaBolt. https://github.com/apache/storm/tree/master/external/storm-kafka How is ordering guaranteed while reading messages from Kafka using KafkaSpout? Does the parallelism_hint set when a KafkaSpout is added to a topology, need to