Re: Http Kafka producer

2015-08-26 Thread Ewen Cheslack-Postava
schema registry ? --regards Hemanth -Original Message- From: Ewen Cheslack-Postava [mailto:e...@confluent.io] Sent: Thursday, August 27, 2015 9:14 AM To: users@kafka.apache.org Subject: Re: Http Kafka producer Hemanth, Can you be a bit more specific about your setup? Do you have

Re: 0.8.2 consumer api docs

2015-09-06 Thread Ewen Cheslack-Postava
Those APIs are not implemented in 0.8.2. They were included because the APIs were being iterated on, but the implementation wasn't there yet. You can expect to see those APIs (with some modifications as they've been refined) in 0.8.3. -Ewen On Sun, Sep 6, 2015 at 10:42 AM, Phil Steitz

Re: Question regarding to reconnect.backoff.ms

2015-09-02 Thread Ewen Cheslack-Postava
Steve, I don't think there is a better solution at the moment. This is an easy issue to miss in unit testing because generally connections to localhost will be rejected immediately if there isn't anything listening on the port. If you're running in an environment where this happens normally, then

Re: Kafka

2015-09-02 Thread Ewen Cheslack-Postava
Muqtafi, There are corresponding Java or Scala classes. You can use them directly, but beware that they are not considered public interfaces, so there are no promises about compatibility. They could completely change between releases. (The command line tools themselves, however, are considered

Re: [VOTE] 0.8.2.2 Candidate 1

2015-09-09 Thread Ewen Cheslack-Postava
+1 non-binding. Verified artifacts, unit tests, quick start. On Wed, Sep 9, 2015 at 10:09 AM, Guozhang Wang wrote: > +1 binding, verified unit tests and quick start. > > On Wed, Sep 9, 2015 at 4:12 AM, Manikumar Reddy > wrote: > > > +1 (non-binding).

Re: How do I achieve round robin based partitioning for topic?

2015-09-21 Thread Ewen Cheslack-Postava
Are you using the old or new producer? That sounds like the behavior the old producer had -- it would stick to the same partition for awhile (10 minutes if I remember correctly). The new producer does not have this behavior, preferring to round-robin the *available* brokers. Note that since it

Re: kafka connect(copycat) question

2015-12-08 Thread Ewen Cheslack-Postava
Svante, Just to clarify, the HDFS connector relies on some Avro translation code which is in a separate repository. You need the https://github.com/confluentinc/schema-registry repository built before the kafka-connector-hdfs repository to get that dependency. Confluent has now also released

Re: flush() vs close()

2015-12-01 Thread Ewen Cheslack-Postava
Kashif, The difference is that close() will also shut down the producer such that it can no longer send any messages. flush(), in contrast, is useful if you want to make sure that all the messages enqueued so far have been sent and acked, but also want to send more messages after that. -Ewen On

Re: kafka connect(copycat) question

2015-12-10 Thread Ewen Cheslack-Postava
uent.io/2.0.0/connect/devguide.html > > Thanks, > Roman > > > > On Wednesday, November 11, 2015 6:59 AM, Ewen Cheslack-Postava < > e...@confluent.io> wrote: > Hi Venkatesh, > > If you're using the default settings included in the sample configs, it'll > expect

Re: kafka-connect-jdbc: ids, timestamps, and transactions

2015-12-16 Thread Ewen Cheslack-Postava
Mark, There are definitely limitations to using JDBC for change data capture. Using a database-specific implementation, especially if you can read directly off the database's log, will be able to handle more situations like this. Cases like the one you describe are difficult to address

Re: Mirrormaker issue with Kafka 0.9 (confluent platform 2.0)

2015-12-10 Thread Ewen Cheslack-Postava
Meghana, It looks like this functionality was removed in https://issues.apache.org/jira/browse/KAFKA-1650, although I don't see explicit discussion of the issue in the JIRA so I'm not sure of the exact motivation. Maybe Becket or Guozhang can offer some insight about if it is necessary (that JIRA

Re: Kafka-Rest question

2016-01-07 Thread Ewen Cheslack-Postava
ET client, but in the proxy it > doesn’t appear to work like that. > > -Original Message- > From: Ewen Cheslack-Postava [mailto:e...@confluent.io] > Sent: Thursday, January 07, 2016 1:18 PM > To: users@kafka.apache.org > Subject: Re: Kafka-Rest question > > On Thu, Jan 7, 2016 at 12

Re: Server Logs

2016-01-01 Thread Ewen Cheslack-Postava
Chandra, If its a separate app to collect logs, it would presumably run on the same server IIS is running on since that's where logs would be generated. -Ewen On Tue, Dec 29, 2015 at 9:22 PM, chandra sekar wrote: > Dear Ewen, > Where do i run the separate

Re: kafka-producer-perf-test.sh - 0.8.2.1

2016-01-08 Thread Ewen Cheslack-Postava
t; have yet to find the right one. > > Thanks! > > Andrew > > On Fri, Jan 8, 2016 at 10:54 AM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Andrew, > > > > kafka-producer-perf-test.sh is just a wrapper around > > orga.apache.kafka.c

Re: [ANN] kinsky: clojure 0.9.0 client

2016-01-07 Thread Ewen Cheslack-Postava
Looks good! I've added it to the clients page here: https://cwiki.apache.org/confluence/display/KAFKA/Clients -Ewen On Thu, Jan 7, 2016 at 9:12 AM, Dana Powers wrote: > Very nice! > On Jan 7, 2016 04:41, "Pierre-Yves Ritschard" wrote: > > > Hi list, >

Re: kafka-producer-perf-test.sh - 0.8.2.1

2016-01-08 Thread Ewen Cheslack-Postava
-class.sh > org.apache.kafka.clients.tools.ProducerPerformance > is single threaded? Or is there any way to specify number of threads? > > On Fri, Jan 8, 2016 at 1:24 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Ah, sorry, I missed the version number in your title. I think this too

Re: Server Logs

2015-12-28 Thread Ewen Cheslack-Postava
Chandra, If you're just serving files from IIS and want to collect logs, you'll probably want to run a separate application to collect the log files and report each log entry to Kafka. If you're running a web application, you can use the producer yourself to report events to Kafka. -Ewen On

Re: Protocol version upgrades in 0.9

2015-12-28 Thread Ewen Cheslack-Postava
is if you want to be able to mix consumers using different libraries in the same consumer group (consumers in different groups using different libraries should always be fine). -Ewen On Mon, Dec 28, 2015 at 4:16 PM, Ewen Cheslack-Postava <e...@confluent.io> wrote: > Yes, version=0 should

Re: Protocol version upgrades in 0.9

2015-12-23 Thread Ewen Cheslack-Postava
Oleksiy, Where are you specifying the version? Unless I'm missing something, the JoinGroup protocol doesn't include versions so I'm not sure I understand the examples you are giving. Are the version numbers included in the per-protocol metadata? You can see exactly how the consumer coordinator

Re: [kafka-clients] The JIRA Awakens [KAFKA-1841]

2015-12-23 Thread Ewen Cheslack-Postava
Dana, Not sure about the old merge script, but with the new one used for GitHub PRs it tracks the branches you choose to commit to and can update the JIRA automatically, tagging the appropriate fix versions. So from now on, the appropriate JIRA queries for issues with, e.g., fix version 0.9.0.1

Re: Gradle build error

2015-12-23 Thread Ewen Cheslack-Postava
What version of Gradle are you using and can you give the exact command you're running? -Ewen On Wed, Dec 23, 2015 at 5:49 PM, Oliver Pačut wrote: > Hello, > > I am having trouble using Gradle to build Kafka. I get the error: > > > FAILURE: Build failed with an

Re: compatibility: 0.8.1.1 broker, 0.8.2.2 producer

2015-12-23 Thread Ewen Cheslack-Postava
producer with 0.8.1.1 brokers without problems. > > Version of scala matters if you are building with scala or some other > > components that use scala. > > Hope this helps. > > > > -- > > Andrey Yegorov > > > > On Wed, Dec 23, 2015 at 1:11 PM, Ew

Re: Protocol version upgrades in 0.9

2015-12-24 Thread Ewen Cheslack-Postava
----- > > Maybe I misunderstood the purpose of this version field? > > On Thu, 24 Dec 2015 at 00:27 Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Oleksiy, > > > > Where are you specifying the version? Unless I'm missing something, th

Re: 409 Conflict

2016-01-12 Thread Ewen Cheslack-Postava
Is the consumer registration failing, or the subsequent calls to read from the topic? From the error, it sounds like the latter -- a conflict during registration should generate a 40902 error. Can you give more info about the sequence of requests that causes the error? The set of commands you

Re: Do consumer offsets stored in zookeeper ever get cleaned up?

2016-06-04 Thread Ewen Cheslack-Postava
You would do this manually with the ConsumerGroupCommand (which also allows you to do deletion of offsets just by topic). -Ewen On Thu, May 19, 2016 at 4:16 PM, James Cheng wrote: > I know that when offsets get stored in Kafka, they get cleaned up based on > the

Re: Suggestions of pulling local application logs into Kafka

2016-06-04 Thread Ewen Cheslack-Postava
Kafka Connect can definitely be used for this -- it's one of the reasons we designed it with standalone mode ( http://docs.confluent.io/3.0.0/connect/userguide.html#workers). For the specific connector, we include a very simple File connector with Kafka which will just take each line and send it

Re: Kafka Connect: fork process from a SinkTask ?

2016-06-04 Thread Ewen Cheslack-Postava
I can't think of anything that would break except that your connector may not be able to run in some environments if certain syscalls are restricted. -Ewen On Wed, May 11, 2016 at 6:05 PM, Dean Arnold wrote: > I need to run an external filter program from a SinkTask. Is

Re: Resetting the Offset of a Kafka Sink Connector

2016-06-04 Thread Ewen Cheslack-Postava
Connectors don't perform any data copying and don't rewind offsets -- that's the job of Tasks. In your SinkTask implementation you have access to the SinkTaskContext via its context field. -Ewen On Tue, May 31, 2016 at 9:47 AM, Jack Lund wrote: > Yes, the one

Re: Kafka Windows Support

2016-06-04 Thread Ewen Cheslack-Postava
Microsoft runs Kafka on Windows at large scale: https://twitter.com/nehanarkhede/status/667903877769891840 -Ewen On Tue, May 17, 2016 at 7:20 PM, Murthy Kakarlamudi wrote: > Hello, > Have a question in installing Kafka on windows. Our server farm is > totally windows

Re: Yet another .NET client

2016-06-04 Thread Ewen Cheslack-Postava
Added to the clients page here: https://cwiki.apache.org/confluence/display/KAFKA/Clients Thanks! -Ewen On Wed, Jun 1, 2016 at 7:45 AM, Serge Danzanvilliers < serge.danzanvilli...@gmail.com> wrote: > Hi, > > Criteo has open sourced its Kafka .NET client. The driver focuses on the > producer but

Re: kafka connect - fetch avro data from the SinkRecord put method

2016-06-04 Thread Ewen Cheslack-Postava
There isn't currently a way to get at the intermediate Avro formatted data -- the point of Connect's generic data API is to decouple the connector implementations from the details of (de)serialization. This allows connectors to work with data written to Kafka in a variety of data formats without

Re: Are key.converter.schemas.enable and value.converter.schemas.enable of any use in Kafka connector?

2016-06-04 Thread Ewen Cheslack-Postava
key.converter and value.converter are namespace prefixes in this case. These settings are used by the JsonConverter https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L53 If schemas are enabled, all JSON messages are sent using an

Re: Does the Kafka Streams DSL support non-Kafka sources/sinks?

2016-06-04 Thread Ewen Cheslack-Postava
And to add yet more, re: usage of Connect. You're right that the custom websocket API would require a custom connector. I'd still suggest considering it, it takes care of all the Kafka pieces for you so all you need to do is write the WebSocket API adapter. For the database side, custom schemas

Re: Schema registry question

2016-06-04 Thread Ewen Cheslack-Postava
In any case, to answer the question, I think there's just an omission in the docs. Getting by subject + version (GET /subjects/{subject}/versions/{version} - http://docs.confluent.io/3.0.0/schema-registry/docs/api.html#get--subjects-%28string-%20subject%29-versions-%28versionId-%20version%29) also

Re: Kafka behind a load balancer

2016-06-04 Thread Ewen Cheslack-Postava
Note, however, that a load balancer can be useful for bootstrapping purposes, i.e. use it for the bootstrap.servers setting to have a single consistent value for the setting but allow the broker list to change over time. From there, as Tom says, it'll start using broker hostnames and automatically

Re: macbook air and kafka

2016-06-04 Thread Ewen Cheslack-Postava
Connect and Streams are both java and compile very quickly -- almost all build time is in Scala. There are other things that can affect this too and may be one-offs, e.g. gradle building its caches can be slow, but after the first build is incremental and cheaper. -Ewen On Sat, May 28, 2016 at

Re: Not able to monitor consumer group lag with new consumer and kerberos

2016-06-07 Thread Ewen Cheslack-Postava
Pierre, I think you'll need the rest of the security-related configs you'd use for a normal client as well. You can use the --command-config flag to include additional settings stored in a property file. -Ewen On Tue, Jun 7, 2016 at 8:57 AM, Pierre LABIAUSSE wrote: >

Re: Rate that connect delivers messages

2016-06-09 Thread Ewen Cheslack-Postava
Barry, It might help to know whether you're hitting a (single threaded) CPU limit or if the bottleneck is elsewhere. Also, how large on average are the messages you are consuming? There's nothing that'll force batching like you're talking about. You can tweak any consumer settings via

Re: Using Multiple Kafka Producers for a single Kafka Topic

2016-05-25 Thread Ewen Cheslack-Postava
On Mon, Apr 25, 2016 at 6:34 AM, Joe San wrote: > I have an application that is currently running and is using Rx Streams to > move data. Now in this application, I have a couple of streams whose > messages I would like to write to a single Kafka topic. Given this, I

Re: Kafka sink connector to Hbase

2016-05-25 Thread Ewen Cheslack-Postava
Sorry for the slow reply -- thanks, and it has been listed here: http://www.confluent.io/product/connectors -Ewen On Fri, Apr 29, 2016 at 10:52 AM, Ravi Kiran wrote: > Hi , > I came up with a sink connector to HBase which is available at >

Re: Regex topics in kafka connect?

2016-06-11 Thread Ewen Cheslack-Postava
Barry, Actually, it's not exposed in 0.10 either. https://issues.apache.org/jira/browse/KAFKA-3073 is filed to track this. We know people will want this, it just hasn't made it in quite yet. -Ewen On Fri, Jun 10, 2016 at 4:40 AM, Barry Kaplan wrote: > The kafka connect

Re: Running kafka connector application

2016-06-13 Thread Ewen Cheslack-Postava
Kanagha, I'm not sure about that particular connector, but normally the build script would provide support for collecting the necessary dependencies. Then all you need to do is add something like /path/to/connector-and-deps/* to your classpath and it shouldn't be affected by versions in the

Re: session.timeout.ms was supplied but isn't a known config

2016-06-13 Thread Ewen Cheslack-Postava
Without more of the log its hard to say, but I think you're probably seeing this reported as those configs are passed from the root config -> client configs -> serializers. There were some issues in the way we were tracking if config values had been used or not and Connect in particular was

Re: Starting Kafka Connector via JMX

2016-06-14 Thread Ewen Cheslack-Postava
No. Connectors are started either by passing properties files on the command line (standalone mode) or by submitting connectors to the REST API (in either standalone or distributed mode). -Ewen On Mon, Jun 13, 2016 at 5:42 PM, Abhinav Solan wrote: > Hi Everyone, > > Is

Re: Kafka Connect HdfsSink and the Schema Registry

2016-06-14 Thread Ewen Cheslack-Postava
On Tue, Jun 14, 2016 at 8:08 AM, Tauzell, Dave wrote: > I have been able to get my C# client to put avro records to a Kafka topic > and have the HdfsSink read and save them in files. I am confused about > interaction with the registry. The kafka message contains

Re: Kafka Connect HdfsSink and Poison Messages

2016-06-14 Thread Ewen Cheslack-Postava
Not today, although that's something we might want to add support for at the framework level (publish to a Kafka dead letter topic) and just provide hooks for to sinks so they don't all have to handle that case. Today the solution would be to reconfigure your connector/worker so it can handle the

Re: How to gracefully shutdown Kafka Connector

2016-06-14 Thread Ewen Cheslack-Postava
There's no API for connectors to shut themselves down because that doesn't really fit the streaming model that Kafka Connect works with -- it isn't a batch processing system. If you want to shut down a connector, you'd normally accomplish this via the REST API. Technically you *could* accomplish

Re: consumer.poll() takes approx. 30 seconds - 0.9 new consumer api

2016-06-19 Thread Ewen Cheslack-Postava
Rohit, The 30s number sounds very suspicious because it is exactly the value of the session timeout. But if you are driving the consumer correctly, you shouldn't normally hit this timeout. Dana was asking about consumers leaving gracefully because that is one case where you can inadvertently

Re: General Question About Kafka

2016-06-19 Thread Ewen Cheslack-Postava
The most common use case for Kafka is within a data center, but you can absolutely produce data across the WAN. You may need to adjust some settings (e.g. timeouts, max in flight requests per connection if you want high throughput) to account for operating over the WAN, but you can definitely do

Re: kafka-producer-perf-test.sh - 0.8.2.1

2016-01-08 Thread Ewen Cheslack-Postava
Andrew, kafka-producer-perf-test.sh is just a wrapper around orga.apache.kafka.clients.tools.ProducerPerformance and all command line options should be forwarded. Can you just pass a --producer-props to set max.request.size to a larger value? -Ewen On Fri, Jan 8, 2016 at 7:51 AM, Andrej

Re: Possible bug in Kafka Connect - Schema modified internally

2016-02-08 Thread Ewen Cheslack-Postava
If you're using the JsonConverter, it looks like you're seeing https://issues.apache.org/jira/browse/KAFKA-3055. It's been fixed and will be included in 0.9.0.1 (and 0.9.1.0, which is trunk). -Ewen On Sun, Feb 7, 2016 at 8:34 PM, Shiti Saxena wrote: > Hi, > > When

Re: WELCOME

2016-02-08 Thread Ewen Cheslack-Postava
If it's taking that long, you may be working on hardware (or a VM?) which is too underpowered to run some of the tests reliably. The README.md has instructions for how to build different components. In your case you want ./gradlew core:test Depending on your hardware, you may want to adjust the

Re: Number of concurrent consumers per data node

2016-02-05 Thread Ewen Cheslack-Postava
On Wed, Feb 3, 2016 at 3:57 PM, Shane MacPhillamy wrote: > Hi > > I’m just coming up to speed with Kafka. Some beginner questions, may be > point me to where I can find the answers please: > > 1. In a Kafka cluster what determines the maximum number of concurrent >

Re: Getting started with 0.9 client custom serialization

2016-02-05 Thread Ewen Cheslack-Postava
Gary, Here are a few concrete examples from Kafka and Confluent Platform: JSON (baked into Kafka Connect, not specifically designed for standalone serialization but they should work for that):

Re: kafka “stops working” after a large message is enqueued

2016-02-05 Thread Ewen Cheslack-Postava
The default max message size is 1MB. You'll probably need to increase a few settings -- the topic max message size on a per-topic basis on the broker (or broker-wide with message.max.bytes), the max.partition.fetch.bytes on the new consumer, etc. You need to make sure all of the producer, broker,

Re: Kafka Consumer for 0.8.x.x

2016-02-09 Thread Ewen Cheslack-Postava
:36 PM, Joe San <codeintheo...@gmail.com> wrote: > Could anyone point me to an older version of the consumer client that I > could use to run against the 0.8.2 version of Kafka? > > On Tue, Feb 9, 2016 at 6:57 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: &g

Re: Kafka Consumer for 0.8.x.x

2016-02-09 Thread Ewen Cheslack-Postava
. -Ewen On Tue, Feb 9, 2016 at 1:16 PM, Joe San <codeintheo...@gmail.com> wrote: > Can I do automatic offset commit using the highlevel consumer? If so, where > is the offset being comitted? > > On Tue, Feb 9, 2016 at 10:13 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrot

Re: Kafka Consumer for 0.8.x.x

2016-02-09 Thread Ewen Cheslack-Postava
The new consumer wasn't implemented until 0.9.0.0. The API was sketched out, but no implementation had been included yet. -Ewen On Tue, Feb 9, 2016 at 8:00 AM, Joe San wrote: > Is this intentioal in the Kafka 0.8.2.0 version, >

Re: Kafka Rest Proxy health check URL

2016-02-09 Thread Ewen Cheslack-Postava
Yeah, you definitely want one that's cheaper than /topics. The root resource / is effectively a nop you can check for liveness of the service (but not for, e.g., more general health like connectivity to ZK or Kafka). The most it will ever do is return a list of available subresources, so it should

Re: Announcing ruby-kafka v0.1

2016-02-05 Thread Ewen Cheslack-Postava
Daniel, Awesome, Ruby folks could use more Kafka love! I added the library to the clients list here: https://cwiki.apache.org/confluence/display/KAFKA/Clients#Clients-Ruby I'm also cc'ing this to the clients list since I think they'd be interested as well. Lots of folks are using the Java

Re: MongoDB Kafka Connect driver

2016-01-29 Thread Ewen Cheslack-Postava
Sunny, As I said on Twitter, I'm stoked to hear you're working on a Mongo connector! It struck me as a pretty natural source to tackle since it does such a nice job of cleanly exposing the op log. Regarding the problem of only getting deltas, unfortunately there is not a trivial solution here --

Re: MongoDB Kafka Connect driver

2016-01-29 Thread Ewen Cheslack-Postava
ou get the current value not > every necessarily every intermediate) but that should be okay for most > uses. > > -Jay > > On Fri, Jan 29, 2016 at 8:54 AM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Sunny, > > > > As I said on Twitter, I'm stoked t

Re: Accumulating data in Kafka Connect source tasks

2016-01-29 Thread Ewen Cheslack-Postava
On Fri, Jan 29, 2016 at 7:06 AM, Randall Hauch <rha...@gmail.com> wrote: > On January 28, 2016 at 7:07:02 PM, Ewen Cheslack-Postava ( > e...@confluent.io) wrote: > > Randall, > > Great question. Ideally you wouldn't need this type of state since it > should really

Re: Accumulating data in Kafka Connect source tasks

2016-01-28 Thread Ewen Cheslack-Postava
Randall, Great question. Ideally you wouldn't need this type of state since it should really be available in the source system. In your case, it might actually make sense to be able to grab that information from the DB itself, although that will also have issues if, for example, there have been

Re: Shutting down Producer

2016-01-27 Thread Ewen Cheslack-Postava
If you don't shut it down properly and there are outstanding requests (e.g. if you call producer.send() and don't call get() on the returned future), then you could potentially lose data. Calling producer.close() flushes all the data before returning, so shutting down properly ensures no data will

Re: Only interested in certain partitions

2016-01-27 Thread Ewen Cheslack-Postava
One option is to instantiate and invoke the DefaultPartitioner yourself (or whatever partitioner you've specified for partitioner.class). However, that will require passing in a Cluster object, which you'll need to construct yourself. This is just used to get the number of partitions for the topic

Re: Kafka connect HDFS conenctor

2016-02-23 Thread Ewen Cheslack-Postava
Consuming plain JSON is a bit tricky for something like HDFS because all the output formats expect the data to have a schema. You can read the JSON data with the provided JsonConverter, but it'll be returned without a schema. The HDFS connector will currently fail on this because it expects a

Re: new producer failed with org.apache.kafka.common.errors.TimeoutException

2016-02-23 Thread Ewen Cheslack-Postava
Kris, This is a bit surprising, but handling the bootstrap servers, broker failures/retirement, and cluster metadata properly is surprisingly hard to get right! https://issues.apache.org/jira/browse/KAFKA-1843 explains some of the challenges. https://issues.apache.org/jira/browse/KAFKA-3068

Re: Having multi-threaded Kafka Consumer per partition, is it possible and recommended, if so any sample snippet?

2016-01-21 Thread Ewen Cheslack-Postava
Hi, The new consumer is single threaded. You can layer multi-threaded processing on top of it, but you'll definitely need to be careful about how offset commits are handled to ensure a) processing of a message is actually *complete* not just passed off to another thread before committing an

Re: Behaviour of KafkaConsumer.poll(long)

2016-01-26 Thread Ewen Cheslack-Postava
It's not an iterator (ConsumerRecords is a collection of records), but you also won't just get the entire set of messages all at once. You would have the same issue if you set auto.offset.reset to earliest for a new consumer -- everything that's in the topic will need to be consumed. Under the

Re: Zookeeper to Broker Count

2016-01-26 Thread Ewen Cheslack-Postava
No, you don't need to keep adding ZK nodes. You should have a 3 or 5 node ZK cluster. The more nodes you use, the slower write performance becomes, so adding more can hurt performance of any ZK-related operations. The tradeoff between 3 and 5 ZK nodes is fault tolerance (better with 5) vs write

Re: Alternate persistence for Kafka

2016-01-26 Thread Ewen Cheslack-Postava
Nikhil, You should search the mailing list archives, but I'm not aware of any discussion around that. If you wanted to try something like that, you might be able to accomplish it via FUSE or similar. For example, this page lists ways you can mount HDFS as a normal filesystem, including fuse-based

Re: 0.9.0.1 RC1

2016-02-15 Thread Ewen Cheslack-Postava
Yeah, I saw kafka.network.SocketServerTest > tooBigRequestIsRejected FAILED java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) at

Re: Re: [kafka-clients] java clients problem

2016-02-10 Thread Ewen Cheslack-Postava
Iman, The problem may or may not be with serialization -- it looks like it can't even find the SerializationException class, which is in the same jar as every other Kafka clients class. You probably just need to adjust your CLASSPATH to make sure the kafka-clients jar is available at runtime.

Re: SimpleConsumerShell not honouring all options

2016-03-08 Thread Ewen Cheslack-Postava
The characters for the first dash in both --max-messages and --print-offsets doesn't look like a standard '-', is it possible those options simply aren't being parsed correctly? -Ewen On Tue, Mar 8, 2016 at 12:26 AM, Anishek Agarwal wrote: > Hello > > following doc @ > >

Re: Kafka Connect concept question

2016-04-07 Thread Ewen Cheslack-Postava
On Wed, Apr 6, 2016 at 8:56 AM, Uber Slacker wrote: > Hi folks. I'm pretty new to Kafka. I have spent a fair amount of time so > far understanding the Kafka system in general and how producers and > consumers work. I'm now trying to get a grasp on how Kafka Connect >

Re: Is there any behavioural change to connect local server and remote server?

2016-04-05 Thread Ewen Cheslack-Postava
Ratha, In EC2, you probably need to use the advertised.listeners setting (or advertised.host and advertised.port on older brokers). This is because EC2 has internal and external addresses for each instance. -Ewen On Tue, Apr 5, 2016 at 5:13 AM, Ratha v wrote: > Hi all;

Re: broker ids in AWS autoscaling group

2016-03-22 Thread Ewen Cheslack-Postava
Raj, You probably just want to use automatic broker ID generation. Take a look at the broker.id.generation.enable option. -Ewen On Tue, Mar 22, 2016 at 2:58 PM, Raj Tanneru wrote: > Hi All, > > We are setting up kafka cluster on AWS. We are using cloud formation >

Re: Consumer deadlock

2016-03-03 Thread Ewen Cheslack-Postava
from my iPhone > > > On Mar 3, 2016, at 03:51, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > > Take a look at the consumer.timeout.ms setting if you don't want the > > iterator to block indefinitely. > > > > And a better long term solution

Re: Announcing rdkafka-dotnet - C# Apache Kafka client

2016-03-03 Thread Ewen Cheslack-Postava
Andreas, Thanks, this looks great and I think its leveraging the existing functionality available in librdkafka is a great choice. I've added this to the wiki page for clients: https://cwiki.apache.org/confluence/display/KAFKA/Clients -Ewen On Wed, Mar 2, 2016 at 12:07 PM, Andreas Heider

Re: session.timeout.ms limit - Kafka Consumer

2016-03-03 Thread Ewen Cheslack-Postava
In fact, KIP-41 has been implemented in trunk -- see https://issues.apache.org/jira/browse/KAFKA-3007. Testing against a version including that change would be greatly appreciated to ensure it fully addresses the problems you're seeing. -Ewen On Wed, Mar 2, 2016 at 7:00 AM, Olson,Andrew

Re: Consumer deadlock

2016-03-03 Thread Ewen Cheslack-Postava
Take a look at the consumer.timeout.ms setting if you don't want the iterator to block indefinitely. And a better long term solution is to switch to the new consumer, but that obviously requires much more significant code changes. The new consumer API is a single-threaded poll-based API where you

Re: Poll Interval for Kafka Connect Source

2016-03-09 Thread Ewen Cheslack-Postava
Shiti, There's not a built-in parameter because not all sources are based on polling -- some may use Selector-like APIs which simply block indefinitely while waiting for new data (and support some sort of wakeup mechanism to support prompt shutdown). -Ewen On Tue, Mar 8, 2016 at 9:37 PM, Shiti

Re: About AMQP connector and Kafka Connect framework

2016-04-03 Thread Ewen Cheslack-Postava
On Fri, Apr 1, 2016 at 12:23 AM, Paolo Patierno wrote: > Hi Ewen, > > thanks for your reply. > > My objective here is to access Kafka through AMQP protocol (now I'm > working on a bridge from scratch without using Kafka Connect). > > Consider the following scenario ... > >

Re: Partition size for topic

2016-04-01 Thread Ewen Cheslack-Postava
Oleg, Normally the number of partitions doesn't change (or infrequently, at least) so regardless of how you got the number of partitions there shouldn't be an inconsistency. Are you actually seeing an inconsistency causing this exception? And is the number of partitions not changing? Is it

Re: About AMQP connector and Kafka Connect framework

2016-03-31 Thread Ewen Cheslack-Postava
On Thu, Mar 31, 2016 at 12:02 AM, Paolo Patierno wrote: > Hi all, > > after the following Twitter conversation ... > > https://twitter.com/jfield/status/715299287479877632 > > I'd like to explain better my concerns about using Kafka Connect for an > AMQP connector. > I

Re: [VOTE] 0.10.0.0 RC5

2016-05-17 Thread Ewen Cheslack-Postava
FYI, there's a blocker issue with this RC due to Apache licensing restrictions. One of Connect's dependencies transitively includes the findbugs annotations jar, which is used for static analysis. Luckily it doesn't affect functionality and looks like it can easily be filtered out. We also

Re: how to write kafka connect hdfs parquet sink.

2016-07-25 Thread Ewen Cheslack-Postava
/blob/master/avro-converter/src/main/java/io/confluent/connect/avro/AvroConverter.java > > > of confluent? > I think, I should also understand connect internal data structure which is > a bit complicated. > > - Kidong. > > > > 2016-07-26 2:54 GMT+09:00 Ewen Cheslack-

Re: Handling long commits during a rebalance

2016-07-27 Thread Ewen Cheslack-Postava
Joey, You'll probably want to look into https://cwiki.apache.org/confluence/display/KAFKA/KIP-62%3A+Allow+consumer+to+send+heartbeats+from+a+background+thread That should address the long, minutes-long timeout you're referring to with onPartitionsRevoked(). If you need to address it in the

Re: how to write kafka connect hdfs parquet sink.

2016-07-25 Thread Ewen Cheslack-Postava
If I'm understanding your setup properly, you need a way to convert your data from your own Avro format to Connect format. From there, the existing Parquet support in the HDFS connector should work for you. So what you need is your own implementation of an AvroConverter, which is what loads the

Re: More OS packages, please!

2016-07-23 Thread Ewen Cheslack-Postava
Confluent Platform includes RPM and Debian packages: http://www.confluent.io/download We tag them a bit differently due to different release schedules, but the CP builds are entirely open source and effectively map directly to Apache releases. Check out

Re: [kafka-connect] multiple or single clusters?

2016-07-23 Thread Ewen Cheslack-Postava
On Fri, Jun 24, 2016 at 11:16 AM, noah wrote: > I'm having some trouble figuring out the right way to run Kafka Connect in > production. We will have multiple sink connectors that we need to remain > running indefinitely and have at least once semantics (with as little >

Re: Colocating Kafka Connect on Kafka Broker

2016-07-23 Thread Ewen Cheslack-Postava
Generally we discourage colocating services with Kafka. Kafka relies heavily on the page cache. It's generally light on CPU (except maybe if it has to recompress messages), but may not play well with other services. For very light installations, colocating some services (e.g. both ZK and Kafka),

Re: Kafka cluster

2016-07-23 Thread Ewen Cheslack-Postava
2 is technically enough but you're at risk of losing data if there is a failure and the second broker fails while a replacement broker is replicating the data. In general, 3 brokers (and replicas) is a good minimum, but there are some cases that might warrant using fewer, even as few as 1. For

Re: Kafka Connect issues

2016-07-23 Thread Ewen Cheslack-Postava
That definitely sounds unusual -- rebalancing normally only happens either when a) there are new workers or b) there are connectivity issues/failures. Is it possible there's something causing large latencies? -Ewen On Sat, Jul 16, 2016 at 6:09 AM, Kristoffer Sjögren wrote: >

Re: Kafka does not preserve an offset on topic.

2016-07-23 Thread Ewen Cheslack-Postava
The parameter you want is AUTO_OFFSET_RESET_CONFIG. If setting that to latest isn't working, can you include some code that reproduces the issue? -Ewen On Wed, Jul 6, 2016 at 6:21 AM, Pawel Huszcza wrote: > Hello, > > I tried every different property I can think of

Re: Monitoring Kafka Connect

2016-07-23 Thread Ewen Cheslack-Postava
On Wed, Jun 29, 2016 at 9:44 AM, Sumit Arora wrote: > Hello, > > We are currently building our data-pipeline using Confluent and as part of > this implementation, we have written couple of Kafka Connect Sink > Connectors for Azure and MS SQL server. To provide

Re: Rebalance and Failures

2016-07-23 Thread Ewen Cheslack-Postava
Since you mention ZK timeout, I think you might be confused about new vs old consumer semantics. With the new consumer, there's no ZK interaction. If one of the member dies after indicating membership but before the group protocol completes, it will simply be assigned data and not process it.

Re: Kafka consumer performance with large network delay

2016-07-23 Thread Ewen Cheslack-Postava
Kafka will batch messages, but if the rate of delivery is too slow it'll fall back to delivering only one message per batch. What is the total throughput per broker? -Ewen On Fri, Jul 15, 2016 at 5:21 PM, Boris Sorochkin wrote: > Hi All, > I have Kafka setup with default

Re: Nginx Logs to Kafka

2016-07-23 Thread Ewen Cheslack-Postava
Kafka Connect can also help you here. There's nothing nginx specific, but even a very simple file connector can help you ingest nginx logs into Kafka. -Ewen On Tue, Jul 19, 2016 at 11:22 AM, Steve Brandon wrote: > You can use the ELK stack to push your logs to Kafka,

<    1   2   3   4   >