Re: [VOTE] 2.0.1 RC0

2018-11-02 Thread Ewen Cheslack-Postava
+1 -Ewen On Thu, Nov 1, 2018 at 10:10 AM Manikumar wrote: > We were waiting for the system test results. There were few failures: > KAFKA-7579, KAFKA-7559, KAFKA-7561 > they are not blockers for 2.0.1 release. We need more votes from > PMC/committers :) > > Thanks Stanislav! for the system

Re: Kafka Connect task re-balance repeatedly

2018-03-22 Thread Ewen Cheslack-Postava
The log is showing that the Connect worker is trying to make sure it has read the entire log and gets to offset 119, but some other worker says it has read to offset 169. The two are in inconsistent states, so the one that seems to be behind will not start work with potentially outdated

[ANNOUNCE] Apache Kafka 1.0.1 Released

2018-03-06 Thread Ewen Cheslack-Postava
, Ewen Cheslack-Postava, Filipe Agapito, fredfp, Guozhang Wang, huxihx, Ismael Juma, Jason Gustafson, Jeremy Custenborder, Jiangjie (Becket) Qin, Joel Hamill, Konstantine Karantasis, lisa2lisa, Logan Buckley, Manjula K, Matthias J. Sax, Nick Chiu, parafiend, Rajini Sivaram, Randall Hauch, Robert

[VOTE] 1.0.1 RC2

2018-02-21 Thread Ewen Cheslack-Postava
/kafka/tree/1.0.1-rc2 * Documentation: http://kafka.apache.org/10/documentation.html * Protocol: http://kafka.apache.org/10/protocol.html /** Thanks, Ewen Cheslack-Postava

Re: [VOTE] 1.0.1 RC1

2018-02-20 Thread Ewen Cheslack-Postava
t; >> > > Satish. > >> > > > >> > > > >> > > On Tue, Feb 13, 2018 at 11:30 PM, Damian Guy <damian@gmail.com> > >> > wrote: > >> > > > >> > > > +1 > >> > > > > >&

Re: [VOTE] 1.0.1 RC1

2018-02-13 Thread Ewen Cheslack-Postava
t (org.apache.kafka:streams-quickstart-java:1.0.1) > > Something i'm missing? > > Thanks, > Damian > > On Tue, 13 Feb 2018 at 10:16 Manikumar <manikumar.re...@gmail.com> wrote: > > > +1 (non-binding) > > > > ran quick-start, unit tests on the src. &

Re: [VOTE] 1.0.1 RC1

2018-02-12 Thread Ewen Cheslack-Postava
sn't been updated yet: > > https://repository.apache.org/content/groups/staging/org/ > apache/kafka/kafka-clients/ > > On Mon, Feb 12, 2018 at 10:16 AM, Ewen Cheslack-Postava <e...@confluent.io > > > wrote: > > > And of course I'm +1 since I've already done normal

Re: [VOTE] 1.0.1 RC1

2018-02-12 Thread Ewen Cheslack-Postava
And of course I'm +1 since I've already done normal release validation before posting this. -Ewen On Mon, Feb 12, 2018 at 10:15 AM, Ewen Cheslack-Postava <e...@confluent.io> wrote: > Hello Kafka users, developers and client-developers, > > This is the second candidate for re

[VOTE] 1.0.1 RC1

2018-02-12 Thread Ewen Cheslack-Postava
/documentation.html * Protocol: http://kafka.apache.org/10/protocol.html Thanks, Ewen Cheslack-Postava

Re: [VOTE] 1.0.1 RC0

2018-02-09 Thread Ewen Cheslack-Postava
om source and running the quickstart were successful on Ubuntu > and Windows 10. > > Thanks for running the release. > --Vahid > > > > From: Ewen Cheslack-Postava <e...@confluent.io> > To: d...@kafka.apache.org, users@kafka.apache.org, > kafka-clie...

[VOTE] 1.0.1 RC0

2018-02-05 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers, Sorry for a bit of delay, but I've now prepared the first candidate for release of Apache Kafka 1.0.1. This is a bugfix release for the 1.0 branch that was first released with 1.0.0 about 3 months ago. We've fixed 46 significant issues since

Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2018-01-04 Thread Ewen Cheslack-Postava
wrote: > Thanks Ewen, > I just edited the KIP to reflect the changes. > > Regards, > Umesh > > On Wed, 9 Aug 2017 at 11:00 Ewen Cheslack-Postava <e...@confluent.io> > wrote: > >> Great, looking good. I'd probably be a bit more concrete about the >> Propose

Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2017-08-08 Thread Ewen Cheslack-Postava
details to > it. > > Regards, > Umesh > > On Mon, 31 Jul 2017 at 21:51 Ewen Cheslack-Postava <e...@confluent.io> > wrote: > >> On Sun, Jul 30, 2017 at 10:21 PM, UMESH CHAUDHARY <umesh9...@gmail.com> >> wrote: >> >>> Hi Ewen, >>> Than

Re: struggling with runtime Schema in connect

2017-07-31 Thread Ewen Cheslack-Postava
t than any other serialization library/format! -Ewen On Wed, Jul 26, 2017 at 6:11 AM, Koert Kuipers <ko...@tresata.com> wrote: > just out of curiosity, why does kafka streams not use this runtime data api > defined in kafka connect? > > On Wed, Jul 26, 2017 at 3:10 AM,

Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2017-07-31 Thread Ewen Cheslack-Postava
now if you have some additional thoughts on this. > > Regards, > Umesh > > > > On Wed, 26 Jul 2017 at 09:27 Ewen Cheslack-Postava <e...@confluent.io> > wrote: > >> Umesh, >> >> Thanks for the KIP. Straightforward and I think it's a good change.

Re: Kafka Connect distributed mode rebalance

2017-07-26 Thread Ewen Cheslack-Postava
Btw, if you can share, I would be curious what connectors you're using and why you need so many. I'd be interested if a modification to the connector could also simplify things for you. -Ewen On Wed, Jul 26, 2017 at 12:33 AM, Ewen Cheslack-Postava <e...@confluent.io> wrote: > Stephen,

Re: Kafka Connect distributed mode rebalance

2017-07-26 Thread Ewen Cheslack-Postava
Stephen, Cool, that is a *lot* of connectors! Regarding rebalances, the reason this happens is that Kafka Connect is trying to keep the total work of the cluster balanced across the workers. If you add/remove connectors or the # of workers change, then we need to go through another round

Re: struggling with runtime Schema in connect

2017-07-26 Thread Ewen Cheslack-Postava
Stephen's explanation is great and accurate :) One of the design goals for Kafka Connect was to not rely on any specific serialization format since that is really orthogonal to getting/sending data from/to other systems. We define the generic *runtime* data API, which is what you'll find in the

Re: Kafka Connect Embedded API

2017-07-26 Thread Ewen Cheslack-Postava
The vast majority of KIP-26 has been implemented. Unfortunately, the embedded API is still one of the gaps that has not yet been implemented. It likely requires some additional design work as only a prototype API was proposed in the KIP describing the framework as a whole. -Ewen On Wed, Jul 12,

Re: Kafka connector throughput reduction upon avro schema change

2017-07-26 Thread Ewen Cheslack-Postava
What is your setting for schema.compatibility? I suspect the issue is probably that it is defaulting to NONE which would cause the connector to roll a new file when the schema changes (which will be frequent with data that is interleaved with different schemas). If you set it to BACKWARDS then

Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2017-07-25 Thread Ewen Cheslack-Postava
Umesh, Thanks for the KIP. Straightforward and I think it's a good change. Unfortunately it is hard to tell how many people it would affect since we can't tell how many people have adjusted that config, but I think this is the right thing to do long term. A couple of quick things that might be

Re: [DISCUSS] KIP-163: Lower the Minimum Required ACL Permission of OffsetFetch

2017-07-24 Thread Ewen Cheslack-Postava
Vahid, Thanks for the KIP. I think we're mostly in violent agreement that the lack of any Write permissions on consumer groups is confusing. Unfortunately it's a pretty annoying issue to fix since it would require an increase in permissions. More generally, I think it's unfortunate because by

Re: Kafka broker startup issue

2017-05-23 Thread Ewen Cheslack-Postava
Version 2 of UpdateMetadataRequest does not exist in version 0.9.0.1. This suggests that you have a broker with a newer version of Kafka running against the same ZK broker. Do you have any other versions running? Or is it possible this is a shared ZK cluster and you're not using a namespace within

Re: Reg: [VOTE] KIP 157 - Add consumer config options to streams reset tool

2017-05-16 Thread Ewen Cheslack-Postava
+1 (binding) I mentioned this in the PR that triggered this: > KIP is accurate, though this is one of those things that we should probably get a KIP for a standard set of config options across all tools so additions like this can just fall under the umbrella of that KIP... I think it would be

Re: Kafka Connect CPU spikes bring down Kafka Connect workers

2017-05-16 Thread Ewen Cheslack-Postava
On Mon, May 15, 2017 at 2:06 PM, Phillip Mann wrote: > Currently, Kafka Connect experiences a spike in CPU usage which causes > Kafka Connect to crash. What kind of crash? Can you provide an error or stacktrace? > There is really no useful information from the logs to

Re: offset commitment from another client

2017-04-18 Thread Ewen Cheslack-Postava
Consumers are responsible for committing offsets, not brokers. See http://kafka.apache.org/documentation.html#design_consumerposition for more of an explanation of how this is tracked. The brokers help coordinate this/store the offsets, but it is the consumers that decide when to commit offsets

Re: even if i pass key no change in partition

2017-04-18 Thread Ewen Cheslack-Postava
Do you have more than 1 partition? You may have an auto-created topic with only 1 partition, in which case the partition of messages will *always* be the same, regardless of key. -Ewen On Fri, Mar 24, 2017 at 5:52 AM, Laxmi Narayan wrote: > Hi, > > I am passing key in

Re: offset.storage.filename configuration in kafka-connect-hdfs

2017-04-09 Thread Ewen Cheslack-Postava
The offset file is marked as required so we can fail early & fast, but if you only run sink connectors then offsets will be stored in Kafka's normal offsets topic and the file will never need to be created. (And the HDFS connector is even more unusual in that it doesn't even rely on Kafka offsets

Re: Kafka Connect behaves weird in case of zombie Kafka brokers. Also, zombie brokers?

2017-04-09 Thread Ewen Cheslack-Postava
Is that node the only bootstrap broker provided? If the Connect worker was pinned to *only* that broker, it wouldn't have any chance of recovering correct cluster information from the healthy brokers. It sounds like there was a separate problem as well (the broker should have figured out it was

Re: kafka connector for mongodb as a source

2017-04-09 Thread Ewen Cheslack-Postava
There is some log noise in there from Reflections, but it does look like your connector & task are being created: [2017-03-27 18:33:00,057] INFO Instantiated task mongodb-0 with version 0.10.0.1 of type org.apache.kafka.connect.mongodb.MongodbSourceTask

Re: Are there Connector artifacts in Confluent or any other Maven repository?

2017-03-21 Thread Ewen Cheslack-Postava
Yes, these get published to Confluent's maven repository. Follow the instructions here http://docs.confluent.io/current/installation.html#installation-maven for adding the Confluent maven repository to your project and then add a dependency for the connector to your project (e.g. for that

[ANNOUNCE] Apache Kafka 0.10.2.0 Released

2017-02-22 Thread Ewen Cheslack-Postava
Mahadevan, Ashish Singh, Balint Molnar, Ben Stopford, Bernard Leach, Bill Bejeck, Colin P. Mccabe, Damian Guy, Dan Norwood, Dana Powers, dasl, Derrick Or, Dong Lin, Dustin Cote, Edoardo Comar, Edward Ribeiro, Elias Levy, Emanuele Cesena, Eno Thereska, Ewen Cheslack-Postava, Flavio Junqueira, fpj

Re: [VOTE] 0.10.2.0 RC2

2017-02-18 Thread Ewen Cheslack-Postava
from the source and ran the quickstart successfully on Ubuntu, Mac, > Windows (64 bit). > > Thank you Ewen for running the release. > > --Vahid > > > > From: Ewen Cheslack-Postava <e...@confluent.io> > To: d...@kafka.apache.org, "users@k

[VOTE] 0.10.2.0 RC2

2017-02-14 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers, This is the third candidate for release of Apache Kafka 0.10.2.0. This is a minor version release of Apache Kafka. It includes 19 new KIPs. See the release notes and release plan (https://cwiki.apache.org/conf

[VOTE] 0.10.2.0 RC1

2017-02-10 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers, This is RC1 for release of Apache Kafka 0.10.2.0. This is a minor version release of Apache Kafka. It includes 19 new KIPs. See the release notes and release plan (https://cwiki.apache.org/ confluence/display/KAFKA/Release+Plan+0.10.2.0) for

Re: When publishing to non existing topic, TimeoutException is thrown instead of UnknownTopicOrPartitionException

2017-01-30 Thread Ewen Cheslack-Postava
Stevo, Agreed that this seems broken if we're just timing out trying to fetch metadata if we should be able to tell that the topic will never be created. Clients can't explicitly tell whether auto topic creation is on. Implicit indication via the error code seems like a good idea. My only

Re: kafka_2.10-0.8.1 simple consumer retrieves junk data in the message

2017-01-30 Thread Ewen Cheslack-Postava
What are the 26 additional bytes? That sounds like a header that a decoder/deserializer is handling with the high level consumer. What class are you using to deserialize the messages with the high level consumer? -Ewen On Fri, Jan 27, 2017 at 10:19 AM, Anjani Gupta

Re: Upgrade questions

2017-01-30 Thread Ewen Cheslack-Postava
Note that the documentation that you linked to for upgrades specifically lists configs that you need to be careful to adjust in your server.properties. In fact, the server.properties shipped with Kafka is meant for testing only. There are some configs in the example server.properties that are not

Re: Kafka JDBC connector vs Sqoop

2017-01-30 Thread Ewen Cheslack-Postava
For MySQL you would either want to use Debezium's connector (which can handle bulk dump + incremental CDC, but requires direct access to the binlog) or the JDBC connector (does an initial bulk dump + incremental queries, but has limitations compared to a "true" CDC solution). Sqoop and the JDBC

Re: using kafka log compaction withour key

2017-01-30 Thread Ewen Cheslack-Postava
The log compaction functionality uses the key to determine which records to deduplicate. You can think of it (very roughly) as deleting entries from a hash map as the value for each key is overwritten. This functionality doesn't have much of a point unless you include keys in your records. -Ewen

Re: special characters in kafka log

2017-01-30 Thread Ewen Cheslack-Postava
Not sure what special characters you are referring to, but for data in the key and value fields in Kafka, it handles arbitrary binary data. "Special characters" aren't special because Kafka doesn't even inspect the data it is handling: clients tell it the length of the data and then it copies that

Re: kafka connect architecture

2017-01-30 Thread Ewen Cheslack-Postava
On Mon, Jan 30, 2017 at 8:24 AM, Koert Kuipers wrote: > i have been playing with kafka connect in standalone and distributed mode. > > i like standalone because: > * i get to configure it using a file. this is easy for automated deployment > (chef, puppet, etc.). configuration

Re: Ideal value for Kafka Connect Distributed tasks.max configuration setting?

2017-01-30 Thread Ewen Cheslack-Postava
On Fri, Jan 27, 2017 at 10:49 AM, Phillip Mann wrote: > I am looking to product ionize and deploy my Kafka Connect application. > However, there are two questions I have about the tasks.max setting which > is required and of high importance but details are vague for what to >

Re: Automatic Offset Committing

2017-01-23 Thread Ewen Cheslack-Postava
topic test* > > > On 24 Jan. 2017 3:25 pm, "Ewen Cheslack-Postava" <e...@confluent.io> > wrote: > > > The new consumer only supports committing offsets to Kafka. (It doesn't > > even have connection info to ZooKeeper, which is a general trend in Kafk

Re: Does offsetsForTimes use createtime of logsegment file?

2017-01-23 Thread Ewen Cheslack-Postava
> On Fri, Jan 6, 2017 at 1:26 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > It would return the earlier one, offset 0. > > > > -Ewen > > > > On Thu, Jan 5, 2017 at 10:15 PM, Vignesh <vignesh.v...@gmail.com> wrote: > > > >

Re: Kafka Protocol : about clients and number of TCP connections

2017-01-23 Thread Ewen Cheslack-Postava
The only other connections to brokers would be to the bootstrap brokers in order to collect cluster metadata. -Ewen On Wed, Jan 18, 2017 at 3:48 AM, Paolo Patierno wrote: > Hello, > > > I'd like to know the number of connections that Kafka clients establish. I > mean ... >

Re: min hardware requirement

2017-01-23 Thread Ewen Cheslack-Postava
Smaller servers/instances work fine for tests, as long as the workload is scaled down as well. Most memory on a Kafka broker will end up dedicated to page cache. For, e.g., 1GB of RAM just consider that you probably won't be leaving much room to cache the data so your performance may suffer a bit.

Re: Automatic Offset Committing

2017-01-23 Thread Ewen Cheslack-Postava
The new consumer only supports committing offsets to Kafka. (It doesn't even have connection info to ZooKeeper, which is a general trend in Kafka clients -- all details of ZooKeeper are being hidden away from clients, even administrative functions like creating topics.) -Ewen On Thu, Jan 19,

Re: Query on MirrorMaker Replication - Bi-directional/Failover replication

2017-01-23 Thread Ewen Cheslack-Postava
o replication. >> [Query]In order to lookup the offset deltas before initiating the >> consumers on the original cluster, is there any recommended >> mechanism/tooling that can be leveraged? >> > There isn't tooling for this, and the intent in this step is to le

Re: Kafka Connect: Paused connector but still processing data

2017-01-23 Thread Ewen Cheslack-Postava
There was this issue: https://issues.apache.org/jira/browse/KAFKA-4527 which was a test failure that had to do with updating the status as soon as the request to pause the connector was received rather than after it was processed. The corresponding PR fixed that (and will be released in 0.10.2.0).

Re: Kafka Connect requestTaskReconfiguration

2017-01-15 Thread Ewen Cheslack-Postava
This is currently expected. Internally the Connect cluster uses the same rebalancing process as consumer groups which means it has similar limitations -- all tasks must stop just as you would need to stop consuming from all partitions and commit offsets during a consumer group rebalance. There's

Re: java.lang.OutOfMemoryError: Java heap space while running kafka-consumer-perf-test.sh

2017-01-13 Thread Ewen Cheslack-Postava
Perhaps the default heap options aren't sufficient for your particular use of the tool. The consumer perf test defaults to 512MB. I'd simply try increasing the max heap usage: KAFKA_HEAP_OPTS="-Xmx1024M" to bump it up a bit. -Ewen On Wed, Jan 11, 2017 at 2:59 PM, Check Peck

Re: Taking a long time to roll a new log segment (~1 min)

2017-01-13 Thread Ewen Cheslack-Postava
; > >>> > > > > >>> > > -Original Message- > > >>> > > From: Stephen Powis [mailto:spo...@salesforce.com] > > >>> > > Sent: Thursday, January 12, 2017 8:34 AM > > >>> > > To: users@kafka.a

Re: Kafka as a data ingest

2017-01-10 Thread Ewen Cheslack-Postava
hink so. In this case, Kafka connect implement has no advantages to read > single big file unless you also use mapreduce. > > Sent from my iPhone > > On Jan 10, 2017, at 02:41, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > >> However, I'm tryi

Re: Json to JDBC using Kafka JDBC connector Sink

2017-01-10 Thread Ewen Cheslack-Postava
Anything with a table structure is probably not going to handle schemaless data (i.e. JSON) very well without some extra help -- tables usually expect schemas and JSON doesn't have a schema. As it stands today, the JDBC sink connector will probably not handle your use case. To send schemaless

Re: Kafka as a data ingest

2017-01-09 Thread Ewen Cheslack-Postava
> However, I'm trying to figure out if I can use Kafka to read Hadoop file. The question is a bit unclear as to whether you mean "use Kafka to send data to a Hadoop file" or "use Kafka to read a Hadoop file into a Kafka topic". But in both cases, Kafka Connect provides a good option. The more

Re: Taking a long time to roll a new log segment (~1 min)

2017-01-09 Thread Ewen Cheslack-Postava
I can't speak to the exact details of why fds would be kept open longer in that specific case, but are you aware that the recommendation for production clusters for open fd limits is much higher? It's been suggested to be 100,000 as a starting point for quite awhile:

Re: compaction + delete not working for me

2017-01-06 Thread Ewen Cheslack-Postava
On Fri, Jan 6, 2017 at 3:57 AM, Mike Gould wrote: > Hi > > I'm trying to configure log compaction + deletion as per KIP-71 in kafka > 0.10.1 but so far haven't had any luck. My tests show more than 50% > duplicate keys when reading from the beginning even several minutes

Re: Does offsetsForTimes use createtime of logsegment file?

2017-01-06 Thread Ewen Cheslack-Postava
= 1 > Message3, Timestamp = T1, Offset = 2 > > > Would offsetForTimestamp(T1) return offset for earliest message with > timestamp T1 (i.e. Offset 0 in above example) ? > > > -Vignesh. > > On Thu, Jan 5, 2017 at 8:19 PM, Ewen Cheslack-Postava <e...@confluent.io> >

Re: One big kafka connect cluster or many small ones?

2017-01-06 Thread Ewen Cheslack-Postava
ented somewhere on the confluent website. I couldn’t find > it. > > On 6 January 2017 at 3:42:45 pm, Ewen Cheslack-Postava (e...@confluent.io) > wrote: > > On Thu, Jan 5, 2017 at 7:19 PM, Stephane Maarek < > steph...@simplemachines.com.au> wrote: > >> Thanks a lo

Re: Apache Kafka integration using Apache Camel

2017-01-05 Thread Ewen Cheslack-Postava
More generally, do you have any log errors/messages or additional info? It's tough to debug issues like this from 3rd party libraries if they don't provide logs/exception info that indicates why processing a specific message failed. -Ewen On Thu, Jan 5, 2017 at 8:29 PM, UMESH CHAUDHARY

Re: One big kafka connect cluster or many small ones?

2017-01-05 Thread Ewen Cheslack-Postava
ems reasonable for most folks since *ideally* you are *somewhat* standardized on a common serialization format). -Ewen > > On 6 January 2017 at 1:54:10 pm, Ewen Cheslack-Postava (e...@confluent.io) > wrote: > > On Thu, Jan 5, 2017 at 3:12 PM, Stephane Maarek < > steph...@simpl

Re: Consumer Rebalancing Question

2017-01-05 Thread Ewen Cheslack-Postava
p. This happens once and then the "issue" is resolved without any additional interruptions. -Ewen On Thu, Jan 5, 2017 at 3:01 PM, Pradeep Gollakota <pradeep...@gmail.com> wrote: > I see... doesn't that cause flapping though? > > On Wed, Jan 4, 2017 at 8:22 PM, Ewen

Re: MirrorMaker - Topics Identification and Replication

2017-01-05 Thread Ewen Cheslack-Postava
s wrong in my understanding. > > Thanks > > On Tue, 3 Jan 2017 at 23:24 Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Yes, the consumer will pick up the new topics when it refreshes metadata > > (defaults to every 5 min) and start subscrib

Re: Does offsetsForTimes use createtime of logsegment file?

2017-01-05 Thread Ewen Cheslack-Postava
On Wed, Jan 4, 2017 at 11:54 PM, Vignesh wrote: > Hi, > > offsetsForTimes > KafkaConsumer.html#offsetsForTimes(java.util.Map)> > function > returns offset for a given timestamp. Does it use

Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread Ewen Cheslack-Postava
The basic issue here is just that the auto.offset.reset defaults to latest, right? That's not a very good setting for a mirroring tool and this seems like something we might just want to change the default for. It's debatable whether it would even need a KIP. We have other settings in MM where we

Re: Metric meaning

2017-01-05 Thread Ewen Cheslack-Postava
There's not currently anything more detaild than what is included in http://kafka.apache.org/documentation/#monitoring There's some work trying to automate the generation of that documentation ( https://issues.apache.org/jira/browse/KAFKA-3480). That combined with some addition to give longer

Re: Query on MirrorMaker Replication - Bi-directional/Failover replication

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 3:07 AM, Greenhorn Techie wrote: > Hi, > > We are planning to setup MirrorMaker based Kafka replication for DR > purposes. The base requirement is to have a DR replication from primary > (site1) to DR site (site2)using MirrorMaker, > > However,

Re: Kafka Connect offset.storage.topic not receiving messages (i.e. how to access Kafka Connect offset metadata?)

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 11:30 AM, Phillip Mann wrote: > I am working on setting up a Kafka Connect Distributed Mode application > which will be a Kafka to S3 pipeline. I am using Kafka 0.10.1.0-1 and Kafka > Connect 3.1.1-1. So far things are going smoothly but one aspect that

Re: One big kafka connect cluster or many small ones?

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 3:12 PM, Stephane Maarek < steph...@simplemachines.com.au> wrote: > Hi, > > We like to operate in micro-services (dockerize and ship everything on ecs) > and I was wondering which approach was preferred. > We have one kafka cluster, one zookeeper cluster, etc, but when it

Re: Consumer Rebalancing Question

2017-01-04 Thread Ewen Cheslack-Postava
The coordinator will immediately move the group into a rebalance if it needs it. The reason LeaveGroupRequest was added was to avoid having to wait for the session timeout before completing a rebalance. So aside from the latency of cleanup/committing offests/rejoining after a heartbeat, rolling

Re: Kafka Connect gets into a rebalance loop

2017-01-04 Thread Ewen Cheslack-Postava
.S. The version of Kafka Connect I'm running is > {"version":"0.10.0.0-cp1","commit":"7aeb2e89dbc741f6"} > On Sat, Dec 17, 2016 at 7:55 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > The message > > > > >

Re: Reg: Need info on Kafka Brokers

2017-01-03 Thread Ewen Cheslack-Postava
Unfortunately, I don't think it has been open sourced (it doesn't seem to be available on https://github.com/paypal). -Ewen On Tue, Jan 3, 2017 at 5:54 PM, Jhon Davis wrote: > Found an interesting Kafka monitoring tool but no information on whether > it's not open

Re: Why does consumer.subscribe(Pattern) require a ConsumerRebalanceListener?

2017-01-03 Thread Ewen Cheslack-Postava
Tbh, I can't remember the exact details around the discussion of the addition of this API, but I think this was to minimize API bloat. It's easy to end up with 83 overloads of methods to handle all the different combinations of parameters, but just a couple of shorthand overrides cover the vast

Re: MirrorMaker - Topics Identification and Replication

2017-01-03 Thread Ewen Cheslack-Postava
Yes, the consumer will pick up the new topics when it refreshes metadata (defaults to every 5 min) and start subscribing to the new topics. -Ewen On Tue, Jan 3, 2017 at 3:07 PM, Greenhorn Techie wrote: > Hi, > > I am new to Kafka and as well as MirrorMaker. So

Re: Kafka Connect Consumer reading messages from Kafka recursively

2017-01-03 Thread Ewen Cheslack-Postava
the > __consumer_offsets topic and it doesn't have anything in it. Should I > provide write permissions to this topic for my Kafka client user? I am > running my consumer using a different user than Kafka user. > > Thanks, > Sri > > On Tue, Jan 3, 2017 at 3:40 PM,

Re: About Kafka Consumer : synchronous and blocking ?

2017-01-03 Thread Ewen Cheslack-Postava
would be fine, just be careful about committing offsets properly! -Ewen > > Thanks > Paolo > > Get Outlook for Android<https://aka.ms/ghei36> > > ____ > From: Ewen Cheslack-Postava <e...@confluent.io> > Sent: Tuesday, January

Re: how to ingest a database with a Kafka Connect cluster in parallel?

2017-01-03 Thread Ewen Cheslack-Postava
ctor? > > Have a nice day. > > Best regards, > Yang > > > 2017-01-03 20:55 GMT+01:00 Ewen Cheslack-Postava <e...@confluent.io>: > > > The unit of parallelism in connect is a task. It's only listing one task, > > so you only have one process copying data.

Re: Kafka Connect Consumer reading messages from Kafka recursively

2017-01-03 Thread Ewen Cheslack-Postava
offset checker to see if any offsets are committed for the group. Also, is there anything in the logs that might indicate a problem with the consumer committing offsets? -Ewen > Thanks, > Sri > > On Tue, Jan 3, 2017 at 1:59 PM, Ewen Cheslack-Postava <e...@confluent.io> > wro

Re: About Kafka Consumer : synchronous and blocking ?

2017-01-03 Thread Ewen Cheslack-Postava
That's correct. Aside from commitAsync, all the consumer methods will block, although note that some are just local operations that affect subsequent method calls (e.g. seek() just sets some state locally). In fact, the only call that I think you'd need to actually worry about blocking is poll().

Re: Schema reg avro version

2017-01-03 Thread Ewen Cheslack-Postava
It just hasn't been update recently. It isn't just the schema-registry that needs to be updated since other components use the library as well and we'd need to avoid potential classpath conflicts, but it should be straightforward to update. -Ewen On Tue, Jan 3, 2017 at 8:45 AM, Scott Ferguson

Re: Kafka Connect Consumer reading messages from Kafka recursively

2017-01-03 Thread Ewen Cheslack-Postava
On Tue, Jan 3, 2017 at 8:38 AM, Srikrishna Alla wrote: > Hi, > > I am using Kafka/Kafka Connect to track certain events happening in my > application. This is how I have implemented it - > 1. My application is opening a KafkaProducer every time this event happens > and

Re: how to ingest a database with a Kafka Connect cluster in parallel?

2017-01-03 Thread Ewen Cheslack-Postava
The unit of parallelism in connect is a task. It's only listing one task, so you only have one process copying data. The connector can consume data from within a single *database* in parallel, but each *table* must be handled by a single task. Since your table whitelist only includes a single

Re: kafka streams and broadcast topic

2017-01-02 Thread Ewen Cheslack-Postava
I think what you're describing could be handled in KStreams by a "global" KTable. This functionality is currently being discussed/voted on in a KIP discussion: https://cwiki.apache.org/confluence/pages/viewpa ge.action?pageId=67633649 The list of interests would be a global KTable (shared globally

Re: Interesting error message du jour

2016-12-30 Thread Ewen Cheslack-Postava
Jon, This looks the same as https://issues.apache.org/jira/browse/KAFKA-4563, although for a different invalid transition. The temporary fix suggested there is to simply convert the exception to log a warning, which should be a pretty trivial patch against trunk. It seems there are some

Re: Processing time series data in order

2016-12-29 Thread Ewen Cheslack-Postava
The best you can do to ensure ordering today is to set: acks = all retries = Integer.MAX_VALUE max.block.ms = Long.MAX_VALUE max.in.flight.requests.per.connection = 1 This ensures there's only one outstanding produce request (batch of messages) at a time, it will be retried indefinitely on

Re: Is it possible for consumers within a single consumer group to have different subscriptions?

2016-12-21 Thread Ewen Cheslack-Postava
It is possible for them to have different subscriptions. Consumers will only be assigned partitions from topics to which they are subscribed. So if you need to modify your app use data from an additional topic, you can safely do a rolling deploy of the updated version and during the period where

Re: Producer connect timeouts

2016-12-19 Thread Ewen Cheslack-Postava
ult is > certainly very surprising behavior. It would also be nice not to have to > coordinate request timeouts, retries, and the max block configuration with > system-level configs. > > > On Sat, Dec 17, 2016 at 6:55 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote

Re: TLS

2016-12-19 Thread Ewen Cheslack-Postava
Ruben, There are step-by-step instructions explained here: http://docs.confluent. io/3.1.1/kafka/security.html For the purposes of configuring Kafka, the JAAS details basically boil down to a security configuration in a security configuration file. -Ewen On Mon, Dec 19, 2016 at 8:40 AM, Ruben

Re: Kafka Connect gets into a rebalance loop

2016-12-17 Thread Ewen Cheslack-Postava
The message > Wasn't unable to resume work after last rebalance means that you previous iterations of the rebalance were somehow behind/out of sync with other members of the group, i.e. they had not read up to the same point in the config topic so it wouldn't be safe for this worker (or possibly

Re: __consumer_offsets topic acks

2016-12-17 Thread Ewen Cheslack-Postava
The default is -1 which means all replicas need to replicate the committed data before the ack will be sent to the consumer. See the offsets.commit.required.acks setting for the broker. min.insync.replicas applies to the offsets topic as well, but defaults to 1. You may want to increase this

Re: What does GetOffsetShell result represent

2016-12-17 Thread Ewen Cheslack-Postava
The tool writes output in the format: :: So in the case of your example with --time -1 that returned test-window-stream:0:724, it is saying that test-window-stream has partition 0 with a valid log segment which has the first offset = 724. Note that --time -1 is a special code for "only give the

Re: Kafka connect distributed mode not distributing the work

2016-12-17 Thread Ewen Cheslack-Postava
Hi Manjunath, I think you're seeing a case of this issue: https://issues.apache. org/jira/browse/KAFKA-4553 where the way round robin assignment works with an even # of workers and connectors that generate only 1 task generates uneven work assignments because connectors aren't really equivalent

Re: Producer connect timeouts

2016-12-17 Thread Ewen Cheslack-Postava
Without having dug back into the code to check, this sounds right. Connection management just fires off a request to connect and then subsequent poll() calls will handle any successful/failed connections. The timeouts wrt requests are handled somewhat differently (the connection request isn't

Re: How to disable auto commit for SimpleConsumer kafka 0.8.1

2016-12-10 Thread Ewen Cheslack-Postava
The simple consumer doesn't do auto-commit. It really only issues individual low-level Kafka protocol request types, so `commitOffsets` is the only way it should write offsets. Is it possible it crashed after the request was sent but before finishing reading the response? Side-note: I know you

Re: Best approach to frequently restarting consumer process

2016-12-10 Thread Ewen Cheslack-Postava
Consumer groups aren't going to handle 'let it crash' particularly well (and really any session-based services, but particularly consumer groups since a single failure affects the entire group). That said, 'let it crash' doesn't necessarily have to mean 'don't try to clean up at all'. The consumer

Re: Configuration for low latency and low cpu utilization? java/librdkafka

2016-12-10 Thread Ewen Cheslack-Postava
On the producer side, there's not much you can do to reduce CPU usage if you want low latency and don't have enough throughput to buffer multiple messages -- you're going to end up sending 1 record at a time in order to achieve your desired latency. Note, however, that the producer is thread safe,

Re: Upgrading from 0.10.0.1 to 0.10.1.0

2016-12-10 Thread Ewen Cheslack-Postava
Hagen, What does "new consumer doesn't like the old brokers" mean exactly? When upgrading MM, remember that it uses the clients internally so the same compatibility rules apply: you need to upgrade both sets of brokers before you can start using the new version of MM. -Ewen On Thu, Dec 8, 2016

Re: NotEnoughReplication

2016-12-10 Thread Ewen Cheslack-Postava
This error doesn't necessarily mean that a broker is down, it can also mean that too many replicas for that topic partition have fallen behind the leader. This indicates your replication is lagging for some reason. You'll want to be monitoring some of the metrics listed here:

Re: Running mirror maker between two different version of kafka

2016-12-10 Thread Ewen Cheslack-Postava
It's tough to read that stacktrace, but if I understand what you mean by "running the kafka mirroring in destination cluster which is 0.10.1.0 version of kafka", then the problem is that you cannot use 0.10.1.0 mirror maker with an 0.8.1. cluster. MirrorMaker is both a producer and consumer, so

  1   2   3   4   >