Re: WELCOME

2016-02-08 Thread Ewen Cheslack-Postava
If it's taking that long, you may be working on hardware (or a VM?) which is too underpowered to run some of the tests reliably. The README.md has instructions for how to build different components. In your case you want ./gradlew core:test Depending on your hardware, you may want to adjust the

Re: Number of concurrent consumers per data node

2016-02-05 Thread Ewen Cheslack-Postava
On Wed, Feb 3, 2016 at 3:57 PM, Shane MacPhillamy wrote: > Hi > > I’m just coming up to speed with Kafka. Some beginner questions, may be > point me to where I can find the answers please: > > 1. In a Kafka cluster what determines the maximum number of concurrent >

Re: Getting started with 0.9 client custom serialization

2016-02-05 Thread Ewen Cheslack-Postava
Gary, Here are a few concrete examples from Kafka and Confluent Platform: JSON (baked into Kafka Connect, not specifically designed for standalone serialization but they should work for that):

Re: kafka “stops working” after a large message is enqueued

2016-02-05 Thread Ewen Cheslack-Postava
The default max message size is 1MB. You'll probably need to increase a few settings -- the topic max message size on a per-topic basis on the broker (or broker-wide with message.max.bytes), the max.partition.fetch.bytes on the new consumer, etc. You need to make sure all of the producer, broker,

Re: Announcing ruby-kafka v0.1

2016-02-05 Thread Ewen Cheslack-Postava
Daniel, Awesome, Ruby folks could use more Kafka love! I added the library to the clients list here: https://cwiki.apache.org/confluence/display/KAFKA/Clients#Clients-Ruby I'm also cc'ing this to the clients list since I think they'd be interested as well. Lots of folks are using the Java

Re: MongoDB Kafka Connect driver

2016-01-29 Thread Ewen Cheslack-Postava
Sunny, As I said on Twitter, I'm stoked to hear you're working on a Mongo connector! It struck me as a pretty natural source to tackle since it does such a nice job of cleanly exposing the op log. Regarding the problem of only getting deltas, unfortunately there is not a trivial solution here --

Re: MongoDB Kafka Connect driver

2016-01-29 Thread Ewen Cheslack-Postava
ou get the current value not > every necessarily every intermediate) but that should be okay for most > uses. > > -Jay > > On Fri, Jan 29, 2016 at 8:54 AM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Sunny, > > > > As I said on Twitter, I'm stoked t

Re: Accumulating data in Kafka Connect source tasks

2016-01-29 Thread Ewen Cheslack-Postava
On Fri, Jan 29, 2016 at 7:06 AM, Randall Hauch <rha...@gmail.com> wrote: > On January 28, 2016 at 7:07:02 PM, Ewen Cheslack-Postava ( > e...@confluent.io) wrote: > > Randall, > > Great question. Ideally you wouldn't need this type of state since it > should really

Re: Accumulating data in Kafka Connect source tasks

2016-01-28 Thread Ewen Cheslack-Postava
Randall, Great question. Ideally you wouldn't need this type of state since it should really be available in the source system. In your case, it might actually make sense to be able to grab that information from the DB itself, although that will also have issues if, for example, there have been

Re: Shutting down Producer

2016-01-27 Thread Ewen Cheslack-Postava
If you don't shut it down properly and there are outstanding requests (e.g. if you call producer.send() and don't call get() on the returned future), then you could potentially lose data. Calling producer.close() flushes all the data before returning, so shutting down properly ensures no data will

Re: Only interested in certain partitions

2016-01-27 Thread Ewen Cheslack-Postava
One option is to instantiate and invoke the DefaultPartitioner yourself (or whatever partitioner you've specified for partitioner.class). However, that will require passing in a Cluster object, which you'll need to construct yourself. This is just used to get the number of partitions for the topic

Re: Behaviour of KafkaConsumer.poll(long)

2016-01-26 Thread Ewen Cheslack-Postava
It's not an iterator (ConsumerRecords is a collection of records), but you also won't just get the entire set of messages all at once. You would have the same issue if you set auto.offset.reset to earliest for a new consumer -- everything that's in the topic will need to be consumed. Under the

Re: Zookeeper to Broker Count

2016-01-26 Thread Ewen Cheslack-Postava
No, you don't need to keep adding ZK nodes. You should have a 3 or 5 node ZK cluster. The more nodes you use, the slower write performance becomes, so adding more can hurt performance of any ZK-related operations. The tradeoff between 3 and 5 ZK nodes is fault tolerance (better with 5) vs write

Re: Alternate persistence for Kafka

2016-01-26 Thread Ewen Cheslack-Postava
Nikhil, You should search the mailing list archives, but I'm not aware of any discussion around that. If you wanted to try something like that, you might be able to accomplish it via FUSE or similar. For example, this page lists ways you can mount HDFS as a normal filesystem, including fuse-based

Re: Having multi-threaded Kafka Consumer per partition, is it possible and recommended, if so any sample snippet?

2016-01-21 Thread Ewen Cheslack-Postava
Hi, The new consumer is single threaded. You can layer multi-threaded processing on top of it, but you'll definitely need to be careful about how offset commits are handled to ensure a) processing of a message is actually *complete* not just passed off to another thread before committing an

Re: 409 Conflict

2016-01-12 Thread Ewen Cheslack-Postava
Is the consumer registration failing, or the subsequent calls to read from the topic? From the error, it sounds like the latter -- a conflict during registration should generate a 40902 error. Can you give more info about the sequence of requests that causes the error? The set of commands you

Re: kafka-producer-perf-test.sh - 0.8.2.1

2016-01-08 Thread Ewen Cheslack-Postava
t; have yet to find the right one. > > Thanks! > > Andrew > > On Fri, Jan 8, 2016 at 10:54 AM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Andrew, > > > > kafka-producer-perf-test.sh is just a wrapper around > > orga.apache.kafka.c

Re: kafka-producer-perf-test.sh - 0.8.2.1

2016-01-08 Thread Ewen Cheslack-Postava
-class.sh > org.apache.kafka.clients.tools.ProducerPerformance > is single threaded? Or is there any way to specify number of threads? > > On Fri, Jan 8, 2016 at 1:24 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Ah, sorry, I missed the version number in your title. I think this too

Re: kafka-producer-perf-test.sh - 0.8.2.1

2016-01-08 Thread Ewen Cheslack-Postava
Andrew, kafka-producer-perf-test.sh is just a wrapper around orga.apache.kafka.clients.tools.ProducerPerformance and all command line options should be forwarded. Can you just pass a --producer-props to set max.request.size to a larger value? -Ewen On Fri, Jan 8, 2016 at 7:51 AM, Andrej

Re: Kafka-Rest question

2016-01-07 Thread Ewen Cheslack-Postava
ET client, but in the proxy it > doesn’t appear to work like that. > > -Original Message- > From: Ewen Cheslack-Postava [mailto:e...@confluent.io] > Sent: Thursday, January 07, 2016 1:18 PM > To: users@kafka.apache.org > Subject: Re: Kafka-Rest question > > On Thu, Jan 7, 2016 at 12

Re: [ANN] kinsky: clojure 0.9.0 client

2016-01-07 Thread Ewen Cheslack-Postava
Looks good! I've added it to the clients page here: https://cwiki.apache.org/confluence/display/KAFKA/Clients -Ewen On Thu, Jan 7, 2016 at 9:12 AM, Dana Powers wrote: > Very nice! > On Jan 7, 2016 04:41, "Pierre-Yves Ritschard" wrote: > > > Hi list, >

Re: Server Logs

2016-01-01 Thread Ewen Cheslack-Postava
Chandra, If its a separate app to collect logs, it would presumably run on the same server IIS is running on since that's where logs would be generated. -Ewen On Tue, Dec 29, 2015 at 9:22 PM, chandra sekar wrote: > Dear Ewen, > Where do i run the separate

Re: Server Logs

2015-12-28 Thread Ewen Cheslack-Postava
Chandra, If you're just serving files from IIS and want to collect logs, you'll probably want to run a separate application to collect the log files and report each log entry to Kafka. If you're running a web application, you can use the producer yourself to report events to Kafka. -Ewen On

Re: Protocol version upgrades in 0.9

2015-12-28 Thread Ewen Cheslack-Postava
is if you want to be able to mix consumers using different libraries in the same consumer group (consumers in different groups using different libraries should always be fine). -Ewen On Mon, Dec 28, 2015 at 4:16 PM, Ewen Cheslack-Postava <e...@confluent.io> wrote: > Yes, version=0 should

Re: Protocol version upgrades in 0.9

2015-12-24 Thread Ewen Cheslack-Postava
----- > > Maybe I misunderstood the purpose of this version field? > > On Thu, 24 Dec 2015 at 00:27 Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Oleksiy, > > > > Where are you specifying the version? Unless I'm missing something, th

Re: Protocol version upgrades in 0.9

2015-12-23 Thread Ewen Cheslack-Postava
Oleksiy, Where are you specifying the version? Unless I'm missing something, the JoinGroup protocol doesn't include versions so I'm not sure I understand the examples you are giving. Are the version numbers included in the per-protocol metadata? You can see exactly how the consumer coordinator

Re: [kafka-clients] The JIRA Awakens [KAFKA-1841]

2015-12-23 Thread Ewen Cheslack-Postava
Dana, Not sure about the old merge script, but with the new one used for GitHub PRs it tracks the branches you choose to commit to and can update the JIRA automatically, tagging the appropriate fix versions. So from now on, the appropriate JIRA queries for issues with, e.g., fix version 0.9.0.1

Re: Gradle build error

2015-12-23 Thread Ewen Cheslack-Postava
What version of Gradle are you using and can you give the exact command you're running? -Ewen On Wed, Dec 23, 2015 at 5:49 PM, Oliver Pačut wrote: > Hello, > > I am having trouble using Gradle to build Kafka. I get the error: > > > FAILURE: Build failed with an

Re: compatibility: 0.8.1.1 broker, 0.8.2.2 producer

2015-12-23 Thread Ewen Cheslack-Postava
producer with 0.8.1.1 brokers without problems. > > Version of scala matters if you are building with scala or some other > > components that use scala. > > Hope this helps. > > > > -- > > Andrey Yegorov > > > > On Wed, Dec 23, 2015 at 1:11 PM, Ew

Re: kafka-connect-jdbc: ids, timestamps, and transactions

2015-12-16 Thread Ewen Cheslack-Postava
Mark, There are definitely limitations to using JDBC for change data capture. Using a database-specific implementation, especially if you can read directly off the database's log, will be able to handle more situations like this. Cases like the one you describe are difficult to address

Re: kafka connect(copycat) question

2015-12-10 Thread Ewen Cheslack-Postava
uent.io/2.0.0/connect/devguide.html > > Thanks, > Roman > > > > On Wednesday, November 11, 2015 6:59 AM, Ewen Cheslack-Postava < > e...@confluent.io> wrote: > Hi Venkatesh, > > If you're using the default settings included in the sample configs, it'll > expect

Re: Mirrormaker issue with Kafka 0.9 (confluent platform 2.0)

2015-12-10 Thread Ewen Cheslack-Postava
Meghana, It looks like this functionality was removed in https://issues.apache.org/jira/browse/KAFKA-1650, although I don't see explicit discussion of the issue in the JIRA so I'm not sure of the exact motivation. Maybe Becket or Guozhang can offer some insight about if it is necessary (that JIRA

Re: kafka connect(copycat) question

2015-12-08 Thread Ewen Cheslack-Postava
Svante, Just to clarify, the HDFS connector relies on some Avro translation code which is in a separate repository. You need the https://github.com/confluentinc/schema-registry repository built before the kafka-connector-hdfs repository to get that dependency. Confluent has now also released

Re: flush() vs close()

2015-12-01 Thread Ewen Cheslack-Postava
Kashif, The difference is that close() will also shut down the producer such that it can no longer send any messages. flush(), in contrast, is useful if you want to make sure that all the messages enqueued so far have been sent and acked, but also want to send more messages after that. -Ewen On

Re: kafka connect(copycat) question

2015-11-16 Thread Ewen Cheslack-Postava
jar:0.9.0.0, > io.confluent:kafka-connect-avro-converter:jar:2.0.0-SNAPSHOT, > io.confluent:common-config:jar:2.0.0-SNAPSHOT: Could not find artifact > org.apache.kafka:connect-api:jar:0.9.0.0 in confluent > > On Thu, Nov 12, 2015 at 2:59 PM, Ewen Cheslack-Postava <e...@confluent.io&g

Re: where do I get the Kafka classes

2015-11-16 Thread Ewen Cheslack-Postava
Hi Adaryl, First, it looks like you might be trying to use the old producer interface. That interface is going to be deprecated in favor of the new producer (under org.apache.kafka.clients.producer). I'd highly recommend using the new producer interface instead. Second, perhaps this repository

Re: kafka connect(copycat) question

2015-11-12 Thread Ewen Cheslack-Postava
he sink as part of a confluent > project ( > > https://github.com/confluentinc/copycat-hdfs/blob/master/src/main/java/io/confluent/copycat/hdfs/HdfsSinkConnector.java > ). > Does it mean that I build this project and add the jar to kafka libs ? > > > > > On Tue, Nov

Re: Ephemeral ports for Kafka broker

2015-11-11 Thread Ewen Cheslack-Postava
Passing 0 as the port should let you do this. This is how we get the tests to work without assuming a specific port is available. The KafkaServer.boundPort(SecurityProtocol) method can be used to get the port that was bound. -Ewen On Tue, Nov 10, 2015 at 11:23 PM, Hemanth Yamijala

Re: [VOTE] 0.9.0.0 Candiate 1

2015-11-10 Thread Ewen Cheslack-Postava
Jun, not sure if this is just because of the RC vs being published on the site, but the links in the release notes aren't pointing to issues.apache.org. They're relative URLs instead of absolute. -Ewen On Tue, Nov 10, 2015 at 3:38 AM, Flavio Junqueira wrote: > -1 (non-binding)

Re: kafka connect(copycat) question

2015-11-10 Thread Ewen Cheslack-Postava
Hi Venkatesh, If you're using the default settings included in the sample configs, it'll expect JSON data in a special format to support passing schemas along with the data. This is turned on by default because it makes it possible to work with a *lot* more connectors and data storage systems

Re: kafka connect(copycat) question

2015-11-10 Thread Ewen Cheslack-Postava
in > double-quotes. > 2) Once I hit the above JsonParser error on the SinkTask, the connector is > hung, doesn't take any more messages even proper ones. > > > On Tue, Nov 10, 2015 at 1:59 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Hi Venka

Re: Kafka 090 maven coordinate

2015-11-03 Thread Ewen Cheslack-Postava
0.9.0.0 is not released yet, but the last blockers are being addressed and release candidates should follow soon. The docs there are just staged as we prepare for the release (note, e.g., that the latest release on the downloads page http://kafka.apache.org/downloads.html is still 0.8.2.2). -Ewen

Re: Kafka topic message consumer fetch response time checking

2015-10-20 Thread Ewen Cheslack-Postava
If you want the full round-trip latency, you need to measure that at the client. The performance measurement tools should do this pretty accurately. For example, if you just want to know how long it takes to produce a message to Kafka and get an ack back, you can use the latency numbers reported

Re: kafka consumer shell scripts (and kafkacat) with respect to JSON pretty printing

2015-10-20 Thread Ewen Cheslack-Postava
You can accomplish this with the console consumer -- it has a formatter flag that lets you plug in custom logic for formatting messages. The default does not do any formatting, but if you write your own implementation, you just need to set the flag to plug it in. You can see an example of this in

Re: Difference between storing offset on Kafka and on Zookeeper server?

2015-10-15 Thread Ewen Cheslack-Postava
There are a couple of advantages. First, scalability. Writes to Kafka are cheaper than writes to ZK. Kafka-based offset storage is going to be able to handle significantly more consumers (and scale out as needed since writes are spread across all partitions in the offsets topic). Second, once you

Re: Kafka 0.9.0 release branch

2015-10-13 Thread Ewen Cheslack-Postava
Yes, 0.9 will include the new consumer. On Tue, Oct 13, 2015 at 12:50 PM, Rajiv Kurian wrote: > A bit off topic but does this release contain the new single threaded > consumer that supports the poll interface? > > Thanks! > > On Mon, Oct 12, 2015 at 1:31 PM, Jun Rao

Re: [kafka-clients] Kafka 0.9.0 release branch

2015-10-13 Thread Ewen Cheslack-Postava
Not sure if I'd call it a blocker, but if we can get it in I would *really* like to see some solution to https://issues.apache.org/jira/browse/KAFKA-2397 committed. Without an explicit leave group, even normal operation of groups can leave some partitions unprocessed for 30-60s at a time under

Re: Better API/Javadocs?

2015-10-11 Thread Ewen Cheslack-Postava
On Javadocs, both new clients (producer and consumer) have very thorough documentation in the javadocs. 0.9.0.0 will be the first release with the new consumer. On deserialization, the new consumer lets you specify deserializers just like you do for the new producer. But the old consumer supports

Re: Documentation for 0.8.2 kafka-client api?

2015-10-08 Thread Ewen Cheslack-Postava
ConsumerConnector is part of the old consumer API (which is what is currently released; new consumer is coming in 0.9.0). That class is not in kafka-clients, it is in the core Kafka jar, which is named with the Scala version you want to use, e.g. kafka_2.10. -Ewen On Thu, Oct 8, 2015 at 1:24 PM,

Re: Partition ownership with high-level consumer

2015-10-07 Thread Ewen Cheslack-Postava
And you can get the current assignment in the new consumer after the rebalance completes too: https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L593 On Tue, Oct 6, 2015 at 5:27 PM, Gwen Shapira wrote: > Ah,

Re: How do I achieve round robin based partitioning for topic?

2015-09-21 Thread Ewen Cheslack-Postava
Are you using the old or new producer? That sounds like the behavior the old producer had -- it would stick to the same partition for awhile (10 minutes if I remember correctly). The new producer does not have this behavior, preferring to round-robin the *available* brokers. Note that since it

Re: [VOTE] 0.8.2.2 Candidate 1

2015-09-09 Thread Ewen Cheslack-Postava
+1 non-binding. Verified artifacts, unit tests, quick start. On Wed, Sep 9, 2015 at 10:09 AM, Guozhang Wang wrote: > +1 binding, verified unit tests and quick start. > > On Wed, Sep 9, 2015 at 4:12 AM, Manikumar Reddy > wrote: > > > +1 (non-binding).

Re: 0.8.2 consumer api docs

2015-09-06 Thread Ewen Cheslack-Postava
Those APIs are not implemented in 0.8.2. They were included because the APIs were being iterated on, but the implementation wasn't there yet. You can expect to see those APIs (with some modifications as they've been refined) in 0.8.3. -Ewen On Sun, Sep 6, 2015 at 10:42 AM, Phil Steitz

Re: Question regarding to reconnect.backoff.ms

2015-09-02 Thread Ewen Cheslack-Postava
Steve, I don't think there is a better solution at the moment. This is an easy issue to miss in unit testing because generally connections to localhost will be rejected immediately if there isn't anything listening on the port. If you're running in an environment where this happens normally, then

Re: Kafka

2015-09-02 Thread Ewen Cheslack-Postava
Muqtafi, There are corresponding Java or Scala classes. You can use them directly, but beware that they are not considered public interfaces, so there are no promises about compatibility. They could completely change between releases. (The command line tools themselves, however, are considered

Re: Http Kafka producer

2015-08-27 Thread Ewen Cheslack-Postava
, or are batches just held in memory before being sent to Kafka (or some other option)? Thanks! On Aug 26, 2015, at 9:50 PM, Ewen Cheslack-Postava e...@confluent.io wrote: Hemanth, The Confluent Platform 1.0 version of have JSON embedded format support (i.e. direct embedding of JSON messages

Re: Http Kafka producer

2015-08-26 Thread Ewen Cheslack-Postava
Hemanth, Can you be a bit more specific about your setup? Do you have control over the format of the request bodies that reach HAProxy or not? If you do, Confluent's REST proxy should work fine and does not require the Schema Registry. It supports both binary (encoded as base64 so it can be

Re: Http Kafka producer

2015-08-26 Thread Ewen Cheslack-Postava
schema registry ? --regards Hemanth -Original Message- From: Ewen Cheslack-Postava [mailto:e...@confluent.io] Sent: Thursday, August 27, 2015 9:14 AM To: users@kafka.apache.org Subject: Re: Http Kafka producer Hemanth, Can you be a bit more specific about your setup? Do you have

Re: bootstrap.servers for the new Producer

2015-08-22 Thread Ewen Cheslack-Postava
initiated (and as long as the dns entry is there) it only fails to connect in the poll() method. And in the poll() method the status is not reset to DISCONNECTED and so it not blacked out. On Fri, Aug 21, 2015 at 10:06 PM, Ewen Cheslack-Postava e...@confluent.io wrote: Are you seeing

Re: bootstrap.servers for the new Producer

2015-08-21 Thread Ewen Cheslack-Postava
Are you seeing this in practice or is this just a concern about the way the code currently works? If the broker is actually down and the host is rejecting connections, the situation you describe shouldn't be a problem. It's true that the NetworkClient chooses a fixed nodeIndexOffset, but the

Re: Kafka java consumer

2015-08-14 Thread Ewen Cheslack-Postava
There's not a precise date for the release, ~1.5 or 2 months from now. On Fri, Aug 14, 2015 at 3:45 PM, Abhijith Prabhakar abhi.preda...@gmail.com wrote: Thanks Ewen. Any idea when we can expect 0.8.3? On Aug 14, 2015, at 5:36 PM, Ewen Cheslack-Postava e...@confluent.io wrote: Hi

Re: Kafka java consumer

2015-08-14 Thread Ewen Cheslack-Postava
Hi Abhijith, You should be using KafkaProducer, but KafkaConsumer is not ready yet. The APIs are included in 0.8.2.1, but the implementation is not ready. Until 0.8.3 is released, you cannot rely only on kafka-clients if you want to write a consumer. You'll need to depend on the main kafka jar

Re: best way to call ReassignPartitionsCommand programmatically

2015-08-10 Thread Ewen Cheslack-Postava
It's not public API so it may not be stable between releases, but you could try using the ReassignPartitionsCommand class directly. Or, you can see that the code in that class is a very simple use of ZkUtils, so you could just make the necessary calls to ZkUtils directly. In the future, when

Re: How to read messages from Kafka by specific time?

2015-08-10 Thread Ewen Cheslack-Postava
You can use SimpleConsumer.getOffsetsBefore to get a list of offsets before a Unix timestamp. However, this isn't per-message. The offests returned are for the log segments stored on the broker, so the granularity will depend on your log rolling settings. -Ewen On Wed, Aug 5, 2015 at 2:11 AM,

Re: how to get single record from kafka topic+partition @ specified offset

2015-08-10 Thread Ewen Cheslack-Postava
:52 PM, Joe Lawson jlaw...@opensourceconnections.com wrote: Ewen, Do you have an example or link for the changes/plans that will bring the benefits you describe? Cheers, Joe Lawson On Aug 10, 2015 3:27 PM, Ewen Cheslack-Postava e...@confluent.io wrote: You can do this using

Re: how to get single record from kafka topic+partition @ specified offset

2015-08-10 Thread Ewen Cheslack-Postava
You can do this using the SimpleConsumer. See https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example for details with some code. When the new consumer is released in 0.8.3, this will get a *lot* simpler. -Ewen On Fri, Aug 7, 2015 at 9:26 AM, Padgett, Ben

Re: problem about the offset

2015-08-10 Thread Ewen Cheslack-Postava
Kafka doesn't track per-message timestamps. The request you're using gets a list of offsets for *log segments* with timestamps earlier than the one you specify. If you start consuming from the offset returned, you should find the timestamp you specified in the same log file. -Ewen On Mon, Aug

Re: Cache Memory Kafka Process

2015-07-29 Thread Ewen Cheslack-Postava
, Nilesh Chhapru. On Tuesday 28 July 2015 12:37 PM, Ewen Cheslack-Postava wrote: Nilesh, It's expected that a lot of memory is used for cache. This makes sense because under the hood, Kafka mostly just reads and writes data to/from files. While Kafka does manage some in-memory data, mostly

Re: Cache Memory Kafka Process

2015-07-28 Thread Ewen Cheslack-Postava
stops reading at times. When i do a free -m on my broker node after 1/2 - 1 hr the memory foot print is as follows. 1) Physical memory - 500 MB - 600 MB 2) Cache Memory - 6.5 GB 3) Free Memory - 50 - 60 MB Regards, Nilesh Chhapru. On Monday 27 July 2015 11:02 PM, Ewen Cheslack-Postava

Re: Best practices - Using kafka (with http server) as source-of-truth

2015-07-27 Thread Ewen Cheslack-Postava
Hi Prabhjot, Confluent has a REST proxy with docs that may give some guidance: http://docs.confluent.io/1.0/kafka-rest/docs/intro.html The new producer that it uses is very efficient, so you should be able to get pretty good throughput. You take a bit of a hit due to the overhead of sending data

Re: Cache Memory Kafka Process

2015-07-27 Thread Ewen Cheslack-Postava
Having the OS cache the data in Kafka's log files is useful since it means that data doesn't need to be read back from disk when consumed. This is good for the latency and throughput of consumers. Usually this caching works out pretty well, keeping the latest data from your topics in cache and

Re: deleting data automatically

2015-07-27 Thread Ewen Cheslack-Postava
log.segment.bytes? Thanks. On Mon, Jul 27, 2015 at 1:25 PM, Ewen Cheslack-Postava e...@confluent.io wrote: I think log.cleanup.interval.mins was removed in the first 0.8 release. It sounds like you're looking at outdated docs. Search for log.retention.check.interval.ms here: http

Re: deleting data automatically

2015-07-27 Thread Ewen Cheslack-Postava
log.retention.check.interval.ms, but there is log.cleanup.interval.mins, is that what you mean? If I set log.roll.ms or log.cleanup.interval.mins too small, will it hurt the throughput? Thanks. On Fri, Jul 24, 2015 at 11:03 PM, Ewen Cheslack-Postava e...@confluent.io wrote: You'll want to set the log retention

Re: Choosing brokers when creating topics

2015-07-27 Thread Ewen Cheslack-Postava
Try the --replica-assignment option for kafka-topics.sh. It allows you to specify which brokers to assign as replicas instead of relying on the assignments being made automatically. -Ewen On Mon, Jul 27, 2015 at 12:25 AM, Jilin Xie jilinxie1...@gmail.com wrote: Hi Is it possible to

Re: deleting data automatically

2015-07-24 Thread Ewen Cheslack-Postava
You'll want to set the log retention policy via log.retention.{ms,minutes,hours} or log.retention.bytes. If you want really aggressive collection (e.g., on the order of seconds, as you specified), you might also need to adjust log.segment.bytes/log.roll.{ms,hours} and

Re: wow--kafka--why? unresolved dependency: com.eed3si9n#sbt-assembly;0.8.8: not found

2015-07-23 Thread Ewen Cheslack-Postava
,daemonRegistryDir=/root/.gradle/daemon,pid=26478,idleTimeout=12,daemonOpts=-XX:MaxPermSize=512m,-Xmx1024m,-Dfile.encoding=UTF-8,-Duser.country=PH,-Duser.language=en,-Duser.variant]}. 04:32:54.865 [DEBUG] [org.gradle.launch On Fri, Jul 24, 2015 at 3:34 AM, Ewen Cheslack-Postava e...@confluent.io

Re: wow--kafka--why? unresolved dependency: com.eed3si9n#sbt-assembly;0.8.8: not found

2015-07-23 Thread Ewen Cheslack-Postava
Also, the branch you're checking out is very old. If you want the most recent release, that's tagged as 0.8.2.1. Otherwise, you'll want to use the trunk branch. -Ewen On Thu, Jul 23, 2015 at 11:45 AM, Gwen Shapira gshap...@cloudera.com wrote: Sorry, we don't actually do SBT builds anymore.

Re: New producer hangs inifitely when it looses connection to Kafka cluster

2015-07-21 Thread Ewen Cheslack-Postava
This is a known issue. There are a few relevant JIRAs and a KIP: https://issues.apache.org/jira/browse/KAFKA-1788 https://issues.apache.org/jira/browse/KAFKA-2120 https://cwiki.apache.org/confluence/display/KAFKA/KIP-19+-+Add+a+request+timeout+to+NetworkClient -Ewen On Tue, Jul 21, 2015 at 7:05

Re: New consumer - poll/seek javadoc confusing, need clarification

2015-07-21 Thread Ewen Cheslack-Postava
On Tue, Jul 21, 2015 at 2:38 AM, Stevo Slavić ssla...@gmail.com wrote: Hello Apache Kafka community, I find new consumer poll/seek javadoc a bit confusing. Just by reading docs I'm not sure what the outcome will be, what is expected in following scenario: - kafkaConsumer is instantiated

Re: Retrieving lost messages produced while the consumer was down.

2015-07-21 Thread Ewen Cheslack-Postava
Since you mentioned consumer groups, I'm assuming you're using the high level consumer? Do you have auto.commit.enable set to true? It sounds like when you start up you are always getting the auto.offset.reset behavior, which indicates you don't have any offsets committed. By default, that

Re: Kafka brokers on HTTP port

2015-07-16 Thread Ewen Cheslack-Postava
, Chandrash3khar Kotekar Mobile - +91 8600011455 On Fri, Jul 17, 2015 at 4:57 AM, Ewen Cheslack-Postava e...@confluent.io wrote: Chandrashekhar, If the firewall rules allow any TCP connection on those ports, you can just use Kafka directly and change the default port. If they actually verify

Re: Kafka brokers on HTTP port

2015-07-16 Thread Ewen Cheslack-Postava
Chandrashekhar, If the firewall rules allow any TCP connection on those ports, you can just use Kafka directly and change the default port. If they actually verify that its HTTP traffic then you'd have to the REST Proxy Edward mentioned or another HTTP-based proxy. -Ewen On Thu, Jul 16, 2015 at

Re: latency performance test

2015-07-16 Thread Ewen Cheslack-Postava
, Thank you for your patient explaining. It is very helpful. Can we assume that the long latency of ProducerPerformance comes from queuing delay in the buffer and it is related to buffer size? Thank you! best, Yuheng On Thu, Jul 16, 2015 at 12:21 AM, Ewen Cheslack-Postava e...@confluent.io

Re: resources for simple consumer?

2015-07-15 Thread Ewen Cheslack-Postava
Hi Jeff, The simple consumer hasn't really changed, the info you found should still be relevant. The wiki page on it might be the most useful reference for getting started: https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example And if you want a version all setup to

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Ewen Cheslack-Postava
Also worth mentioning is that the new producer doesn't have this behavior -- it will round robin over available partitions for records without keys. Available means it currently has a leader -- under normal cases this means it distributes evenly across all partitions, but if a partition is down

Re: latency performance test

2015-07-15 Thread Ewen Cheslack-Postava
The tests are meant to evaluate different things and the way they send messages is the source of the difference. EndToEndLatency works with a single message at a time. It produces the message then waits for the consumer to receive it. This approach guarantees there is no delay due to queuing. The

Re: kafka benchmark tests

2015-07-14 Thread Ewen Cheslack-Postava
at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move

Re: Data Structure abstractions over kafka

2015-07-13 Thread Ewen Cheslack-Postava
Tim, Kafka can be used as a key-value store if you turn on log compaction: http://kafka.apache.org/documentation.html#compaction You need to be careful with that since it's purely last-writer-wins and doesn't have anything like CAS that might help you manage concurrent writers, but the basic

Re: kafka benchmark tests

2015-07-13 Thread Ewen Cheslack-Postava
I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py:

Re: Batch producer latencies and flush()

2015-06-28 Thread Ewen Cheslack-Postava
The logic you're requesting is basically what the new producer implements. The first condition is the batch size limit and the second is linger.ms. The actual logic is a bit more complicated and has some caveats dealing with, for example, backing off after failures, but you can see in this code

Re: No key specified when sending the message to Kafka

2015-06-23 Thread Ewen Cheslack-Postava
It does balance data, but is sticky over short periods of time (for some definition of short...). See this FAQ for an explanation: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified ? This behavior has been

Re: Consumer rebalancing based on partition sizes?

2015-06-23 Thread Ewen Cheslack-Postava
Current partition assignment only has a few limited options -- see the partition.assignment.strategy consumer option (which seems to be listed twice, see the second version for a more detailed explanation). There has been some discussion of making assignment strategies user extensible to support

Re: NoSuchMethodError with Consumer Instantiation

2015-06-18 Thread Ewen Cheslack-Postava
It looks like you have mixed up versions of the kafka jars: 4. kafka_2.11-0.8.3-SNAPSHOT.jar 5. kafka_2.11-0.8.2.1.jar 6. kafka-clients-0.8.2.1.jar I think org.apache.kafka.common.utils.Utils is very new, probably post 0.8.2.1, so it's probably caused by the kafka_2.11-0.8.3-SNAPSHOT.jar being

Re: Kafka as an event store for Event Sourcing

2015-06-13 Thread Ewen Cheslack-Postava
that could be rebuilt from the log or a snapshot. On lør. 13. jun. 2015 at 20.26 Ewen Cheslack-Postava e...@confluent.io wrote: Jay - I think you need broker support if you want CAS to work with compacted topics. With the approach you described you can't turn on compaction since that would make

Re: Kafka as an event store for Event Sourcing

2015-06-13 Thread Ewen Cheslack-Postava
: But wouldn't the key-offset table be enough to accept or reject a write? I'm not familiar with the exact implementation of Kafka, so I may be wrong. On lør. 13. jun. 2015 at 21.05 Ewen Cheslack-Postava e...@confluent.io wrote: Daniel: By random read, I meant not reading the data

Re: Kafka as an event store for Event Sourcing

2015-06-13 Thread Ewen Cheslack-Postava
Jay - I think you need broker support if you want CAS to work with compacted topics. With the approach you described you can't turn on compaction since that would make it last-writer-wins, and using any non-infinite retention policy would require some external process to monitor keys that might

Re: offsets.storage=kafka, dual.commit.enabled=false still requires ZK

2015-06-09 Thread Ewen Cheslack-Postava
The new consumer implementation, which should be included in 0.8.3, only needs a bootstrap.servers setting and does not use a zookeeper connection. On Tue, Jun 9, 2015 at 1:26 PM, noah iamn...@gmail.com wrote: We are setting up a new Kafka project (0.8.2.1) and are trying to go straight to

Re: How Producer handles Network Connectivity Issues

2015-05-26 Thread Ewen Cheslack-Postava
It's not being switched in this case because the broker hasn't failed. It can still connect to all the other brokers and zookeeper. The only failure is of the link between a client and the broker. Another way to think of this is to extend the scenario with more producers. If I have 100 other

Re: KafkaConsumer poll always returns null

2015-05-20 Thread Ewen Cheslack-Postava
-Postava - do you have an example you could post? From: Ewen Cheslack-Postava [e...@confluent.io] Sent: Tuesday, May 19, 2015 3:12 PM To: users@kafka.apache.org Subject: Re: KafkaConsumer poll always returns null The new consumer in trunk is functional

Re: KafkaConsumer poll always returns null

2015-05-19 Thread Ewen Cheslack-Postava
The new consumer in trunk is functional when used similarly to the old SimpleConsumer, but none of the functionality corresponding to the high level consumer is there yet (broker-based coordination for consumer groups). There's not a specific timeline for the next release (i.e. when it's ready).

<    1   2   3   4   >