Re: KIP-119: Drop Support for Scala 2.10 in Kafka 0.11

2017-02-03 Thread Grant Henke
Thanks for proposing this Ismael. This makes sense to me.

In this KIP and the java KIP you state:

A reasonable policy is to support the 2 most recently released versions so
> that we can strike a good balance between supporting older versions,
> maintainability and taking advantage of language and library improvements.


What do you think about adjusting the KIP to instead vote on that as a
standard policy for Java and Scala going forward? Something along the lines
of:

"Kafka's policy is to support the 2 most recently released versions of Java
and Scala at a given time. When a new version becomes available, the
supported versions will be updated in the next major release of Kafka."


On Fri, Feb 3, 2017 at 8:30 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi all,
>
> I have posted a KIP for dropping support for Scala 2.10 in Kafka 0.11:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11
>
> Please take a look. Your feedback is appreciated.
>
> Thanks,
> Ismael
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Grant Henke
Looks good to me. Thanks for handling the KIP.

On Fri, Feb 3, 2017 at 8:49 AM, Damian Guy <damian@gmail.com> wrote:

> Thanks Ismael. Makes sense to me.
>
> On Fri, 3 Feb 2017 at 10:39 Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Hi all,
> >
> > I have posted a KIP for dropping support for Java 7 in Kafka 0.11:
> >
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 118%3A+Drop+Support+for+Java+7+in+Kafka+0.11
> >
> > Most people were supportive when we last discussed the topic[1], but
> there
> > were a few concerns. I believe the following should mitigate the
> concerns:
> >
> > 1. The new proposal suggests dropping support in the next major version
> > instead of the next minor version.
> > 2. KIP-97 which is part of 0.10.2 means that 0.11 clients will support
> 0.10
> > brokers (0.11 brokers will also support 0.10 clients as usual), so there
> is
> > even more flexibility on incremental upgrades.
> > 3. Java 9 will be released shortly after the next Kafka release, so we'll
> > be supporting the 2 most recent Java releases, which is a reasonable
> > policy.
> > 4. 8 months have passed since the last proposal and the release after
> > 0.10.2 won't be out for another 4 months, which should hopefully be
> enough
> > time for Java 8 to be even more established. We haven't decided when the
> > next major release will happen, but we know that it won't happen before
> > June 2017.
> >
> > Please take a look at the proposal and share your feedback.
> >
> > Thanks,
> > Ismael
> >
> > [1] http://search-hadoop.com/m/Kafka/uyzND1oIhV61GS5Sf2
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [ANNOUNCE] New committer: Grant Henke

2017-01-12 Thread Grant Henke
Thanks everyone!

On Thu, Jan 12, 2017 at 2:58 AM, Damian Guy <damian@gmail.com> wrote:

> Congratulations!
>
> On Thu, 12 Jan 2017 at 03:35 Jun Rao <j...@confluent.io> wrote:
>
> > Grant,
> >
> > Thanks for all your contribution! Congratulations!
> >
> > Jun
> >
> > On Wed, Jan 11, 2017 at 2:51 PM, Gwen Shapira <g...@confluent.io> wrote:
> >
> > > The PMC for Apache Kafka has invited Grant Henke to join as a
> > > committer and we are pleased to announce that he has accepted!
> > >
> > > Grant contributed 88 patches, 90 code reviews, countless great
> > > comments on discussions, a much-needed cleanup to our protocol and the
> > > on-going and critical work on the Admin protocol. Throughout this, he
> > > displayed great technical judgment, high-quality work and willingness
> > > to contribute where needed to make Apache Kafka awesome.
> > >
> > > Thank you for your contributions, Grant :)
> > >
> > > --
> > > Gwen Shapira
> > > Product Manager | Confluent
> > > 650.450.2760 <(650)%20450-2760> | @gwenshap
> > > Follow us: Twitter | blog
> > >
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [VOTE] KIP-106 - Default unclean.leader.election.enabled True => False

2017-01-11 Thread Grant Henke
+1

On Wed, Jan 11, 2017 at 1:23 PM, Sriram Subramanian <r...@confluent.io>
wrote:

> +1
>
> On Wed, Jan 11, 2017 at 11:10 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Thanks for raising this, +1.
> >
> > Ismael
> >
> > On Wed, Jan 11, 2017 at 6:56 PM, Ben Stopford <b...@confluent.io> wrote:
> >
> > > Looks like there was a good consensus on the discuss thread for KIP-106
> > so
> > > lets move to a vote.
> > >
> > > Please chime in if you would like to change the default for
> > > unclean.leader.election.enabled from true to false.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/%
> > > 5BWIP%5D+KIP-106+-+Change+Default+unclean.leader.
> > > election.enabled+from+True+to+False
> > >
> > > B
> > >
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: What's the difference between kafka_2.11 and kafka-client?

2016-06-23 Thread Grant Henke
kafka_2.11 is the Kafka server code and old Scala clients. kafka-client are
the new Java clients.

Thanks,
Grant

On Thu, Jun 23, 2016 at 9:25 PM, BYEONG-GI KIM <bg...@bluedigm.com> wrote:

> Hello.
>
> I wonder what the difference is between kafka_2.11 and kafka-client on
> Maven Repo.
>
> Thank you in advance!
>
> Best regards
>
> KIM
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [DISCUSS] Java 8 as a minimum requirement

2016-06-16 Thread Grant Henke
> > > > > most phones will use that version sadly). This reduces (but does
> not
> > > > > eliminate) the chance that we would be the first project that would
> > > > cause a
> > > > > user to consider a Java upgrade.
> > > > >
> > > > > The main argument for not making the change is that a reasonable
> > number
> > > > of
> > > > > users may still be using Java 7 by the time Kafka 0.10.1.0 is
> > released.
> > > > > More specifically, we care about the subset who would be able to
> > > upgrade
> > > > to
> > > > > Kafka 0.10.1.0, but would not be able to upgrade the Java version.
> It
> > > > would
> > > > > be great if we could quantify this in some way.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Ismael
> > > > >
> > > > > [1] https://java.com/en/download/faq/java_7.xml
> > > > > [2]
> > https://blogs.oracle.com/thejavatutorials/entry/jdk_8_is_released
> > > > > [3] http://openjdk.java.net/projects/jdk9/
> > > > > [4] https://github.com/apache/cassandra/blob/trunk/README.asc
> > > > > [5]
> > > https://lucene.apache.org/#highlights-of-this-lucene-release-include
> > > > > [6] http://akka.io/news/2015/09/30/akka-2.4.0-released.html
> > > > > [7] https://issues.apache.org/jira/browse/HADOOP-11858
> > > > > [8] https://webtide.com/jetty-9-3-features/
> > > > > [9] http://markmail.org/message/l7s276y3xkga2eqf
> > > > > [10]
> > > > >
> > > > >
> > > >
> > >
> >
> https://intellij-support.jetbrains.com/hc/en-us/articles/206544879-Selecting-the-JDK-version-the-IDE-will-run-under
> > > > > [11] http://markmail.org/message/l7s276y3xkga2eqf
> > > > >
> > > >
> > >
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Topic relentlessly recreated

2016-06-16 Thread Grant Henke
Hi Jason,

We encountered a corrupted topic and when we attempt to delete it, it comes
> back with some unusable defaults. It's really, really annoying.
>

It sounds like you may have auto topic creation enabled and a client is
constantly requesting that topic causing it to be created. Try
setting auto.create.topics.enable=false. Note that this may cause that
client to fail.

We tried creating the topic with one broker down as the topics is only
> created on one broker but the tool requires all brokers online so that
> doesn't work. As an aside, creating a topic without all brokers online
> should be possible...
> /usr/bin/kafka-topics.sh --zookeeper localhost --create --topic
> job_processing --partitions 4 --replication-factor 2


How many brokers do you have? It sounds like you may only have 2 brokers
total (and one is down). If thats the case, the tool is no able to reach
the requested replication factor of 2 and fails. If you had 3 brokers and 1
were down, this should work.

Since delete doesn't do what most people would expect and actually delete,
> we can't delete once online so we are completely stuck.


True that delete functionality could be improved. However, assuming your
cluster and topics are healthy, delete should work as you expect and
actually delete. Do you have delete functionality enabled on your cluster?
Try setting delete.topic.enable=true.

Thanks,
Grant






On Thu, Jun 16, 2016 at 2:11 PM, Jason Kania <jason.ka...@ymail.com.invalid>
wrote:

> We encountered a corrupted topic and when we attempt to delete it, it
> comes back with some unusable defaults. It's really, really annoying.
> We are shutting down all the kafka brokers, removing the kafka log folder
> and contents on all nodes, removing the broker topic information from
> zookeeper and restarting everything, but it continues to happen.
> We tried creating the topic with one broker down as the topics is only
> created on one broker but the tool requires all brokers online so that
> doesn't work. As an aside, creating a topic without all brokers online
> should be possible...
> /usr/bin/kafka-topics.sh --zookeeper localhost --create --topic
> job_processing --partitions 4 --replication-factor 2
>
> Since delete doesn't do what most people would expect and actually delete,
> we can't delete once online so we are completely stuck.
> Any suggestions would be appreciated.
> Thanks,
> Jason




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Invalid Version for API key

2016-06-08 Thread Grant Henke
ApiKey 2 is the Offsets request. There is only a version 0 of that protocol
since there has been no change in the protocol version for LIST_OFFSETS
from 0.8 to 0.10. So there error that version 1 is invalid is correct.

What has changed in 0.10 is that the validation and errors of incorrect api
keys has been improved.

Are you using any custom clients? Can you identify what commands (or client
code) trigger this error log?



On Wed, Jun 8, 2016 at 9:39 AM, Chris Barlock <barl...@us.ibm.com> wrote:

> Anyone?  Is there additional tracing that can be turned on to track down
> the source of these exceptions?
>
> Chris
>
>
>
>
> From:   Chris Barlock/Raleigh/IBM@IBMUS
> To: users@kafka.apache.org
> Date:   06/07/2016 12:45 PM
> Subject:Invalid Version for API key
>
>
>
> We are running some tests on upgrading from Kafka 2.10-0.8.2.1 to
> 2.11-0.10.00.  So far, we only upgraded the server.  All of the clients
> that I could verify are still using the Maven kafka-clients version
> 0.8.2.1.  There are a number of exceptions in the server.log:
>
> [2016-06-07 16:00:00,266] ERROR Closing socket for
> 172.20.8.19:9092-172.20.3.0:53901 because of error
> (kafka.network.Processor)
> kafka.network.InvalidRequestException: Error getting request for apiKey: 2
>
> and apiVersion: 1
> at
> kafka.network.RequestChannel$Request.liftedTree2$1(RequestChannel.scala:95)
> at
> kafka.network.RequestChannel$Request.(RequestChannel.scala:87)
> at
>
> kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:488)
> at
>
> kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:483)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at
> kafka.network.Processor.processCompletedReceives(SocketServer.scala:483)
> at kafka.network.Processor.run(SocketServer.scala:413)
> at java.lang.Thread.run(Thread.java:785)
> Caused by: java.lang.IllegalArgumentException: Invalid version for API key
>
> 2: 1
> at
> org.apache.kafka.common.protocol.ProtoUtils.schemaFor(ProtoUtils.java:31)
> at
>
> org.apache.kafka.common.protocol.ProtoUtils.requestSchema(ProtoUtils.java:44)
> at
>
> org.apache.kafka.common.protocol.ProtoUtils.parseRequest(ProtoUtils.java:60)
> at
>
> org.apache.kafka.common.requests.ListOffsetRequest.parse(ListOffsetRequest.java:142)
> at
>
> org.apache.kafka.common.requests.AbstractRequest.getRequest(AbstractRequest.java:46)
> at
> kafka.network.RequestChannel$Request.liftedTree2$1(RequestChannel.scala:92)
> ... 10 more
>
> Any guidance on what the source of this might be?
>
> Chris
>
>
>
>
>
>


-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [ANNOUCE] Apache Kafka 0.10.0.0 Released

2016-05-24 Thread Grant Henke
Awesome! Thanks for managing the release Gwen!

On Tue, May 24, 2016 at 11:24 AM, Gwen Shapira <gwens...@apache.org> wrote:

> The Apache Kafka community is pleased to announce the release for Apache
> Kafka 0.10.0.0.
> This is a major release with exciting new features, including first
> release of KafkaStreams and many other improvements.
>
> All of the changes in this release can be found:
>
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.0.0/RELEASE_NOTES.html
>
> Apache Kafka is high-throughput, publish-subscribe messaging system
> rethought of as a distributed commit log.
>
> ** Fast => A single Kafka broker can handle hundreds of megabytes of reads
> and
> writes per second from thousands of clients.
>
> ** Scalable => Kafka is designed to allow a single cluster to serve as the
> central data backbone
> for a large organization. It can be elastically and transparently expanded
> without downtime.
> Data streams are partitioned and spread over a cluster of machines to allow
> data streams
> larger than the capability of any single machine and to allow clusters of
> co-ordinated consumers.
>
> ** Durable => Messages are persisted on disk and replicated within the
> cluster to prevent
> data loss. Each broker can handle terabytes of messages without performance
> impact.
>
> ** Distributed by Design => Kafka has a modern cluster-centric design that
> offers
> strong durability and fault-tolerance guarantees.
>
> You can download the source release from
>
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.0.0/kafka-0.10.0.0-src.tgz
>
> and binary releases from
>
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.0.0/kafka_2.10-0.10.0.0.tgz
>
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.0.0/kafka_2.11-0.10.0.0.tgz
>
> A big thank you for the following people who have contributed to the
> 0.10.0.0 release.
>
> Adam Kunicki, Aditya Auradkar, Alex Loddengaard, Alex Sherwin, Allen
> Wang, Andrea Cosentino, Anna Povzner, Ashish Singh, Atul Soman, Ben
> Stopford, Bill Bejeck, BINLEI XUE, Chen Shangan, Chen Zhu, Christian
> Posta, Cory Kolbeck, Damian Guy, dan norwood, Dana Powers, David
> Jacot, Denise Fernandez, Dionysis Grigoropoulos, Dmitry Stratiychuk,
> Dong Lin, Dongjoon Hyun, Drausin Wulsin, Duncan Sands, Dustin Cote,
> Eamon Zhang, edoardo, Edward Ribeiro, Eno Thereska, Ewen
> Cheslack-Postava, Flavio Junqueira, Francois Visconte, Frank Scholten,
> Gabriel Zhang, gaob13, Geoff Anderson, glikson, Grant Henke, Greg
> Fodor, Guozhang Wang, Gwen Shapira, Igor Stepanov, Ishita Mandhan,
> Ismael Juma, Jaikiran Pai, Jakub Nowak, James Cheng, Jason Gustafson,
> Jay Kreps, Jeff Klukas, Jeremy Custenborder, Jesse Anderson, jholoman,
> Jiangjie Qin, Jin Xing, jinxing, Jonathan Bond, Jun Rao, Ján Koščo,
> Kaufman Ng, kenji yoshida, Kim Christensen, Kishore Senji, Konrad,
> Liquan Pei, Luciano Afranllie, Magnus Edenhill, Maksim Logvinenko,
> manasvigupta, Manikumar reddy O, Mark Grover, Matt Fluet, Matt
> McClure, Matthias J. Sax, Mayuresh Gharat, Micah Zoltu, Michael Blume,
> Michael G. Noll, Mickael Maison, Onur Karaman, ouyangliduo, Parth
> Brahmbhatt, Paul Cavallaro, Pierre-Yves Ritschard, Piotr Szwed,
> Praveen Devarao, Rafael Winterhalter, Rajini Sivaram, Randall Hauch,
> Richard Whaling, Ryan P, Samuel Julius Hecht, Sasaki Toru, Som Sahu,
> Sriharsha Chintalapani, Stig Rohde Døssing, Tao Xiao, Tom Crayford,
> Tom Dearman, Tom Graves, Tom Lee, Tomasz Nurkiewicz, Vahid Hashemian,
> William Thurston, Xin Wang, Yasuhiro Matsuda, Yifan Ying, Yuto
> Kawamura, zhuchen1018
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> http://kafka.apache.org/
>
> Thanks,
>
> Gwen
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Tell us how you're using Kafka and what you'd like to see in the future

2016-04-28 Thread Grant Henke
Hi Neha,

Thanks for sending out this Survey! Hopefully we got a lot of responses
from the community. Are there any results to share?

Thank you,
Grant

On Wed, Apr 6, 2016 at 4:58 PM, Neha Narkhede <n...@confluent.io> wrote:

> Folks,
>
> We'd like to hear from community members about how you are using Kafka
> today, and what sort of features you'd like to see in the future.
>
> We put together this survey
> <http://www.surveygizmo.com/s3/2655428/a32ea76f74d9> to gather your
> feedback, which should take only a few minutes to complete.
>
> Please be candid about what’s working and what needs to be improved—this
> community is greater than the sum of its parts, and your responses will
> ultimately help to steer the development of Kafka moving forward.
>
> I will share the results of the survey for everyone to look at.
>
> Thank you!
>
> Best,
> Neha
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Documentation

2016-03-29 Thread Grant Henke
Thanks for the details Dana!

I think this sort of thing could be worked into the new "Protocol Guide"
documentation: http://kafka.apache.org/protocol.html

On Tue, Mar 29, 2016 at 11:25 AM, Gwen Shapira <g...@confluent.io> wrote:

> Awesome summary, Dana. I'd like to fit this into our docs, but I'm not sure
> where does step-by-step-description of the protocol fits. Maybe in "Design"
> section?
>
> Just one more thing:
> 8) At any time, the broker can respond to a fetch request with
> "Rebalancing" error code, at which point the assignment dance begins from
> scratch: consumers needs to send the sync request, leaders need to create
> an assignment, etc.
>
> Gwen
>
>
>
> On Tue, Mar 29, 2016 at 9:05 AM, Dana Powers <dana.pow...@gmail.com>
> wrote:
>
> > I also found the documentation difficult to parse when it came time to
> > implement group APIs. I ended up just reading the client source code and
> > trying api calls until it made sense.
> >
> > My general description from off the top of my head:
> >
> > (1) have all consumers submit a shared protocol_name string* and
> > bytes-serialized protocol_metadata, which includes the topic(s) for
> > subscription.
> > (2) All consumers should be prepared to parse a JoinResponse and identify
> > whether they have been selected as the group leader.
> > (3) The non-leaders go straight to SyncGroupRequest, with
> group_assignment
> > empty, and will wait get their partition assignments via the
> > SyncGroupResponse.
> > (4) Unlike the others, the consumer-leader will get the agreed
> > protocol_name and all member's metadata in its JoinGroupResponse.
> > (5) The consumer-leader must apply the protocol_name assignment strategy
> to
> > the member list + metadata (plus the cluster metadata), generating a set
> of
> > member assignments (member_id -> topic partitions).
> > (6)The consumer-leader then submits that member_assignment data in its
> > SyncGroupRequest.
> > (7) The broker will take the group member_assignment data and split it
> out
> > into each members SyncGroupResponse (including the consumer-leader).
> >
> > At this point all members have their assignments and can start chugging
> > along.
> >
> > I've tried to encapsulate the group request / response protocol info
> > cleanly into kafka-python, if that helps you:
> >
> > https://github.com/dpkp/kafka-python/blob/1.0.2/kafka/protocol/group.py
> >
> > I'd be willing to help improve the docs if someone points me at the
> "source
> > of truth". Or feel free to ping on irc #apache-kafka
> >
> > -Dana
> >
> > *for java client these are 'range' or 'roundrobin', but are opaque and
> can
> > be anything if you aren't interested in joining groups that have
> > heterogeneous clients. related, the protocol_metadata and member_metadata
> > are also opaque and technically can be anything you like. But sticking
> with
> > the documented metadata specs should in theory allow heterogenous clients
> > to cooperate.
> >
> > On Tue, Mar 29, 2016 at 8:21 AM, Heath Ivie <hi...@autoanything.com>
> > wrote:
> >
> > > Does anyone have better documentation around the group membership APIs?
> > >
> > > The information about the APIs are great at the beginning but get
> > > progressively  sparse towards then end.
> > >
> > > I am not finding enough information about the values of the request
> > fields
> > > to join / sync the group.
> > >
> > > Can anyone help me or send me the some additional documentation?
> > >
> > > Heath Ivie
> > > Solutions Architect
> > >
> > >
> > > Warning: This e-mail may contain information proprietary to
> AutoAnything
> > > Inc. and is intended only for the use of the intended recipient(s). If
> > the
> > > reader of this message is not the intended recipient(s), you have
> > received
> > > this message in error and any review, dissemination, distribution or
> > > copying of this message is strictly prohibited. If you have received
> this
> > > message in error, please notify the sender immediately and delete all
> > > copies.
> > >
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Fetch Response V1 incorrectly documented

2016-03-03 Thread Grant Henke
It looks like you are right, the throttle time does come first. I have a
WIP implementation (https://github.com/apache/kafka/pull/970) that
generates the protocol docs based on the protocol specification in the code
and the output for fetch response v2 is:

> Fetch Response (Version: 2) => throttle_time_ms [responses]
>   responses => topic [partition_responses]
> partition_responses => partition error_code high_watermark record_set
>   partition => INT32
>   error_code => INT16
>   high_watermark => INT64
>   record_set => BYTES
> topic => STRING
>   throttle_time_ms => INT32
>
> I don't think that was intentional though, because the similar produce
response puts it on the end like documented:

> Produce Response (Version: 2) => [responses] throttle_time_ms
>   responses => topic [partition_responses]
> partition_responses => partition error_code base_offset timestamp
>   partition => INT32
>   error_code => INT16
>   base_offset => INT64
>   timestamp => INT64
> topic => STRING
>   throttle_time_ms => INT32
>
> However, even if it was a mistake it wont change until at least the next
protocol bump for fetch. Is it important to you that it be at the end
functionally? Or just that the documentation is correct?

Thanks,
Grant



On Thu, Mar 3, 2016 at 2:49 AM, Oleksiy Krivoshey <oleks...@gmail.com>
wrote:

> It seems that Fetch Response V1 is not correctly documented:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>
> It says the response should be:
>
> FetchResponse => [TopicName [Partition ErrorCode HighwaterMarkOffset
> MessageSetSize MessageSet]] ThrottleTime
>
> But it actually is (as of Kafka 0.9.0.1):
>
> FetchResponse => ThrottleTime [TopicName [Partition ErrorCode
> HighwaterMarkOffset MessageSetSize MessageSet]]
>
> e.g. ThrottleTime comes first after the response header, not last.
>
> As a client library developer (https://github.com/oleksiyk/kafka) I would
> like to know if its an error in documentation or in Kafka server?
>
> Thanks!
>
> --
> Oleksiy Krivoshey
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [kafka-clients] 0.9.0.1 RC1

2016-02-16 Thread Grant Henke
+1 (non-binding)

On Tue, Feb 16, 2016 at 8:59 PM, Joel Koshy <jjkosh...@gmail.com> wrote:

> +1
>
> On Thu, Feb 11, 2016 at 6:55 PM, Jun Rao <j...@confluent.io> wrote:
>
> > This is the first candidate for release of Apache Kafka 0.9.0.1. This a
> > bug fix release that fixes 70 issues.
> >
> > Release Notes for the 0.9.0.1 release
> >
> https://home.apache.org/~junrao/kafka-0.9.0.1-candidate1/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Tuesday, Feb. 16, 7pm PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS in addition to the md5, sha1
> > and sha2 (SHA256) checksum.
> >
> > * Release artifacts to be voted upon (source and binary):
> > https://home.apache.org/~junrao/kafka-0.9.0.1-candidate1/
> >
> > * Maven artifacts to be voted upon prior to release:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * scala-doc
> > https://home.apache.org/~junrao/kafka-0.9.0.1-candidate1/scaladoc/
> >
> > * java-doc
> > https://home.apache.org/~junrao/kafka-0.9.0.1-candidate1/javadoc/
> >
> > * The tag to be voted upon (off the 0.9.0 branch) is the 0.9.0.1 tag
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2c17685a45efe665bf5f24c0296cb8f9e1157e89
> >
> > * Documentation
> > http://kafka.apache.org/090/documentation.html
> >
> > Thanks,
> >
> > Jun
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "kafka-clients" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to kafka-clients+unsubscr...@googlegroups.com.
> > To post to this group, send email to kafka-clie...@googlegroups.com.
> > Visit this group at https://groups.google.com/group/kafka-clients.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/kafka-clients/CAFc58G8VbhZ5Q0nVnUAg8qR0yEO%3DqhYrHFtLySpJo1Nha%3DoOxA%40mail.gmail.com
> > <
> https://groups.google.com/d/msgid/kafka-clients/CAFc58G8VbhZ5Q0nVnUAg8qR0yEO%3DqhYrHFtLySpJo1Nha%3DoOxA%40mail.gmail.com?utm_medium=email_source=footer
> >
> > .
> > For more options, visit https://groups.google.com/d/optout.
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Maximum Topic Length in Kafka

2015-11-29 Thread Grant Henke
Quotas (KIP-13) is actually included in the recent 0.9.0 release. More
about them can be read in the documentation here:

   - http://kafka.apache.org/documentation.html#design_quotas
   - http://kafka.apache.org/documentation.html#quotas



On Sun, Nov 29, 2015 at 9:24 AM, Marko Bonaći <marko.bon...@sematext.com>
wrote:

> Yes, I thought you weren't interested in retention, but how to limit the
> amount of messages produced into a topic.
> Take a look at this Kafka Improvement Proposal (KIP):
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> But, AFAIK, there's nothing currently available for your use case.
>
> Perhaps you could check Consumer offsets from your Producer and then decide
> based on that information whether to throttle Producer or not. Could get
> complicated really fast, though.
>
> Marko Bonaći
> Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> Solr & Elasticsearch Support
> Sematext <http://sematext.com/> | Contact
> <http://sematext.com/about/contact.html>
>
> On Sun, Nov 29, 2015 at 8:57 AM, Debraj Manna <subharaj.ma...@gmail.com>
> wrote:
>
> > Let me explain my use case:-
> >
> > We have a ELK setup in which logstash-forwarders pushes logs from
> different
> > services to a logstash. The logstash then pushes them to kafka. The
> > logstash consumer then pulls them out of Kafka and indexes them to
> > Elasticsearch cluster.
> >
> > We are trying to ensure that no single service logs doesn't overwhelm the
> > system. So I was thinking if each service logs go in their own topics in
> > kafka and if we can specify a maximum length in the topic then the
> producer
> > of that topic can block when a kafka topic is full.
> > AFAIK there is no such notion as maximum length of a topic, i.e. offset
> has
> > no limit, except Long.MAX_VALUE I think, which should be enough for a
> > couple of lifetimes (9 * 10E18, or quintillion or million trillions).
> >
> > What would be the purpose of that, besides being a nice foot-gun :)
> >
> > Marko Bonaći
> > Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> > Solr & Elasticsearch Support
> > Sematext <http://sematext.com/> | Contact
> > <http://sematext.com/about/contact.html>
> >
> > On Sat, Nov 28, 2015 at 2:13 PM, Debraj Manna <subharaj.ma...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Can some one please let me know the following:-
> > >
> > >
> > >1. Is it possible to specify maximum length of a particular topic (
> in
> > >terms of number of messages ) in kafka ?
> > >2. Also how does Kafka behave when a particular topic gets full?
> > >3. Can the producer be blocked if a topic get full rather than
> > deleting
> > >old messages?
> > >
> > > I have gone through the documentation
> > > <http://kafka.apache.org/081/documentation.html#basic_ops_add_topic>
> but
> > > could not find anything of what I am looking for.
> > >
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: build error

2015-11-17 Thread Grant Henke
I too would like to understand the command you used to build and what
directory.

That aside it is likely best to test using the release candidate artifacts
that are described in Jun's email here:
http://search-hadoop.com/m/uyzND1bBATE1M3hVn/candidate=+VOTE+0+9+0+0+Candiate+2

This gives you all built artifacts and a maven repo
<https://repository.apache.org/content/groups/staging/> with the client
artifacts
<https://repository.apache.org/content/groups/staging/org/apache/kafka/kafka-clients/0.9.0.0/>
.

Thanks,
Grant

On Tue, Nov 17, 2015 at 12:17 AM, Guozhang Wang <wangg...@gmail.com> wrote:

> Did you just use "./gradlew build" in root directory?
>
> Guozhang
>
> On Mon, Nov 16, 2015 at 6:41 PM, hsy...@gmail.com <hsy...@gmail.com>
> wrote:
>
> > The actual thing I want to do is I want to build and install in my local
> > maven repository so I can include new api in my dependencies. When the
> > release is officially out, I can have both my code ready with the
> official
> > maven dependency
> >
> > Thanks,
> > Siyuan
> >
> > On Monday, November 16, 2015, Grant Henke <ghe...@cloudera.com> wrote:
> >
> > > Hi Siyuan,
> > >
> > > My guess is that you are trying to build from a subdirectory. I have a
> > > minor patch available to fix this that has not been pulled in yet here:
> > > https://github.com/apache/kafka/pull/509
> > >
> > > In the mean time, if you need to build a subproject you can execute a
> > > command like the following:
> > > gradle clients:build
> > >
> > > Thanks,
> > > Grant
> > >
> > > On Mon, Nov 16, 2015 at 6:33 PM, Guozhang Wang <wangg...@gmail.com
> > > <javascript:;>> wrote:
> > >
> > > > Siyuan,
> > > >
> > > > Which command did you use to build?
> > > >
> > > > Guozhang
> > > >
> > > > On Mon, Nov 16, 2015 at 4:01 PM, hsy...@gmail.com <javascript:;> <
> > > hsy...@gmail.com <javascript:;>>
> > > > wrote:
> > > >
> > > > > I got a build error on both trunk and 0.9.0 branch
> > > > >
> > > > > > docs/producer_config.html (No such file or directory)
> > > > >
> > > > > Do I miss anything before build
> > > > >
> > > > > Thanks,
> > > > > Siyuan
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> > >
> > >
> > > --
> > > Grant Henke
> > > Software Engineer | Cloudera
> > > gr...@cloudera.com <javascript:;> | twitter.com/gchenke |
> > > linkedin.com/in/granthenke
> > >
> >
>
>
>
> --
> -- Guozhang
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: build error

2015-11-17 Thread Grant Henke
Glad you resolved it. Note that we are currently on gradle 2.8 and likely
2.9 soon feel free to update whenever makes sense for you.

Ewen recently updated the developer setup page here
<https://cwiki.apache.org/confluence/display/KAFKA/Developer+Setup> to show
how to use the wrapper effectively in the Kafka repository. Since we do not
checkin the wrapper jar, you need to execute the default gradle task by
simply executing `gradle` before  running and gradle wrapper tasks.

See here
<http://search-hadoop.com/m/uyzND1GtKrd1nxbcE1/The+wrapper+should+be+in+the+repository=+gradle+build+The+wrapper+should+be+in+the+repository>
for the related email chain.

On Tue, Nov 17, 2015 at 10:27 AM, hsy...@gmail.com <hsy...@gmail.com> wrote:

> I got main class not found error. So I installed gradle 2.5 and run gradle
> build (not the wrapper)
>
> On Mon, Nov 16, 2015 at 10:17 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
>
> > Did you just use "./gradlew build" in root directory?
> >
> > Guozhang
> >
> > On Mon, Nov 16, 2015 at 6:41 PM, hsy...@gmail.com <hsy...@gmail.com>
> > wrote:
> >
> > > The actual thing I want to do is I want to build and install in my
> local
> > > maven repository so I can include new api in my dependencies. When the
> > > release is officially out, I can have both my code ready with the
> > official
> > > maven dependency
> > >
> > > Thanks,
> > > Siyuan
> > >
> > > On Monday, November 16, 2015, Grant Henke <ghe...@cloudera.com> wrote:
> > >
> > > > Hi Siyuan,
> > > >
> > > > My guess is that you are trying to build from a subdirectory. I have
> a
> > > > minor patch available to fix this that has not been pulled in yet
> here:
> > > > https://github.com/apache/kafka/pull/509
> > > >
> > > > In the mean time, if you need to build a subproject you can execute a
> > > > command like the following:
> > > > gradle clients:build
> > > >
> > > > Thanks,
> > > > Grant
> > > >
> > > > On Mon, Nov 16, 2015 at 6:33 PM, Guozhang Wang <wangg...@gmail.com
> > > > <javascript:;>> wrote:
> > > >
> > > > > Siyuan,
> > > > >
> > > > > Which command did you use to build?
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Mon, Nov 16, 2015 at 4:01 PM, hsy...@gmail.com <javascript:;> <
> > > > hsy...@gmail.com <javascript:;>>
> > > > > wrote:
> > > > >
> > > > > > I got a build error on both trunk and 0.9.0 branch
> > > > > >
> > > > > > > docs/producer_config.html (No such file or directory)
> > > > > >
> > > > > > Do I miss anything before build
> > > > > >
> > > > > > Thanks,
> > > > > > Siyuan
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Grant Henke
> > > > Software Engineer | Cloudera
> > > > gr...@cloudera.com <javascript:;> | twitter.com/gchenke |
> > > > linkedin.com/in/granthenke
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: issue with javaapi consumer

2015-11-17 Thread Grant Henke
Are you sure you mean 0.8.2.3? The latest version available is 0.8.2.2, and
it is completely compatible.
http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.kafka%22%20a%3A%22kafka_2.10%22

If you still have an issue, what error are you seeing?

Thanks,
Grant

On Tue, Nov 17, 2015 at 5:47 AM, Kudumula, Surender <
surender.kudum...@hpe.com> wrote:

> Hi all
> Iam currently using kafka 0.8.2.1 consumer which has kafka java api
> consumer but I had to upgrade to kafka 0.8.2.3 and when I add the jars for
> kafka 0.8.2.3 my consumer code doesn't compile.
> consumerConnector = ConsumerConfig
>
>  .createJavaConsumerConnector(consumerConfig);
>
> Do I have to rewrite the whole consumer to use the new jar. Whats the best
> way forward please?
>
> Regards
>
> Surender Kudumula
>
>
>


-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: setting up kafka github

2015-11-16 Thread Grant Henke
Hi Mojhaha,

You will not have access to the actual Apache Kafka repo. Everyone
contributes via their own fork and asking for the changes to be pulled into
(pull request) the Apache Kafka repo. The guide linked earlier is a great
resource for the Github process.

Thanks,
Grant

On Sat, Nov 14, 2015 at 5:50 AM, mojhaha kiklasds <sesca.syst...@gmail.com>
wrote:

> Hello,
>
> In this approach, I think setting it up according to the second method that
> I described in my earlier email should work.
> But, is this method what other contributors are also using ?
> Or are they using the first methods that I described?
>
> Thanks,
> Mojhaha
>
> On Sat, Nov 14, 2015 at 4:45 PM, jeanbaptiste lespiau <
> jeanbaptiste.lesp...@gmail.com> wrote:
>
> > Hello,
> >
> > I'm new to kafka too, but I think this page can help you :
> > https://help.github.com/articles/using-pull-requests/
> >
> > It describes exactly the process to follow.
> >
> > Regards.
> >
> > 2015-11-14 11:49 GMT+01:00 mojhaha kiklasds <sesca.syst...@gmail.com>:
> >
> > > Hello,
> > >
> > > I'm new to github usage but I want to contribute to kafka. I am trying
> to
> > > setup my github repo based on the instructions mentioned here:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes#ContributingCodeChanges-PullRequest
> > >
> > > I have one doubt though. Which repo shall I configure as the remote -
> > > apache-kafka or my fork ?
> > >
> > > If I configure apache-kafka as remote, will I be able to submit pull
> > > requests?
> > >
> > > If I sync my committed changes to my fork (hosted on github), will I
> > issue
> > > pull requests from this fork to apache-kafka ?
> > >
> > > Thanks,
> > > Mojhaha
> > >
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: build error

2015-11-16 Thread Grant Henke
Hi Siyuan,

My guess is that you are trying to build from a subdirectory. I have a
minor patch available to fix this that has not been pulled in yet here:
https://github.com/apache/kafka/pull/509

In the mean time, if you need to build a subproject you can execute a
command like the following:
gradle clients:build

Thanks,
Grant

On Mon, Nov 16, 2015 at 6:33 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> Siyuan,
>
> Which command did you use to build?
>
> Guozhang
>
> On Mon, Nov 16, 2015 at 4:01 PM, hsy...@gmail.com <hsy...@gmail.com>
> wrote:
>
> > I got a build error on both trunk and 0.9.0 branch
> >
> > > docs/producer_config.html (No such file or directory)
> >
> > Do I miss anything before build
> >
> > Thanks,
> > Siyuan
> >
>
>
>
> --
> -- Guozhang
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Question on 0.9 new consumer API

2015-11-12 Thread Grant Henke
The new consumer (0.9.0) will not be compatible with older brokers (0.8.2).
In general you should upgrade brokers before upgrading clients. The old
clients (0.8.2) will work on the new brokers (0.9.0).

Thanks,
Grant

On Thu, Nov 12, 2015 at 7:52 AM, Han JU <ju.han.fe...@gmail.com> wrote:

> Hello,
>
> Just want to know if the new consumer API coming with 0.9 will be
> compatible with 0.8 broker servers? We're looking at the new consumer
> because the new rebalancing listener is very interesting for one of our use
> case.
>
> Another question is that if we have to upgrade our brokers to 0.9, will
> they accept producers in 0.8.2?
>
> Thanks!
>
> --
> *JU Han*
>
> Software Engineer @ Teads.tv
>
> +33 061960
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Please add skyscanner to Powered by page please

2015-11-05 Thread Grant Henke
Hi Scott,

Added it here https://cwiki.apache.org/confluence/display/KAFKA/Powered+By.

Thank you,
Grant

On Thu, Nov 5, 2015 at 8:52 AM, Scott Krueger <scott.krue...@skyscanner.net>
wrote:

> Dear Kafka users,
>
>
> Could someone kindly add the following to the Kafka "Powered by" page
> please:?
>
>
> [skyscanner](http://www.skyscanner.net/) | [skyscanner](
> http://www.skyscanner.net/), the world's travel search engine, uses Kafka
> for real-time log and event ingestion. It is the integration point for of
> all stream-processing and data transportation services.
>
>
> Many thanks,
>
>
> Scott Krueger
>
>
> Data Architect
>
> skyscanner
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Kafka 090 maven coordinate

2015-11-03 Thread Grant Henke
Hi Fajar,

Please see my response to a similar email here:
http://search-hadoop.com/m/uyzND1ifZt65CCBS

If you still have questions, please do not hesitate to ask.

Thank you,
Grant

On Tue, Nov 3, 2015 at 5:07 PM, Fajar Maulana Firdaus <faja...@gmail.com>
wrote:

> I see, thank you for your explanation, will the client of 0.9.0.0 be
> backward compatible with 0.8.2.2 kafka?
>
> On Wed, Nov 4, 2015 at 2:52 AM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
> > 0.9.0.0 is not released yet, but the last blockers are being addressed
> and
> > release candidates should follow soon. The docs there are just staged as
> we
> > prepare for the release (note, e.g., that the latest release on the
> > downloads page http://kafka.apache.org/downloads.html is still 0.8.2.2).
> >
> > -Ewen
> >
> > On Tue, Nov 3, 2015 at 10:57 AM, Fajar Maulana Firdaus <
> faja...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I saw that there is new kafka client 0.9.0 in here:
> >> http://kafka.apache.org/090/javadoc/index.html So what is the maven
> >> coordinate for this version? I am asking this because it has
> >> KafkaConsumer api which doesn't exist in 0.8.2
> >>
> >> Thank you
> >>
> >
> >
> >
> > --
> > Thanks,
> > Ewen
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: One more Kafka Meetup hosted by LinkedIn in 2015 (this time in San Francisco) - does anyone want to talk?

2015-11-03 Thread Grant Henke
Is there a place where we can find all previously streamed/recorded meetups?

Thank you,
Grant

On Tue, Nov 3, 2015 at 2:07 PM, Ed Yakabosky <eyakabo...@linkedin.com>
wrote:

> I'm sorry to hear that Lukas.  I have heard that people are starting to do
> carpools via rydeful.com for some of these meetups.
>
> Additionally, we will live stream and record the presentations, so you can
> participate remotely.
>
> Ed
>
> On Tue, Nov 3, 2015 at 10:43 AM, Lukas Steiblys <lu...@doubledutch.me>
> wrote:
>
> > This is sad news. I was looking forward to finally going to a Kafka or
> > Samza meetup. Going to Mountain View for a meetup is just unrealistic
> with
> > 2h travel time each way.
> >
> > Lukas
> >
> > -Original Message- From: Ed Yakabosky
> > Sent: Tuesday, November 3, 2015 10:36 AM
> > To: users@kafka.apache.org ; d...@kafka.apache.org ; Clark Haskins
> > Subject: Re: One more Kafka Meetup hosted by LinkedIn in 2015 (this time
> > in San Francisco) - does anyone want to talk?
> >
> > Hi all,
> >
> > Two corrections to the invite:
> >
> >   1. The invitation is for November 18, 2015.  *NOT 2016.*  I was a
> little
> >   hasty...
> >   2. LinkedIn has finished remodeling our broadcast room, so we are going
> >
> >   to host the meet up in Mountain View, not San Francisco.
> >
> > We've arranged for speakers from HortonWorks to talk about Security and
> > LinkedIn to talk about Quotas.  We are still looking for one more
> speaker,
> > so please let me know if you are interested.
> >
> > Thanks!
> > Ed
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Oct 30, 2015 at 12:49 PM, Ed Yakabosky <eyakabo...@linkedin.com>
> > wrote:
> >
> > Hi all,
> >>
> >> LinkedIn is hoping to host one more Apache Kafka meetup this year on
> >> November 18 in our San Francisco office.  We're working on building the
> >> agenda now.  Does anyone want to talk?  Please send me (and Clark) a
> >> private email with a short description of what you would be talking
> about
> >> if interested.
> >>
> >> --
> >> Thanks,
> >>
> >> Ed Yakabosky
> >> ​Technical Program Management @ LinkedIn>
> >>
> >>
> >
> > --
> > Thanks,
> > Ed Yakabosky
> >
>
>
>
> --
> Thanks,
> Ed Yakabosky
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Release 0.9.0

2015-11-03 Thread Grant Henke
Hi Mohit,

http://search-hadoop.com/ is great for searching the mailing lists for
information like this.

A quick search for "0.9.0 release" (link:
http://search-hadoop.com/?q=0.9.0+release=144408960=144668160_project=Kafka)
shows some great threads on the current status.

The first thread is current and shows our plans:
http://search-hadoop.com/m/uyzND10Vvg61DdgA32/0.9.0+release=0+9+0+release+branch

If you have any follow up question, please don't hesitate to ask.

Thank you,
Grant

On Tue, Nov 3, 2015 at 5:36 PM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:

> Is there a tentative release date for Kafka 0.9.0?
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: kafka_2.9.2-0.8.2.0 - Topic marked for deletion

2015-11-03 Thread Grant Henke
Hi Avinash,

This is working as expected. Since delete topics is not enabled, they get
marked for deletion, but never get deleted. Once you enable topic
deletion (delete.topic.enable=true)
the topics marked get delete.

Once the topic is marked for deletion messages will still be available. It
is when the topic is actually deleted that messages can no longer be
consumed. However, in order for a topic to be safely deleted all
producers/consumers of that topic should be stopped. Not stopping the
clients can cause weird behavior especially when auto topic creation is
enabled.

Thank you,
Grant

On Tue, Nov 3, 2015 at 3:15 PM, Kumar, Avinash <avinash.ku...@capitalone.com
> wrote:

> Hi,
>
> I am trying to delete the kafka topic, verison I am using is 2.9.2-0.8.2.
> I am using dockerized kafka to delete the topic already created and didn’t
> set the “delete.topic.enable=true”.  When I am listing the topics, it is
> giving as “ marked for deletion”.
>
> Is this a bug? However when I am setting “delete.topic.enable=true” as
> true, it gets deleted immediately. Apart from this I have one more
> question, once the topic is marked for deletion; will messages be still
> available for consumer?
>
> I do see regular updated in controller update with
>
> [2015-11-03 21:05:57,753] TRACE [Controller 1]: checking need to trigger
> partition rebalance (kafka.controller.KafkaController)
>
> [2015-11-03 21:05:57,754] DEBUG [Controller 1]: preferred replicas by
> broker Map(1 -> Map([abcdefghApp,0] -> List(1), [abcdefgApp,0] -> List(1),
> [abcdefApp,0] -> List(1), [adbceApp,0] -> List(1), [abcdApp,0] -> List(1),
> [abcApp,0] -> List(1), [abApp,0] -> List(1), [aApp,0] -> List(1)))
> (kafka.controller.KafkaController)
>
> [2015-11-03 21:05:57,754] DEBUG [Controller 1]: topics not in preferred
> replica Map() (kafka.controller.KafkaController)
>
> [2015-11-03 21:05:57,754] TRACE [Controller 1]: leader imbalance ratio for
> broker 1 is 0.00 (kafka.controller.KafkaController)
>
> Out of these aApp,abApp and abcApp topics has been marked for deletion.
>
> Thanks,
> Avinash
> 
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Topic per entity

2015-11-02 Thread Grant Henke
Hi Alex & Andrew,

There was a discussion with some pointers on this mailing list a bit ago
titled "mapping events to topics". I suggest taking a look at that thread:
http://search-hadoop.com/m/uyzND1vJsUuYtGD91/mapping+events+to+topics=mapping+events+to+topics

If you still have questions, don't hesitate to ask.

Thanks,
Grant



On Sat, Oct 31, 2015 at 3:19 AM, Andrew Stevenson <asteven...@outlook.com>
wrote:

> I too would be interested in any responses to this question.
>
> I'm using kafka for event notification and once secure will put real
> payload in it and take advantage of the durable commit log. I want to let
> users describe a DAG in orientdb and have the Kafka Client processor load
> and execute it. Each processor would then attach it's lineage and
> provenance back to the orientdbs graph store.
>
> This way I can let users replay stress scenarios, calculate VaR etc with
> one source of replayable truth. Compliance and regulatory authorities like
> this.
>
> Regards
>
> Andrew
> 
> From: Alex Buchanan<mailto:bucha...@gmail.com>
> Sent: ‎31/‎10/‎2015 05:30
> To: users@kafka.apache.org<mailto:users@kafka.apache.org>
> Subject: Topic per entity
>
> Hey Kafka community.
>
> I'm researching possible architecture for a distributed data processing
> system. In this system, there's a close relationship between a specific
> dataset and the processing code. The user might upload a few datasets and
> write code to run analysis on that data. In other words, frequently the
> analysis code pulls data from a specific entity.
>
> Kafka is attractive for lots of reasons:
> - I'll need messaging anyway
> - I want a model for immutability of data (mutable state and potential job
> failure don't mix)
> - cross-language clients
> - the change stream concept could have some nice uses (such as updating
> visualizations without rebuilding)
> - Samza's model of state management is a simple way to think of external
> data without worrying too-much about network-based RPC
> - as a source of truth data store, it's really simple. No mutability,
> complex queries, etc. Just a log. To me, that helps prevent abuse and
> mistakes.
> - it fits well with the concept of pipes, frequently found in data analysis
>
> But most of the Kafka examples are about processing a large stream of a
> specific _type_, not so much about processing specific entities. And I
> understand there are limits to topics (file/node limits on the filesystem
> and in zookeeper) and it's discouraged to model topics based on
> characteristics of data. In this system, it feels more natural to have a
> topic per entity so the processing code can connect directly to the data it
> wants.
>
> So I need a little guidance from smart people. Am I lost in the rabbit
> hole? Maybe I'm trying to force Kafka into this territory it's not suited
> for. Have I been reading too many (awesome) articles about the role of the
> log and streaming in distributed computing? Or am I on the right track and
> I just need to put in some work to jump the hurdles (such as topic storage
> and coordination)?
>
> It sounds like Cassandra might be another good option, but I don't know
> much about it yet.
>
> Thanks guys!
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Kafka console producer stuck

2015-10-28 Thread Grant Henke
Hi Sathya,

This was a bug that is now fixed in trunk and will be included in the next
release. See KAFKA-1711 <https://issues.apache.org/jira/browse/KAFKA-1711> for
more details.

You can safely ignore the warning and it should not impact the usage of the
console producer.

Thank you,
Grant

On Wed, Oct 28, 2015 at 5:49 AM, Sathyakumar Seshachalam <
sathyakumar_seshacha...@trimble.com> wrote:

> Hi, Its the same as in the Quick Start tutorial
>
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
>
>
> On 10/28/15, 4:06 PM, "Prabhjot Bharaj" <prabhbha...@gmail.com> wrote:
>
> >Hi,
> >
> >It seems (from VerifiableProperties.scala -
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils
> >/VerifiableProperties.scala#L224)
> >that you are providing some property which kafka does not recognise
> >Please share the full command that you are trying to use
> >
> >Regards,
> >Prabhjot
> >
> >On Wed, Oct 28, 2015 at 2:25 PM, Sathyakumar Seshachalam <
> >sathyakumar_seshacha...@trimble.com> wrote:
> >
> >> Am trying to get started on Kafka and following instructions on
> >> http://kafka.apache.org/documentation.html#quickstart.
> >> Am setting up on a single broker. I got as far as creating topics and
> >> listing them, but when I try kafka-console-producer.sh  to add
> >>messages, I
> >> ended up in below error.
> >>
> >> [2015-10-28 14:14:46,140] WARN Property topic is not valid
> >> (kafka.utils.VerifiableProperties).
> >>
> >> My google searches generally hinted that I should set the right value
> >>for ³
> >> advertised.host.name², But setting that has no effect.
> >> Any help overcoming this will be appreciated.
> >>
> >> Am running version kafka_2.11-0.8.2.1.tgz on OS X.
> >>
> >> Regards,
> >> Sathya
> >>
> >>
> >
> >
> >--
> >-
> >"There are only 10 types of people in the world: Those who understand
> >binary, and those who don't"
>
>


-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Kafka console producer stuck

2015-10-28 Thread Grant Henke
The docs don't necessarily make it clear, but if you don't pipe data into
the console-producer, it waits for your typed input (stdin). This is shown
in the docs with the lines just below "*bin/kafka-console-producer.sh
--broker-list localhost:9092 --topic test*" that show:

This is a message
> This is another message


Otherwise, a quick command I like to use when playing around with the
console-producer is:

*vmstat -w -n -t 1 | kafka-console-producer --broker-list
my-broker-host:9092 --topic my-topic*

vmstat <http://linux.die.net/man/8/vmstat> and other similar linux
utilities (iostat <http://linux.die.net/man/1/iostat>) are a great way to
get quick and real data for experimentation.

Thanks,
Grant

On Wed, Oct 28, 2015 at 10:49 AM, Sathyakumar Seshachalam <
sathyakumar_seshacha...@trimble.com> wrote:

> Hi Grant,
>
> But then my console producer is not sending any message to the broker. It
> just prints that warning and keeps waiting.
>
> I have verified that console producer is "not" sending messages by
> actually running console consumer. And a Java client that sends messages
> works  and the console consumer is able to read them fine. It is just the
> console producer that doesn't seem to work (or at least as per the quick
> start link)
>
>
>
> Sent from my iPhone
>
> > On 28-Oct-2015, at 7:34 PM, Grant Henke <ghe...@cloudera.com> wrote:
> >
> > Hi Sathya,
> >
> > This was a bug that is now fixed in trunk and will be included in the
> next
> > release. See KAFKA-1711 <
> https://issues.apache.org/jira/browse/KAFKA-1711> for
> > more details.
> >
> > You can safely ignore the warning and it should not impact the usage of
> the
> > console producer.
> >
> > Thank you,
> > Grant
> >
> > On Wed, Oct 28, 2015 at 5:49 AM, Sathyakumar Seshachalam <
> > sathyakumar_seshacha...@trimble.com> wrote:
> >
> >> Hi, Its the same as in the Quick Start tutorial
> >>
> >> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
> >>
> >>
> >>> On 10/28/15, 4:06 PM, "Prabhjot Bharaj" <prabhbha...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> It seems (from VerifiableProperties.scala -
> >>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils
> >>> /VerifiableProperties.scala#L224)
> >>> that you are providing some property which kafka does not recognise
> >>> Please share the full command that you are trying to use
> >>>
> >>> Regards,
> >>> Prabhjot
> >>>
> >>> On Wed, Oct 28, 2015 at 2:25 PM, Sathyakumar Seshachalam <
> >>> sathyakumar_seshacha...@trimble.com> wrote:
> >>>
> >>>> Am trying to get started on Kafka and following instructions on
> >>>> http://kafka.apache.org/documentation.html#quickstart.
> >>>> Am setting up on a single broker. I got as far as creating topics and
> >>>> listing them, but when I try kafka-console-producer.sh  to add
> >>>> messages, I
> >>>> ended up in below error.
> >>>>
> >>>> [2015-10-28 14:14:46,140] WARN Property topic is not valid
> >>>> (kafka.utils.VerifiableProperties).
> >>>>
> >>>> My google searches generally hinted that I should set the right value
> >>>> for ³
> >>>> advertised.host.name², But setting that has no effect.
> >>>> Any help overcoming this will be appreciated.
> >>>>
> >>>> Am running version kafka_2.11-0.8.2.1.tgz on OS X.
> >>>>
> >>>> Regards,
> >>>> Sathya
> >>>
> >>>
> >>> --
> >>> -
> >>> "There are only 10 types of people in the world: Those who understand
> >>> binary, and those who don't"
> >
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: How Kafka Distribute partitions across the cluster

2015-10-27 Thread Grant Henke
Hi Nitin,

Partitions and replicas are assigned in a round robin fashion. You can read
more about it in the docs under "4.7 Replication
<http://kafka.apache.org/documentation.html#replication>" look for the
header for "Replica Management".

You can view the implementation here:
https://github.com/apache/kafka/blob/21443f214fc6f1f51037e27f8ece155cf1eb288c/core/src/main/scala/kafka/admin/AdminUtils.scala#L48-L65

I copy-pasted the relevant javadoc below for convenience:

/**
>* There are 2 goals of replica assignment:
>* 1. Spread the replicas evenly among brokers.
>* 2. For partitions assigned to a particular broker, their other
> replicas are spread over the other brokers.
>*
>* To achieve this goal, we:
>* 1. Assign the first replica of each partition by round-robin,
> starting from a random position in the broker list.
>* 2. Assign the remaining replicas of each partition with an increasing
> shift.
>*
>* Here is an example of assigning (10 partitions, replication = 3)
>* broker-0  broker-1  broker-2  broker-3  broker-4
>* p0p1p2p3p4   (1st replica)
>* p5p6p7p8p9   (1st replica)
>* p4p0p1p2p3   (2nd replica)
>* p8p9p5p6p7   (2nd replica)
>* p3p4p0p1p2   (3nd replica)
>* p7p8p9p5p6   (3nd replica)
>*/


Note that this is just the default implementation, you can manually assign
them however you like using the command line tools. There are also some
Jiras in progress to implement other options as well(KAFKA-2435
<https://issues.apache.org/jira/browse/KAFKA-2435>, KAFKA-1215
<https://issues.apache.org/jira/browse/KAFKA-1215>, KAFKA-1736
<https://issues.apache.org/jira/browse/KAFKA-1736>).

Thanks,
Grant

On Tue, Oct 27, 2015 at 11:58 AM, nitin sharma <kumarsharma.ni...@gmail.com>
wrote:

> Hi All,
>
> I would like to know what kind of strategy Kafka adopts in splitting
> partitions of a topic to across the cluster.
>
> Ex: if i have 10 broker in a cluster. 1 topic with 5 partition and each
> with replication-factor of 2. then how would Kafka will assign the
> partitions ( 5*2 =10) across the cluster nodes?
>
>
> Regards,
> Nitin Kumar Sharma.
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Synchronous producer performance is very low

2015-10-19 Thread Grant Henke
The bottom line is synchronous will never be as fast as asynchronous. When
using asynchronous mode you get the benefits of batching, compression,
reduced network roundtrips, etc. You do loose the ability to achieve
absolute ordering, but do you really need that? Can you design you system
not to?

When you say asynchronous producer took 1 second. Do you mean
to receive the data on the other end? or just to make all the send calls?
My guess is that it took 1 second because the call to send
returns immediately, but that does not necessarily mean the data was sent
yet.

More can be read about the new producer (make sure you are using the new
producer) in the javadocs here
<http://kafka.apache.org/082/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html>
.

Thank you,
Grant


Side note:
There have been quite a few posts (3) in both the dev and user list about
this. In a very similar format. If you are all on the same team, please
keep your questions to one post. Even if you are different teams, joining
in on an existing post instead of creating a new one is encouraged.

On Mon, Oct 19, 2015 at 10:15 AM, shanthu Venkatesan <
shanthuvenka...@gmail.com> wrote:

> Hi all,
>
> We tested both Asynchronous and Synchronous producers by producing 10k
> messages.
>
> Asynchronous producer took 1 sec to produce whereas synchronous took 50
> secs.
>
> Reliability is our primary concern hence we have planned to use
> synchronous producer.
>
> Please can any one of you suggest a way to improve the performance of
> sync producer.
>
> Thanks,
> Santhakumari
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Spamming kafka mailing list

2015-10-19 Thread Grant Henke
Hi Kiran,

Your enthusiasm for Kafka has been noticed. Here are a few resources you
can use to help answer your questions and improve your Kafka knowledge:

   - Kafka Documentation <http://kafka.apache.org/documentation.html>
   - Kafka Development Wiki
   
<https://cwiki.apache.org/confluence/display/KAFKA/Index;jsessionid=E35AEBFF17C065EE2D05D6E77BFE2855>
  - Note: Some of this may be out of date so be sure it applies to you
   - Search old user and development emails <http://search-hadoop.com/kafka>
  - Its a best practice to do this before posting a new message
   - Stackoverflow (Apache Kafka)
   
<http://stackoverflow.com/questions/tagged/apache-kafka?sort=votes=15>
   - Quora (Apache Kafka) <https://www.quora.com/Apache-Kafka>
   - Slideshare (Kafka)
   <http://www.slideshare.net/search/slideshow?searchfrom=header=kafka>

If you can't find what you need, or find something especially challenging,
don't hesitate to ask or suggest improvements.

Thank you,
Grant



On Mon, Oct 19, 2015 at 7:27 AM, jhon davis <jhon.dav.s...@gmail.com> wrote:

> Kiran,
> Please stop spamming the mailing list.
> There is enough Kafka documentation for basic queries.
> Please spend time finding the answers before sending queries to Kafka
> mailing list. In case you still not find answers Kafka community is there
> to help.
> Genuine queries do not get proper attention by all this clutter!
>
> Best,
> Jhon
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Facing Issue to create asyn producer in kafka 0.8.2

2015-10-14 Thread Grant Henke
Looks like you may be mixing the new producer with old producer configs.
See the new config documentation here:
http://kafka.apache.org/documentation.html#newproducerconfigs. You will
likely want to set the "batch.size" and "linger.ms" to achieve your goal.

Thanks,
Grant

On Wed, Oct 14, 2015 at 1:29 PM, prateek arora <prateek.arora...@gmail.com>
wrote:

> Hi
>
> Thanks for help .
>
> but same behavior even after changing batch.size
>
> I have changes  batch.size value to 33554432.
>  props.put("batch.size","33554432");
>
>
>
> On Wed, Oct 14, 2015 at 11:09 AM, Zakee <kzak...@netzero.net> wrote:
>
> > Hi Prateek,
> >
> > Looks like you are using default batch.size which is ~16K and it forces
> > the send of messages immediately as your single message is larger than
> > that. Try using larger batch.size.
> >
> > Thanks
> > Zakee
> >
> >
> >
> > > On Oct 14, 2015, at 10:29 AM, prateek arora <
> prateek.arora...@gmail.com>
> > wrote:
> > >
> > > Hi
> > >
> > > I want to create async producer so i can buffer messages in queue and
> > send
> > > after every 5 sec .
> > >
> > > my kafka version is 0.8.2.0.
> > >
> > > and i am using  kafka-clients 0.8.2.0 to create kafka producer in java.
> > >
> > >
> > > below is my sample code :
> > >
> > > package com.intel.labs.ive.cloud.testKafkaProducerJ;
> > >
> > > import java.nio.charset.Charset;
> > > import java.util.HashMap;
> > >
> > > import java.util.Map;
> > >
> > > import org.apache.kafka.clients.producer.KafkaProducer;
> > > import org.apache.kafka.clients.producer.Producer;
> > > import org.apache.kafka.clients.producer.ProducerConfig;
> > > import org.apache.kafka.clients.producer.ProducerRecord;
> > > import org.apache.kafka.common.Metric;
> > > import org.apache.kafka.common.MetricName;
> > > import org.apache.kafka.common.serialization.Serializer;
> > > import org.apache.kafka.common.serialization.StringSerializer;
> > > import org.apache.kafka.common.serialization.ByteArraySerializer;
> > >
> > > import java.nio.file.DirectoryStream;
> > > import java.nio.file.Files;
> > > import java.nio.file.Path;
> > > import java.nio.file.Paths;
> > >
> > > public class TestKafkaProducer {
> > >
> > > Map<String, Object> props = new HashMap<String, Object>();
> > >props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
> > metadataBroker);
> > >props.put("producer.type", "async");
> > >props.put("queue.buffering.max.ms", "5000");
> > >
> > > Serializer keySerializer = new StringSerializer();
> > >Serializer<byte[]> valueSerializer = new ByteArraySerializer();
> > >
> > >producer = new KafkaProducer<String, byte[]>(props,
> keySerializer,
> > > valueSerializer);
> > >
> > > ProducerRecord<String, byte[]> imageRecord;
> > >
> > > while ( true ) {
> > > imageRecord = new ProducerRecord<String, byte[]>(topicName,
> > > recordKey,imageBytes);
> > >
> > >producer.send(imageRecord);
> > > }
> > > }
> > >
> > > size of my message is around 77K
> > >
> > > but its work like a synchronous producer , send every message to broker
> > .
> > > not buffering a message in to queue and send after 5 sec
> > >
> > >
> > > please help to find out a solution.
> > >
> > >
> > > Regards
> > > Prateek
> >
> > 
> > A Balance Transfer Card With An Outrageously Long Intro Rate And No
> > Balance Transfer Fees That Can Save You Thousands
> > http://thirdpartyoffers.netzero.net/TGL3231/561e9a75a77071a74763fst04vuc
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Kafka Offset remains at -1

2015-10-13 Thread Grant Henke
Are you sure your producer and consumer are pointed to the same topic? I
see RECUR_TRANS in your logs but you mentioned creating RECURRING_TRANS.

If thats not the issue. I have a few questions to get more informations.

   - Can you list & describe the relevant topics using the kafka-topics
   script
   - Do you see any errors in the Kafka broker logs?
   - What are you using to produce to Kafka?

Thanks,
Grant

On Tue, Oct 13, 2015 at 3:03 AM, Gautam Anand  wrote:

> Dear Experts,
>
> I have a producer that pushes the Messages to kafka and a producer which is
> a spark algorithm to pick up the messages using a zookeeper instance.
>
> Below is the Spark log: Can anyone tell me what am I doing wrong?
>
> Kafka Topic details:
>
> *The command I used to create the Topic:*
>
> ./kafka-topics --create --zookeeper 192.168.229.113:2181
> --replication-factor 3 --partitions 3 --topic RECURRING_TRANS
>
> *Log:*
>
> 15/10/12 23:27:49 INFO consumer.ZookeeperConsumerConnector:
> [peAlgoConsumer_RECUR_TRANS_spark-master-01-1444717669491-f19c60c1],
> Connecting to zookeeper instance at 192.168.229.113:2181
> 15/10/12 23:27:49 INFO zkclient.ZkEventThread: Starting ZkClient event
> thread.
> 15/10/12 23:27:49 INFO zookeeper.ZooKeeper: Client
> environment:zookeeper.version=3.4.5-cdh5.4.7--1, built on 09/17/2015
> 09:14 GMT
> 15/10/12 23:27:49 INFO zookeeper.ZooKeeper: Client
> environment:host.name=spark-master-01.yodlee.com
> 15/10/12 23:27:49 INFO zookeeper.ZooKeeper: Client
> environment:java.version=1.7.0_67
> 15/10/12 23:27:49 INFO zookeeper.ZooKeeper: Client
> environment:java.vendor=Oracle Corporation
> 15/10/12 23:27:49 INFO zookeeper.ZooKeeper: Client
> environment:java.home=/usr/java/jdk1.7.0_67-cloudera/jre
> 15/10/12 23:27:49 INFO zookeeper.ZooKeeper: Client
>
> 

Re: Offset rollover/overflow?

2015-10-05 Thread Grant Henke
I can't be sure of how every client will handle it, it is probably not
likely, and there could potentially be unforeseen issues.

That said, given that offsets are stored in a (signed) Long. I would
suspect that it would rollover to negative values and increment from there.
That means instead of 9,223,372,036,854,775,807 potential offset values,
you actually have 18,446,744,073,709,551,614 potential values. To put that
into perspective if we assign 1 byte to each offset thats just over 18
Exabytes.

You will likely run into many more issues other than offset rollover,
before you are able to retain 18 Exabytes in single Kafka topic. (And if
not, I would evaluate breaking up your topic into multiple smaller ones).

Thanks,
Grant


On Sat, Oct 3, 2015 at 8:58 PM, Li Tao <ahumbleco...@gmail.com> wrote:

> It will never happan.
>
> On Thu, Oct 1, 2015 at 4:22 AM, Chad Lung <chad.l...@gmail.com> wrote:
>
> > I seen a previous question (http://search-hadoop.com/m/uyzND1lrGUW1PgKGG
> )
> > on offset rollovers but it doesn't look like it was ever answered.
> >
> > Does anyone one know what happens when an offset max limit is reached?
> > Overflow, or something else?
> >
> > Thanks,
> >
> > Chad
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: How to verify offsets topic exists?

2015-10-05 Thread Grant Henke
Hi Stevo,

There are a couple of options to verify the topic exists:

   1. Consume from a topic with "offsets.storage=kafka". If its not created
   already, this should create it.
   2. List and describe the topic using the Kafka topics script. Ex:

bin/kafka-topics.sh --zookeeper localhost:2181 --list

bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic
__consumer_offsets


   1. Check the ZNode exists in Zookeeper. Ex:

bin/zookeeper-shell.sh localhost:2181
ls /brokers/topics/__consumer_offsets

get /brokers/topics/__consumer_offsets


Thanks,
Grant

On Mon, Oct 5, 2015 at 10:44 AM, Stevo Slavić <ssla...@gmail.com> wrote:

> Hello Apache Kafka community,
>
> In my integration tests, with single 0.8.2.2 broker, for newly created
> topic with single partition, after determining through topic metadata
> request that partition has lead broker assigned, when I try to reset offset
> for given consumer group, I first try to discover offset coordinator and
> that lookup is throwing ConsumerCoordinatorNotAvailableException
>
> On
>
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommit/FetchAPI
> it is documented that broker returns ConsumerCoordinatorNotAvailableCode
> for consumer metadata requests or offset commit requests if the offsets
> topic has not yet been created.
>
> I wonder if this is really the case, that the offsets topic has not been
> created. Any tips how to ensure/verify that offsets topic exists?
>
> Kind regards,
>
> Stevo Slavic.
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: are 0.8.2.1 and 0.9.0.0 compatible?

2015-10-01 Thread Grant Henke
Hi Richard,

You are correct that version will now be 0.9.0 and anything referencing
0.8.3 is being changed. You are also correct in the there have been wire
protocol changes that break compatibility. However, backwards compatibility
exists and you should always upgrade your brokers before upgrading your
clients in order to avoid issues (In the future KIP-35
<http://KIP-35+-+Retrieving+protocol+version> may change that).

It's also worth noting that if you are performing a rolling upgrade of your
brokers, you need to be sure brokers running the new protocol know to
communicate with the old version to remain compatible during the bounce.
This is done using the inter.broker.protocol.version property. More on that
topic can be read here:
https://kafka.apache.org/083/documentation.html#upgrade

Hopefully that helps clear things up.

Thank you,
Grant





On Thu, Oct 1, 2015 at 12:21 PM, Richard Lee <rd...@tivo.com> wrote:

> Note the 0.8.3-SNAPSHOT has recently been renamed 0.9.0.0-SNAPSHOT.
>
> In any event, the major version number change could indicate that there
> has, in fact, been some sort of incompatible change.  Using 0.9.0.0, I'm
> also unable to use the kafka-console-consumer.sh to read from a 0.8.2.1
> broker, but it works fine with a 0.9.0.0 broker.
>
> Some validation from a kafka expert that broker forward compatibility (or
> client backward compatibility) is not supported would be appreciated, and
> that this isn't just a case of some sort of local, fixable misconfiguration.
>
> Thanks!
> Richard
>
> On 09/30/2015 11:17 AM, Doug Tomm wrote:
>
>> hello,
>>
>> i've got a set of broker nodes running 0.8.2.1.  on my laptop i'm also
>> running 0.8.2.1, and i have a single broker node and mirrormaker there.
>> i'm also using kafka-console-consumer.sh on the mac to display messages on
>> a favorite topic being published from the broker nodes.  there are no
>> messages on the topic, but everything is well-behaved.  i can inject
>> messages with kafkacat and everything is fine.
>>
>> but then!
>>
>> on the laptop i switched everything to 0.8.3 but left the broker nodes
>> alone.  now when i run mirrormaker i see this:
>>
>> [2015-09-30 10:44:55,090] WARN
>> [ConsumerFetcherThread-tivo_kafka_110339-mbpr.local-1443635093396-c55cbafb-0-5],
>> Error in fetch kafka.consumer.ConsumerFetcherThread$FetchRequest@61cb11c5.
>> Possible cause: java.nio.BufferUnderflowException
>> (kafka.consumer.ConsumerFetcherThread)
>> [2015-09-30 10:44:55,624] WARN
>> [ConsumerFetcherThread-tivo_kafka_110339-mbpr.local-1443635093396-c55cbafb-0-5],
>> Error in fetch kafka.consumer.ConsumerFetcherThread$FetchRequest@3c7bb986.
>> Possible cause: java.nio.BufferUnderflowException
>> (kafka.consumer.ConsumerFetcherThread)
>> [2015-09-30 10:44:56,181] WARN
>> [ConsumerFetcherThread-tivo_kafka_110339-mbpr.local-1443635093396-c55cbafb-0-5],
>> Error in fetch kafka.consumer.ConsumerFetcherThread$FetchRequest@1d4fbd2c.
>> Possible cause: java.nio.BufferUnderflowException
>> (kafka.consumer.ConsumerFetcherThread)
>> [2015-09-30 10:44:56,726] WARN
>> [ConsumerFetcherThread-tivo_kafka_110339-mbpr.local-1443635093396-c55cbafb-0-5],
>> Error in fetch kafka.consumer.ConsumerFetcherThread$FetchRequest@59e67b2f.
>> Possible cause: java.nio.BufferUnderflowException
>> (kafka.consumer.ConsumerFetcherThread)
>>
>> if i use kafkacat to generate a message on the topic i see
>> IllegalArgumentExceptions instead.
>>
>> this suggests that the two versions of kafka aren't compatible. is this
>> the case?  does the whole ecosystem need to be on the same version?
>>
>> thank you,
>> doug
>>
>>
>


-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: log.retention.hours not working?

2015-09-22 Thread Grant Henke
All of the information Todd posted is important to know. There was also
jira related to this that has been committed trunk:
https://issues.apache.org/jira/browse/KAFKA-2436

Before that patch, log.retention.hours was used to calculate
KafkaConfig.logRetentionTimeMillis. But it was not used in LogManager to
decide when to delete a log. LogManager was only using the log.retention.ms
in the broker configuration.

Could you try setting log.retention.ms=360 instead of using the hours
config?

On Mon, Sep 21, 2015 at 10:33 PM, Todd Palino <tpal...@gmail.com> wrote:

> Retention is going to be based on a combination of both the retention and
> segment size settings (as a side note, it's recommended to use
> log.retention.ms and log.segment.ms, not the hours config. That's there
> for
> legacy reasons, but the ms configs are more consistent). As messages are
> received by Kafka, they are written to the current open log segment for
> each partition. That segment is rotated when either the log.segment.bytes
> or the log.segment.ms limit is reached. Once that happens, the log segment
> is closed and a new one is opened. Only after a log segment is closed can
> it be deleted via the retention settings. Once the log segment is closed
> AND either all the messages in the segment are older than log.retention.ms
> OR the total partition size is greater than log.retention.bytes, then the
> log segment is purged.
>
> As a note, the default segment limit is 1 gibibyte. So if you've only
> written in 1k of messages, you have a long way to go before that segment
> gets rotated. This is why the retention is referred to as a minimum time.
> You can easily retain much more than you're expecting for slow topics.
>
> -Todd
>
>
> On Mon, Sep 21, 2015 at 7:28 PM, allen chan <allen.michael.c...@gmail.com>
> wrote:
>
> > I guess that kind of makes sense.
> > The following section in the config is what confused me:
> > *"# The following configurations control the disposal of log segments.
> The
> > policy can*
> > *# be set to delete segments after a period of time, or after a given
> size
> > has accumulated.*
> > *# A segment will be deleted whenever *either* of these criteria are met.
> > Deletion always happens*
> > *# from the end of the log."*
> >
> > That makes it sound like deletion will happen if either of the criteria
> is
> > met.
> > I thought the whole idea of those two settings (time and bytes) is
> telling
> > the application when it will need to delete.
> >
> >
> >
> > On Mon, Sep 21, 2015 at 7:10 PM, noah <iamn...@gmail.com> wrote:
> >
> > > "minimum age of a log file to be eligible for deletion" Key word is
> > > minimum. If you only have 1k logs, Kafka doesn't need to delete
> anything.
> > > Try to push more data through and when it needs to, it will start
> > deleting
> > > old logs.
> > >
> > > On Mon, Sep 21, 2015 at 8:58 PM allen chan <
> allen.michael.c...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Just brought up new kafka cluster for testing.
> > > > Was able to use the console producers to send 1k of logs and received
> > it
> > > on
> > > > the console consumer side.
> > > >
> > > > The one issue that i have right now is that the retention period does
> > not
> > > > seem to be working.
> > > >
> > > > *# The minimum age of a log file to be eligible for deletion*
> > > > *log.retention.hours=1*
> > > >
> > > > I have waited for almost 2 hours and the 1k of logs are still in
> kafka.
> > > >
> > > > I did see these messages pop up on the console
> > > > *[2015-09-21 17:12:01,236] INFO Scheduling log segment 0 for log
> test-1
> > > for
> > > > deletion. (kafka.log.Log)*
> > > > *[2015-09-21 17:13:01,238] INFO Deleting segment 0 from log test-1.
> > > > (kafka.log.Log)*
> > > > *[2015-09-21 17:13:01,239] INFO Deleting index
> > > > /var/log/kafka/test-1/.index.deleted
> > > > (kafka.log.OffsetIndex)*
> > > >
> > > > I know the logs are still in there because i am using
> > > > the kafka-consumer-offset-checker.sh and it says how many messages
> the
> > > > logSize is.
> > > >
> > > > What am i missing in my configuration?
> > > >
> > > >
> > > >
> > > > Thanks!
> > > >
> > > > --
> > > > Allen Michael Chan
> > > >
> > >
> >
> >
> >
> > --
> > Allen Michael Chan
> >
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Jumbled up ISR

2015-09-15 Thread Grant Henke
The first replica in the ISR is the preferred replica, but is not required
to be the leader at all times. If you execute a preferred leader election,
or enable auto.leader.rebalance.enable, then replica 4 will become the
leader again.

More can be read here:

   - http://kafka.apache.org/documentation.html#basic_ops_leader_balancing
   -
   
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.PreferredReplicaLeaderElectionTool


Thank you,
Grant

On Tue, Sep 15, 2015 at 5:26 AM, Prabhjot Bharaj <prabhbha...@gmail.com>
wrote:

> Hi,
>
> Topic:part_1_repl_3_3 PartitionCount:1 ReplicationFactor:3 Configs:
>
> Topic: part_1_repl_3_3 Partition: 0 Leader: 3 Replicas: 3,4,5 Isr: 4,3,5
>
>
> I see that the replicas are 3,4,5 but, the ISR is 4,3,5.
>
>
> I have this doubt:-
>
> When the leader is 3, can the ISR be 4, 3, 5 ?
>
> Does the ISR got to have the leader as the first replica ??
>
>
> Regards,
>
> Prabhjot
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Zookeeper use cases with Kafka

2015-08-18 Thread Grant Henke
Hi Prabcs,

Much of that information can be found in the documentation and on the wiki.
The remaining can be found in the code. Any improvements to the
documentation is not only welcome but encouraged. Below are a few links to
get you started:

Documentation (See Zookeeper Directories):
http://kafka.apache.org/documentation.html#distributionimpl
Wiki:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper

I also recommended that contribution oriented emails be sent to
the  Kafka development email list: d...@kafka.apache.org

Thank you,
Grant

On Tue, Aug 18, 2015 at 8:03 AM, Prabhjot Bharaj prabhbha...@gmail.com
wrote:

 Hello Folks,

 I wish to contribute to Kafka internals. And, one of the things which can
 help me do that is understanding how kafka uses zookeeper. I have some of
 these basic doubts:-

 1. Is zookeeper primarily used for locking ? If yes, in what cases and what
 kind of nodes does it use - sequential/ephemeral?

 2. Does kafka use zookeeper watches for any of functions ?

 3. What kind of state is stored in Zookeeper ? (I believe it has to be the
 leader information per partition, but is there anything apart from it?)
 What is the scale of data that is stored in Zookeeper ?

 Looking forward for your help.

 Thanks,
 prabcs




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Describe command does not update details about killed brokers

2015-08-17 Thread Grant Henke
I think you are running into KAFKA-972
https://issues.apache.org/jira/browse/KAFKA-972(MetadataRequest returns
stale list of brokers) which is marked to be fixed in the 0.8.3 release.

Thank you,
Grant

On Mon, Aug 17, 2015 at 3:52 AM, Priya Darsini pdmgnfc940...@gmail.com
wrote:

 Hi,
 I have 3 brokers and created a topic with replication factor 3.
 I described topic after killing the brokers one by one.
 When i deleted my last live broker and described topic it still shows value
 for leader and isr.
 I've enabled auto.leader.rebalance


 Any clarifications for this confusing behaviour of kafka broker are
 welcome.
 Thanks in advance.




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Grant Henke
+dev

Adding dev list back in. Somehow it got dropped.


On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke ghe...@cloudera.com wrote:

 Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I don't
 suspect all of these will (or should) make it into the release but this
 should be a relatively complete list to work from:

- KAFKA-2114 https://issues.apache.org/jira/browse/KAFKA-2114: Unable
to change min.insync.replicas default
- KAFKA-1702 https://issues.apache.org/jira/browse/KAFKA-1702:
Messages silently Lost by producer
- KAFKA-2012 https://issues.apache.org/jira/browse/KAFKA-2012:
Broker should automatically handle corrupt index files
- KAFKA-2406 https://issues.apache.org/jira/browse/KAFKA-2406: ISR
propagation should be throttled to avoid overwhelming controller.
- KAFKA-2336 https://issues.apache.org/jira/browse/KAFKA-2336:
Changing offsets.topic.num.partitions after the offset topic is created
breaks consumer group partition assignment
- KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337: Verify
that metric names will not collide when creating new topics
- KAFKA-2393 https://issues.apache.org/jira/browse/KAFKA-2393:
Correctly Handle InvalidTopicException in KafkaApis.getTopicMetadata()
- KAFKA-2189 https://issues.apache.org/jira/browse/KAFKA-2189: Snappy
compression of message batches less efficient in 0.8.2.1
- KAFKA-2308 https://issues.apache.org/jira/browse/KAFKA-2308: New
producer + Snappy face un-compression errors after broker restart
- KAFKA-2042 https://issues.apache.org/jira/browse/KAFKA-2042: New
producer metadata update always get all topics.
- KAFKA-1367 https://issues.apache.org/jira/browse/KAFKA-1367: Broker
topic metadata not kept in sync with ZooKeeper
- KAFKA-972 https://issues.apache.org/jira/browse/KAFKA-972: 
 MetadataRequest
returns stale list of brokers
- KAFKA-1867 https://issues.apache.org/jira/browse/KAFKA-1867: liveBroker
list not updated on a cluster with no topics
- KAFKA-1650 https://issues.apache.org/jira/browse/KAFKA-1650: Mirror
Maker could lose data on unclean shutdown.
- KAFKA-2009 https://issues.apache.org/jira/browse/KAFKA-2009: Fix
UncheckedOffset.removeOffset synchronization and trace logging issue in
mirror maker
- KAFKA-2407 https://issues.apache.org/jira/browse/KAFKA-2407: Only
create a log directory when it will be used
- KAFKA-2327 https://issues.apache.org/jira/browse/KAFKA-2327:
broker doesn't start if config defines advertised.host but not
advertised.port
- KAFKA-1788: producer record can stay in RecordAccumulator forever if
leader is no available
- KAFKA-2234 https://issues.apache.org/jira/browse/KAFKA-2234:
Partition reassignment of a nonexistent topic prevents future reassignments
- KAFKA-2096 https://issues.apache.org/jira/browse/KAFKA-2096:
Enable keepalive socket option for broker to prevent socket leak
- KAFKA-1057 https://issues.apache.org/jira/browse/KAFKA-1057: Trim
whitespaces from user specified configs
- KAFKA-1641 https://issues.apache.org/jira/browse/KAFKA-1641: Log
cleaner exits if last cleaned offset is lower than earliest offset
- KAFKA-1648 https://issues.apache.org/jira/browse/KAFKA-1648: Round
robin consumer balance throws an NPE when there are no topics
- KAFKA-1724 https://issues.apache.org/jira/browse/KAFKA-1724:
Errors after reboot in single node setup
- KAFKA-1758 https://issues.apache.org/jira/browse/KAFKA-1758:
corrupt recovery file prevents startup
- KAFKA-1866 https://issues.apache.org/jira/browse/KAFKA-1866:
LogStartOffset gauge throws exceptions after log.delete()
- KAFKA-1883 https://issues.apache.org/jira/browse/KAFKA-1883: 
 NullPointerException
in RequestSendThread
- KAFKA-1896 https://issues.apache.org/jira/browse/KAFKA-1896:
Record size funcition of record in mirror maker hit NPE when the message
value is null.
- KAFKA-2101 https://issues.apache.org/jira/browse/KAFKA-2101:
Metric metadata-age is reset on a failed update
- KAFKA-2112 https://issues.apache.org/jira/browse/KAFKA-2112: make
overflowWheel volatile
- KAFKA-2117 https://issues.apache.org/jira/browse/KAFKA-2117:
OffsetManager uses incorrect field for metadata
- KAFKA-2164 https://issues.apache.org/jira/browse/KAFKA-2164:
ReplicaFetcherThread: suspicious log message on reset offset
- KAFKA-1668 https://issues.apache.org/jira/browse/KAFKA-1668:
TopicCommand doesn't warn if --topic argument doesn't match any topics
- KAFKA-2198 https://issues.apache.org/jira/browse/KAFKA-2198:
kafka-topics.sh exits with 0 status on failures
- KAFKA-2235 https://issues.apache.org/jira/browse/KAFKA-2235:
LogCleaner offset map overflow
- KAFKA-2241 https://issues.apache.org/jira/browse/KAFKA-2241:
AbstractFetcherThread.shutdown() should not block

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Grant Henke
/KAFKA-2345: Attempt
   to delete a topic already marked for deletion throws ZkNodeExistsException
   - KAFKA-2353 https://issues.apache.org/jira/browse/KAFKA-2353:
   SocketServer.Processor should catch exception and close the socket properly
   in configureNewConnections.
   - KAFKA-1836 https://issues.apache.org/jira/browse/KAFKA-1836:
   metadata.fetch.timeout.ms set to zero blocks forever
   - KAFKA-2317 https://issues.apache.org/jira/browse/KAFKA-2317: De-register
   isrChangeNotificationListener on controller resignation

Note: KAFKA-2120 https://issues.apache.org/jira/browse/KAFKA-2120 
KAFKA-2421 https://issues.apache.org/jira/browse/KAFKA-2421 were
mentioned in previous emails, but are not in the list because they are not
committed yet.

Hope that helps the effort.

Thanks,
Grant

On Mon, Aug 17, 2015 at 12:09 AM, Grant Henke ghe...@cloudera.com wrote:

 +1 to that suggestion. Though I suspect that requires a committer to do.
 Making it part of the standard commit process could work too.
 On Aug 16, 2015 11:01 PM, Gwen Shapira g...@confluent.io wrote:

 BTW. I think it will be great for Apache Kafka to have a 0.8.2 release
 manager who's role is to cherrypick low-risk bug-fixes into the 0.8.2
 branch and once enough bug fixes happened (or if sufficiently critical
 fixes happened) to roll out a new maintenance release (with every 3 month
 as a reasonable bugfix release target).

 This will add some predictability regarding how fast we release fixes for
 bugs.

 Gwen

 On Sun, Aug 16, 2015 at 8:09 PM, Jeff Holoman jholo...@cloudera.com
 wrote:

  +1 for the release and also including
 
  https://issues.apache.org/jira/browse/KAFKA-2114
 
  Thanks
 
  Jeff
 
  On Sun, Aug 16, 2015 at 2:51 PM, Stevo Slavić ssla...@gmail.com
 wrote:
 
   +1 (non-binding) for 0.8.2.2 release
  
   Would be nice to include in that release new producer resiliency bug
  fixes
   https://issues.apache.org/jira/browse/KAFKA-1788 and
   https://issues.apache.org/jira/browse/KAFKA-2120
  
   On Fri, Aug 14, 2015 at 4:03 PM, Gwen Shapira g...@confluent.io
 wrote:
  
Will be nice to include Kafka-2308 and fix two critical snappy
 issues
  in
the maintenance release.
   
Gwen
On Aug 14, 2015 6:16 AM, Grant Henke ghe...@cloudera.com wrote:
   
 Just to clarify. Will KAFKA-2189 be the only patch in the release?

 On Fri, Aug 14, 2015 at 7:35 AM, Manikumar Reddy 
  ku...@nmsworks.co.in
   
 wrote:

  +1  for 0.8.2.2 release
 
  On Fri, Aug 14, 2015 at 5:49 PM, Ismael Juma ism...@juma.me.uk
 
wrote:
 
   I think this is a good idea as the change is minimal on our
 side
   and
it
  has
   been tested in production for some time by the reporter.
  
   Best,
   Ismael
  
   On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao j...@confluent.io
  wrote:
  
Hi, Everyone,
   
Since the release of Kafka 0.8.2.1, a number of people have
reported
 an
issue with snappy compression (
https://issues.apache.org/jira/browse/KAFKA-2189).
 Basically,
  if
 they
   use
snappy in 0.8.2.1, they will experience a 2-3X space
 increase.
   The
  issue
has since been fixed in trunk (just a snappy jar upgrade).
  Since
 0.8.3
  is
still a few months away, it may make sense to do an 0.8.2.2
   release
  just
   to
fix this issue. Any objections?
   
Thanks,
   
Jun
   
  
 



 --
 Grant Henke
 Software Engineer | Cloudera
 gr...@cloudera.com | twitter.com/gchenke |
  linkedin.com/in/granthenke

   
  
 
 
 
  --
  Jeff Holoman
  Systems Engineer
 




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Compaction per topic

2015-08-17 Thread Grant Henke
Hi Elias,

You can set compaction on a per topic basis while leaving
log.cleanup.policy=delete as the default on the broker. See Topic-level
configuration here:
http://kafka.apache.org/documentation.html#brokerconfigs

An example usage of the command line tools to do this is:

bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic
--config cleanup.policy=compact


Thanks,
Grant

On Mon, Aug 17, 2015 at 1:48 PM, Elias K mariguanit...@gmail.com wrote:

 Hi all,

 This is my first post here so please bear with me.

 I would like to have compaction enabled in some topics, but in others have
 purge after x amount of days.

 I did some searches and I couldn't find anything related to this and it
 appears that the compaction is enabled globally for all topics.

 Thanks for all the information
 Elias




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-16 Thread Grant Henke
+1 to that suggestion. Though I suspect that requires a committer to do.
Making it part of the standard commit process could work too.
On Aug 16, 2015 11:01 PM, Gwen Shapira g...@confluent.io wrote:

 BTW. I think it will be great for Apache Kafka to have a 0.8.2 release
 manager who's role is to cherrypick low-risk bug-fixes into the 0.8.2
 branch and once enough bug fixes happened (or if sufficiently critical
 fixes happened) to roll out a new maintenance release (with every 3 month
 as a reasonable bugfix release target).

 This will add some predictability regarding how fast we release fixes for
 bugs.

 Gwen

 On Sun, Aug 16, 2015 at 8:09 PM, Jeff Holoman jholo...@cloudera.com
 wrote:

  +1 for the release and also including
 
  https://issues.apache.org/jira/browse/KAFKA-2114
 
  Thanks
 
  Jeff
 
  On Sun, Aug 16, 2015 at 2:51 PM, Stevo Slavić ssla...@gmail.com wrote:
 
   +1 (non-binding) for 0.8.2.2 release
  
   Would be nice to include in that release new producer resiliency bug
  fixes
   https://issues.apache.org/jira/browse/KAFKA-1788 and
   https://issues.apache.org/jira/browse/KAFKA-2120
  
   On Fri, Aug 14, 2015 at 4:03 PM, Gwen Shapira g...@confluent.io
 wrote:
  
Will be nice to include Kafka-2308 and fix two critical snappy issues
  in
the maintenance release.
   
Gwen
On Aug 14, 2015 6:16 AM, Grant Henke ghe...@cloudera.com wrote:
   
 Just to clarify. Will KAFKA-2189 be the only patch in the release?

 On Fri, Aug 14, 2015 at 7:35 AM, Manikumar Reddy 
  ku...@nmsworks.co.in
   
 wrote:

  +1  for 0.8.2.2 release
 
  On Fri, Aug 14, 2015 at 5:49 PM, Ismael Juma ism...@juma.me.uk
wrote:
 
   I think this is a good idea as the change is minimal on our
 side
   and
it
  has
   been tested in production for some time by the reporter.
  
   Best,
   Ismael
  
   On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao j...@confluent.io
  wrote:
  
Hi, Everyone,
   
Since the release of Kafka 0.8.2.1, a number of people have
reported
 an
issue with snappy compression (
https://issues.apache.org/jira/browse/KAFKA-2189).
 Basically,
  if
 they
   use
snappy in 0.8.2.1, they will experience a 2-3X space
 increase.
   The
  issue
has since been fixed in trunk (just a snappy jar upgrade).
  Since
 0.8.3
  is
still a few months away, it may make sense to do an 0.8.2.2
   release
  just
   to
fix this issue. Any objections?
   
Thanks,
   
Jun
   
  
 



 --
 Grant Henke
 Software Engineer | Cloudera
 gr...@cloudera.com | twitter.com/gchenke |
  linkedin.com/in/granthenke

   
  
 
 
 
  --
  Jeff Holoman
  Systems Engineer
 



Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-14 Thread Grant Henke
Just to clarify. Will KAFKA-2189 be the only patch in the release?

On Fri, Aug 14, 2015 at 7:35 AM, Manikumar Reddy ku...@nmsworks.co.in
wrote:

 +1  for 0.8.2.2 release

 On Fri, Aug 14, 2015 at 5:49 PM, Ismael Juma ism...@juma.me.uk wrote:

  I think this is a good idea as the change is minimal on our side and it
 has
  been tested in production for some time by the reporter.
 
  Best,
  Ismael
 
  On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao j...@confluent.io wrote:
 
   Hi, Everyone,
  
   Since the release of Kafka 0.8.2.1, a number of people have reported an
   issue with snappy compression (
   https://issues.apache.org/jira/browse/KAFKA-2189). Basically, if they
  use
   snappy in 0.8.2.1, they will experience a 2-3X space increase. The
 issue
   has since been fixed in trunk (just a snappy jar upgrade). Since 0.8.3
 is
   still a few months away, it may make sense to do an 0.8.2.2 release
 just
  to
   fix this issue. Any objections?
  
   Thanks,
  
   Jun
  
 




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Topic auto deletion

2015-08-11 Thread Grant Henke
Hi Shahar,

This feature does not exist.

Thanks,
Grant


On Tue, Aug 11, 2015 at 2:28 AM, Shahar Danos shahar.da...@kaltura.com
wrote:

 Hi,

 I'm wondering whether Kafka 0.8.2 has a topic auto-deletion feature, but
 couldn't figure it out from documentation
 https://kafka.apache.org/082/ops.html.
 I'm aware of delete.topic.enable config parameter, but I'm asking about an
 expiration aspect; when a topic exists but no one used it (consume/produce)
 for the last X mins, it will be auto-removed. I wonder if I can configure
 topics to be expired after a time I choose.

 Many thanks in advance,
 Shahar.




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: AdminUtils addPartition, subsequent producer send exception

2015-08-07 Thread Grant Henke
Interesting use case. I would be interested to hear more. Are you assigning
1 partition per key incrementally? How does your consumer know which
partition has which key?

I don't think there is a way to manually invalidate the cached metadata in
the public producer api (I could be wrong), but the longest you should have
to wait is whatever is configured for metadata.max.age.ms.

metadata.max.age.ms is the period of time in milliseconds after which we
force a refresh of metadata even if we haven't seen any partition
leadership changes to proactively discover any new brokers or partitions.
http://kafka.apache.org/documentation.html#newproducerconfigs




On Fri, Aug 7, 2015 at 12:34 PM, Gelinas, Chiara cgeli...@illumina.com
wrote:

 Hi All,

 We are looking to dynamically create partitions when we see a new piece of
 data that requires logical partitioning (so partitioning from a logical
 perspective rather than partitioning solely for load-based reasons). I
 noticed that when I create a partition via AdminUtils.addPartition, and
 then send the message (within milliseconds since it’s all happening on one
 thread execution), I get the following error:

 Invalid partition given with record: 3 is not in the range [0...3].

 Basically, the Producer can’t see the new partition. When I set a
 breakpoint just before the send (which essentially sleeps the thread), then
 all is well, and it pushes the message to the new partition with no issues.

 I am running just one zookeeper, one kafka (no replicas) – this is all
 local on my dev environment.

 Is this normal behavior or is there possibly some issue with how we are
 using addPartition? Also, once we have replicas in a more realistic
 production environment, should we expect this lag to increase?

 The only workaround I can envision for this is to have the thread check
 the partition count via AdminUtils and only move on when the partition
 count comes back as expected.

 Thanks,
 Chiara





-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: AdminUtils addPartition, subsequent producer send exception

2015-08-07 Thread Grant Henke
Glad to help. I will say, as you probably got from my interest/questions,
that is definitely outside of normal use (that I have seen). Why do you
need dynamic logical partitioning?

On Fri, Aug 7, 2015 at 1:20 PM, Gelinas, Chiara cgeli...@illumina.com
wrote:

 Thank you! We are new to Kafka, so this makes complete sense - the
 metadata refresh age.

 Yes, incrementally assigning 1 partitioner key - we are tracking it in a
 relational DB along with offset, etc, for the consumers.

 We haven¹t yet implemented the dynamic consumer side but there are several
 approaches from the brute/force quick/dirty (e.g. Poll DB at interval for
 updates), to using another Kafka topic for new partition event
 notifications  (of course, we favor the latter, but timelines will likely
 mean that the first option is what initially gets implemented).

 On 8/7/15, 10:56 AM, Grant Henke ghe...@cloudera.com wrote:

 Interesting use case. I would be interested to hear more. Are you
 assigning
 1 partition per key incrementally? How does your consumer know which
 partition has which key?
 
 I don't think there is a way to manually invalidate the cached metadata in
 the public producer api (I could be wrong), but the longest you should
 have
 to wait is whatever is configured for metadata.max.age.ms.
 
 metadata.max.age.ms is the period of time in milliseconds after which we
 force a refresh of metadata even if we haven't seen any partition
 leadership changes to proactively discover any new brokers or partitions.
 http://kafka.apache.org/documentation.html#newproducerconfigs
 
 
 
 
 On Fri, Aug 7, 2015 at 12:34 PM, Gelinas, Chiara cgeli...@illumina.com
 wrote:
 
  Hi All,
 
  We are looking to dynamically create partitions when we see a new piece
 of
  data that requires logical partitioning (so partitioning from a logical
  perspective rather than partitioning solely for load-based reasons). I
  noticed that when I create a partition via AdminUtils.addPartition, and
  then send the message (within milliseconds since it¹s all happening on
 one
  thread execution), I get the following error:
 
  Invalid partition given with record: 3 is not in the range [0...3].
 
  Basically, the Producer can¹t see the new partition. When I set a
  breakpoint just before the send (which essentially sleeps the thread),
 then
  all is well, and it pushes the message to the new partition with no
 issues.
 
  I am running just one zookeeper, one kafka (no replicas) ­ this is all
  local on my dev environment.
 
  Is this normal behavior or is there possibly some issue with how we are
  using addPartition? Also, once we have replicas in a more realistic
  production environment, should we expect this lag to increase?
 
  The only workaround I can envision for this is to have the thread check
  the partition count via AdminUtils and only move on when the partition
  count comes back as expected.
 
  Thanks,
  Chiara
 
 
 
 
 
 --
 Grant Henke
 Software Engineer | Cloudera
 gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: OffsetOutOfRangeError with Kafka-Spark streaming

2015-08-06 Thread Grant Henke
Looks like this is likely a case very similar to the case Parth mentioned
storm users have seen, when processing falls behind the retention period.

Perhaps Spark and Kafka can handle this scenario more gracefully. I would
be happy to do some investigation/testing and report back with findings and
potentially open a Jira to track any fix.

On Thu, Aug 6, 2015 at 6:48 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 retention.ms is actually millisecond, you want a value much larger then
 1440, which translates to 1.4 seconds.


 On 8/6/15, 4:35 PM, Cassa L lcas...@gmail.com wrote:

 Hi Grant,
 Yes, I saw exception in Spark and Kafka. In Kafka server logs I get this
 exception:
 kafka.common.OffsetOutOfRangeException: Request for offset 2823 but we
 only
 have log segments in the range 2824 to 2824.
 at kafka.log.Log.read(Log.scala:380)
 at
 kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSet(KafkaApis.sc
 ala:530)
 at
 kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.
 apply(KafkaApis.scala:476)
 at
 kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.
 apply(KafkaApis.scala:471)
 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scal
 a:206)
 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scal
 a:206)
 at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
 at
 scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.Map$Map1.map(Map.scala:93)
 at
 kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSets(KafkaApis.s
 cala:471)
 at
 kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:783)
 at
 kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:765)
 at
 kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.sc
 a
 
 Similar kind of exception comes to Spark Job.
 
 Here are my versions :
Spark - 1.4.1
 Kafka - 0.8.1
 
 I changed retention on config using this command :
 ./kafka-topics.sh --alter --zookeeper  XXX:2181  --topic MyTopic --config
 retention.ms=1440  (I believe this is in minutes)
 
 I am also noticing something in Kafka. When I run below command on broker:
 ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
 vdc-vm8.apple.com:9092 --topic MyTopic --time -2
 Earliest offset is being set to latest just in few seconds. Am I
 co-relating this issue correctly?
 
 Here is my example on a new Topic. Initial output of this command is
 ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
 vdc-vm8.apple.com:9092 --topic MyTopic --time -2
 SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder.
 SLF4J: Defaulting to no-operation (NOP) logger implementation
 SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
 details.
 MyTopic:0:60
 
 I published 4 messages to Kafka. Immediately after few seconds, command
 output is:
   MyTopic:0:64
 Isn't this supposed to stay at 60 for longer time based on retention
 policy?
 
 
 Thanks,
 Leena
 
 
 On Thu, Aug 6, 2015 at 12:09 PM, Grant Henke ghe...@cloudera.com wrote:
 
  Does this Spark Jira match up with what you are seeing or sound related?
  https://issues.apache.org/jira/browse/SPARK-8474
 
  What versions of Spark and Kafka are you using? Can you include more of
 the
  spark log? Any errors shown in the Kafka log?
 
  Thanks,
  Grant
 
  On Thu, Aug 6, 2015 at 1:17 PM, Cassa L lcas...@gmail.com wrote:
 
   Hi,
Has anyone tried streaming API of Spark with Kafka? I am
 experimenting
  new
   Spark API to read from Kafka.
   KafkaUtils.createDirectStream(...)
  
   Every now and then, I get following error spark
   kafka.common.OffsetOutOfRangeException and my spark script stops
  working.
   I have simple topic with just one partition.
  
   I would appreciate any clues on how to debug this issue.
  
   Thanks,
   LCassa
  
 
 
 
  --
  Grant Henke
  Software Engineer | Cloudera
  gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
 




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: OffsetOutOfRangeError with Kafka-Spark streaming

2015-08-06 Thread Grant Henke
Does this Spark Jira match up with what you are seeing or sound related?
https://issues.apache.org/jira/browse/SPARK-8474

What versions of Spark and Kafka are you using? Can you include more of the
spark log? Any errors shown in the Kafka log?

Thanks,
Grant

On Thu, Aug 6, 2015 at 1:17 PM, Cassa L lcas...@gmail.com wrote:

 Hi,
  Has anyone tried streaming API of Spark with Kafka? I am experimenting new
 Spark API to read from Kafka.
 KafkaUtils.createDirectStream(...)

 Every now and then, I get following error spark
 kafka.common.OffsetOutOfRangeException and my spark script stops working.
 I have simple topic with just one partition.

 I would appreciate any clues on how to debug this issue.

 Thanks,
 LCassa




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Kafka Simple Cosumer

2015-08-06 Thread Grant Henke
I think you are looking for commitOffsets. I pasted the doc code snippet
below:

  /**
   * Commit offsets for a topic
   * Version 0 of the request will commit offsets to Zookeeper and version
1 and above will commit offsets to Kafka.
   * @param request a [[kafka.api.OffsetCommitRequest]] object.
   * @return a [[kafka.api.OffsetCommitResponse]] object.
   */
  def commitOffsets(request: OffsetCommitRequest)

Thanks,
Grant

On Thu, Aug 6, 2015 at 12:09 AM, uvsd1 prabhakar.digumar...@gmail.com
wrote:

 Hi All ,

 Does kafka provides a way to store the offset within zookeper while using
 the simpleConsumer ?


 Thanks,
 Prabhakar




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Decomissioning a broker

2015-08-04 Thread Grant Henke
Thats correct. Thanks for catching that.

On Tue, Aug 4, 2015 at 3:27 PM, Andrew Otto ao...@wikimedia.org wrote:

 Thanks!

  In fact if you use a Controlled Shutdown migrating the replicas and
  leaders should happen for you as well.

 Just to clarify, controlled shutdown will only move the leaders to other
 replicas, right?  It won’t actually migrate any replicas elsewhere.

 -Ao


  On Aug 4, 2015, at 13:00, Grant Henke ghe...@cloudera.com wrote:
 
  The broker will actually unregister itself from zookeeper. The brokers id
  path uses ephemeral nodes so they are automatically destroyed on
 shutdown.
  In fact if you use a Controlled Shutdown migrating the replicas and
  leaders should happen for you as well. Though, manual reassignment may be
  preferred in your case.
 
  Here is some extra information on controlled shutdowns:
  http://kafka.apache.org/documentation.html#basic_ops_restarting
 
  Thanks,
  Grant
 
  On Thu, Jul 30, 2015 at 4:37 PM, Andrew Otto ao...@wikimedia.org
 wrote:
 
  I’m sure this has been asked before, but I can’t seem to find the
 answer.
 
  I’m planning a Kafka cluster expansion and upgrade to 0.8.2.1.  In doing
  so, I will be decommissioning a broker.  I plan to remove this broker
 fully
  from the cluster, and then reinstall it and use it for a different
 purpose.
 
  I understand how to use the reassign-partitions tool to generate new
  partition assignments and to move partitions around so that the target
  broker no longer has any active replicas.  Once that is done, is there
  anything special that needs to happen?  I can shutdown the broker, but
 as
  far as I know that broker will still be registered in Zookeeper.
 Should I
  just delete the znode for that broker once it has been shut down?
 
  Thanks!
  -Andrew Otto
 
 
 
 
  --
  Grant Henke
  Software Engineer | Cloudera
  gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Lead Broker from kafka.message.MessageAndMetadata

2015-08-04 Thread Grant Henke
Hi Sreeni,

Using the SimpleConsumer you can send a TopicMetadataRequest for a topic
and the TopicMetadataResponse will contain TopicMetadata for each topic
requested (or all) which contains PartitionMetadata for all all partitions.
The PartitionMetadata contains the leader, replicas, and isr.

Is that what you are looking for?

Thanks,
Grant

On Mon, Aug 3, 2015 at 7:26 AM, Sreenivasulu Nallapati 
sreenu.nallap...@gmail.com wrote:

 Hello,

 Is there a way that we can find the lead broker
 from kafka.message.MessageAndMetadata class?

 My use case is simple, I have topic and partition and wanted to find out
 the lead broker for that partition.

 Please provide your insights


 Thanks
 Sreeni




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Kafka Zookeeper Issues

2015-08-04 Thread Grant Henke
The /brokers/ids nodes are ephemeral nodes that only exists while the
brokers maintain a session to zookeeper. There is more information on
Kafka's Zookeeper usage here:
   - http://kafka.apache.org/documentation.html
  - look for Broker Node Registry
   -
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper

Hopefully that helps debug your issue.

Thank you,
Grant

On Mon, Aug 3, 2015 at 5:20 AM, Wollert, Fabian fabian.woll...@zalando.de
wrote:

 hi everyone,

 we are trying to deploy Kafka 0.8.2.1 and Zookeeper on AWS using
 Cloudformation, ASG's and other Services. For Zookeeper we are using
 Netflix' Exhibitor (V 1.5.5) to ensure failover stability.

 What we are observing right now is that after some days our Brokers are not
 registered anymore in the /brokers/ids path in Zookeeper. I was trying to
 see when they get deleted to check the logs, but the ZK Transaction logs
 only shows the create stmt, no deletes or something (though deletes are
 written down there). Can someone explain me how the mechanism works with
 registering and deregistering in Zookeeper or point me to a doc or even
 source code, where this happens? Or some one has even some idea what
 happens there.

 Any experience on what to take care of deploying kafka on AWS (or generally
 a cloud env) would be also helpful.

 Cheers

 --
 *Fabian Wollert*
 Business Intelligence

 *POSTAL ADDRESS*
 Zalando SE
 11501 Berlin

 *OFFICE*
 Zalando SE
 Mollstraße 1
 10178 Berlin
 Germany

 Phone: +49 30 20968 1819
 Fax:   +49 30 27594 693
 E-Mail: fabian.woll...@zalando.de
 Web: www.zalando.de
 Jobs: jobs.zalando.de

 Zalando SE, Tamara-Danz-Straße 1, 10243 Berlin
 Company registration: Amtsgericht Charlottenburg, HRB 158855 B
 Tax ID: 29/560/00596 * VAT registration number: DE 260543043
 Management Board: Robert Gentz, David Schneider, Rubin Ritter
 Chairperson of the Supervisory Board: Cristina Stenbeck
 Registered office: Berlinn




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Decomissioning a broker

2015-08-04 Thread Grant Henke
The broker will actually unregister itself from zookeeper. The brokers id
path uses ephemeral nodes so they are automatically destroyed on shutdown.
In fact if you use a Controlled Shutdown migrating the replicas and
leaders should happen for you as well. Though, manual reassignment may be
preferred in your case.

Here is some extra information on controlled shutdowns:
http://kafka.apache.org/documentation.html#basic_ops_restarting

Thanks,
Grant

On Thu, Jul 30, 2015 at 4:37 PM, Andrew Otto ao...@wikimedia.org wrote:

 I’m sure this has been asked before, but I can’t seem to find the answer.

 I’m planning a Kafka cluster expansion and upgrade to 0.8.2.1.  In doing
 so, I will be decommissioning a broker.  I plan to remove this broker fully
 from the cluster, and then reinstall it and use it for a different purpose.

 I understand how to use the reassign-partitions tool to generate new
 partition assignments and to move partitions around so that the target
 broker no longer has any active replicas.  Once that is done, is there
 anything special that needs to happen?  I can shutdown the broker, but as
 far as I know that broker will still be registered in Zookeeper.  Should I
 just delete the znode for that broker once it has been shut down?

 Thanks!
 -Andrew Otto




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Log Deletion Behavior

2015-07-24 Thread Grant Henke
Also this stackoverflow answer may help: http://stackoverflow.com/a/29672325

On Fri, Jul 24, 2015 at 9:36 PM, Grant Henke ghe...@cloudera.com wrote:

 I would actually suggest only using the ms versions of the retention
 config. Be sure to check/set all the configs below and look at the
 documented explanations here
 http://kafka.apache.org/documentation.html#brokerconfigs.

 I am guessing the default log.retention.check.interval.ms of 5 minutes is
 too long for your case or you may have changed from the default
 log.cleanup.policy and it is no longer set to delete.

 log.retention.ms
 log.retention.check.interval.ms
 log.cleanup.policy

 On Fri, Jul 24, 2015 at 9:14 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com wrote:

 To add on, the main thing here is you should be using only one of these
 properties.

 Thanks,

 Mayuresh

 On Fri, Jul 24, 2015 at 6:47 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com
  wrote:

  Yes. It should. Do not set other retention settings. Just use the
 hours
  settings.
  Let me know about this :)
 
  Thanks,
 
  Mayuresh
 
  On Fri, Jul 24, 2015 at 6:43 PM, JIEFU GONG jg...@berkeley.edu wrote:
 
  Mayuresh, thanks for your comment. I won't be able to change these
  settings
  until next Monday, but just so confirm you are saying that if I restart
  the
  brokers my logs should delete themselves with respect to the newest
  settings, correct?
  ᐧ
 
  On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat 
  gharatmayures...@gmail.com
   wrote:
 
   No. This should not happen. At Linkedin we just use the log retention
   hours. Try using that. Chang e it and bounce the broker. It should
 work.
   Also looking back at the config's I am not sure why we had 3
 different
   configs for the same property :
  
   log.retention.ms
   log.retention.minutes
   log.retention.hours
  
   We should probably be having just the milliseconds.
  
   Thanks,
  
   Mayuresh
  
   On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu
 wrote:
  
Hi all,
   
I have a few broad questions on how log deletion works,
 specifically
  in
conjunction with the log.retention.time setting. Say I published
 some
messages to some topics when the configuration was originally set
 to
something like log.retention.hours=168 (default). If I publish
 these
messages successfully, then later set the configuration to
 something
  like
log.retention.minutes=1, are those logs supposed to persist for the
   newest
settings or the old settings? Right now my logs are refusing to
 delete
themselves unless I specifically mark them for deletion -- is this
 the
correct/anticipated/wanted behavior?
   
Thanks for the help!
   
--
   
Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences
   
jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
   
  
  
  
   --
   -Regards,
   Mayuresh R. Gharat
   (862) 250-7125
  
 
 
 
  --
 
  Jiefu Gong
  University of California, Berkeley | Class of 2017
  B.A Computer Science | College of Letters and Sciences
 
  jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
 
 
 
 
  --
  -Regards,
  Mayuresh R. Gharat
  (862) 250-7125
 



 --
 -Regards,
 Mayuresh R. Gharat
 (862) 250-7125




 --
 Grant Henke
 Solutions Consultant | Cloudera
 ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke




-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Log Deletion Behavior

2015-07-24 Thread Grant Henke
I would actually suggest only using the ms versions of the retention
config. Be sure to check/set all the configs below and look at the
documented explanations here
http://kafka.apache.org/documentation.html#brokerconfigs.

I am guessing the default log.retention.check.interval.ms of 5 minutes is
too long for your case or you may have changed from the default
log.cleanup.policy and it is no longer set to delete.

log.retention.ms
log.retention.check.interval.ms
log.cleanup.policy

On Fri, Jul 24, 2015 at 9:14 PM, Mayuresh Gharat gharatmayures...@gmail.com
 wrote:

 To add on, the main thing here is you should be using only one of these
 properties.

 Thanks,

 Mayuresh

 On Fri, Jul 24, 2015 at 6:47 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com
  wrote:

  Yes. It should. Do not set other retention settings. Just use the hours
  settings.
  Let me know about this :)
 
  Thanks,
 
  Mayuresh
 
  On Fri, Jul 24, 2015 at 6:43 PM, JIEFU GONG jg...@berkeley.edu wrote:
 
  Mayuresh, thanks for your comment. I won't be able to change these
  settings
  until next Monday, but just so confirm you are saying that if I restart
  the
  brokers my logs should delete themselves with respect to the newest
  settings, correct?
  ᐧ
 
  On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat 
  gharatmayures...@gmail.com
   wrote:
 
   No. This should not happen. At Linkedin we just use the log retention
   hours. Try using that. Chang e it and bounce the broker. It should
 work.
   Also looking back at the config's I am not sure why we had 3 different
   configs for the same property :
  
   log.retention.ms
   log.retention.minutes
   log.retention.hours
  
   We should probably be having just the milliseconds.
  
   Thanks,
  
   Mayuresh
  
   On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu
 wrote:
  
Hi all,
   
I have a few broad questions on how log deletion works, specifically
  in
conjunction with the log.retention.time setting. Say I published
 some
messages to some topics when the configuration was originally set to
something like log.retention.hours=168 (default). If I publish these
messages successfully, then later set the configuration to something
  like
log.retention.minutes=1, are those logs supposed to persist for the
   newest
settings or the old settings? Right now my logs are refusing to
 delete
themselves unless I specifically mark them for deletion -- is this
 the
correct/anticipated/wanted behavior?
   
Thanks for the help!
   
--
   
Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences
   
jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
   
  
  
  
   --
   -Regards,
   Mayuresh R. Gharat
   (862) 250-7125
  
 
 
 
  --
 
  Jiefu Gong
  University of California, Berkeley | Class of 2017
  B.A Computer Science | College of Letters and Sciences
 
  jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
 
 
 
 
  --
  -Regards,
  Mayuresh R. Gharat
  (862) 250-7125
 



 --
 -Regards,
 Mayuresh R. Gharat
 (862) 250-7125




-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Dropping support for Scala 2.9.x

2015-07-08 Thread Grant Henke
+1 for dropping 2.9

On Wed, Jul 8, 2015 at 9:15 AM, Sriharsha Chintalapani ka...@harsha.io
wrote:

 I am +1 on dropping 2.9.x support.

 Thanks,
 Harsha


 On July 8, 2015 at 7:08:12 AM, Ismael Juma (mli...@juma.me.uk) wrote:

 Hi,

 The responses in this thread were positive, but there weren't many. A few
 months passed and Sriharsha encouraged me to reopen the thread given that
 the 2.9 build has been broken for at least a week[1] and no-one seemed to
 notice.

 Do we want to invest more time so that the 2.9 build continues to work or
 do we want to focus our efforts on 2.10 and 2.11? Please share your
 opinion.

 Best,
 Ismael

 [1] https://issues.apache.org/jira/browse/KAFKA-2325

 On Fri, Mar 27, 2015 at 2:20 PM, Ismael Juma mli...@juma.me.uk wrote:

  Hi all,
 
  The Kafka build currently includes support for Scala 2.9, which means
 that
  it cannot take advantage of features introduced in Scala 2.10 or depend
 on
  libraries that require it.
 
  This restricts the solutions available while trying to solve existing
  issues. I was browsing JIRA looking for areas to contribute and I quickly
  ran into two issues where this is the case:
 
  * KAFKA-1351: String.format is very expensive in Scala could be solved
  nicely by using the String interpolation feature introduced in Scala
 2.10.
 
  * KAFKA-1595: Remove deprecated and slower scala JSON parser from
  kafka.consumer.TopicCount could be solved by using an existing JSON
  library, but both jackson-scala and play-json require 2.10 (argonaut
  supports Scala 2.9, but it brings other dependencies like scalaz). We can
  workaround this by writing our own code instead of using libraries, of
  course, but it's not ideal.
 
  Other features like Scala Futures and value classes would also be useful
  in some situations, I would think (for a more extensive list of new
  features, see
 
 http://scala-language.1934581.n4.nabble.com/Scala-2-10-0-now-available-td4634126.html
  ).
 
  Another pain point of supporting 2.9.x is that it doubles the number of
  build and test configurations required from 2 to 4 (because the 2.9.x
  series was not necessarily binary compatible).
 
  A strong argument for maintaining support for 2.9.x was the client
  library, but that has been rewritten in Java.
 
  It's also worth mentioning that Scala 2.9.1 was released in August 2011
  (more than 3.5 years ago) and the 2.9.x series hasn't received updates of
  any sort since early 2013. Scala 2.10.0, in turn, was released in January
  2013 (over 2 years ago) and 2.10.5, the last planned release in the
 2.10.x
  series, has been recently released (so even 2.10.x won't be receiving
  updates any longer).
 
  All in all, I think it would not be unreasonable to drop support for
 Scala
  2.9.x in a future release, but I may be missing something. What do others
  think?
 
  Ismael
 




-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


queued.max.requests Configuration

2015-06-18 Thread Grant Henke
The default value of queued.max.requests is 500. However, the sample
production config in the documentation (
http://kafka.apache.org/documentation.html#prodconfig)
sets queued.max.requests to 16.

Can anyone elaborate on the recommended value of 16 and the trade offs of
increasing or decreasing this value from the default?

Thank you,
Grant

-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: blocking KafkaProducer call

2015-04-01 Thread Grant Henke
They can. You can read more about configuring the new java producer here:
http://kafka.apache.org/documentation.html#newproducerconfigs

Thanks,
Grant

On Wed, Apr 1, 2015 at 12:34 PM, sunil kalva sambarc...@gmail.com wrote:

 Does these config params has effect when i try to simulate sync mode by
 not passing callback ?

 On Wed, Apr 1, 2015 at 10:32 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com wrote:

 Whats your linger.ms and batch.size ?

 Thanks,

 Mayuresh

 On Wed, Apr 1, 2015 at 5:51 AM, sunil kalva sambarc...@gmail.com wrote:

  I am trying to simulate sync call using following code,
 
  try {
 
  FutureRecordMetadata send = producer.send(new
  ProducerRecordbyte[],byte[](the-topic, key.getBytes(),
  value.getBytes())).get();
 
   send.get();
 
  System.out.println(Time =  + (System.currentTimeMillis() - b));
  } catch (Exception e) {
 
  }
 
  And i am using new org.apache.kafka.clients.producer.KafkaProducer
  class for sending messages, each  message  is taking more than 100ms,
  am i missing something. If i use old kafka.javaapi.producer.Produce
  it is giving the desired throughput.
 
  Please advice me hot to fix this.
 
 
  On Tue, Mar 31, 2015 at 11:21 PM, sunil kalva sambarc...@gmail.com
  wrote:
 
   thanks ghenke, that was a quick response. I will test and will let you
   know if i have some questions.
  
   On Tue, Mar 31, 2015 at 11:17 PM, Grant Henke ghe...@cloudera.com
  wrote:
  
   I think you are looking at is this section:
  
If you want to simulate a simple blocking call you can do the
  following:
   
producer.send(new ProducerRecordbyte[],byte[](the-topic,
   key.getBytes(), value.getBytes())).get();
   
What that is doing is calling .get() on the Future returned by the
  send
   method. This will block until the message is sent or an exception is
   thrown.
  
   The documentation for Future is here:
  
  
 
 http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Future.html#get()
  
   On Tue, Mar 31, 2015 at 12:30 PM, sunil kalva sambarc...@gmail.com
   wrote:
  
Hi
According to this
   
   
  
 
 http://kafka.apache.org/082/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html
documentation, if i don't pass callback it will work as blocking
 call,
   Does
it mean that message will be immediately sent to kafka cluster and
 all
possible exceptions will be throws immediately if not able to send
 ?
   
--
SunilKalva
   
  
  
  
   --
   Grant Henke
   Solutions Consultant | Cloudera
   ghe...@cloudera.com | 920-980-8979
   twitter.com/ghenke http://twitter.com/gchenke |
   linkedin.com/in/granthenke
  
  
  
  
   --
   SunilKalva
  
 
 
 
  --
  SunilKalva
 



 --
 -Regards,
 Mayuresh R. Gharat
 (862) 250-7125




 --
 SunilKalva




-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | twitter.com/ghenke http://twitter.com/gchenke |
linkedin.com/in/granthenke


Re: blocking KafkaProducer call

2015-03-31 Thread Grant Henke
I think you are looking at is this section:

 If you want to simulate a simple blocking call you can do the following:

 producer.send(new ProducerRecordbyte[],byte[](the-topic, 
 key.getBytes(), value.getBytes())).get();

 What that is doing is calling .get() on the Future returned by the send
method. This will block until the message is sent or an exception is
thrown.

The documentation for Future is here:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Future.html#get()

On Tue, Mar 31, 2015 at 12:30 PM, sunil kalva sambarc...@gmail.com wrote:

 Hi
 According to this

 http://kafka.apache.org/082/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html
 documentation, if i don't pass callback it will work as blocking call, Does
 it mean that message will be immediately sent to kafka cluster and all
 possible exceptions will be throws immediately if not able to send ?

 --
 SunilKalva




-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | 920-980-8979
twitter.com/ghenke http://twitter.com/gchenke | linkedin.com/in/granthenke


Re: Check topic exists after deleting it.

2015-03-23 Thread Grant Henke
What happens when producers or consumers are running while the topic
deleting is going on?

On Mon, Mar 23, 2015 at 10:02 AM, Harsha ka...@harsha.io wrote:

 DeleteTopic makes a node in zookeeper to let controller know that there is
 a topic up for deletion. This doesn’t immediately delete the topic it can
 take time depending if all the partitions of that topic are online and
 brokers are available as well.  Once all the Log files deleted zookeeper
 node gets deleted as well.
 Also make sure you don’t have any producers or consumers are running while
 the topic deleting is going on.

 --
 Harsha


 On March 23, 2015 at 1:29:50 AM, anthony musyoki (
 anthony.musy...@gmail.com) wrote:

 On deleting a topic via TopicCommand.deleteTopic()

 I get Topic test-delete is marked for deletion.

 I follow up by checking if the topic exists by using
 AdminUtils.topicExists()
 which suprisingly returns true.

 I expected AdminUtils.TopicExists() to check both BrokerTopicsPath
 and DeleteTopicsPath before returning a verdict but it only checks
 BrokerTopicsPath

 Shouldn't a topic marked for deletion return false for topicExists() ?




-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | 920-980-8979
twitter.com/ghenke http://twitter.com/gchenke | linkedin.com/in/granthenke