Re: shrinking isr for partitions followed by error.. (kafka 0.8.1.1)

2014-11-14 Thread Manish
correction server.log ( not zookeeper logs. )

On Fri, Nov 14, 2014 at 5:04 PM, Manish  wrote:

> I had deleted the topic "sams_2014-10-30" successfully, and later on one
> of our kafka brokers (broker id 4) crashed,
>
> Now the topic is gone from zookeeper
>
> ls /brokers/topics
>
> (doesn't show the topic anymore)
>
>
> but I still see the zookeeper logs filling up with these messages.. :
>
>
> [2014-11-14 16:57:27,837] INFO Partition [sams_2014-10-30,7] on broker 0:
> Shrinking ISR for partition [sams_2014-10-30,7] from 0,4 to 0
> (kafka.cluster.Partition)
>
> [2014-11-14 16:57:27,907] ERROR Conditional update of path
> /brokers/topics/sams_2014-10-30/partitions/7/state with data
> {"controller_epoch":20,"leader":0,"version":1,"leader_epoch":7,"isr":[0]}
> and expected version 490 failed due to
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> NoNode for /brokers/topics/sams_2014-10-30/partitions/7/state
> (kafka.utils.ZkUtils$)
>
>
>
> any ideas?
>
>
>


shrinking isr for partitions followed by error.. (kafka 0.8.1.1)

2014-11-14 Thread Manish
I had deleted the topic "sams_2014-10-30" successfully, and later on one of
our kafka brokers (broker id 4) crashed,

Now the topic is gone from zookeeper

ls /brokers/topics

(doesn't show the topic anymore)


but I still see the zookeeper logs filling up with these messages.. :


[2014-11-14 16:57:27,837] INFO Partition [sams_2014-10-30,7] on broker 0:
Shrinking ISR for partition [sams_2014-10-30,7] from 0,4 to 0
(kafka.cluster.Partition)

[2014-11-14 16:57:27,907] ERROR Conditional update of path
/brokers/topics/sams_2014-10-30/partitions/7/state with data
{"controller_epoch":20,"leader":0,"version":1,"leader_epoch":7,"isr":[0]}
and expected version 490 failed due to
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
NoNode for /brokers/topics/sams_2014-10-30/partitions/7/state
(kafka.utils.ZkUtils$)



any ideas?


Re: log4j dir?

2014-11-14 Thread hsy...@gmail.com
I think there is no way to specify different log location without modifying
shell script.
The *kafka.**logs.dir* in log4j.properties file is misleading

On Fri, Nov 14, 2014 at 1:24 PM, Ben Drees  wrote:

> Hi,
>
> I had trouble with this as well.  The version of Kafka I'm running insists
> on using 'kafka/logs', so I create a soft link from there to the desired
> destination directory:
>
> # kafka scripts hard-code the logs dir, so point that path to where we want
> the logs to be.
> ln -s $STREAM_BUFFER_LOGS_DIR kafka/logs
>
> -Ben
>
>
> On Fri, Nov 14, 2014 at 11:17 AM, hsy...@gmail.com 
> wrote:
>
> > Anyone has any idea how do I config the log4j file dir?
> >
> > On Thu, Nov 13, 2014 at 4:58 PM, hsy...@gmail.com 
> > wrote:
> >
> > > Hi guys,
> > >
> > > Just notice kafka.logs.dir in log4j.properties doesn't take effect
> > >
> > > It's always set to *$base_dir/logs* in kafka-run-class.sh
> > >
> > > LOG_DIR=$base_dir/logs
> > > KAFKA_LOG4J_OPTS="-Dkafka.logs.dir=$LOG_DIR $KAFKA_LOG4J_OPTS"
> > >
> > > Best,
> > > Siyuan
> > >
> >
>


Re: Enforcing Network Bandwidth Quote with New Java Producer

2014-11-14 Thread Jun Rao
We have a metric that measures the per-topic bytes send rate (after
compression). You can get the values through the producer api.

Thanks,

Jun

On Fri, Nov 14, 2014 at 10:34 AM, Bhavesh Mistry  wrote:

> HI Kafka Team,
>
> We like to enforce a network bandwidth quota limit per minute on producer
> side.  How can I do this ?  I need some way to count compressed bytes on
> producer ?  I know there is callback does not give this ability ?  Let me
> know the best way.
>
>
>
> Thanks,
>
> Bhavesh
>


Re: Consumer reading from same topic twice without commiting

2014-11-14 Thread Chia-Chun Shih
Auto Commit is to sync offset to zookeeper periodically. Without auto
commit, consumer still maintains an offset locally. It explains this
behavior.

To be able to rewind, you need to use SimpleConsumer.
2014/11/15 上午2:18 於 "dinesh kumar"  寫道:

> Hi,
> I am trying to understand the behavior of Kafka Consumer in Java.
>
> Consider a scenario where a consumer is reading from a partition with
> auto.comiit disabled and only one partition in the topic. The consumer
> reads a message 'A' and sleep for some time, say 10 seconds. Then reads
> from the same parition once again from the without commiting the offsets of
>  the previous read a message 'B'.
>
> My assumption was since there was no commit of the previous offset we will
> get the same message message 'A' and message 'B' should be the same.
>
> I ran a simple code with this assumption but I was suprised to see message
> 'A' and message 'B' were different (Producer produced Message 'B' after
> message 'A').
>
> Can someone explain this behavior? Is there a way to make the consumer
> return message 'A' instead of message 'B' without restarting the consumer
> (which will trigger a repartition in a multi partition/ multi consumer
> scenario), say some parameter or a method call to rewind locally?
>
>
> Thanks,
> Dinesh
>


Re: Getting Simple consumer details using MBean

2014-11-14 Thread Jun Rao
So, you want to monitor the mbeans on the broker side? Take a look at
http://kafka.apache.org/documentation.html#monitoring

Thanks,

Jun

On Thu, Nov 13, 2014 at 10:58 PM, Madhukar Bharti 
wrote:

> Hi Jun Rao,
>
> Sorry to disturb you. But I my Kafka setup it is not showing. I am
> attaching screen shot taken from all brokers.
>
> In kafka.consumer it is listing only "ReplicaFetcherThread".
>
> As I said earlier I am using "2.10-0.8.1.1" version. Do i need to
> configure any extra parameter for this? I am simply using the same
> configuration as described in wiki page.
>
>
>
> Thanks and Regards,
> Madhukar
>
>
> On Fri, Nov 14, 2014 at 1:17 AM, Jun Rao  wrote:
>
>> I tried running kafka-simple-consumer-shell. I can see the following
>> mbean.
>>
>>
>> "kafka.consumer":type="FetchRequestAndResponseMetrics",name="SimpleConsumerShell-AllBrokersFetchRequestRateAndTimeMs"
>>
>> Thanks,
>>
>> Jun
>>
>> On Wed, Nov 12, 2014 at 9:57 PM, Madhukar Bharti <
>> bhartimadhu...@gmail.com>
>> wrote:
>>
>> > Hi Jun Rao,
>> >
>> > Thanks for your quick reply.
>> >
>> > I am not able to see this  any bean named as "SimpleConsumer". Is there
>> any
>> > configuration related to this?
>> >
>> > How can I see this bean named listing in Jconsole window?
>> >
>> >
>> > Thanks and Regards
>> > Madhukar
>> >
>> > On Thu, Nov 13, 2014 at 6:06 AM, Jun Rao  wrote:
>> >
>> > > Those are for 0.7. In 0.8, you should see sth
>> > > like FetchRequestRateAndTimeMs in SimpleConsumer.
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Wed, Nov 12, 2014 at 5:14 AM, Madhukar Bharti <
>> > bhartimadhu...@gmail.com
>> > > >
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I want to get the simple consumer details using MBean as described
>> here
>> > > > <
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/Operations#Operations-Monitoring
>> > > > >.
>> > > > But these bean names are not showing in JConsole as well as while
>> > trying
>> > > to
>> > > > read from JMX.
>> > > >
>> > > > Please help me to get simple consumer details.
>> > > >
>> > > > I am using Kafka 0.8.1.1 version.
>> > > >
>> > > >
>> > > > Thanks and Regards,
>> > > > Madhukar Bharti
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks and Regards,
>> > Madhukar Bharti
>> > Mob: 7845755539
>> >
>>
>
>
>
> --
> Thanks and Regards,
> Madhukar Bharti
> Mob: 7845755539
>


Re: Location of Logging Files/How To Turn On Logging For Kafka Components

2014-11-14 Thread Jun Rao
You need to set the location of log4j.properties in a JVM variable. Take a
look at the kafka scripts in bin/

Thanks,

Jun

On Thu, Nov 13, 2014 at 10:53 PM, Alex Melville  wrote:

> Hi Jun,
>
> These are the two lines of log4j-related warnings I get when I try to run
> my producer:
>
> log4j:WARN No appenders could be found for logger
> (kafka.utils.VerifiableProperties).
>
> log4j:WARN Please initialize the log4j system properly.
>
>
> I have searched extensively online and have so far not found how to
> "initialize the log4j system" properly. All I want is to create debug
> logging so I can better find why my producer fails to send messages to the
> broker cluster.
>
>
>
> Alex
>
> On Thu, Nov 6, 2014 at 3:31 PM, Jun Rao  wrote:
>
> > The log4j entries before that error should tell you the cause of the
> error.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Nov 4, 2014 at 11:25 PM, Alex Melville 
> > wrote:
> >
> > > Background:
> > >
> > > I have searched for a while online, and through the files located in
> the
> > > kafka/logs directory, trying to find where kafka writes log output to
> in
> > > order to debug the SimpleProducer I wrote. My producer is almost
> > identical
> > > to the simple producer located here
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example
> > >
> > > except for I'm using Protobuffers instead of Strings to publish data
> to a
> > > cluster. I'm receiving the following error when I try to run the
> > > SimpleProducer
> > >
> > > Exception in thread "main" kafka.common.FailedToSendMessageException:
> > > Failed to send messages after 3 tries.
> > >
> > > at
> > >
> > >
> >
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:90)
> > >
> > > at kafka.producer.Producer.send(Producer.scala:76)
> > >
> > > at kafka.javaapi.producer.Producer.send(Producer.scala:33)
> > >
> > > at stream.SimpleProducer.send(Unknown Source)
> > >
> > > at stream.SimpleProducer.main(Unknown Source)
> > >
> > >
> > > I know this isn't a network problem, because I ran the console-producer
> > and
> > > successfully published data to the same broker that my Simple Producer
> is
> > > trying to publish to. I now want to try to debug this error.
> > >
> > >
> > >
> > > Question:
> > >
> > > Where would my Simple Producer write info about its startup and
> eventual
> > > error, such that I can read it and try to reason as to why it failed?
> If
> > it
> > > produces no log data on its own, what is the best way to write this
> data
> > to
> > > a somewhere where I can use it to debug? I've noticed that log4j,
> which I
> > > understand is a often-used library for logging in Java, came with my
> > kafka
> > > download. Am I supposed to use log4j for this info? I do not know very
> > much
> > > about log4j, so any info on how to get this setup would also be
> > > appreciated.
> > >
> > >
> > > -Alex
> > >
> >
>


Re: log4j dir?

2014-11-14 Thread Ben Drees
Hi,

I had trouble with this as well.  The version of Kafka I'm running insists
on using 'kafka/logs', so I create a soft link from there to the desired
destination directory:

# kafka scripts hard-code the logs dir, so point that path to where we want
the logs to be.
ln -s $STREAM_BUFFER_LOGS_DIR kafka/logs

-Ben


On Fri, Nov 14, 2014 at 11:17 AM, hsy...@gmail.com  wrote:

> Anyone has any idea how do I config the log4j file dir?
>
> On Thu, Nov 13, 2014 at 4:58 PM, hsy...@gmail.com 
> wrote:
>
> > Hi guys,
> >
> > Just notice kafka.logs.dir in log4j.properties doesn't take effect
> >
> > It's always set to *$base_dir/logs* in kafka-run-class.sh
> >
> > LOG_DIR=$base_dir/logs
> > KAFKA_LOG4J_OPTS="-Dkafka.logs.dir=$LOG_DIR $KAFKA_LOG4J_OPTS"
> >
> > Best,
> > Siyuan
> >
>


Re: Offset manager movement (due to change in KAFKA-1469)

2014-11-14 Thread Joel Koshy
I just wanted to follow-up on a fall-out caused by the issue mentioned
below. After the offset manager moved, consumer offsets started going
to a different partition of the offsets topic.

However, the previous partition of the offsets topic can still have
the old offsets.  Those won't necessarily get compacted out (if the
dirtiness threshold is not met).

Now suppose multiple leader changes occur and both the new (correct)
offsets partition and the old offsets partition happen to move to the
same broker (in that order). We load the offsets into the offsets
cache on a leader change. So if the order of leader changes is new
partition followed by old partition, then the old offsets end up
overwriting the correct offsets in the cache with old (most likely out
of range) offsets.

The above issue affected some of our consumers (especially ones that
have auto.offset.reset set to smallest).

In order to fix the issue completely I ended up writing this tool to
purge bad offsets:

https://gist.github.com/jjkoshy/a3f64d67fe494da3c3a6

In order to produce the tombstones the broker needs to allow producer
requests to the __consumer_offsets topic. i.e., fortunately we had not
yet picked up KAFKA-1580 so the above worked for us.

Thanks,

Joel


On Mon, Sep 22, 2014 at 03:36:46PM -0700, Joel Koshy wrote:
> I just wanted to send this out as an FYI but it does not affect any
> released versions.
> 
> This only affects those who release off trunk and use Kafka-based
> consumer offset management.  KAFKA-1469 fixes an issue in our
> Utils.abs code. Since we use this method in determining the offset
> manager for a consumer group, the fix can yield a different offset
> manager if you happen to run off trunk and upgrade across the fix.
> This won't affect all groups, but those that happen to hash to a value
> that is affected by the bug fixed in KAFKA-1469.
> 
> (Sort of related - we may want to consider not using hashcode on the
> group and switch to a more standard hashing algorithm but I highly
> doubt that hashcode values on a string will change in the future.)
> 
> Thanks,
> 
> -- 
> Joel



Re: Programmatic Kafka version detection/extraction?

2014-11-14 Thread Magnus Edenhill
Hi Gwen,

the protocol version in an InfoResponse message would be the maximum
supported mainline protocol version, each change to the
effective protocol specification bumps the version by one.


The current per-request-type version ID is unfortunately useless to the
client, it only allows the broker to act differently
on different versions of a known request type.

Since the broker currently does not handle unknown request types gracefully
- it simply disconnects the client, which from the client's perspective
could mean anything and nothing - the addition of an InfoRequest/Respones
would allow the client to know if a desired broker-feature is available or
not and thus allows the client library to provide the application (and in
the end the user) with usable error message (compare "Broker disconnected"
to "offsetCommit not supported by 0.8.0 broker").
Having the broker return a fabricated response with the header.err field
set to UNKNOWN_REQUEST_TYPE would be very useful as well of course.


Regards,
Magnus



2014-11-14 18:17 GMT+01:00 Gwen Shapira :

> I'm not sure there's a single "protocol version". Each
> request/response has its own version ID embedded in the request
> itself.
>
> Broker version, broker ID and (as Joel suggested) git hash are all
> reasonable. And +1 for adding this to API to support non-JVM clients.
>
> Gwen
>
> On Fri, Nov 14, 2014 at 8:46 AM, Magnus Edenhill 
> wrote:
> > Please let's not add dependencies on more third party protocols/software,
> > the move away from Zookeeper dependence on the clients is very much
> welcomed
> > and it'd be a pity to see a new foreign dependency added for such a
> trivial
> > thing
> > as propagating the version of a broker to its client.
> >
> > My suggestion is to add a new protocol request which returns:
> >  - broker version
> >  - protocol version
> >  - broker id
> >
> > A generic future-proof solution would be the use of tags (or named TLVs):
> > InfoResponse: [InfoTag]
> > InfoTag:
> >intX tag  ( KAFKA_TAG_BROKER_VERSION, KAFKA_TAG_PROTO_VERSION,
> > KAFKA_TAG_SSL, ... )
> >intX type( KAFKA_TYPE_STR, KAFKA_TYPE_INT32, KAFKA_TYPE_INT64,
> ...)
> >intX len
> >bytes payload
> >
> > Local site/vendor tags could be defined in the configuration file,
> >
> >
> > This would also allow clients to enable/disable features based on
> protocol
> > version,
> > e.g., there is no point in trying offsetCommit on a 0.8.0 broker and the
> > client library
> > can inform the application about this early, rather than having its
> > offsetCommit requests
> > fail by connection teardown (which is not much of an error propagation).
> >
> >
> > My two cents,
> > Magnus
> >
> >
> > 2014-11-12 20:11 GMT+01:00 Mark Roberts :
> >
> >> I haven't worked much with JMX before, but some quick googling (10-20
> >> minutes) is very inconclusive as to how I would go about getting the
> server
> >> version I'm connecting to from a Python client.  Can someone please
> >> reassure me that it's relatively trivial for non Java clients to query
> JMX
> >> for the server version of every server in the cluster? Is there a reason
> >> not to include this in the API itself?
> >>
> >> -Mark
> >>
> >> On Wed, Nov 12, 2014 at 9:50 AM, Joel Koshy 
> wrote:
> >>
> >> > +1 on the JMX + gradle properties. Is there any (seamless) way of
> >> > including the exact git hash? That would be extremely useful if users
> >> > need help debugging and happen to be on an unreleased build (say, off
> >> > trunk)
> >> >
> >> > On Wed, Nov 12, 2014 at 09:34:35AM -0800, Gwen Shapira wrote:
> >> > > Actually, Jun suggested exposing this via JMX.
> >> > >
> >> > > On Wed, Nov 12, 2014 at 9:31 AM, Gwen Shapira <
> gshap...@cloudera.com>
> >> > wrote:
> >> > >
> >> > > > Good question.
> >> > > >
> >> > > > The server will need to expose this in the protocol, so Kafka
> clients
> >> > will
> >> > > > know what they are talking to.
> >> > > >
> >> > > > We may also want to expose this in the producer and consumer, so
> >> people
> >> > > > who use Kafka's built-in clients will know which version they
> have in
> >> > the
> >> > > > environment.
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Wed, Nov 12, 2014 at 9:09 AM, Mark Roberts 
> >> > wrote:
> >> > > >
> >> > > >> Just to be clear: this is going to be exposed via some Api the
> >> clients
> >> > > >> can call at startup?
> >> > > >>
> >> > > >>
> >> > > >> > On Nov 12, 2014, at 08:59, Guozhang Wang 
> >> > wrote:
> >> > > >> >
> >> > > >> > Sounds great, +1 on this.
> >> > > >> >
> >> > > >> >> On Tue, Nov 11, 2014 at 1:36 PM, Gwen Shapira <
> >> > gshap...@cloudera.com>
> >> > > >> wrote:
> >> > > >> >>
> >> > > >> >> So it looks like we can use Gradle to add properties to
> manifest
> >> > file
> >> > > >> and
> >> > > >> >> then use getResourceAsStream to read the file and parse it.
> >> > > >> >>
> >> > > >> >> The Gradle part would be something like:
> >> > > >> >> jar.manifest {
> >> > > >> >>attributes('Implem

Re: log4j dir?

2014-11-14 Thread hsy...@gmail.com
Anyone has any idea how do I config the log4j file dir?

On Thu, Nov 13, 2014 at 4:58 PM, hsy...@gmail.com  wrote:

> Hi guys,
>
> Just notice kafka.logs.dir in log4j.properties doesn't take effect
>
> It's always set to *$base_dir/logs* in kafka-run-class.sh
>
> LOG_DIR=$base_dir/logs
> KAFKA_LOG4J_OPTS="-Dkafka.logs.dir=$LOG_DIR $KAFKA_LOG4J_OPTS"
>
> Best,
> Siyuan
>


Enforcing Network Bandwidth Quote with New Java Producer

2014-11-14 Thread Bhavesh Mistry
HI Kafka Team,

We like to enforce a network bandwidth quota limit per minute on producer
side.  How can I do this ?  I need some way to count compressed bytes on
producer ?  I know there is callback does not give this ability ?  Let me
know the best way.



Thanks,

Bhavesh


Consumer reading from same topic twice without commiting

2014-11-14 Thread dinesh kumar
Hi,
I am trying to understand the behavior of Kafka Consumer in Java.

Consider a scenario where a consumer is reading from a partition with
auto.comiit disabled and only one partition in the topic. The consumer
reads a message 'A' and sleep for some time, say 10 seconds. Then reads
from the same parition once again from the without commiting the offsets of
 the previous read a message 'B'.

My assumption was since there was no commit of the previous offset we will
get the same message message 'A' and message 'B' should be the same.

I ran a simple code with this assumption but I was suprised to see message
'A' and message 'B' were different (Producer produced Message 'B' after
message 'A').

Can someone explain this behavior? Is there a way to make the consumer
return message 'A' instead of message 'B' without restarting the consumer
(which will trigger a repartition in a multi partition/ multi consumer
scenario), say some parameter or a method call to rewind locally?


Thanks,
Dinesh


Re: Broker keeps rebalancing

2014-11-14 Thread Chen Wang
We are using
zookeeper-3.4.6

more zookeeper logs here:
2014-11-14 09:42:28,233 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
client /10.93.83.43:43487 which had sessionid 0x149a4cc1b581b60
2014-11-14 09:42:28,234 [myid:1] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x149a4cc1b581b5f, likely client has closed socket
at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:780)
2014-11-14 09:42:28,234 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
client /10.93.83.43:43486 which had sessionid 0x149a4cc1b581b5f
2014-11-14 09:42:48,689 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket connection
from /10.93.83.50:46935
2014-11-14 09:42:49,280 [myid:1] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from old
client /10.93.83.50:46935; will be dropped if server is in r-o mode
2014-11-14 09:42:49,280 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to establish
new session at /10.93.83.50:46935
2014-11-14 09:42:49,311 [myid:1] - INFO
 [CommitProcessor:1:ZooKeeperServer@617] - Established session
0x149a4cc1b581b62 with negotiated timeout 4 for client /
10.93.83.50:46935
2014-11-14 09:45:19,746 [myid:1] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x149a4cc1b581b61, likely client has closed socket
at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:780)
2014-11-14 09:45:19,747 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
client /10.93.83.50:46331 which had sessionid 0x149a4cc1b581b61
2014-11-14 09:47:25,995 [myid:1] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x149a4cc1b581b5b, likely client has closed socket
at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:780)
2014-11-14 09:47:25,996 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
client /10.93.83.50:44094 which had sessionid 0x149a4cc1b581b5b


Chen

On Thu, Nov 13, 2014 at 5:25 PM, Jun Rao  wrote:

> Which version of ZK are you using?
>
> Thanks,
>
> Jun
>
> On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang 
> wrote:
>
> > Thanks for the info.
> > It makes sense, however, I didn't see any "session timeout"/"expired"
> > entries in consumer log..
> > but do see lots of connection closed entry in zookeeper log:
> >
> > 2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
> > client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
> > 2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket
> > connection
> > from /10.93.80.121:38437
> > 2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from old
> > client /10.93.80.121:38437; will be dropped if server is in r-o mode
> > 2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to
> establish
> > new session at /10.93.80.121:38437
> > 2014-11-13 10:08:04,538 [myid:1] - INFO
> >  [CommitProcessor:1:ZooKeeperServer@617] - Established session
> > 0x149a4cc1b580e7e with negotiated timeout 4 for client /
> > 10.93.80.121:38437
> > 2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
> > client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e
> >
> > We are using -Xmx2048m for consumer, and I didn't see any GC related
> > exceptions
> >
> > Chen
> >
> >
> >
> > On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang 
> wrote:
> >
> > > Hey Chen,
> > >
> > > As Neha suggested, typical reason of too many rebalances is that your
> > > consumers kept being timed out from ZK, and you can verify this by
> > checking
> > 

Re: Programmatic Kafka version detection/extraction?

2014-11-14 Thread Gwen Shapira
I'm not sure there's a single "protocol version". Each
request/response has its own version ID embedded in the request
itself.

Broker version, broker ID and (as Joel suggested) git hash are all
reasonable. And +1 for adding this to API to support non-JVM clients.

Gwen

On Fri, Nov 14, 2014 at 8:46 AM, Magnus Edenhill  wrote:
> Please let's not add dependencies on more third party protocols/software,
> the move away from Zookeeper dependence on the clients is very much welcomed
> and it'd be a pity to see a new foreign dependency added for such a trivial
> thing
> as propagating the version of a broker to its client.
>
> My suggestion is to add a new protocol request which returns:
>  - broker version
>  - protocol version
>  - broker id
>
> A generic future-proof solution would be the use of tags (or named TLVs):
> InfoResponse: [InfoTag]
> InfoTag:
>intX tag  ( KAFKA_TAG_BROKER_VERSION, KAFKA_TAG_PROTO_VERSION,
> KAFKA_TAG_SSL, ... )
>intX type( KAFKA_TYPE_STR, KAFKA_TYPE_INT32, KAFKA_TYPE_INT64, ...)
>intX len
>bytes payload
>
> Local site/vendor tags could be defined in the configuration file,
>
>
> This would also allow clients to enable/disable features based on protocol
> version,
> e.g., there is no point in trying offsetCommit on a 0.8.0 broker and the
> client library
> can inform the application about this early, rather than having its
> offsetCommit requests
> fail by connection teardown (which is not much of an error propagation).
>
>
> My two cents,
> Magnus
>
>
> 2014-11-12 20:11 GMT+01:00 Mark Roberts :
>
>> I haven't worked much with JMX before, but some quick googling (10-20
>> minutes) is very inconclusive as to how I would go about getting the server
>> version I'm connecting to from a Python client.  Can someone please
>> reassure me that it's relatively trivial for non Java clients to query JMX
>> for the server version of every server in the cluster? Is there a reason
>> not to include this in the API itself?
>>
>> -Mark
>>
>> On Wed, Nov 12, 2014 at 9:50 AM, Joel Koshy  wrote:
>>
>> > +1 on the JMX + gradle properties. Is there any (seamless) way of
>> > including the exact git hash? That would be extremely useful if users
>> > need help debugging and happen to be on an unreleased build (say, off
>> > trunk)
>> >
>> > On Wed, Nov 12, 2014 at 09:34:35AM -0800, Gwen Shapira wrote:
>> > > Actually, Jun suggested exposing this via JMX.
>> > >
>> > > On Wed, Nov 12, 2014 at 9:31 AM, Gwen Shapira 
>> > wrote:
>> > >
>> > > > Good question.
>> > > >
>> > > > The server will need to expose this in the protocol, so Kafka clients
>> > will
>> > > > know what they are talking to.
>> > > >
>> > > > We may also want to expose this in the producer and consumer, so
>> people
>> > > > who use Kafka's built-in clients will know which version they have in
>> > the
>> > > > environment.
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Nov 12, 2014 at 9:09 AM, Mark Roberts 
>> > wrote:
>> > > >
>> > > >> Just to be clear: this is going to be exposed via some Api the
>> clients
>> > > >> can call at startup?
>> > > >>
>> > > >>
>> > > >> > On Nov 12, 2014, at 08:59, Guozhang Wang 
>> > wrote:
>> > > >> >
>> > > >> > Sounds great, +1 on this.
>> > > >> >
>> > > >> >> On Tue, Nov 11, 2014 at 1:36 PM, Gwen Shapira <
>> > gshap...@cloudera.com>
>> > > >> wrote:
>> > > >> >>
>> > > >> >> So it looks like we can use Gradle to add properties to manifest
>> > file
>> > > >> and
>> > > >> >> then use getResourceAsStream to read the file and parse it.
>> > > >> >>
>> > > >> >> The Gradle part would be something like:
>> > > >> >> jar.manifest {
>> > > >> >>attributes('Implementation-Title': project.name,
>> > > >> >>'Implementation-Version': project.version,
>> > > >> >>'Built-By': System.getProperty('user.name'),
>> > > >> >>'Built-JDK': System.getProperty('java.version'),
>> > > >> >>'Built-Host': getHostname(),
>> > > >> >>'Source-Compatibility': project.sourceCompatibility,
>> > > >> >>'Target-Compatibility': project.targetCompatibility
>> > > >> >>)
>> > > >> >>}
>> > > >> >>
>> > > >> >> The code part would be:
>> > > >> >>
>> > > >> >>
>> > > >>
>> >
>> this.getClass().getClassLoader().getResourceAsStream("/META-INF/MANIFEST.MF")
>> > > >> >>
>> > > >> >> Does that look like the right approach?
>> > > >> >>
>> > > >> >> Gwen
>> > > >> >>
>> > > >> >> On Tue, Nov 11, 2014 at 10:43 AM, Bhavesh Mistry <
>> > > >> >> mistry.p.bhav...@gmail.com
>> > > >> >>> wrote:
>> > > >> >>
>> > > >> >>> If is maven artifact then you will get following pre-build
>> > property
>> > > >> file
>> > > >> >>> from maven build called pom.properties under
>> > > >> >>> /META-INF/maven/groupid/artifactId/pom.properties folder.
>> > > >> >>>
>> > > >> >>> Here is sample:
>> > > >> >>> #Generated by Maven
>> > > >> >>> #Mon Oct 10 10:44:31 EDT 2011
>> > > >> >>> version=10.0.1
>> > > >> >>> groupId=com.google.gu

Re: Programmatic Kafka version detection/extraction?

2014-11-14 Thread Magnus Edenhill
Please let's not add dependencies on more third party protocols/software,
the move away from Zookeeper dependence on the clients is very much welcomed
and it'd be a pity to see a new foreign dependency added for such a trivial
thing
as propagating the version of a broker to its client.

My suggestion is to add a new protocol request which returns:
 - broker version
 - protocol version
 - broker id

A generic future-proof solution would be the use of tags (or named TLVs):
InfoResponse: [InfoTag]
InfoTag:
   intX tag  ( KAFKA_TAG_BROKER_VERSION, KAFKA_TAG_PROTO_VERSION,
KAFKA_TAG_SSL, ... )
   intX type( KAFKA_TYPE_STR, KAFKA_TYPE_INT32, KAFKA_TYPE_INT64, ...)
   intX len
   bytes payload

Local site/vendor tags could be defined in the configuration file,


This would also allow clients to enable/disable features based on protocol
version,
e.g., there is no point in trying offsetCommit on a 0.8.0 broker and the
client library
can inform the application about this early, rather than having its
offsetCommit requests
fail by connection teardown (which is not much of an error propagation).


My two cents,
Magnus


2014-11-12 20:11 GMT+01:00 Mark Roberts :

> I haven't worked much with JMX before, but some quick googling (10-20
> minutes) is very inconclusive as to how I would go about getting the server
> version I'm connecting to from a Python client.  Can someone please
> reassure me that it's relatively trivial for non Java clients to query JMX
> for the server version of every server in the cluster? Is there a reason
> not to include this in the API itself?
>
> -Mark
>
> On Wed, Nov 12, 2014 at 9:50 AM, Joel Koshy  wrote:
>
> > +1 on the JMX + gradle properties. Is there any (seamless) way of
> > including the exact git hash? That would be extremely useful if users
> > need help debugging and happen to be on an unreleased build (say, off
> > trunk)
> >
> > On Wed, Nov 12, 2014 at 09:34:35AM -0800, Gwen Shapira wrote:
> > > Actually, Jun suggested exposing this via JMX.
> > >
> > > On Wed, Nov 12, 2014 at 9:31 AM, Gwen Shapira 
> > wrote:
> > >
> > > > Good question.
> > > >
> > > > The server will need to expose this in the protocol, so Kafka clients
> > will
> > > > know what they are talking to.
> > > >
> > > > We may also want to expose this in the producer and consumer, so
> people
> > > > who use Kafka's built-in clients will know which version they have in
> > the
> > > > environment.
> > > >
> > > >
> > > >
> > > > On Wed, Nov 12, 2014 at 9:09 AM, Mark Roberts 
> > wrote:
> > > >
> > > >> Just to be clear: this is going to be exposed via some Api the
> clients
> > > >> can call at startup?
> > > >>
> > > >>
> > > >> > On Nov 12, 2014, at 08:59, Guozhang Wang 
> > wrote:
> > > >> >
> > > >> > Sounds great, +1 on this.
> > > >> >
> > > >> >> On Tue, Nov 11, 2014 at 1:36 PM, Gwen Shapira <
> > gshap...@cloudera.com>
> > > >> wrote:
> > > >> >>
> > > >> >> So it looks like we can use Gradle to add properties to manifest
> > file
> > > >> and
> > > >> >> then use getResourceAsStream to read the file and parse it.
> > > >> >>
> > > >> >> The Gradle part would be something like:
> > > >> >> jar.manifest {
> > > >> >>attributes('Implementation-Title': project.name,
> > > >> >>'Implementation-Version': project.version,
> > > >> >>'Built-By': System.getProperty('user.name'),
> > > >> >>'Built-JDK': System.getProperty('java.version'),
> > > >> >>'Built-Host': getHostname(),
> > > >> >>'Source-Compatibility': project.sourceCompatibility,
> > > >> >>'Target-Compatibility': project.targetCompatibility
> > > >> >>)
> > > >> >>}
> > > >> >>
> > > >> >> The code part would be:
> > > >> >>
> > > >> >>
> > > >>
> >
> this.getClass().getClassLoader().getResourceAsStream("/META-INF/MANIFEST.MF")
> > > >> >>
> > > >> >> Does that look like the right approach?
> > > >> >>
> > > >> >> Gwen
> > > >> >>
> > > >> >> On Tue, Nov 11, 2014 at 10:43 AM, Bhavesh Mistry <
> > > >> >> mistry.p.bhav...@gmail.com
> > > >> >>> wrote:
> > > >> >>
> > > >> >>> If is maven artifact then you will get following pre-build
> > property
> > > >> file
> > > >> >>> from maven build called pom.properties under
> > > >> >>> /META-INF/maven/groupid/artifactId/pom.properties folder.
> > > >> >>>
> > > >> >>> Here is sample:
> > > >> >>> #Generated by Maven
> > > >> >>> #Mon Oct 10 10:44:31 EDT 2011
> > > >> >>> version=10.0.1
> > > >> >>> groupId=com.google.guava
> > > >> >>> artifactId=guava
> > > >> >>>
> > > >> >>> Thanks,
> > > >> >>>
> > > >> >>> Bhavesh
> > > >> >>>
> > > >> >>> On Tue, Nov 11, 2014 at 10:34 AM, Gwen Shapira <
> > gshap...@cloudera.com
> > > >> >
> > > >> >>> wrote:
> > > >> >>>
> > > >>  In Sqoop we do the following:
> > > >> 
> > > >>  Maven runs a shell script, passing the version as a parameter.
> > > >>  The shell-script generates a small java class, which is then
> > built
> > > >> >> with a
> > >

Re: Programmatic Kafka version detection/extraction?

2014-11-14 Thread Yury Ruchin
Mark,
For non-Java clients an option could be to expose JMX via REST API using
Jolokia as an adapter. This may be helpful:
http://stackoverflow.com/questions/5106847/access-jmx-agents-from-non-java-clients

Joel,
I'm not familiar with Kafka build infrastructure, but e. g. Jenkins can
easily propagate Git commit hash it's building to the build runtime using a
predefined build parameter, GIT_COMMIT. Most likely there is a similar
option in other CI tools. This value can then be put into a properties
file. E. g. Maven can do this using resource filtering mechanism. I believe
Gradle should be even more flexible in this regard.
For non-CI builds things are more tricky, but the idea might be to read git
CLI output. This command would give the "current" commit hash: "git
rev-parse HEAD" which again can be passed to the build runtime using a
property/parameter. Here I make an assumption that Git is installed on the
build machine, but this will hold true in most cases I can imagine.

- Yury

2014-11-12 22:11 GMT+03:00 Mark Roberts :

> I haven't worked much with JMX before, but some quick googling (10-20
> minutes) is very inconclusive as to how I would go about getting the server
> version I'm connecting to from a Python client.  Can someone please
> reassure me that it's relatively trivial for non Java clients to query JMX
> for the server version of every server in the cluster? Is there a reason
> not to include this in the API itself?
>
> -Mark
>
> On Wed, Nov 12, 2014 at 9:50 AM, Joel Koshy  wrote:
>
> > +1 on the JMX + gradle properties. Is there any (seamless) way of
> > including the exact git hash? That would be extremely useful if users
> > need help debugging and happen to be on an unreleased build (say, off
> > trunk)
> >
> > On Wed, Nov 12, 2014 at 09:34:35AM -0800, Gwen Shapira wrote:
> > > Actually, Jun suggested exposing this via JMX.
> > >
> > > On Wed, Nov 12, 2014 at 9:31 AM, Gwen Shapira 
> > wrote:
> > >
> > > > Good question.
> > > >
> > > > The server will need to expose this in the protocol, so Kafka clients
> > will
> > > > know what they are talking to.
> > > >
> > > > We may also want to expose this in the producer and consumer, so
> people
> > > > who use Kafka's built-in clients will know which version they have in
> > the
> > > > environment.
> > > >
> > > >
> > > >
> > > > On Wed, Nov 12, 2014 at 9:09 AM, Mark Roberts 
> > wrote:
> > > >
> > > >> Just to be clear: this is going to be exposed via some Api the
> clients
> > > >> can call at startup?
> > > >>
> > > >>
> > > >> > On Nov 12, 2014, at 08:59, Guozhang Wang 
> > wrote:
> > > >> >
> > > >> > Sounds great, +1 on this.
> > > >> >
> > > >> >> On Tue, Nov 11, 2014 at 1:36 PM, Gwen Shapira <
> > gshap...@cloudera.com>
> > > >> wrote:
> > > >> >>
> > > >> >> So it looks like we can use Gradle to add properties to manifest
> > file
> > > >> and
> > > >> >> then use getResourceAsStream to read the file and parse it.
> > > >> >>
> > > >> >> The Gradle part would be something like:
> > > >> >> jar.manifest {
> > > >> >>attributes('Implementation-Title': project.name,
> > > >> >>'Implementation-Version': project.version,
> > > >> >>'Built-By': System.getProperty('user.name'),
> > > >> >>'Built-JDK': System.getProperty('java.version'),
> > > >> >>'Built-Host': getHostname(),
> > > >> >>'Source-Compatibility': project.sourceCompatibility,
> > > >> >>'Target-Compatibility': project.targetCompatibility
> > > >> >>)
> > > >> >>}
> > > >> >>
> > > >> >> The code part would be:
> > > >> >>
> > > >> >>
> > > >>
> >
> this.getClass().getClassLoader().getResourceAsStream("/META-INF/MANIFEST.MF")
> > > >> >>
> > > >> >> Does that look like the right approach?
> > > >> >>
> > > >> >> Gwen
> > > >> >>
> > > >> >> On Tue, Nov 11, 2014 at 10:43 AM, Bhavesh Mistry <
> > > >> >> mistry.p.bhav...@gmail.com
> > > >> >>> wrote:
> > > >> >>
> > > >> >>> If is maven artifact then you will get following pre-build
> > property
> > > >> file
> > > >> >>> from maven build called pom.properties under
> > > >> >>> /META-INF/maven/groupid/artifactId/pom.properties folder.
> > > >> >>>
> > > >> >>> Here is sample:
> > > >> >>> #Generated by Maven
> > > >> >>> #Mon Oct 10 10:44:31 EDT 2011
> > > >> >>> version=10.0.1
> > > >> >>> groupId=com.google.guava
> > > >> >>> artifactId=guava
> > > >> >>>
> > > >> >>> Thanks,
> > > >> >>>
> > > >> >>> Bhavesh
> > > >> >>>
> > > >> >>> On Tue, Nov 11, 2014 at 10:34 AM, Gwen Shapira <
> > gshap...@cloudera.com
> > > >> >
> > > >> >>> wrote:
> > > >> >>>
> > > >>  In Sqoop we do the following:
> > > >> 
> > > >>  Maven runs a shell script, passing the version as a parameter.
> > > >>  The shell-script generates a small java class, which is then
> > built
> > > >> >> with a
> > > >>  Maven plugin.
> > > >>  Our code references this generated class when we expose
> > > >> "getVersion()".
> > > >> 
> > 

Re: Plugable metadata store

2014-11-14 Thread Stevo Slavić
Isn't it only for offset management?

Kind regards,
Stevo

On Fri, Nov 14, 2014 at 2:16 PM, Sharninder  wrote:

> I haven't been following closely but getting rid of zookeeper is in the
> pipeline. Look up 0.9 plans. They're somewhere on the wiki.
>
> Sent from my iPhone
>
> > On 14-Nov-2014, at 5:18 pm, Stevo Slavić  wrote:
> >
> > Hello Apache Kafka community,
> >
> > Is it already possible to configure/use a different metadata store
> (topics,
> > consumer groups, consumer to partition assignments, etc.) instead of
> > ZooKeeper?
> > If not, are there any plans to make it plugable in future?
> >
> > Kind regards,
> > Stevo Slavic
>


Kafka broker cannot start after running out of disk space

2014-11-14 Thread Yury Ruchin
Hello,

I've run into an issue with Kafka 0.8.1.1 broker. The broker stopped
working after the disk it was writing to ran out of space. I freed up some
space and tried to restart the broker. It started some recovery procedure,
but after some short time in the logs I see the following strange error
message:

FATAL kafka.server.KafkaServerStartable  - Fatal error during
KafkaServerStable startup. Prepare to shutdown
java.lang.InternalError: a fault occurred in a recent unsafe memory access
operation in compiled Java code
at java.nio.HeapByteBuffer.(HeapByteBuffer.java:39)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
at
kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:188)
at
kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:165)
at
kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
at kafka.log.LogSegment.recover(LogSegment.scala:165)
at kafka.log.Log.recoverLog(Log.scala:179)
at kafka.log.Log.loadSegments(Log.scala:155)
at kafka.log.Log.(Log.scala:64)
at
kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
at
kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at
scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105)
at
kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
at
kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at
scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at kafka.log.LogManager.loadLogs(LogManager.scala:105)
at kafka.log.LogManager.(LogManager.scala:57)
at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
at
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
at kafka.Kafka$.main(Kafka.scala:46)
at kafka.Kafka.main(Kafka.scala)

and then everything starts over. I've been waiting for a while, but the
broker keeps restarting. How can I bring it back to life?

Thanks!


Re: Plugable metadata store

2014-11-14 Thread Sharninder
I haven't been following closely but getting rid of zookeeper is in the 
pipeline. Look up 0.9 plans. They're somewhere on the wiki. 

Sent from my iPhone

> On 14-Nov-2014, at 5:18 pm, Stevo Slavić  wrote:
> 
> Hello Apache Kafka community,
> 
> Is it already possible to configure/use a different metadata store (topics,
> consumer groups, consumer to partition assignments, etc.) instead of
> ZooKeeper?
> If not, are there any plans to make it plugable in future?
> 
> Kind regards,
> Stevo Slavic


Plugable metadata store

2014-11-14 Thread Stevo Slavić
Hello Apache Kafka community,

Is it already possible to configure/use a different metadata store (topics,
consumer groups, consumer to partition assignments, etc.) instead of
ZooKeeper?
If not, are there any plans to make it plugable in future?

Kind regards,
Stevo Slavic


Re: Kafka to transport binary files

2014-11-14 Thread Magnus Edenhill
Hi Rohit,

librdkafka will produce messages with zero copy (unless compression is
enabled).
And kafkacat will do this if you provide it one or more files as arguments
in producer mode:
  kafkacat -b  -t  -P   

(each file is sent as one message)

Just make sure to keep message.max.bytes synchronized across all clients
and brokers.

Magnus


2014-11-14 2:34 GMT+01:00 Jun Rao :

> Both the Kafka client and broker need to allocate memory for the whole
> message. So, the larger the message, the more memory fragmentation it may
> cause, which can lead to GC/OOME issues.
>
> Thanks,
>
> Jun
>
> On Wed, Nov 12, 2014 at 9:23 PM, Rohit Pujari 
> wrote:
>
> > I'm thinking of using Kafka for transporting binary files (tiff, jpeg,
> > pdf). These files are anywhere between 10 KB to 5MB. Thought behind
> > considering Kafka is - It serves as a staging area for the files and
> > facilitates asynchronous ingestion in near-real-time.
> >
> > Any thoughts on using Kafka for binary payloads? gotchas, watch outs?
> >
> > Thanks,
> > Rohit Pujari
> > Solutions Architect, Hortonworks
> > rpuj...@hortonworks.com
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>