Hi,
We are using kafka 0.8.1.1 in our production cluster. I recently started
specifying key as the message itself. I just realised that the key is also
written to the broker which means that the data is duplicated within a
keyed message. I am going to change the key. Stupid mistake.
However,
Hi all,
I'm using Kafka 0.8.2.1 in production.
My Kafka Config is pretty much vanilla, so (as far as I understand) offsets
are being written to Zookeeper.
As recommended, I want to start writing offsets to Kafka instead of
Zookeeper.
I was surprised to see that the __consumer_offsets topic
Thanks, Jiangjie,
Yes, we had reduced the segmetn.index.bytes to 1K in order to maintain more
frequent offset index, which was required for ability to fetch start and end
offsets for a given span of time say 15 mins. Ideally changing only the
index.interval.bytes to 1K should have been
To make my question clearer:
I know how to increase the partitions and the replication factor of any
plain old topic.
I'm worried that making changes to this internal topic could cause
problems, so I'm looking for advice.
Thanks,
*Daniel Coldham*
On Tue, Jun 23, 2015 at 3:15 PM, Daniel
Hey Mohit,
Unfortunately, I don't think there's any such configuration.
By the way, there are some pretty cool things you can do with keys in Kafka
(such as semantic partitioning and log compaction). I don't know if they
would help in your use case, but it might be worth checking out
Yes and no. We're running a version about a month behind trunk at any given
time here at LinkedIn. That's generally the amount of time we spend testing
and going through our release process internally (less if there are no
problems). So it can be done.
That said, we also have several Kafka
Hi Mohit,
If you instantiate the keyed message with
val topic = topic
val value = value
val message = new KeyedMessage[String, String](topic, value);
Then the key in the KeyedMessage will be null.
Hope this helps!
Thanks,
Liquan
On Tue, Jun 23, 2015 at 8:18 AM, Mohit Kathuria
Yes new features are a big part of it and sometimes bug
fixes/improvements. Bug fixes are mostly due to being on trunk, but
some aren't necessarily introduced on trunk. For e.g., we would like
to do a broader roll-out of the new producer, but KAFKA-2121 (adding a
request timeout to NetworkClient)
Thanks, Joel. I know I remember a case where we had a difference like this
between two brokers, and it was not due to retention settings or some other
problem, but I can't remember exactly what we determined it was.
-Todd
On Mon, Jun 22, 2015 at 4:22 PM, Joel Koshy jjkosh...@gmail.com wrote:
Out of curiosity, why do you want to run trunk?
General fondness for cutting edge stuff? Or are there specific
features in trunk that you need?
Gwen
On Tue, Jun 23, 2015 at 2:59 AM, Achanta Vamsi Subhash
achanta.va...@flipkart.com wrote:
I am planning to use for the producer part. How stable is
Hi,
I was just wondering if there is any difference in the memory footprint of
a high level consumer when:
1. the consumer is live and continuously consuming messages with no backlogs
2. when the consumer is down for quite some time and needs to be brought up
to clear the backlog.
My test case
@Gwen
I want to patch this JIRA https://issues.apache.org/jira/browse/KAFKA-1865
to 0.8.2.1. So, I was thinking instead of patching it can we run it against
the trunk as I see other producer changes also pushed to trunk. We are
facing latency problems with the current producer (sent out a separate
Hello
Is there a good reference for best practices on running Java consumers?
I'm thinking a FAQ format.
- How should we run them? We are currently running them in Tomcat on
Ubuntu, are there other approaches using services? Maybe the service
wrapper
It seems you might have run that on the last log segment. Can you run
it on 21764229.log on both brokers and compare? I'm
guessing there may be a message-set with a different compression codec
that may be causing this.
Thanks,
Joel
On Tue, Jun 23, 2015 at 01:06:16PM +0530, nirmal
I don't know of any such resource, but I'll be happy to help
contribute from my experience.
I'm sure others would too.
Do you want to start one?
Gwen
On Tue, Jun 23, 2015 at 2:03 PM, Tom McKenzie thomaswmcken...@gmail.com wrote:
Hello
Is there a good reference for best practices on running
I have the following code snippet that use Kafka Producer to send message(No
key is specified in the KeyedMessage):
val data = new KeyedMessage[String, String](topicName, msg);
Kafka_Producer.send(data)
Kafka_Producer is an instance of kafka.producer.Producer.
With above code, I observed that
Hi,
i ran DumpLogSegments.
*Broker 1*
offset: 23077447 position: 1073722324 isvalid: true payloadsize: 431
magic: 0 compresscodec: NoCompressionCodec crc: 895349554
*Broker 2*
offset: 23077447 position: 1073740131 isvalid: true payloadsize: 431
magic: 0 compresscodec: NoCompressionCodec crc:
Hi,
We are using the batch producer of 0.8.2.1 and we are getting very bad
latencies for the topics. We have ~40K partitions now in a 20-node cluster.
- We have many topics and each with messages published to them varying. Ex:
some topics take 10k/sec and other 2000/minute.
- We are seeing
I am planning to use for the producer part. How stable is trunk generally?
--
Regards
Vamsi Subhash
--
--
This email and any files transmitted with it are
Hi,
I have 3 high level consumers with the same group id. One of the consumer
goes down, I know rebalance will kick in in the remaining two consumers.
What happens if one of the remaining consumers is very slow during
rebalancing and it hasn't released ownership of some of the topics will the
It does balance data, but is sticky over short periods of time (for some
definition of short...). See this FAQ for an explanation:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified
?
This behavior has been
Current partition assignment only has a few limited options -- see the
partition.assignment.strategy consumer option (which seems to be listed
twice, see the second version for a more detailed explanation). There has
been some discussion of making assignment strategies user extensible to
support
Hello!
I'm working with a topic of largely variable partition sizes. My biggest
concern is that I have no control over which keys are assigned to which
consumers in my consumer group, as the amount of data my consumer sees is
directly reflected on it's work load. Is there a way to distribute
23 matches
Mail list logo