Guozhang,
I guess you are referring to a scenario where noOfThreads < totalNoOfTasks.
We could have KTable task and KStream task running on the same thread and
sleep will be counter productive?
On this note, will a Processor always run on the same thread? Are process()
and punctuate() guaranteed
Well, that's something our user is asking for so I'd like to learn how
Kafka Streams has done it. FYI, Gearpump checkpoints the mapping of event
time clock to kafka offset for replay.
We are thinking about allowing users to configure a topic to recover from
after application upgrade.
Meanwhile,
Srikanth,
Note that the same thread maybe used for fetching both the "semi-static"
KTable stream as well as the continuous KStream stream, so
sleep-on-no-match may not work.
I think setting timestamps for this KTable to make sure its values is
smaller than the KStream stream will work, and there
Hi Jahn,
How is the load on Zookeeper? How often are you committing your offsets?
Could that be an issue?
Cheers,
Jens
Den mån 23 maj 2016 18:12Jahn Roux skrev:
> I have a large Kafka deployment on virtual hardware: 120 brokers on 32gb
> memory 8 core virtual machines.
Thanks Guozhang & Matthias!
For 1), it is going to be a common ask. So a DSL API will be good.
For 2), source for the KTable currently is a file. Think about it as a dump
from a DB table.
We are thinking of ways to stream updates from this table. But for now its
a new file every day or so.
I
Hi,
I have a use case that would benefit from reading from a kafka topic in
reverse order. I saw some similar questions on this in the mailing list,
but didn't see a way to do this discussed with the 0.9 consumer api.
Is it possible to do this in the 0.9 consumer api? I don't need it to be
Thanks
Andy
From: Kamal C
Reply-To:
Date: Sunday, May 22, 2016 at 8:36 AM
To:
Subject: Re: newbie: kafka 0.9.0.0 producer does not terminate after
producer.close()
> Andy,
>
> Kafka 0.9.0 server supports the
Hi,
I am running simple experiments to evaluate the scalability of Kafka
consumers with respect to the number of partitions. I assign every consumer
to a specific partition. Each consumer polls the records in its assigned
partition and print the first one, then polls again from the offset of the
I created a java producer on my local machine and setup a Kafka broker on
another machine, say M2, on the network(I can ping,SSH, connect to this
machine). On the Producer side in the Eclipse console I get "Message sent".
But when I check the console consumer on machine M2 I cannot see those
Hi All,
Does anyone have any performance metrics running Kafka on Ceph?
I briefly gathered at the 2016 Kafka Summit that there's an ongoing work
between the Kafka community and RedHat in getting Kafka running
successfully on Ceph. Is this correct? If so, what's timeline for that?
Thanks
Hi Srikanth,
How do you define if the "KTable is read completely" to get to the "current
content"? Since as you said that table is not purely static, but still with
maybe-low-traffic update streams, I guess "catch up to current content" is
still depending on some timestamp?
BTW about 1), we are
Hi Manu,
That is a good question, currently if users change their application logic
while upgrading, the intermediate state stores may not be valid anymore,
and hence the change log. In this case we need to wipe out the invalid
internal data and restart from scratch.
We are working on a better
Mike, the schema registry you’re using is a part of Confluent’s platform, which
builds upon but is not affiliated with Kafka itself. You might want to post
your question to Confluent’s mailing list:
https://groups.google.com/d/forum/confluent-platform
HTH,
Avi
On 5/20/16, 21:40, "Mike
I have a large Kafka deployment on virtual hardware: 120 brokers on 32gb
memory 8 core virtual machines. Gigabit network, RHEL 6.7. 4 Topics, 1200
partitions each, replication factor of 2 and running Kafka 0.8.1.2
We are running into issues where our cluster is not keeping up. We have 4
sets
Matthias,
For (2), how do you achieve this using transform()?
Thanks,
Srikanth
On Sat, May 21, 2016 at 9:10 AM, Matthias J. Sax
wrote:
> Hi Srikanth,
>
> 1) there is no support on DSL level, but if you use Processor API you
> can do "anything" you like. So yes, a
That's one case. Well, if I get it right, the change log's topic name is
bound to applicationId, so what I want to ask is how change log works for
scenarios like application upgrade (with new applicationId, I guess). Does
the system handle that for user ?
On Mon, May 23, 2016 at 9:21 PM Matthias
If I understand correctly, you want to have a non-Kafka-Streams consumer
to read a change-log topic that was written by a Kafka-Streams application?
That is certainly possible. Kafka is agnostic to Kafka Stream, ie, all
topics are regular topic and can be read by any consumer.
-Matthias
On
Are you sure consumers are always up, when they are behind they could
generate a lot of traffic in a small amount of time?
On Mon, May 23, 2016 at 9:11 AM Anishek Agarwal wrote:
> additionally all the read / writes are happening via storm topologies.
>
> On Mon, May 23, 2016
It puts things on several internal queues. I'd benchmark what kind of rates
you're looking at - we handily do a few hundred thousand per second per
process with a 2GB JVM heap.
On Mon, May 23, 2016 at 1:31 PM, Joe San wrote:
> When you say Threadsafe, I assume that the
When you say Threadsafe, I assume that the calls to the send method is
Synchronized or? If it is the case, I see this as a bottleneck. We have a
high frequency system and the frequency at which the calls are made to the
send method is very high. This was the reason why I came up with multiple
That's accurate. Why are you creating so many producers? The Kafka producer
is thread safe and *should* be shared to take advantage of batching, so I'd
recommend just having a single producer.
Thanks
Tom Crayford
Heroku Kafka
On Mon, May 23, 2016 at 10:41 AM, Joe San
H there,
You could probably wrangle this with log4j and filters. A single broker
doesn't really have a consistent view of "if the cluster goes down", so
it'd be hard to log that, but you could write some external monitoring that
checked brokers were up via JMX and log from there.
Thanks
Tom
Hi,
I am working on 3 node kafka broker bosh release. I want the kafka broker
events to be saved somewhere for audit logging.
By kafka events, I mean to say kafka connection and disconnection logs. Also if
a node in the kafka cluster goes down, the event should be logged.
Is there any
In one of our application, we have the following setting:
# kafka configuration
# ~
kafka {
# comma seperated list of brokers
# for e.g., "localhost:9092,localhost:9032"
brokers = "localhost:9092,localhost:9032"
topic = "asset-computed-telemetry"
isEnabled = true
# for a detailed
Thanks Matthias. Is there a way to allow users to read change logs from a
previous application ?
On Mon, May 23, 2016 at 3:57 PM Matthias J. Sax
wrote:
> Hi Manu,
>
> Yes. If a StreamTask recovers, it will write to the same change log's
> topic partition. Log compaction
Hi,
you can use "ProcessorContext" that is provided via init(). The context
object is dynamically adjusted to the currently processed record
To get the offset, use "context.offset()".
-Matthias
On 05/22/2016 10:38 PM, Florian Hussonnois wrote:
> Hi everyone,
>
> Is there any way to retrieve
Hi Manu,
Yes. If a StreamTask recovers, it will write to the same change log's
topic partition. Log compaction is enable per default for those topics.
You still might see some duplicates in your output. Currently, Kafka
Streams guarantees at-least-once processing (exactly-once processing is
on
additionally all the read / writes are happening via storm topologies.
On Mon, May 23, 2016 at 12:17 PM, Anishek Agarwal wrote:
> Hello,
>
> we are using 4 kafka machines in production with 4 topics and each topic
> either 16/32 partitions and replication factor of 2. each
Hello,
we are using 4 kafka machines in production with 4 topics and each topic
either 16/32 partitions and replication factor of 2. each machine has 3
disks for kafka logs.
we see a strange behaviour where we see high disk usage spikes on one of
the disks on all machines. it varies over time,
29 matches
Mail list logo