Hey Guozhang,
Thanks a lot for these insights. We are facing the exact same problem as
Tianji. Our commit frequency is also quite high. We flush almost around 16K
messages per minute to Kafka at the end of the topology.
Another issue that we are facing is that rocksdb is not deleting old data.
We
We did some more analysis on why the disk utilisation is continuously
increasing. Turns out it's the RocksDB WAL that's utilising most of the
disk space. The LOG.old WAL files are not getting deleted. Ideally they
should have been. RocksDB provides certain configuration for purging WAL
files
Mahendra,
The WAL is turned off in KafkaStreams. This file is just the rocksdb log,
you can probably just delete the old ones:
https://github.com/facebook/rocksdb/issues/849
In 0.10.0.1 there is no way of configuring RocksDB via KafkaStreams.
Thanks,
Damian
On Mon, 20 Mar 2017 at 09:22 Mahendra K
Is this possible? Im wondering about gathering data from a stream into a
series of windowed aggregators: minute, hour and day. A separate process
would start at fixed intervals, query the appropriate state store for
available values and then hopefully clear / zero / reset everything for the
next in
Hi Guys,
Great information again as usual, very helpful!
Very appreciated, thanks so much!
Tianji
PS: The Kafka Community is simply great!
On Fri, Mar 17, 2017 at 3:00 PM, Guozhang Wang wrote:
> Tianji and Sachin (and also cc'ing people who I remember have reported
> similar RocksDB memory
And since you asked for a pointer, Ali:
http://docs.confluent.io/current/streams/concepts.html#windowing
On Mon, Mar 20, 2017 at 5:43 PM, Michael Noll wrote:
> Late-arriving and out-of-order data is only treated specially for windowed
> aggregations.
>
> For stateless operations such as `KStrea
Late-arriving and out-of-order data is only treated specially for windowed
aggregations.
For stateless operations such as `KStream#foreach()` or `KStream#map()`,
records are processed in the order they arrive (per partition).
-Michael
On Sat, Mar 18, 2017 at 10:47 PM, Ali Akhtar wrote:
> >
Can windows only be used for aggregations, or can they also be used for
foreach(), and such?
And is it possible to get metadata on the message, such as whether or not
its late, its index/position within the other messages, etc?
On Mon, Mar 20, 2017 at 9:44 PM, Michael Noll wrote:
> And since yo
Hi,tarte
I am trying to use kafka as a messaging platform for my microservices.
There is a problem i am facing in real time . when I try to bring in a new
consumer group to consume from a certain topic . I have to restart the
producer only then it ( new consumer group) starts consuming. Is there
Jon,
the windowing operation of Kafka's Streams API (in its DSL) aligns
time-based windows to the epoch [1]:
Quoting from e.g. hopping windows (sometimes called sliding windows in
other technologies):
> Hopping time windows are aligned to the epoch, with the lower interval
bound
> being inclusiv
Does the consumer or producer log anything when you connect the new
consumer? Is there anything logged in the broker logs?
On Mon, Mar 20, 2017 at 10:58 AM, jeffrey venus
wrote:
> Hi,tarte
> I am trying to use kafka as a messaging platform for my microservices.
> There is a problem i am facing
Hello,
I have configured on my server.properties the auto.create.topics.enable=true
but im having issues with that.
When I launch kafka for the first time and send any message to any topic (to be
created) it doesn't arrive at the topic, but if I launch kafka and wait few
minutes to send the mes
Hi,
I am looking to send log file periodically to Kafka. Before I set out to
write my own producer, I was wonder what everyone uses to send log file to
Kafka through HTTP API. Ideally it should also prune the log and have some
sort of error handling.
Thanks very much.
Tony S. Wu
Hi,
I get ERROR Error when sending message to topic my-topic with key: null,
value: ... bytes with error: (org.apache.kafka.clients.producer.internals.
ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 11 record(s) for
my-topic-0: 1732 ms has passed since last append
Hi,
I am Manasa, currently working on a project that requires processing data
from multiple topics at the same time. I am looking for an advise on how to
approach this problem. Below is the use case.
We have 4 topics, with data coming in at a different rate in each topic,
but the messages in eac
Are you saying, that it should process all messages from topic 1, then
topic 2, then topic 3, then 4?
Or that they need to be processed exactly at the same time?
On Mon, Mar 20, 2017 at 10:05 PM, Manasa Danda
wrote:
> Hi,
>
> I am Manasa, currently working on a project that requires processing
> Can windows only be used for aggregations, or can they also be used for
> foreach(),
and such?
As of today, you can use windows only in aggregations.
> And is it possible to get metadata on the message, such as whether or not its
late, its index/position within the other messages, etc?
If you
It would be helpful to know the 'start' and 'end' of the current metadata,
so if an out of order message arrives late, and is being processed in
foreach(), you'd know which window / bucket it belongs to, and can handle
it accordingly.
I'm guessing that's not possible at the moment.
(My use case i
\cc users list
Forwarded Message
Subject: Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API
Date: Mon, 20 Mar 2017 11:51:01 -0700
From: Matthias J. Sax
Organization: Confluent Inc
To: d...@kafka.apache.org
I want to push this discussion further.
Guozhang's argument abo
Ali,
what you describe is (roughly!) how Kafka Streams implements the internal
state stores to support windowing.
Some users have been following a similar approach as you outlined, using
the Processor API.
On Mon, Mar 20, 2017 at 7:54 PM, Ali Akhtar wrote:
> It would be helpful to know the '
Hmm, I must admit I don't like this last update all too much.
Basically we would have:
StreamsBuilder builder = new StreamsBuilder();
// And here you'd define your...well, what actually?
// Ah right, you are composing a topology here, though you are not
aware of it.
KafkaStreams
Yeah, windowing seems perfect, if only I could find out the current
window's start time (so I can log the current bucket's start & end times)
and process window messages individually rather than as aggregates.
It doesn't seem like i can get this metadata from ProcessorContext though,
from looking
Hello,
I am new to Kafka and am looking for a way for consumers to be able to identify
the producer of each message in a topic. There are a large number of producers
(lets say on the order of millions), and each producer would be connecting via
SSL and using a unique client certificate. Essenti
You can configure Kafka with ACLs that only allow certain users to
produce/consume to certain topics but if multiple producers are allowed to
produce to a shared topic then you cannot identify them without adding
something to the messages.
For example, you can have each producer digitally sign (or
I would recommend to try out Kafka's Streams API instead of Spark Streaming.
http://docs.confluent.io/current/streams/index.html
-Matthias
On 3/20/17 11:32 AM, Ali Akhtar wrote:
> Are you saying, that it should process all messages from topic 1, then
> topic 2, then topic 3, then 4?
>
> Or tha
Thanks, Hans.
Signing messages is a good idea. Other than that, is there possibly an
extension point in Kafka itself on the receiving of records, before they are
stored/distributed? I was thinking along the lines of
org.apache.kafka.clients.producer.ProducerInterceptor
but on the server side?
Hi Tony,
We are using custom logging API that wraps Kafka producer as singleton for each
app. The custom API takes structured log data and converts to JSON, writes to
app specific Kafka topic. Then that topic is bridged to Logstash consumer and
logs get ingested to elastic search. Now these lo
Thanks, Hans.
Signing messages is a good idea. Other than that, is there possibly an
extension point in Kafka itself on the receiving of records, before they are
stored/distributed? I was thinking along the lines of
org.apache.kafka.clients.producer.ProducerInterceptor
but on the server side?
Nothing on the broker today but if you use Kafka Connect API in 0.10.2 and
above there is a pluggable interface called Transformations.
See org.apache.kafka.connect.transforms in
https://kafka.apache.org/0102/javadoc/index.html?org/apache/kafka/connect
Source Connector transformations happen b
OK, thank you for that. I looked at
org.apache.kafka.connect.transforms.Transformation and
org.apache.kafka.connect.source.SourceRecord, but am not discovering where the
authenticated username (or Principal) might be available for the call to
Transformation.apply()… or does Connect even support
30 matches
Mail list logo