Thanks Sini!
I intended to create a new JIRA, but than changed my mind and just
picky-backed it to the existing one, as it's highly related and we might
be able to tackle it in one effort.
-Matthias
On 5/18/17 12:12 AM, Peter Sinoros Szabo wrote:
> Hi Michal,
>
> yes, I know its beyond the
Hi Michal,
yes, I know its beyond the scope of KIP-138, but from previous messages
from Matthias I thought that he will create a new ticket, but it seems
that instead he added it to KAFKA-3514. I will update that ticket with my
thoughts.
Thanks,
Sini
From: Michal Borowiecki
Hi Sini,
This is beyond the score of KIP-138 but
https://issues.apache.org/jira/browse/KAFKA-3514 exists to track such
improvements
Thanks,
Michal
On 17 May 2017 5:10 p.m., Peter Sinoros Szabo
wrote:
Hi,
I see, now its clear why the repeated punctuations
Hi,
I see, now its clear why the repeated punctuations use the same time value
in that case.
Do you have a JIRA ticket to track improvement ideas for that?
It would be great to have an option to:
- advance the stream time before calling the process() on a new record -
this would prevent to
We use Kafka Streams for quite a few aggregation tasks. For instance,
counting the number of messages with a particular status in a 1-minute time
window.
We have noticed that whenever we restart a stream, we see a sudden spike in
the aggregated numbers. After a few minutes, things are back to
I added the feedback to https://issues.apache.org/jira/browse/KAFKA-3514
-Matthias
On 5/12/17 10:38 AM, Thomas Becker wrote:
> Thanks. I think the system time based punctuation scheme we were discussing
> would not result in repeated punctuations like this, but even using stream
> time it
Thanks. I think the system time based punctuation scheme we were discussing
would not result in repeated punctuations like this, but even using stream time
it seems a bit odd. If you do anything in a punctuate call that is relatively
expensive it's especially bad.
Thanks for sharing.
As punctuate is called with "streams time" you see the same time value
multiple times. It's again due to the coarse grained advance of "stream
time".
@Thomas: I think, the way we handle it just simplifies the
implementation of punctuations. I don't see any other "advantage".
Well, this is also a good question, because it is triggered with the same
timestamp 3 times, so in order to create my update for both three seconds,
I will have to count the number of punctuations and calculate the missed
stream times for myself. It's ok for me to trigger it 3 times, but the
I'm a bit troubled by the fact that it fires 3 times despite the stream time
being advanced all at once; is there a scenario when this is beneficial?
From: Matthias J. Sax [matth...@confluent.io]
Sent: Friday, May 12, 2017 12:38 PM
To:
Hi,
It is actually critical for me (and of course I would like to understand
it too), because using this design the "first message" will be processed
in a bad window/segment (I mean the timeframe between the punctuate()
calls): it will be processed in the "3 seconds before segment" instead of
Hi Peter,
It's by design. Streams internally tracks time progress (so-called
"streams time"). "streams time" get advanced *after* processing a record.
Thus, in your case, "stream time" is still at its old value before it
processed the first message of you send "burst". After that, "streams
time"
Hi,
Let's assume the following case.
- a stream processor that uses the Processor API
- context.schedule(1000) is called in the init()
- the processor reads only one topic that has one partition
- using custom timestamp extractor, but that timestamp is just a wall
clock time
Image the
13 matches
Mail list logo