You can look into KStreamReduceProcessor#process() and
KStreamAggregateProcessor#process() This should help to see the input
data for each instance.
count()/aggregate() will use AggregateProcessor while reduce() will use
ReduceProcessor
Hope this helps.
-Matthias
On 6/20/17 10:54 PM, Sameer K
Hi Matthias,
I am working on 10.2.1. and I do see same outputs on same keys albeit not
always and a bit random.
I shall try to code a simpler example free from domain level code and share
it with you in a while.
I am willing to debug this further as well, if you could tell me the
classes that I
Hi Sameer,
With regard to
>>> What I saw was that while on Machine1, the counter was 100 , another
>>> machine it was at 1. I saw it as inconsistent.
If you really see the same key on different machines, that would be
incorrect. All record with the same key, must be processed by the same
machine
Continued from m last mail...
The code snippet that I shared was after joining impression and
notification logs. Here I am picking the line item and concatenating it
with date. You can also see there is a check for a TARGETED_LINE_ITEM, I am
not emitting the data otherwise.
-Sameer.
On Sat, Jun
The example I gave was just for illustration. I have impression logs and
notification logs. Notification logs are essentially tied to impressions
served. An impression would serve multiple items.
I was just trying to aggregate across a single line item, this means I am
always generating a single k
I just double checked you example code from an email before. There you
are using:
stream.flatMap(...)
.groupBy((k, v) -> k, Serdes.String(), Serdes.Integer())
.reduce((value1, value2) -> value1 + value2,
In you last email, you say that you want to count on category that is
contained i
Ok.. Let me try explain it again.
So, Lets say my source processor has a different key, now the value that it
contains lets say contains an identifier type: which basically denotes
category and I am counting on different categories. A specific case would
be I do a filter and outputs only a specifi
I'm not sure if I fully understand this but let me check:
- if you start 2 instances, one instance will process half of the partitions,
the other instance will process the other half
- for any given key, like key 100, it will only be processed on one of the
instances, not both.
Does this help?
Also, I am writing a single key in the output all the time. I believe
machine2 will have to write a key and since a state store is local it
wouldn't know about the counter state on another machine. So, I guess this
will happen.
-Sameer.
On Thu, Jun 15, 2017 at 11:11 AM, Sameer Kumar
wrote:
> Th
The input topic contains 60 partitions and data is distributed well across
different partitions on different keys. While consumption, I am doing some
filtering and writing only single key data.
Output would be something of the form:- Machine 1
2017-06-13 16:49:10 INFO LICountClickImprMR2:116 - l
Hi,
I witnessed a strange behaviour in KafkaStreams, need help in understanding
the same.
I created an application for aggregating clicks per user, I want to process
it only for 1 user( i was writing only a single key).
When I ran application on one machine, it was running fine.Now, to
loadbalanc
11 matches
Mail list logo