Ah. Sorry.
You are right. Nevertheless, you can set an non-null dummy value like
`byte[0]` instead of the actual "tuple" to not blow up your storage
requirement.
-Matthias
On 4/30/17 10:24 AM, Michal Borowiecki wrote:
> Apologies, I must have not made myself clear.
>
> I meant the values in t
Apologies, I must have not made myself clear.
I meant the values in the records coming from the input topic (which in
turn are coming from kafka connect in the example at hand)
and not the records coming out of the join.
My intention was to warn against sending null values from kafka connect
Your observation is correct.
If you use inner KStream-KTable join, the join will implement the
filter automatically as the join will not return any result.
-Matthias
On 4/30/17 7:23 AM, Michal Borowiecki wrote:
> I have something working on the same principle (except not using
> connect), th
I have something working on the same principle (except not using
connect), that is, I put ids to filter on into a ktable and then (inner)
join a kstream with that ktable.
I don't believe the value can be null though. In a changlog null value
is interpreted as a delete so won't be put into a kt
>> I'd like to avoid repeated trips to the db, and caching a large amount of
>> data in memory.
Lookups to the DB would be hard to get done anyway. Ie, it would not
perform well, as all your calls would need to be synchronous...
>> Is it possible to send a message w/ the id as the partition key
I'd like to avoid repeated trips to the db, and caching a large amount of
data in memory.
Is it possible to send a message w/ the id as the partition key to a topic,
and then use the same id as the key, so the same node which will receive
the data for an id is the one which will process it?
On F
The recommended solution would be to use Kafka Connect to load you DB
data into a Kafka topic.
With Kafka Streams you read your db-topic as KTable and do a (inne)
KStream-KTable join to lookup the IDs.
-Matthias
On 4/27/17 2:22 PM, Ali Akhtar wrote:
> I have a Kafka topic which will receive a l
I have a Kafka topic which will receive a large amount of data.
This data has an 'id' field. I need to look up the id in an external db,
see if we are tracking that id, and if yes, we process that message, if
not, we ignore it.
99% of the data will be for ids which are not being tracked - 1% or s