Hey Karsten,

There could be several reasons this could happen.
1. Did you check the error logs? There are several reasons why the Kafka
stream app may drop incoming messages. Use exactly-once semantics to limit
such cases.
2. Are you sure there was no error when deserializing the records from
`folderTopicName`. You mentioned that it happens only when you start
processing and the other table snapshot works fine. This gives me a feeling
that the first records in the topic might not be deserialized properly.

Regards,
Mangat

On Tue, Mar 26, 2024 at 8:45 AM Karsten Stöckmann <
karsten.stoeckm...@gmail.com> wrote:

> Hi,
>
> thanks for getting back. I'll try and illustrate the issue.
>
> I've got an input topic 'folderTopicName' fed by a database CDC system.
> Messages then pass a series of FK left joins and are eventually sent to an
> output topic like this ('agencies' and 'documents' being KTables):
>
>
>             streamsBuilder //
>                     .table( //
>                             folderTopicName, //
>                             Consumed.with( //
>                                     folderKeySerde, //
>                                     folderSerde)) //
>                     .leftJoin( //
>                             agencies, //
>                             Folder::agencyIdValue, //
>                             AggregateFolder::new, //
>                             TableJoined.as("folder-to-agency"), //
>                             Materializer //
>                                     .<FolderId,
> AggregateFolder>named("folder-to-agency-materialized") //
>                                     .withKeySerde(folderKeySerde) //
>                                     .withValueSerde(aggregateFolderSerde))
> //
>                     .leftJoin( //
>                             documents, //
>                     .toStream(...
>                     .to(...
>
> ...
>
> As far as I understand, left join sematics should be similar to those of
> relational databases, i.e. the left hand value always passes the join with
> the right hand value set as <null> if not present. Whereas what I am
> observing is this: lots of messages on the input topic are even absent on
> the first left join changelog topic
> ('folder-to-agency-materialized-changelog'). But: this seems to happen only
> in case the Streams application is fired up for the first time, i.e.
> intermediate topics do not yet exist. When streaming another table snapshot
> to the input topic, things seem (!) to work as expected...
>
> Best wishes,
> Karsten
>
> Bruno Cadonna <cado...@apache.org> schrieb am Mo., 25. März 2024, 17:01:
>
> > Hi,
> >
> > That sounds worrisome!
> >
> > Could you please provide us with a minimal example that shows the issue
> > you describe?
> >
> > That would help a lot!
> >
> > Best,
> > Bruno
> >
> > On 3/25/24 4:07 PM, Karsten Stöckmann wrote:
> > > Hi,
> > >
> > > are there circumstances that might lead to messages silently (i.e.
> > without
> > > any logged warnings or errors) disappearing from a topology?
> > >
> > > Specifically, I've got a rather simple topology doing a series of FK
> left
> > > joins and notice severe message loss in case the application is fired
> up
> > > for the first time, i.e. intermediate topics not existing yet. I'd
> > > generally expect the message count on the output topic resemble that
> from
> > > the input topic, yet it doesn't (about half only).
> > >
> > > Any hints on this?
> > >
> > > Best wishes
> > > Karsten
> > >
> >
>

Reply via email to