Github user tomatophantastico commented on the issue:
https://github.com/apache/metamodel/pull/178
wrt 1) That unfolding would imho add a lot of benefit. in the case of kafka
this will be especially difficult, as virtually any kind of data can be
encountered. iirc in case of MongoDB the compromise of dealing with a lack of
schema is to scan a certain amount of documents per collection, so maybe a
similar approach can be taken here.
wrt 2) My objection here is that kafka is build to scale, up to the TBs of
data. While not the full dataset has to be held in memory at a time, all data
to be filtered has to be transferred and materialized put into local memory,
right? So, if there are 10^9 messages on the server, and you want to repeatedly
query for a certain key, you'll have to copy all 10^9 messages every time.
Your concern was that you might bog down kafka with too many subscriber
ids. I just want to point out that this is probably a much bigger performance
issue.
---