Github user koeninger commented on the issue:
https://github.com/apache/spark/pull/15102
> We are not giving the developer the option to manually configure a
consumer in this way for this PR precisely because I don't think we can while
still maintaining the semantics that structured streaming has promised.
You've got this backwards. As soon as you give someone the ability to set
auto.offset.reset to largest, it opens up the can of worms as to whether resets
should happen at the beginning of a stream, during a stream, and/or when a
partition is added. Giving people the ability to configure a consumer doesn't
cause that problem, it allows them to solve that problem until such time as the
Kafka project has a unified way to solve it. Similarly, as soon as you allow
pattern subscriptions, it opens up the can of worms as to adding/removing
topics and whether the sql Offset interface as is makes sense for Kafka. Just
saying you aren't going to handle deletions for right now doesn't solve that
problem.
If you really don't want to consider changing the Offset interface, and
want to tell people who need the details of Kafka in order to work done to use
the DStream, then you should probably eliminate all configuration options
except brokers, a list of topics, and maybe SSL.
I'll try one more time, and then I'm done:
- Months ago you came up with an interface that realistically will only
work with Kafka / Kinesis / lookalikes, yet had no implementation for any of
those.
- Actually attempting an implementation raised some notable differences
between what the interface allowed for and what the implementation needed.
- I offered some specific suggestions, including considering changes to the
interface
- I offered to help with implementation
Your response, from my point of view, has been
- Decline to consider changes to the interface
- Decline any assistance with actual implementation
- Only (re)implement a subset of Kafka functionality that you can see is
"safe", regardless of whether it's congruent with the way Kafka is already
being used by users.
Under those circumstances, I'm happy to answer specific directed questions
you may have, but I'm not interested in continuing to argue. If you guys say
you've got this and you're going to do it your way, then you've got it.
Let me know if you change your mind, I'll still be around.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]