[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

tdas Thu, 23 Jun 2016 10:38:47 -0700

Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/11863
  
    1. I didnt quite get it when you meant "But your description of what the 
code is currently doing
    is not accurate, and your recommendation does not meet the use cases." I 
just collapsed the three cases into two - when the user has NO PREFERENCES (the 
system SHOULD figure out how to schedule partitions on the same executors 
consistently), and SOME PREFERENCES (because of co-located brokers, or skew, or 
whatever). Why doesnt this recommendation meet the criteria?
    
    2. I agree with the argument that there are whole lot of stuff you cannot 
do without exposing a () => Consumer function. Buts thats where the question of 
API stability comes in. At this late stage of 2.0 release, I would much rather 
provide simpler API for simpler usecases than we know will not break, rather 
than an API that supports everything is more prone to breaking if Kafka breaks 
API. We can always start simple and then add more advanced interfaces in the 
future.
    
    3. Wrapping things up with extra Spark classes and interfaces is a cost we 
have to pay in order to prevent API breaking in the future. It is an investment 
we are undertaking in every part of Spark - SparkSession (using a builder 
pattern, instead of exposing constructor), SQL Data sources (never expose any 
3rd party classes), etc. Its hard-learnt lesson.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

Reply via email to