[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

marmbrus Thu, 22 Sep 2016 17:29:34 -0700

Github user marmbrus commented on the issue:

    https://github.com/apache/spark/pull/15102
  
    For streaming you already know what the global order is, because you know 
when you asked for A and B.  I agree that we should probably remove the 
comparable requirement from `Offset` in favor of just having equality.  At the 
time it was a useful safe guard, but its clearly causing more confusion than 
anything.
    
    Assuming `A` was retrieved before `B`, then it seems like you emit a 
warning that data was possibly missed from A (since it was deleted before we 
could get it) and you start a new batch on topic B from offsets 0-1.  Right?
    
    > Or do you actually think that stuff like option("assign", 
"topicA:1:1,topicA:2:2,topicB:3:3") makes it clear what the arguments are?
    
    We don't support assign.  When we do add that support, that is not super 
easy to follow, I don't feel strongly that its better or worse than JSON though 
if its unambiguous.
    
    Are there arguments that we do support that you think are confusing?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

Reply via email to