[GitHub] spark pull request: [SPARK-13231] Make count failed values a user ...

andrewor14 Mon, 28 Mar 2016 17:22:40 -0700

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/11115#issuecomment-202639815
  
    I just talked with @rxin and we decided that this flag isn't quite ready to 
be exposed yet. The problem is the default value (false) has weird semantics on 
stage retries; if you're counting the number of rows in your data and your 
stage was run twice, then you might get 2X the actual number of rows. This is 
what #11105 is looking at, and we might proceed from there.
    
    Either way I think we should try to fix the default behavior before 
exposing a flag that we won't be able to change in the future. Let's keep the 
issue open but close this PR for now.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-13231] Make count failed values a user ...

Reply via email to