Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/12655#issuecomment-214101076
  
    I think some of the terminology used in this and related PRs is confusing 
the issues.  When @kayousterhout and I ask about "correctness", what we are 
fundamentally concerned about is whether evaluation of the DAG produces the 
correct data elements.  I don't think that your description of "incorrect" or 
"illegal" graphs is meant to imply that incorrect data is produced from their 
evaluation.  Correct me if I am wrong, but I think that you are talking 
exclusively about graphs that are not optimal, causing duplication of effort 
and preventing further optimizations -- graphs that are taking longer to 
evaluate than is necessary, not graphs that are producing incorrect data 
elements.
    
    If I am thinking correctly about this, then the entire effect of this and 
related PRs is to improve or optimize the DAGScheduler, not to create graphs 
and schedules that produce different end results than the DAGScheduler does now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to