[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

HyukjinKwon Sun, 15 Apr 2018 18:20:07 -0700

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21060
  
    I am not saying we shouldn't be careful. I am trying to be careful when I 
backport. So, your reasons are:
    
    - any behaviour changes shouldn't be backported and it's the basic backport 
rule
    
      I disagree unless it's clearly documented as a rule. Even if so, I would 
like to make this as an exception because it's less invasive, looks a bug, 
affects an actual user group and fixes the case to make it sense. That's what I 
have been used to so far.
    
    - the query execution listener is not clearly defined
    
      I am seeing `collect` is included in the original commit - 
https://github.com/apache/spark/commit/15ff85b3163acbe8052d4489a00bcf1d2332fcf0.
 I don't see a reason to specifically exclude PySpark's case since Scala and R 
also work. I don't think we would exclude this on purpose.
    
    - It's not a critical issue nor a regression.
    
      I don't think we should only make a backport for a critical issue or a 
regression. That's a strong reason to backport but there are still other cases 
that can be backported based on my understanding and observations. If it's a 
bug quite clearly and it affects an actual user group, I would guess it can be 
valuable for a backport. The fix is straightforward, less invasive and small.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

Reply via email to