[GitHub] spark pull request #16792: [SPARK-19453][PYTHON][SQL][DOC] Correct and exten...

holdenk Sat, 04 Feb 2017 16:11:49 -0800

Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16792#discussion_r99478139
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -1272,16 +1272,18 @@ def replace(self, to_replace, value, subset=None):
             """Returns a new :class:`DataFrame` replacing a value with another 
value.
             :func:`DataFrame.replace` and :func:`DataFrameNaFunctions.replace` 
are
             aliases of each other.
    +        Values `to_replace` and `value` should be homogeneous. Mixed 
string and numeric
    --- End diff --
    
    We can leave the warning about truncation off if you think its unnecessary 
- but I do like the description you came up with for it.
    
    I think the current proposed text on the expected types is a little vague 
(as you said) and I think we can have something that is precise, accurate, and 
easy to read so I'd like us to give it a shot :)
    
    How about something like: "The element(s) `to_replace` and `value` should 
be the same type(s) (either all numerics, all booleans, or all strings)." This 
way we've clarified that mixing the different numerics is ok since it all gets 
converted to doubles at the end of the day? I'm open to other suggestions 
though too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #16792: [SPARK-19453][PYTHON][SQL][DOC] Correct and exten...

Reply via email to