[GitHub] spark pull request #20464: [SPARK-23291][SQL][R] R's substr should not reduc...

felixcheung Tue, 06 Mar 2018 22:25:44 -0800

Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20464#discussion_r172750404
  
    --- Diff: docs/sparkr.md ---
    @@ -663,3 +663,7 @@ You can inspect the search path in R with 
[`search()`](https://stat.ethz.ch/R-ma
      - The `stringsAsFactors` parameter was previously ignored with `collect`, 
for example, in `collect(createDataFrame(iris), stringsAsFactors = TRUE))`. It 
has been corrected.
      - For `summary`, option for statistics to compute has been added. Its 
output is changed from that from `describe`.
      - A warning can be raised if versions of SparkR package and the Spark JVM 
do not match.
    +
    +## Upgrading to Spark 2.4.0
    +
    + - The `start` parameter of `substr` method was wrongly subtracted by one, 
previously. In other words, the index specified by `start` parameter was 
considered as 0-base. This can lead to inconsistent substring results and also 
does not match with the behaviour with `substr` in R. It has been fixed so the 
`start` parameter of `substr` method is now 1-base, e.g., `substr(df$a, 2, 5)` 
should be changed to `substr(df$a, 1, 4)`.
    --- End diff --
    
    could you add
    `method is now 1-base, e.g., therefore to get the same result as 
substr(df$a, 2, 5), it should be changed to substr(df$a, 1, 4)`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20464: [SPARK-23291][SQL][R] R's substr should not reduc...

Reply via email to