Hi Everyone,

I'd like to start a discussion about possibility of adding magrittr
(https://magrittr.tidyverse.org/) as an explicit dependency for SparkR.
For those not familiar with the package, it provides a number small
utilities where the most important one is %>% function, similar to
pipe-forward (|>) in F# or thread-first macro (->) in Clojure. In other
words, it allows us to replace:

df <- createDataFrame(iris)

df_filtered <- filter(df, df$Sepal_Width > df$Petal_Length)

df_projected <- select(df_filtered, min(df$Sepal_Width - df$Petal_Length))

or


df_projected <- select(

  filter(createDataFrame(iris), column("Sepal_Width") >
column("Petal_Length")),

  min(column("Sepal_Width") - column("Petal_Length"))

)

with

df_projected <- createDataFrame(iris) %>% 
  filter(.$Sepal_Width > .$Petal_Length) %>%
  select(min(.$Sepal_Width - .$Petal_Length))

It is widely used (see reverse dependency section
https://cran.r-project.org/web/packages/magrittr/index.html), stable and
pretty much a core element of idiomatic R code these days.

Why we might want to add it:

  * Improve readability of SparkR examples which, subjectively speaking,
    can look a bit archaic.
  * Reduce verbosity of SparkR codebase.


Possible risks:

  * It is additional dependency for CI pipeline.

    A: magrittr is already a transitive dependency for SparkR tests (it
    is required by testthat), its API is extremely stable and itself
    requires no dependencies.
  * It is an additional dependency for SparkR installations.

    A: Give widespread usage (over 1200 reverse imports, including some
    of the most popular packages) it is probably of any, but minimal, R
    installation.

    While it's just anecdotal evidence, most of the SparkR applications
    I've seen out there, already use magrittr.


Non-goals:

  * Supporting non-standard evaluation.


Thanks in advance for your input.

-- 
Best regards,
Maciej Szymkiewicz

Web: https://zero323.net
Keybase: https://keybase.io/zero323
Gigs: https://www.codementor.io/@zero323
PGP: A30CEF0C31A501EC

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to