[
https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990182#comment-13990182
]
Dmitriy Lyubimov edited comment on MAHOUT-1490 at 5/6/14 1:26 AM:
------------------------------------------------------------------
Ok, let's start with something simple, like select() and mutate() form dplyr
[1].
Part of the problem here is that R allows dynamic call expressions, e.g.
{code}
d %.% mutate( gain = ArrDelay - DepDelay,
speed = Distance / AirTime * 60)
{code}
We therefore can't translate that verbatim, because the scala language would
require us to have a defined identifier.
We also can't easily have compile-time verified expressions.
first approximation that can happen is perhaps something like
{code}
d.mutate( let("gain") equal { col("ArrDelay") - col("DepDelay") } )
{code}
or
{code}
d.mutate( let("gain") = col("ArrDelay") - col("DepDelay") )
{code}
is this too ugly?
Is there a way to make it look less ugly?
[1] http://cran.r-project.org/web/packages/dplyr/vignettes/introduction.html
was (Author: dlyubimov):
Ok, let's start with something simple, like select() and mutate() form dplyr
[1].
Part of the problem here is that R allows dynamic call expressions, e.g.
{code}
d %.% mutate( gain = ArrDelay - DepDelay,
speed = Distance / AirTime * 60)
{code}
We therefore can't translate that verbatim, because the scala language would
require us to have a defined identifier.
We also can't easily have compile-time verified expressions.
first approximation that can happen is perhaps something like
{code}
d.mutate( let("gain") equal { col("ArrDelay") - col("DepDelay") } )
{code}
is this too ugly?
Is there a way to make it look less ugly?
[1] http://cran.r-project.org/web/packages/dplyr/vignettes/introduction.html
> Data frame R-like bindings
> --------------------------
>
> Key: MAHOUT-1490
> URL: https://issues.apache.org/jira/browse/MAHOUT-1490
> Project: Mahout
> Issue Type: New Feature
> Reporter: Saikat Kanjilal
> Assignee: Dmitriy Lyubimov
> Fix For: 1.0
>
> Original Estimate: 20h
> Remaining Estimate: 20h
>
> Create Data frame R-like bindings for spark
--
This message was sent by Atlassian JIRA
(v6.2#6252)