[ 
https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990182#comment-13990182
 ] 

Dmitriy Lyubimov edited comment on MAHOUT-1490 at 5/6/14 1:26 AM:
------------------------------------------------------------------

Ok, let's start with something simple, like select() and mutate() form dplyr 
[1].

Part of the problem here is that R allows dynamic call expressions, e.g. 
{code}
d %.% mutate( gain = ArrDelay - DepDelay,
  speed = Distance / AirTime * 60)
{code}

We therefore can't translate that verbatim, because the scala language would 
require us to have a defined identifier.

We also can't easily have compile-time verified expressions.

first approximation that can happen is perhaps something like

{code}
d.mutate( let("gain") equal { col("ArrDelay") - col("DepDelay") } )
{code}

or 

{code}
d.mutate( let("gain") =  col("ArrDelay") - col("DepDelay")  )
{code}

is this too ugly? 

Is there a way to make it look less ugly?


[1] http://cran.r-project.org/web/packages/dplyr/vignettes/introduction.html


was (Author: dlyubimov):
Ok, let's start with something simple, like select() and mutate() form dplyr 
[1].

Part of the problem here is that R allows dynamic call expressions, e.g. 
{code}
d %.% mutate( gain = ArrDelay - DepDelay,
  speed = Distance / AirTime * 60)
{code}

We therefore can't translate that verbatim, because the scala language would 
require us to have a defined identifier.

We also can't easily have compile-time verified expressions.

first approximation that can happen is perhaps something like

{code}
d.mutate( let("gain") equal { col("ArrDelay") - col("DepDelay") } )
{code}

is this too ugly? 

Is there a way to make it look less ugly?


[1] http://cran.r-project.org/web/packages/dplyr/vignettes/introduction.html

> Data frame R-like bindings
> --------------------------
>
>                 Key: MAHOUT-1490
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1490
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Saikat Kanjilal
>            Assignee: Dmitriy Lyubimov
>             Fix For: 1.0
>
>   Original Estimate: 20h
>  Remaining Estimate: 20h
>
> Create Data frame R-like bindings for spark



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to