[jira] [Commented] (MAHOUT-1490) Data frame R-like bindings

Dmitriy Lyubimov (JIRA) Tue, 20 May 2014 17:37:27 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004160#comment-14004160
 ]


Dmitriy Lyubimov commented on MAHOUT-1490:
------------------------------------------

Also in a realistic case, we will be reading the frame blocks off media which 
does not internally use that compression (most likely, the media would be 
row-wise). So compression will stream in uncompressed data and will already 
have the memory bottleneck. So in order to justify compression in these 
scenarios, we need to make sure that compressed source will be iterated over 
more than one time. Again, this is all just a programming model. 

For example, there might be an api that says "build fast iterative source" 
explicitly, rather than always assume it is a good thing. I kind of suspect 
that's what h2o "freeze" concept encompasses.

> Data frame R-like bindings
> --------------------------
>
>                 Key: MAHOUT-1490
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1490
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Saikat Kanjilal
>            Assignee: Dmitriy Lyubimov
>             Fix For: 1.0
>
>   Original Estimate: 20h
>  Remaining Estimate: 20h
>
> Create Data frame R-like bindings for spark



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAHOUT-1490) Data frame R-like bindings

Reply via email to