[
https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004160#comment-14004160
]
Dmitriy Lyubimov commented on MAHOUT-1490:
------------------------------------------
Also in a realistic case, we will be reading the frame blocks off media which
does not internally use that compression (most likely, the media would be
row-wise). So compression will stream in uncompressed data and will already
have the memory bottleneck. So in order to justify compression in these
scenarios, we need to make sure that compressed source will be iterated over
more than one time. Again, this is all just a programming model.
For example, there might be an api that says "build fast iterative source"
explicitly, rather than always assume it is a good thing. I kind of suspect
that's what h2o "freeze" concept encompasses.
> Data frame R-like bindings
> --------------------------
>
> Key: MAHOUT-1490
> URL: https://issues.apache.org/jira/browse/MAHOUT-1490
> Project: Mahout
> Issue Type: New Feature
> Reporter: Saikat Kanjilal
> Assignee: Dmitriy Lyubimov
> Fix For: 1.0
>
> Original Estimate: 20h
> Remaining Estimate: 20h
>
> Create Data frame R-like bindings for spark
--
This message was sent by Atlassian JIRA
(v6.2#6252)