[jira] [Commented] (SPARK-560) Specialize RDDs / iterators

Reynold Xin (JIRA) Fri, 06 Feb 2015 11:34:47 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309723#comment-14309723
 ]


Reynold Xin commented on SPARK-560:
-----------------------------------

We should close this one. It is much easier to do with DataFrame, and we will 
just make DataFrame optimized for this.

> Specialize RDDs / iterators
> ---------------------------
>
>                 Key: SPARK-560
>                 URL: https://issues.apache.org/jira/browse/SPARK-560
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Matei Zaharia
>
> When you're working on in-memory data, the overhead of boxing / unboxing 
> starts to matter, and it looks like specializing would give a 2-4x speedup. 
> We can't just throw in @specialized though because Scala's Iterator is not 
> specialized. We probably need to make our own and also ensure that the right 
> methods get called remotely when you have a chain of RDDs (i.e. it doesn't 
> "lose" its specialization).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-560) Specialize RDDs / iterators

Reply via email to