[jira] [Updated] (SPARK-1912) Compression memory issue during reduce

Patrick Wendell (JIRA) Wed, 25 Jun 2014 13:33:39 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Patrick Wendell updated SPARK-1912:
-----------------------------------

    Fix Version/s: 0.9.2

> Compression memory issue during reduce
> --------------------------------------
>
>                 Key: SPARK-1912
>                 URL: https://issues.apache.org/jira/browse/SPARK-1912
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Wenchen Fan
>            Assignee: Wenchen Fan
>             Fix For: 0.9.2, 1.0.1, 1.1.0
>
>
> When we need to read a compressed block, we will first create a compress 
> stream instance(LZF or Snappy) and use it to wrap that block.
> Let's say a reducer task need to read 1000 local shuffle blocks, it will 
> first prepare to read that 1000 blocks, which means create 1000 compression 
> stream instance to wrap them. But the initialization of compression instance 
> will allocate some memory and when we have many compression instance at the 
> same time, it is a problem.
> Actually reducer reads the shuffle blocks one by one, so why we create 
> compression instance at the first time? Can we do it lazily that when a block 
> is first read, create compression instance for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1912) Compression memory issue during reduce

Reply via email to