[
https://issues.apache.org/jira/browse/SPARK-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568312#comment-14568312
]
Cory Nguyen edited comment on SPARK-5841 at 6/2/15 12:58 AM:
-------------------------------------------------------------
We are using Luigi with Spark 1.3.1 to manage our jobs
However we run into a unique rare case with the following conditions that
trigger this resolved Block Manger Bug:
- Dataset is relatively large ~ 1.5TB
- Spark job is ran with Luigi
- save to local HDFS
The spark job would process data and mappings just fine, until the very end
when it proceeds to save the files to local hdfs this is when it triggers this
bug.
However, the job saves and complete data successfully if it was saved to s3://
location.
wondering what might cause this resolved bug to trigger when ran with luigi
saving to local hdfs but not trigger when saved to s3 with luigi or ran without
luigi?
was (Author: cqnguyen):
We are using Luigi with Spark to manage our jobs
However we run into a unique rare case with the following conditions that
trigger this resolved Block Manger Bug:
- Dataset is relatively large ~ 1.5TB
- Spark job is ran with Luigi
- save to local HDFS
The spark job would process data and mappings just fine, until the very end
when it proceeds to save the files to local hdfs this is when it triggers this
bug.
However, the job saves and complete data successfully if it was saved to s3://
location.
wondering what might cause this resolved bug to trigger when ran with luigi
saving to local hdfs but not trigger when saved to s3 with luigi or ran without
luigi?
> Memory leak in DiskBlockManager
> -------------------------------
>
> Key: SPARK-5841
> URL: https://issues.apache.org/jira/browse/SPARK-5841
> Project: Spark
> Issue Type: Bug
> Components: Block Manager
> Affects Versions: 1.2.1
> Reporter: Matt Whelan
> Assignee: Matt Whelan
> Fix For: 1.3.0
>
>
> DiskBlockManager registers a Runtime shutdown hook, which creates a hard
> reference to the entire Driver ActorSystem. If a long-running JVM repeatedly
> creates and destroys SparkContext instances, it leaks memory.
> I suggest we deregister the shutdown hook if DiskBlockManager.stop is called.
> It's redundant at that point.
> PR coming.
> See also
> http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CCA+kjH+w_DDTEBE9XB6NrPxLTUXD=nc_d-3ogxtumk_5v-e0...@mail.gmail.com%3E
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]