Imran Rashid created SPARK-14168:
------------------------------------

             Summary: Managed Memory Leak Msg Should Only Be a Warning
                 Key: SPARK-14168
                 URL: https://issues.apache.org/jira/browse/SPARK-14168
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 1.6.1
            Reporter: Imran Rashid
            Assignee: Imran Rashid
            Priority: Minor


When a task is completed, executors check to see if all managed memory for the 
task was correctly released, and logs an error when it wasn't.  However, it 
turns out its OK for there to be memory that wasn't released when an Iterator 
isn't read to completion, eg., with {{rdd.take()}}.  This results in a scary 
error msg in the executor logs:

{noformat}
16/01/05 17:02:49 ERROR Executor: Managed memory leak detected; size = 16259594 
bytes, TID = 24
{noformat}

Furthermore, if tasks fails for any reason, this msg is also triggered.  This 
can lead users to believe that the failure was from the memory leak, when the 
root cause could be entirely different.  Eg., the same error msg appears in 
executor logs with this clearly broken user code run with {{spark-shell 
--master 'local-cluster[2,2,1024]'}}

{code}
sc.parallelize(0 to 10000000, 2).map(x => x % 10000 -> 
x).groupByKey.mapPartitions { it => throw new RuntimeException("user error!") 
}.collect
{code}

We should downgrade the msg to a warning and link to a more detailed 
explanation.

See https://issues.apache.org/jira/browse/SPARK-11293 for more reports from 
users (and perhaps a true fix)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to