Tathagata Das created SPARK-5523:
------------------------------------

             Summary: TaskMetrics and TaskInfo have innumerable copies of the 
hostname string
                 Key: SPARK-5523
                 URL: https://issues.apache.org/jira/browse/SPARK-5523
             Project: Spark
          Issue Type: Bug
            Reporter: Tathagata Das


 TaskMetrics and TaskInfo objects have the hostname associated with the task. 
As these are created (directly or through deserialization of RPC messages), 
each of them have a separate String object for the hostname even though most of 
them have the same string data in them. This results in thousands of string 
objects, increasing memory requirement of the driver. 
This can be easily deduped when deserializing a TaskMetrics object, or when 
creating a TaskInfo object (in TaskSchedulerImpl).

 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to