[ 
https://issues.apache.org/jira/browse/PIG-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Tolar updated PIG-5418:
-----------------------------
    Description: 
A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak 
memory. I noticed this while running a unit test for a UDF several thousand 
times and checking the heap. 

Links are to latest commit as of creating this ticket: 

https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256


{{new PigContext()}} [creates a MapReduce 
ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269].
 
This creates a 
[MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34].
 
This registers a [Hadoop shutdown 
hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105]
 which doesn't go away until the JVM dies. See: 
https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html
 . 

I will attach a proposed patch. From my reading of the code and running tests, 
the existing schema parse APIs do not actually use anything from this dummy 
PigContext, and with a minor tweak it can be passed in as NULL, avoiding the 
creation of these extra resources. 

  was:
A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak 
memory. I noticed this while running a unit test for a UDF many thousand times 
and checking the heap. 

Links is to latest commit as of creating this ticket: 

https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256


{{new PigContext()}} [creates a MapReduce 
ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269].
 
This creates a 
[MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34].
 
This registers a [Hadoop shutdown 
hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105]
 which doesn't go away until the JVM dies. See: 
https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html
 . 

I will attach a proposed patch. From my reading of the code and running tests, 
the existing schema parse APIs do not actually use anything from this dummy 
PigContext, and with a minor tweak it can be passed in as NULL, avoiding the 
creation of these extra resources. 


> Utils.parseSchema(String), parseConstant(String) leak memory
> ------------------------------------------------------------
>
>                 Key: PIG-5418
>                 URL: https://issues.apache.org/jira/browse/PIG-5418
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Jacob Tolar
>            Priority: Minor
>         Attachments: PIG-5418.patch
>
>
> A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak 
> memory. I noticed this while running a unit test for a UDF several thousand 
> times and checking the heap. 
> Links are to latest commit as of creating this ticket: 
> https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256
> {{new PigContext()}} [creates a MapReduce 
> ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269].
>  
> This creates a 
> [MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34].
>  
> This registers a [Hadoop shutdown 
> hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105]
>  which doesn't go away until the JVM dies. See: 
> https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html
>  . 
> I will attach a proposed patch. From my reading of the code and running 
> tests, the existing schema parse APIs do not actually use anything from this 
> dummy PigContext, and with a minor tweak it can be passed in as NULL, 
> avoiding the creation of these extra resources. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to