[ 
https://issues.apache.org/jira/browse/PIG-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264146#comment-13264146
 ] 

Rohini Palaniswamy commented on PIG-2672:
-----------------------------------------

Have used all versions of hadoop from 0.20, 0.20S to 23. Have never seen it 
unjarred till now. Verified by checking the cache directory of production task 
trackers of both 0.20.205 and 0.23. They are not unjarred and we are certainly 
using "tmpjars". 

 But looking at the code in TrackerDistributedCacheManager, I am wondering why 
it did not unjar. The code definitely seems to be unjarring. Confused and need 
to dig deeper. 
                
> Optimize the use of DistributedCache
> ------------------------------------
>
>                 Key: PIG-2672
>                 URL: https://issues.apache.org/jira/browse/PIG-2672
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>
> Pig currently copies jar files to a temporary location in hdfs and then adds 
> them to DistributedCache for each job launched. This is inefficient in terms 
> of 
>    * Space - The jars are distributed to task trackers for every job taking 
> up lot of local temporary space in tasktrackers.
>    * Performance - The jar distribution impacts the job launch time.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to