[jira] [Commented] (MAPREDUCE-6441) LocalDistributedCacheManager for concurrent sqoop processes fails to create unique directories

Ray Chiang (JIRA) Tue, 26 Apr 2016 13:53:59 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258901#comment-15258901
 ]


Ray Chiang commented on MAPREDUCE-6441:
---------------------------------------

FYI, I just filed and closed MAPREDUCE-6685 today as a duplicate.  Some 
comments from my experiments on that JIRA:

- JobID or process ID were the only unique things I could think of on separate 
processes.

- The only downside I saw to using UUID#randomUUID() was that it was still 
based on timestamp (granted, with 100ns resolution).  I'm guessing that it 
might be more CPU intensive than Random#nextLong(), assuming the difference 
comes from System#currentTimeMillis() vs System#nanoTime().

- The solution in MAPREDUCE-6685 did seeding Random with process ID as the RNG 
instead and skipping modifying the method signature (either is fine as far as I 
can tell).

That being said, this solution looks fine to me.

Minor nitpick: Please do "git add" and verify with "git status" before 
generating the patch (either that or clobber the patch file and don't append).  
The resulting file is much easier to read.

> LocalDistributedCacheManager for concurrent sqoop processes fails to create 
> unique directories
> ----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6441
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6441
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: William Watson
>            Assignee: William Watson
>         Attachments: HADOOP-10924.02.patch, 
> HADOOP-10924.03.jobid-plus-uuid.patch
>
>
> Kicking off many sqoop processes in different threads results in:
> {code}
> 2014-08-01 13:47:24 -0400:  INFO - 14/08/01 13:47:22 ERROR tool.ImportTool: 
> Encountered IOException running import job: java.io.IOException: 
> java.util.concurrent.ExecutionException: java.io.IOException: Rename cannot 
> overwrite non empty destination directory 
> /tmp/hadoop-hadoop/mapred/local/1406915233073
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:149)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> java.security.AccessController.doPrivileged(Native Method)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> javax.security.auth.Subject.doAs(Subject.java:415)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:239)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:645)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:415)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
> 2014-08-01 13:47:24 -0400:  INFO -    at 
> org.apache.sqoop.Sqoop.main(Sqoop.java:238)
> {code}
> If two are kicked off in the same second. The issue is the following lines of 
> code in the org.apache.hadoop.mapred.LocalDistributedCacheManager class: 
> {code}
>     // Generating unique numbers for FSDownload.
>     AtomicLong uniqueNumberGenerator =
>        new AtomicLong(System.currentTimeMillis());
> {code}
> and 
> {code}
> Long.toString(uniqueNumberGenerator.incrementAndGet())),
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6441) LocalDistributedCacheManager for concurrent sqoop processes fails to create unique directories

Reply via email to