[ 
https://issues.apache.org/jira/browse/CRUNCH-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963217#comment-14963217
 ] 

Gabriel Reid commented on CRUNCH-575:
-------------------------------------

This issue (or one very similar to it) is discussed in CRUNCH-515. 

[~srowen] could you take a quick look at that one first, and see if the 
underlying problem that you're encountering is or isn't the same as the one 
mentioned on that ticket (crashing pipelines or pipelines that weren't calling 
pipeline.done())? It would be good to have an additional sample point to help 
determine if making this change will just be hiding a different issue (which 
will lead to a huge number of temp directories), or if we are just running into 
the limits of 32-bits.

On the other hand, if we want to really avoid collisions (and if this isn't due 
to pipelines which aren't correctly being cleaned up), maybe a UUID is (even) 
better than a long as a randomizer in the temp dir name.

> DistributedPipeline temp dir choice can collide with itself
> -----------------------------------------------------------
>
>                 Key: CRUNCH-575
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-575
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.12.0
>            Reporter: Sean Owen
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH_575.patch
>
>
> We've observed that Crunch jobs can fail because the output temp dir already 
> exists:
> {code}
> 2015-04-02 04:45:49,208 INFO 
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob: 
> org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory 
> /tmp/crunch-686245394/p2/output already exists
> at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132)
> {code}
> One possible cause is the choice of random directory name, which is based on 
> a random nonnegative 32-bit int. The chance of collision is more than 50% at 
> about 55,000 temp dirs, which is not unimaginable.
> A suggested fix, at least for that theoretical cause, is to generate a much 
> larger random value. 64 bits should put this firmly in the realm of extremely 
> improbably (billions, not tens of thousands).
> (HT [~wilfreds] / CC [~tomwhite])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to