[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192217#comment-13192217
 ] 

Lawrence Simpson commented on HBASE-5210:
-----------------------------------------

@Todd:
Two questions about your solution:
1. If we were to form a file name from just the numeric digits of the task 
attempt ID, that would be 23 digits.  As I look at the file names for HBase 
tables, they seem to be 18-19 digits long.  Do you know if there are any 
assumptions made in other HBase code about the length of file names for store 
files?
2. In the unlikely event that there was a name conflict with an HFile created 
by a reducer, what should happen then?  (The job number looks like it might 
roll at 10000 jobs - I don't know if anyone has gotten that far without 
restarting Map/Reduce.)  

It still seems to me that the safest solution is a change to HFileOutputFormat 
to use a new output committer class that adds rename logic to 
moveTaskOutputs().  These changes could be implemented strictly in the HBase 
code tree without having to involve the underlying Hadoop implementation. 
                
> HFiles are missing from an incremental load
> -------------------------------------------
>
>                 Key: HBASE-5210
>                 URL: https://issues.apache.org/jira/browse/HBASE-5210
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.2
>         Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
> RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
>            Reporter: Lawrence Simpson
>         Attachments: HBASE-5210-crazy-new-getRandomFilename.patch
>
>
> We run an overnight map/reduce job that loads data from an external source 
> and adds that data to an existing HBase table.  The input files have been 
> loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
> TotalOrderPartitioner) to create HFiles which are subsequently added to the 
> HBase table.  On at least two separate occasions (that we know of), a range 
> of output would be missing for a given day.  The range of keys for the 
> missing values corresponded to those of a particular region.  This implied 
> that a complete HFile somehow went missing from the job.  Further 
> investigation revealed the following:
>  * Two different reducers (running in separate JVMs and thus separate class 
> loaders)
>  * in the same server can end up using the same file names for their
>  * HFiles.  The scenario is as follows:
>  *    1.      Both reducers start near the same time.
>  *    2.      The first reducer reaches the point where it wants to write its 
> first file.
>  *    3.      It uses the StoreFile class which contains a static Random 
> object 
>  *            which is initialized by default using a timestamp.
>  *    4.      The file name is generated using the random number generator.
>  *    5.      The file name is checked against other existing files.
>  *    6.      The file is written into temporary files in a directory named
>  *            after the reducer attempt.
>  *    7.      The second reduce task reaches the same point, but its 
> StoreClass
>  *            (which is now in the file system's cache) gets loaded within the
>  *            time resolution of the OS and thus initializes its Random()
>  *            object with the same seed as the first task.
>  *    8.      The second task also checks for an existing file with the name
>  *            generated by the random number generator and finds no conflict
>  *            because each task is writing files in its own temporary folder.
>  *    9.      The first task finishes and gets its temporary files committed
>  *            to the "real" folder specified for output of the HFiles.
>  *     10.    The second task then reaches its own conclusion and commits its
>  *            files (moveTaskOutputs).  The released Hadoop code just 
> overwrites
>  *            any files with the same name.  No warning messages or anything.
>  *            The first task's HFiles just go missing.
>  * 
>  *  Note:  The reducers here are NOT different attempts at the same 
>  *    reduce task.  They are different reduce tasks so data is
>  *    really lost.
> I am currently testing a fix in which I have added code to the Hadoop 
> FileOutputCommitter.moveTaskOutputs method to check for a conflict with
> an existing file in the final output folder and to rename the HFile if
> needed.  This may not be appropriate for all uses of FileOutputFormat.
> So I have put this into a new class which is then used by a subclass of
> HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
> more of a problem due to private declarations.
> I don't know if my approach is the best fix for the problem.  If someone
> more knowledgeable than myself deems that it is, I will be happy to share
> what I have done and by that time I may have some information on the
> results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to