[ 
https://issues.apache.org/jira/browse/HBASE-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Dougan updated HBASE-8367:
--------------------------------

    Status: Patch Available  (was: Open)

Patched proposed changes against trunk.
                
> LoadIncrementalHFiles does not return an error code or throw Exception when 
> failures occur due to timeouts.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8367
>                 URL: https://issues.apache.org/jira/browse/HBASE-8367
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 0.92.2, 0.92.1
>         Environment: Red Hat 6.2 
> Java 1.6.0_26
> Hadoop 2.0.0-mr1-cdh4.1.1
> HBase 0.92.1-cdh4.1.1
>            Reporter: Brian Dougan
>            Priority: Minor
>             Fix For: 0.94.8
>
>         Attachments: LoadIncrementalHFiles-HBASE-8367.patch
>
>
> The LoadIncrementalHFiles (completebulkload) command will exit with a success 
> code (or lack of Exception) when one or more of the HFiles fail to be 
> imported through a few ways (mainly when timeouts occur).  Instead, it simply 
> logs error messages to the log which makes it difficult to automate the 
> import of HFiles programmatically.   
> The heart of the LoadIncrementalHFiles class (doBulkLoad) returns void and 
> has essentially the following structure.
> {code:title=LoadIncrementalHFiles.java}
> try {
>       ...
>  
>     } finally {
>       pool.shutdown();
>       if (queue != null && !queue.isEmpty()) {
>         StringBuilder err = new StringBuilder();
>         err.append("-------------------------------------------------\n");
>         err.append("Bulk load aborted with some files not yet loaded:\n");
>         err.append("-------------------------------------------------\n");
>         for (LoadQueueItem q : queue) {
>           err.append("  ").append(q.hfilePath).append('\n');
>         }
>         LOG.error(err);
>       }
>     }
> {code}
> As you can see, instead of returning an error code, a success indicator, or 
> simply throwing an Exception, an error message is sent to the log.  This 
> results in something like the following in the logs.
> {quote}
> Bulk load aborted with some files not yet loaded:
> -------------------------------------------------
>   
> hdfs://prmdprod/user/userxxx/hfile/TABLE-1365721885510/record/_tmp/TABLE,2.bottom
>   
> hdfs://prmdprod/user/userxxx/hfile/TABLE-1365721885510/record/_tmp/TABLE,2.top
>   
> hdfs://prmdprod/user/userxxx/hfile/TABLE-1365721885510/record/_tmp/TABLE,1.bottom
>   
> hdfs://prmdprod/user/userxxx/hfile/TABLE-1365721885510/record/_tmp/TABLE,1.top
> {quote}
> Without some sort of indication, it's not currently possible to chain this 
> command to another or to programmatically consume this class and be certain 
> of a successful import.
> This class should be enhanced to return non-success in whatever way makes 
> sense to the community.  I don't really have a strong preference, but one of 
> the following should work fine (at least for my needs).
> * boolean return value on doBulkLoad (non-zero on run method)
> * Response object on doBulkLoad detailing the files that failed (non-zero on 
> run method)
> * throw Exception in the finally block when files failed after the error is 
> written to the log (should automatically cause non-zero on run method)
> It would also be nice to get this to the 0.94.x stream so it get included in 
> the next Cloudera release.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to