[ 
https://issues.apache.org/jira/browse/HBASE-15271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-15271:
--------------------------------
      Resolution: Fixed
    Release Note: 
When using the bulk load helper provided by the hbase-spark module, output 
files will now be written into temporary files and only made available when the 
executor has successfully completed.

Previously, failed executors would leave their files in place in a way that 
would be picked up by a bulk load command. This caused retried failures to 
include spurious copies of some cells.
          Status: Resolved  (was: Patch Available)

Pushed. Thanks Ted M for the fix. Please close your reviewboard as "submitted".

> Spark Bulk Load: Need to write HFiles to tmp location then rename to protect 
> from Spark Executor Failures
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15271
>                 URL: https://issues.apache.org/jira/browse/HBASE-15271
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Ted Malaska
>            Assignee: Ted Malaska
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15271.1.patch, HBASE-15271.2.patch, 
> HBASE-15271.3.patch, HBASE-15271.4.patch
>
>
> With the current code if an executor failure before the HFile is close it 
> will cause problems.  This jira will have the files first write out to a file 
> that starts with an underscore.  Then when the HFile is complete it will be 
> renamed and the underscore will be removed.
> The underscore is important because the load bulk functionality will skip 
> files with an underscore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to