wenhao created HBASE-29382:
------------------------------

             Summary: The always.copy.files parameter does not take effect in 
some bulkload scenarios
                 Key: HBASE-29382
                 URL: https://issues.apache.org/jira/browse/HBASE-29382
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 2.5.11, 2.0.0
            Reporter: wenhao
         Attachments: 1.jpg, 2.jpg, 3.jpg, 4.jpg

When using bulkload, if the region partitions of two tables are inconsistent, 
there is a need for the hfile split of the source table to match the region 
partitions of the target table. However, in this case, if -Dalways.copy.files 
is specified, it will be found that the hfile of the source table is still 
cleaned up, and there are no recognizable hfiles in the region directory of 
HDFS, resulting in unavailable data.

As shown in the image below, after bulkload, the original hfile is deleted, 
while the hfile after split (under .tmp) is retained (-Dalways.copy.files). 
However, the files under .tmp cannot be recognized. For example, after 
unassign/assign, the data volume of this region becomes 0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to