[jira] [Resolved] (HBASE-28836) Parallelize the archival of compacted files

Viraj Jasani (Jira) Mon, 09 Dec 2024 12:51:45 -0800


     [ 
https://issues.apache.org/jira/browse/HBASE-28836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Viraj Jasani resolved HBASE-28836.
----------------------------------
    Fix Version/s: 2.7.0
                   3.0.0-beta-2
                   2.5.11
                   2.6.2
     Hadoop Flags: Reviewed
       Resolution: Fixed

> Parallelize the archival of compacted files 
> --------------------------------------------
>
>                 Key: HBASE-28836
>                 URL: https://issues.apache.org/jira/browse/HBASE-28836
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 2.5.10
>            Reporter: Aman Poonia
>            Assignee: Aman Poonia
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.7.0, 3.0.0-beta-2, 2.5.11, 2.6.2
>
>
> While splitting a region in hbase it has to cleanup compacted files for 
> bookkeeping.
>  
> Currently we do it sequentially and that is good enough because for hdfs as 
> it is a fast operation. When we do the same in s3 it becomes a issue. We need 
> to paralleize this to make it faster. 
> {code:java}
> // code placeholder
> for (File file : toArchive) {
>       // if its a file archive it
>       try {
>         LOG.trace("Archiving {}", file);
>         if (file.isFile()) {
>           // attempt to archive the file
>           if (!resolveAndArchiveFile(baseArchiveDir, file, startTime)) {
>             LOG.warn("Couldn't archive " + file + " into backup directory: " 
> + baseArchiveDir);
>             failures.add(file);
>           }
>         } else {
>           // otherwise its a directory and we need to archive all files
>           LOG.trace("{} is a directory, archiving children files", file);
>           // so we add the directory name to the one base archive
>           Path parentArchiveDir = new Path(baseArchiveDir, file.getName());
>           // and then get all the files from that directory and attempt to
>           // archive those too
>           Collection<File> children = file.getChildren();
>           failures.addAll(resolveAndArchive(fs, parentArchiveDir, children, 
> start));
>         }
>       } catch (IOException e) {
>         LOG.warn("Failed to archive {}", file, e);
>         failures.add(file);
>       }
>     } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HBASE-28836) Parallelize the archival of compacted files

Reply via email to