[jira] [Resolved] (HBASE-28836) Parallelize the archival of compacted files

2025-02-18 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-28836.
-
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.2
   2.5.12
   Resolution: Fixed

> Parallelize the archival of compacted files 
> 
>
> Key: HBASE-28836
> URL: https://issues.apache.org/jira/browse/HBASE-28836
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 2.5.10
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.2, 2.5.12
>
>
> While splitting a region in hbase it has to cleanup compacted files for 
> bookkeeping.
>  
> Currently we do it sequentially and that is good enough because for hdfs as 
> it is a fast operation. When we do the same in s3 it becomes a issue. We need 
> to paralleize this to make it faster. 
> {code:java}
> // code placeholder
> for (File file : toArchive) {
>       // if its a file archive it
>       try {
>         LOG.trace("Archiving {}", file);
>         if (file.isFile()) {
>           // attempt to archive the file
>           if (!resolveAndArchiveFile(baseArchiveDir, file, startTime)) {
>             LOG.warn("Couldn't archive " + file + " into backup directory: " 
> + baseArchiveDir);
>             failures.add(file);
>           }
>         } else {
>           // otherwise its a directory and we need to archive all files
>           LOG.trace("{} is a directory, archiving children files", file);
>           // so we add the directory name to the one base archive
>           Path parentArchiveDir = new Path(baseArchiveDir, file.getName());
>           // and then get all the files from that directory and attempt to
>           // archive those too
>           Collection children = file.getChildren();
>           failures.addAll(resolveAndArchive(fs, parentArchiveDir, children, 
> start));
>         }
>       } catch (IOException e) {
>         LOG.warn("Failed to archive {}", file, e);
>         failures.add(file);
>       }
>     } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28836) Parallelize the archival of compacted files

2024-12-09 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28836.
--
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.5.11
   2.6.2
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Parallelize the archival of compacted files 
> 
>
> Key: HBASE-28836
> URL: https://issues.apache.org/jira/browse/HBASE-28836
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 2.5.10
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.5.11, 2.6.2
>
>
> While splitting a region in hbase it has to cleanup compacted files for 
> bookkeeping.
>  
> Currently we do it sequentially and that is good enough because for hdfs as 
> it is a fast operation. When we do the same in s3 it becomes a issue. We need 
> to paralleize this to make it faster. 
> {code:java}
> // code placeholder
> for (File file : toArchive) {
>       // if its a file archive it
>       try {
>         LOG.trace("Archiving {}", file);
>         if (file.isFile()) {
>           // attempt to archive the file
>           if (!resolveAndArchiveFile(baseArchiveDir, file, startTime)) {
>             LOG.warn("Couldn't archive " + file + " into backup directory: " 
> + baseArchiveDir);
>             failures.add(file);
>           }
>         } else {
>           // otherwise its a directory and we need to archive all files
>           LOG.trace("{} is a directory, archiving children files", file);
>           // so we add the directory name to the one base archive
>           Path parentArchiveDir = new Path(baseArchiveDir, file.getName());
>           // and then get all the files from that directory and attempt to
>           // archive those too
>           Collection children = file.getChildren();
>           failures.addAll(resolveAndArchive(fs, parentArchiveDir, children, 
> start));
>         }
>       } catch (IOException e) {
>         LOG.warn("Failed to archive {}", file, e);
>         failures.add(file);
>       }
>     } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)