[ 
https://issues.apache.org/jira/browse/HDDS-12707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumit Agrawal resolved HDDS-12707.
----------------------------------
    Fix Version/s: 2.1.0
       Resolution: Fixed

> Recon - Streaming-Based Approach for Fetch and Extraction of Recon OM DB 
> Snapshot Tar SST files
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDDS-12707
>                 URL: https://issues.apache.org/jira/browse/HDDS-12707
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: Ozone Recon
>            Reporter: Devesh Kumar Singh
>            Assignee: Devesh Kumar Singh
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.1.0
>
>
> Instead of storing the full TAR file by Recon and waiting for complete 
> transfer, let's *extract files as they arrive* using 
> {{{}TarArchiveInputStream{}}}.
> This will:
>  * Reduce disk I/O
>  * Start processing sooner
>  * Avoid extra storage needs
> h4. Why This is More Efficient
> h5. No Temporary TAR File
>  * Directly extracts files {*}while streaming{*}, eliminating the need to 
> store the full TAR.
> h5. *Starts Extracting Immediately*
>  * No waiting for the full file to be received; extraction happens {*}as data 
> arrives{*}.
> h5. Lower Disk I/O & Storage Needs
>  * Removes unnecessary {{FileUtils.copyInputStreamToFile()}} call.
>  * Avoids writing and re-reading the TAR file.
> h5. Handles Both Files & Directories
>  * Ensures correct directory structure before writing files.
> h4. *Using Multithreading for Parallel Extraction*
> To extract files in {*}parallel{*}, we need to:
>  # *Use a thread pool* to process multiple files at the same time.
>  # *Extract files asynchronously* while maintaining order and efficiency.
>  # *Ensure correct handling of directories before writing files.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to