[ 
https://issues.apache.org/jira/browse/COMPRESS-540?focusedWorklogId=462522&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-462522
 ]

ASF GitHub Bot logged work on COMPRESS-540:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Jul/20 11:49
            Start Date: 23/Jul/20 11:49
    Worklog Time Spent: 10m 
      Work Description: theobisproject commented on pull request #113:
URL: https://github.com/apache/commons-compress/pull/113#issuecomment-662962115


   I know you are working on this library in your free time. So I did expect 
this big change to take some time before it is integrated.
   
   >   IMHO this would be a nice addition even though I don't expect many 
people to require random access to a raw tar archive. Most people likely deal 
with tar archives that have been compressed and so you'd need to decompress it 
to a temporary file (or memory) for random access.
   
   As you pointed out this only brings better speed for some special cases e.g. 
random access or somthing like the `TarLister` where no content is read and 
therefore the stream needs to read and discard all data.  
   Specifially for compressed archives it can even take longer depending on the 
target where the file is decompressed to.
   
   >  have you considered making `TarArchiveEntry` implement 
`EntryStreamOffsets `?
   
   Seems like I missed this interface. Since this exactly what I wanted to 
implement I will change the implementation to implement the 
`EntryStreamOffsets` interface.
   
   > I'd be in favour of moving stream implementations to the utils package, in 
particular if you copy code from the zip package
   
   The code I moved out of the `ZipFile` class is already located in the utils 
package. Only the Tar specific streams I didn't create in the utils package 
because in my opinion nobody outside the library should access them and 
currently no other archive needs similar streams.
   
   > nitpicking: the methods you've moved to `TarUtils` probably don't need to 
be public.
   
   I agree and will change the code which was moved to the `TarUtils` to be not 
public.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 462522)
    Time Spent: 1h 40m  (was: 1.5h)

> Random access on Tar archive
> ----------------------------
>
>                 Key: COMPRESS-540
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-540
>             Project: Commons Compress
>          Issue Type: Improvement
>            Reporter: Robin Schimpf
>            Priority: Major
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The TarArchiveInputStream only provides sequential access. If only a small 
> amount of files from the archive is needed large amount of data in the input 
> stream needs to be skipped.
> Therefore I was working on a implementation to provide random access to 
> TarFiles equal to the ZipFile api. The basic idea behind the implementation 
> is the following
>  * Random access is backed by a SeekableByteChannel
>  * Read all headers of the tar file and save the place to the data of every 
> header
>  * User can request an input stream for any entry in the archive multiple 
> times



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to