[
https://issues.apache.org/jira/browse/COMPRESS-540?focusedWorklogId=460703&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460703
]
ASF GitHub Bot logged work on COMPRESS-540:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 18/Jul/20 09:04
Start Date: 18/Jul/20 09:04
Worklog Time Spent: 10m
Work Description: PeterAlfredLee commented on a change in pull request
#113:
URL: https://github.com/apache/commons-compress/pull/113#discussion_r456765778
##########
File path:
src/main/java/org/apache/commons/compress/archivers/tar/TarArchiveInputStream.java
##########
@@ -1106,35 +917,8 @@ public int compare(final TarArchiveStructSparse p, final
TarArchiveStructSparse
}
}
- if (sparseInputStreams.size() > 0) {
+ if (!sparseInputStreams.isEmpty()) {
currentSparseInputStreamIndex = 0;
}
}
-
- /**
- * This is an inputstream that always return 0,
- * this is used when reading the "holes" of a sparse file
- */
- private static class TarArchiveSparseZeroInputStream extends InputStream {
Review comment:
I did think about making the `TarArchiveSparseZeroInputStream` as a
separate class instead of a inner class like you do now when I was writing it.
I made it a private inner class cause I was thinking that this class could and
should only be used here in `TarArchiveInputStream`.
Not sure if making `TarArchiveSparseZeroInputStream` a separate class is a
good idea or not.
##########
File path:
src/main/java/org/apache/commons/compress/utils/BoundedNIOInputStream.java
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+package org.apache.commons.compress.utils;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.ByteBuffer;
+
+/**
+ * NIO backed bounded input stream for reading a predefined amount of data
from.
+ * @since 1.21
+ */
+public abstract class BoundedNIOInputStream extends InputStream {
Review comment:
The `BoundedInputStream` is renamed as `BoundedNIOInputStream` here. But
the matter if it's a NIO input stream is depending on the implemention of
abstract method `read` - which is decided by the inherited class.
I don't like the name `BoundedNIOInputStream` as the word 'NIO' is confusing
: we need to implement a NIO read to make this stream a NIO input stream.
##########
File path:
src/main/java/org/apache/commons/compress/utils/BoundedSeekableByteChannelInputStream.java
##########
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+package org.apache.commons.compress.utils;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.channels.SeekableByteChannel;
+
+/**
+ * InputStream that delegates requests to the underlying SeekableByteChannel,
making sure that only bytes from a certain
+ * range can be read.
+ * @since 1.21
+ */
+public class BoundedSeekableByteChannelInputStream extends
BoundedNIOInputStream {
Review comment:
It seems `BoundedSeekableByteChannelInputStream` extends from
`BoundedNIOInputStream` is exactly the same as the original inner class
`BoundedInputStream` in `ZipFile`. It looks like that
`BoundedSeekableByteChannelInputStream` is only used in `ZipFile` now. Is it so?
##########
File path: src/main/java/org/apache/commons/compress/archivers/tar/TarFile.java
##########
@@ -0,0 +1,712 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+package org.apache.commons.compress.archivers.tar;
+
+import java.io.ByteArrayOutputStream;
+import java.io.Closeable;
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.ByteBuffer;
+import java.nio.channels.SeekableByteChannel;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.commons.compress.archivers.zip.ZipEncoding;
+import org.apache.commons.compress.archivers.zip.ZipEncodingHelper;
+import org.apache.commons.compress.utils.ArchiveUtils;
+import org.apache.commons.compress.utils.BoundedInputStream;
+import org.apache.commons.compress.utils.BoundedNIOInputStream;
+import org.apache.commons.compress.utils.BoundedSeekableByteChannelInputStream;
+import org.apache.commons.compress.utils.SeekableInMemoryByteChannel;
+
+/**
+ * The TarFile provides random access to UNIX to archives.
+ * @since 1.21
+ */
+public class TarFile implements Closeable {
Review comment:
Many methods in `TarFile` looks pretty like the ones in
`TarArchiveInputStream`(e.g. `buildSparseInputStreams`, `getLongNameData`,
`buildSparseInputStreams`). Maybe we can find some way to share these methods?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 460703)
Time Spent: 1h 10m (was: 1h)
> Random access on Tar archive
> ----------------------------
>
> Key: COMPRESS-540
> URL: https://issues.apache.org/jira/browse/COMPRESS-540
> Project: Commons Compress
> Issue Type: Improvement
> Reporter: Robin Schimpf
> Priority: Major
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> The TarArchiveInputStream only provides sequential access. If only a small
> amount of files from the archive is needed large amount of data in the input
> stream needs to be skipped.
> Therefore I was working on a implementation to provide random access to
> TarFiles equal to the ZipFile api. The basic idea behind the implementation
> is the following
> * Random access is backed by a SeekableByteChannel
> * Read all headers of the tar file and save the place to the data of every
> header
> * User can request an input stream for any entry in the archive multiple
> times
--
This message was sent by Atlassian Jira
(v8.3.4#803005)