[jira] [Created] (COMPRESS-394) Zip - Local `Version Needed To Extract` does not match Central Directory

2017-05-13 Thread Plamen Totev (JIRA)
Plamen Totev created COMPRESS-394:
-

 Summary: Zip - Local `Version Needed To Extract` does not match 
Central Directory
 Key: COMPRESS-394
 URL: https://issues.apache.org/jira/browse/COMPRESS-394
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Reporter: Plamen Totev


Hi,

This is followup on an issue reported on Plexus Archiver - 
https://github.com/codehaus-plexus/plexus-archiver/issues/57

Plexus Archiver uses {{ZipArchiveOutputStream}} to create zip archives. It 
constructs the {{ZipArchiveOutputStream}} using {{BufferedOutputStream}}. As a 
result the output do not provide random access and additional data descriptor 
records are added. Unfortunately this leads to different values being set for 
{{version needed to extract}} field in the local file header and in the central 
directory. It looks like that the root cause is the way the local header 
{{version needed to extract}} field value is calculated:
{code:java}
if (phased &&  !isZip64Required(entry.entry, zip64Mode)){
putShort(INITIAL_VERSION, buf, LFH_VERSION_NEEDED_OFFSET);
} else {
putShort(versionNeededToExtract(zipMethod, hasZip64Extra(ze)), buf, 
LFH_VERSION_NEEDED_OFFSET);
}
{code}

As you can see the need for data descriptors is not taken into account. On 
other hand when the central directory is created the following is used to 
determine the minimum required version

{code:java}
private int versionNeededToExtract(final int zipMethod, final boolean 
zip64) {
if (zip64) {
return ZIP64_MIN_VERSION;
}
// requires version 2 as we are going to store length info
// in the data descriptor
return (isDeflatedToOutputStream(zipMethod)) ?
DATA_DESCRIPTOR_MIN_VERSION :
INITIAL_VERSION;
}
{code}

As a side note: I'm not a zip expert by any means so I could be wrong, but my 
understanding is that if Deflate compression is used then the minimum required 
version should be 2.0 regardless if data descriptors are used or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (COMPRESS-394) Zip - Local `Version Needed To Extract` does not match Central Directory

2017-05-13 Thread Plamen Totev (JIRA)

 [ 
https://issues.apache.org/jira/browse/COMPRESS-394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Totev updated COMPRESS-394:
--
Priority: Minor  (was: Major)

> Zip - Local `Version Needed To Extract` does not match Central Directory
> 
>
> Key: COMPRESS-394
> URL: https://issues.apache.org/jira/browse/COMPRESS-394
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Reporter: Plamen Totev
>Priority: Minor
>
> Hi,
> This is followup on an issue reported on Plexus Archiver - 
> https://github.com/codehaus-plexus/plexus-archiver/issues/57
> Plexus Archiver uses {{ZipArchiveOutputStream}} to create zip archives. It 
> constructs the {{ZipArchiveOutputStream}} using {{BufferedOutputStream}}. As 
> a result the output do not provide random access and additional data 
> descriptor records are added. Unfortunately this leads to different values 
> being set for {{version needed to extract}} field in the local file header 
> and in the central directory. It looks like that the root cause is the way 
> the local header {{version needed to extract}} field value is calculated:
> {code:java}
> if (phased &&  !isZip64Required(entry.entry, zip64Mode)){
> putShort(INITIAL_VERSION, buf, LFH_VERSION_NEEDED_OFFSET);
> } else {
> putShort(versionNeededToExtract(zipMethod, hasZip64Extra(ze)), 
> buf, LFH_VERSION_NEEDED_OFFSET);
> }
> {code}
> As you can see the need for data descriptors is not taken into account. On 
> other hand when the central directory is created the following is used to 
> determine the minimum required version
> {code:java}
> private int versionNeededToExtract(final int zipMethod, final boolean 
> zip64) {
> if (zip64) {
> return ZIP64_MIN_VERSION;
> }
> // requires version 2 as we are going to store length info
> // in the data descriptor
> return (isDeflatedToOutputStream(zipMethod)) ?
> DATA_DESCRIPTOR_MIN_VERSION :
> INITIAL_VERSION;
> }
> {code}
> As a side note: I'm not a zip expert by any means so I could be wrong, but my 
> understanding is that if Deflate compression is used then the minimum 
> required version should be 2.0 regardless if data descriptors are used or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (COMPRESS-395) Zip - Do not add data descriptor record when CRC and size value are known

2017-05-13 Thread Plamen Totev (JIRA)
Plamen Totev created COMPRESS-395:
-

 Summary: Zip - Do not add data descriptor record when CRC and size 
value are known
 Key: COMPRESS-395
 URL: https://issues.apache.org/jira/browse/COMPRESS-395
 Project: Commons Compress
  Issue Type: Improvement
Reporter: Plamen Totev
Priority: Minor


Hi,

Currently {{ZipArchiveOutputStream}} will add data descriptor record when the 
output do not provide random access. But if you add an entry using 
{{addRawArchiveEntry}} then the CRC, compressed size and uncompressed size 
could be know and there is no need for data descriptor record as those values 
could be set in the local file header. The current implementation does both - 
it sets the correct value in the local file header and adds additional data 
descriptor record. Here is the relevant code from 
{{ZipArchiveOutputStream#putArchiveEntry}}:
{code:java}
// just a placeholder, real data will be in data
// descriptor or inserted later via SeekableByteChannel
ZipEightByteInteger size = ZipEightByteInteger.ZERO;
ZipEightByteInteger compressedSize = ZipEightByteInteger.ZERO;
if (phased){
size = new ZipEightByteInteger(entry.entry.getSize());
compressedSize = new 
ZipEightByteInteger(entry.entry.getCompressedSize());
} else if (entry.entry.getMethod() == STORED
&& entry.entry.getSize() != ArchiveEntry.SIZE_UNKNOWN) {
// actually, we already know the sizes
size = new ZipEightByteInteger(entry.entry.getSize());
compressedSize = size;
}
z64.setSize(size);
z64.setCompressedSize(compressedSize);
{code}

Maybe {{ZipArchiveOutputStream}} could be improved to not add  data descriptor 
record when the CRC and size values are known in advance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (COMPRESS-395) Zip - Do not add data descriptor record when CRC and size values are known

2017-05-13 Thread Plamen Totev (JIRA)

 [ 
https://issues.apache.org/jira/browse/COMPRESS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Totev updated COMPRESS-395:
--
Summary: Zip - Do not add data descriptor record when CRC and size values 
are known  (was: Zip - Do not add data descriptor record when CRC and size 
value are known)

> Zip - Do not add data descriptor record when CRC and size values are known
> --
>
> Key: COMPRESS-395
> URL: https://issues.apache.org/jira/browse/COMPRESS-395
> Project: Commons Compress
>  Issue Type: Improvement
>Reporter: Plamen Totev
>Priority: Minor
>
> Hi,
> Currently {{ZipArchiveOutputStream}} will add data descriptor record when the 
> output do not provide random access. But if you add an entry using 
> {{addRawArchiveEntry}} then the CRC, compressed size and uncompressed size 
> could be know and there is no need for data descriptor record as those values 
> could be set in the local file header. The current implementation does both - 
> it sets the correct value in the local file header and adds additional data 
> descriptor record. Here is the relevant code from 
> {{ZipArchiveOutputStream#putArchiveEntry}}:
> {code:java}
> // just a placeholder, real data will be in data
> // descriptor or inserted later via SeekableByteChannel
> ZipEightByteInteger size = ZipEightByteInteger.ZERO;
> ZipEightByteInteger compressedSize = ZipEightByteInteger.ZERO;
> if (phased){
> size = new ZipEightByteInteger(entry.entry.getSize());
> compressedSize = new 
> ZipEightByteInteger(entry.entry.getCompressedSize());
> } else if (entry.entry.getMethod() == STORED
> && entry.entry.getSize() != ArchiveEntry.SIZE_UNKNOWN) {
> // actually, we already know the sizes
> size = new ZipEightByteInteger(entry.entry.getSize());
> compressedSize = size;
> }
> z64.setSize(size);
> z64.setCompressedSize(compressedSize);
> {code}
> Maybe {{ZipArchiveOutputStream}} could be improved to not add  data 
> descriptor record when the CRC and size values are known in advance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-394) [Zip] Local `Version Needed To Extract` does not match Central Directory

2017-05-16 Thread Plamen Totev (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012922#comment-16012922
 ] 

Plamen Totev commented on COMPRESS-394:
---

Stefan, thank you for looking into this and COMPRESS-395, really appreciate it 
and the fast reaction. I know that both does not have big impact but still some 
tools issue warnings about that. 

I really would love to help with patches but I'm in the middle of writing my 
master thesis, so I afraid I don't have the time :(

> [Zip] Local `Version Needed To Extract` does not match Central Directory
> 
>
> Key: COMPRESS-394
> URL: https://issues.apache.org/jira/browse/COMPRESS-394
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Reporter: Plamen Totev
>Priority: Minor
>  Labels: zip
> Fix For: 1.15
>
>
> Hi,
> This is followup on an issue reported on Plexus Archiver - 
> https://github.com/codehaus-plexus/plexus-archiver/issues/57
> Plexus Archiver uses {{ZipArchiveOutputStream}} to create zip archives. It 
> constructs the {{ZipArchiveOutputStream}} using {{BufferedOutputStream}}. As 
> a result the output do not provide random access and additional data 
> descriptor records are added. Unfortunately this leads to different values 
> being set for {{version needed to extract}} field in the local file header 
> and in the central directory. It looks like that the root cause is the way 
> the local header {{version needed to extract}} field value is calculated:
> {code:java}
> if (phased &&  !isZip64Required(entry.entry, zip64Mode)){
> putShort(INITIAL_VERSION, buf, LFH_VERSION_NEEDED_OFFSET);
> } else {
> putShort(versionNeededToExtract(zipMethod, hasZip64Extra(ze)), 
> buf, LFH_VERSION_NEEDED_OFFSET);
> }
> {code}
> As you can see the need for data descriptors is not taken into account. On 
> other hand when the central directory is created the following is used to 
> determine the minimum required version
> {code:java}
> private int versionNeededToExtract(final int zipMethod, final boolean 
> zip64) {
> if (zip64) {
> return ZIP64_MIN_VERSION;
> }
> // requires version 2 as we are going to store length info
> // in the data descriptor
> return (isDeflatedToOutputStream(zipMethod)) ?
> DATA_DESCRIPTOR_MIN_VERSION :
> INITIAL_VERSION;
> }
> {code}
> As a side note: I'm not a zip expert by any means so I could be wrong, but my 
> understanding is that if Deflate compression is used then the minimum 
> required version should be 2.0 regardless if data descriptors are used or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-394) [Zip] Local `Version Needed To Extract` does not match Central Directory

2017-05-22 Thread Plamen Totev (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020108#comment-16020108
 ] 

Plamen Totev commented on COMPRESS-394:
---

Thanks. I've tried it and the version fields are consistent.

> [Zip] Local `Version Needed To Extract` does not match Central Directory
> 
>
> Key: COMPRESS-394
> URL: https://issues.apache.org/jira/browse/COMPRESS-394
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Reporter: Plamen Totev
>Priority: Minor
>  Labels: zip
> Fix For: 1.15
>
>
> Hi,
> This is followup on an issue reported on Plexus Archiver - 
> https://github.com/codehaus-plexus/plexus-archiver/issues/57
> Plexus Archiver uses {{ZipArchiveOutputStream}} to create zip archives. It 
> constructs the {{ZipArchiveOutputStream}} using {{BufferedOutputStream}}. As 
> a result the output do not provide random access and additional data 
> descriptor records are added. Unfortunately this leads to different values 
> being set for {{version needed to extract}} field in the local file header 
> and in the central directory. It looks like that the root cause is the way 
> the local header {{version needed to extract}} field value is calculated:
> {code:java}
> if (phased &&  !isZip64Required(entry.entry, zip64Mode)){
> putShort(INITIAL_VERSION, buf, LFH_VERSION_NEEDED_OFFSET);
> } else {
> putShort(versionNeededToExtract(zipMethod, hasZip64Extra(ze)), 
> buf, LFH_VERSION_NEEDED_OFFSET);
> }
> {code}
> As you can see the need for data descriptors is not taken into account. On 
> other hand when the central directory is created the following is used to 
> determine the minimum required version
> {code:java}
> private int versionNeededToExtract(final int zipMethod, final boolean 
> zip64) {
> if (zip64) {
> return ZIP64_MIN_VERSION;
> }
> // requires version 2 as we are going to store length info
> // in the data descriptor
> return (isDeflatedToOutputStream(zipMethod)) ?
> DATA_DESCRIPTOR_MIN_VERSION :
> INITIAL_VERSION;
> }
> {code}
> As a side note: I'm not a zip expert by any means so I could be wrong, but my 
> understanding is that if Deflate compression is used then the minimum 
> required version should be 2.0 regardless if data descriptors are used or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (COMPRESS-395) [Zip] Do not add data descriptor record when CRC and size values are known

2017-05-22 Thread Plamen Totev (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020120#comment-16020120
 ] 

Plamen Totev commented on COMPRESS-395:
---

Thanks. I could confirm that it's fixed.

> [Zip] Do not add data descriptor record when CRC and size values are known
> --
>
> Key: COMPRESS-395
> URL: https://issues.apache.org/jira/browse/COMPRESS-395
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Plamen Totev
>Priority: Minor
>  Labels: zip
> Fix For: 1.15
>
>
> Hi,
> Currently {{ZipArchiveOutputStream}} will add data descriptor record when the 
> output do not provide random access. But if you add an entry using 
> {{addRawArchiveEntry}} then the CRC, compressed size and uncompressed size 
> could be know and there is no need for data descriptor record as those values 
> could be set in the local file header. The current implementation does both - 
> it sets the correct value in the local file header and adds additional data 
> descriptor record. Here is the relevant code from 
> {{ZipArchiveOutputStream#putArchiveEntry}}:
> {code:java}
> // just a placeholder, real data will be in data
> // descriptor or inserted later via SeekableByteChannel
> ZipEightByteInteger size = ZipEightByteInteger.ZERO;
> ZipEightByteInteger compressedSize = ZipEightByteInteger.ZERO;
> if (phased){
> size = new ZipEightByteInteger(entry.entry.getSize());
> compressedSize = new 
> ZipEightByteInteger(entry.entry.getCompressedSize());
> } else if (entry.entry.getMethod() == STORED
> && entry.entry.getSize() != ArchiveEntry.SIZE_UNKNOWN) {
> // actually, we already know the sizes
> size = new ZipEightByteInteger(entry.entry.getSize());
> compressedSize = size;
> }
> z64.setSize(size);
> z64.setCompressedSize(compressedSize);
> {code}
> Maybe {{ZipArchiveOutputStream}} could be improved to not add  data 
> descriptor record when the CRC and size values are known in advance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (COMPRESS-375) Allow the clients of ParallelScatterZipCreator to provide ZipArchiveEntryRequestSupplier

2016-11-30 Thread Plamen Totev (JIRA)
Plamen Totev created COMPRESS-375:
-

 Summary: Allow the clients of ParallelScatterZipCreator to provide 
ZipArchiveEntryRequestSupplier
 Key: COMPRESS-375
 URL: https://issues.apache.org/jira/browse/COMPRESS-375
 Project: Commons Compress
  Issue Type: Improvement
  Components: Archivers
Reporter: Plamen Totev


Currently clients of {{ParallelScatterZipCreator}} could provide 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} through 
{{ParallelScatterZipCreator#addArchiveEntry}}. From those two a 
{{ZipArchiveEntryRequest}} is created. Providing {{InputStreamSupplier}} solves 
the problem with opening too many files - streams are opened just-in-time - 
when an entry is  compressed, not when it's submitted.

But there are use cases when the stream may contain information about the 
{{ZipArchiveEntry}}. In those cases creating {{ZipArchiveEntry}} before the 
{{InputStream}} is opened won't work. If there is an option to supply both 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} ({{ZipArchiveEntryRequest}}), 
this will solve the issue.

There is a bug in Plexus Archiver 
(https://github.com/codehaus-plexus/plexus-archiver/issues/53) that is example 
for such use case. Plexus Archiver have option that allows entries that are 
already zip files to be stored instead of compressed 
({{AbstractZipArchiver.recompressAddedZips}}). To detect if given entry is zip 
archive, {{AbstractZipArchiver}} should read the first several bytes of the 
stream. So creating {{ZipArchiveEntry}} before the stream is opened is not 
useful - the compress mode is not known. Opening the stream when the  
{{ZipArchiveEntry}} is created won't work either. Because you can add entries 
to {{ParallelScatterZipCreator}} a lot faster than you could compress them you 
could open too many files very fast. And I don't think opening and closing the 
stream is an option as such operations could be relatively expensive in the 
general case. But if it could supply both the {{ZipArchiveEntry}} and the 
{{InputStream}} just-in-time (by passing {{ZipArchiveEntryRequestSupplier}} to 
{{ParallelScatterZipCreator}}) then the problem is solved.

What do you think. Does the addition of 
{{ParallelScatterZipCreator#addArchiveEntry(ZipArchiveEntryRequestSupplier)}} 
make sense?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (COMPRESS-375) Allow the clients of ParallelScatterZipCreator to provide ZipArchiveEntryRequestSupplier

2016-11-30 Thread Plamen Totev (JIRA)

 [ 
https://issues.apache.org/jira/browse/COMPRESS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Totev updated COMPRESS-375:
--
Description: 
Currently clients of {{ParallelScatterZipCreator}} could provide 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} through 
{{ParallelScatterZipCreator#addArchiveEntry}}. From those two a 
{{ZipArchiveEntryRequest}} is created. Providing {{InputStreamSupplier}} solves 
the problem with opening too many files - streams are opened just-in-time - 
when an entry is  compressed, not when it's submitted.

But there are use cases when the stream may contain information about the 
{{ZipArchiveEntry}}. In those cases creating {{ZipArchiveEntry}} before the 
{{InputStream}} is opened won't work. If there is an option to supply both 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} ({{ZipArchiveEntryRequest}}), 
this will solve the issue.

There is a bug in Plexus Archiver 
(https://github.com/codehaus-plexus/plexus-archiver/issues/53) that is example 
for such use case. Plexus Archiver have option that allows entries that are 
already zip files to be stored instead of compressed 
({{AbstractZipArchiver.recompressAddedZips}}). To detect if given entry is zip 
archive, {{AbstractZipArchiver}} should read the first several bytes of the 
stream. So creating {{ZipArchiveEntry}} before the stream is opened is not 
useful - the compress mode is not known. Opening the stream when the  
{{ZipArchiveEntry}} is created won't work either. Because you can add entries 
to {{ParallelScatterZipCreator}} a lot faster than you could compress them you 
could open too many files very fast. And I don't think opening and closing the 
stream is an option as such operations could be relatively expensive in the 
general case. But if it could supply both the {{ZipArchiveEntry}} and the 
{{InputStream}} just-in-time (by passing {{ZipArchiveEntryRequestSupplier}} to 
{{ParallelScatterZipCreator}}) then the problem is solved.

What do you think? Does the addition of 
{{ParallelScatterZipCreator#addArchiveEntry(ZipArchiveEntryRequestSupplier)}} 
makes sense?

  was:
Currently clients of {{ParallelScatterZipCreator}} could provide 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} through 
{{ParallelScatterZipCreator#addArchiveEntry}}. From those two a 
{{ZipArchiveEntryRequest}} is created. Providing {{InputStreamSupplier}} solves 
the problem with opening too many files - streams are opened just-in-time - 
when an entry is  compressed, not when it's submitted.

But there are use cases when the stream may contain information about the 
{{ZipArchiveEntry}}. In those cases creating {{ZipArchiveEntry}} before the 
{{InputStream}} is opened won't work. If there is an option to supply both 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} ({{ZipArchiveEntryRequest}}), 
this will solve the issue.

There is a bug in Plexus Archiver 
(https://github.com/codehaus-plexus/plexus-archiver/issues/53) that is example 
for such use case. Plexus Archiver have option that allows entries that are 
already zip files to be stored instead of compressed 
({{AbstractZipArchiver.recompressAddedZips}}). To detect if given entry is zip 
archive, {{AbstractZipArchiver}} should read the first several bytes of the 
stream. So creating {{ZipArchiveEntry}} before the stream is opened is not 
useful - the compress mode is not known. Opening the stream when the  
{{ZipArchiveEntry}} is created won't work either. Because you can add entries 
to {{ParallelScatterZipCreator}} a lot faster than you could compress them you 
could open too many files very fast. And I don't think opening and closing the 
stream is an option as such operations could be relatively expensive in the 
general case. But if it could supply both the {{ZipArchiveEntry}} and the 
{{InputStream}} just-in-time (by passing {{ZipArchiveEntryRequestSupplier}} to 
{{ParallelScatterZipCreator}}) then the problem is solved.

What do you think? Does the addition of 
{{ParallelScatterZipCreator#addArchiveEntry(ZipArchiveEntryRequestSupplier)}} 
make sense?


> Allow the clients of ParallelScatterZipCreator to provide 
> ZipArchiveEntryRequestSupplier
> 
>
> Key: COMPRESS-375
> URL: https://issues.apache.org/jira/browse/COMPRESS-375
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Plamen Totev
>
> Currently clients of {{ParallelScatterZipCreator}} could provide 
> {{ZipArchiveEntry}} and {{InputStreamSupplier}} through 
> {{ParallelScatterZipCreator#addArchiveEntry}}. From those two a 
> {{ZipArchiveEntryRequest}} is created. Providing {{InputStreamSupplier}} 
> solves the problem with opening too many files - streams are opened 
> just-in-time - when an entry is  compressed, not when it's submitted.
> But there ar

[jira] [Updated] (COMPRESS-375) Allow the clients of ParallelScatterZipCreator to provide ZipArchiveEntryRequestSupplier

2016-11-30 Thread Plamen Totev (JIRA)

 [ 
https://issues.apache.org/jira/browse/COMPRESS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Totev updated COMPRESS-375:
--
Description: 
Currently clients of {{ParallelScatterZipCreator}} could provide 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} through 
{{ParallelScatterZipCreator#addArchiveEntry}}. From those two a 
{{ZipArchiveEntryRequest}} is created. Providing {{InputStreamSupplier}} solves 
the problem with opening too many files - streams are opened just-in-time - 
when an entry is  compressed, not when it's submitted.

But there are use cases when the stream may contain information about the 
{{ZipArchiveEntry}}. In those cases creating {{ZipArchiveEntry}} before the 
{{InputStream}} is opened won't work. If there is an option to supply both 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} ({{ZipArchiveEntryRequest}}), 
this will solve the issue.

There is a bug in Plexus Archiver 
(https://github.com/codehaus-plexus/plexus-archiver/issues/53) that is example 
for such use case. Plexus Archiver have option that allows entries that are 
already zip files to be stored instead of compressed 
({{AbstractZipArchiver.recompressAddedZips}}). To detect if given entry is zip 
archive, {{AbstractZipArchiver}} should read the first several bytes of the 
stream. So creating {{ZipArchiveEntry}} before the stream is opened is not 
useful - the compress mode is not known. Opening the stream when the  
{{ZipArchiveEntry}} is created won't work either. Because you can add entries 
to {{ParallelScatterZipCreator}} a lot faster than you could compress them you 
could open too many files very fast. And I don't think opening and closing the 
stream is an option as such operations could be relatively expensive in the 
general case. But if it could supply both the {{ZipArchiveEntry}} and the 
{{InputStream}} just-in-time (by passing {{ZipArchiveEntryRequestSupplier}} to 
{{ParallelScatterZipCreator}}) then the problem is solved.

What do you think? Does the addition of 
{{ParallelScatterZipCreator#addArchiveEntry(ZipArchiveEntryRequestSupplier)}} 
make sense?

  was:
Currently clients of {{ParallelScatterZipCreator}} could provide 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} through 
{{ParallelScatterZipCreator#addArchiveEntry}}. From those two a 
{{ZipArchiveEntryRequest}} is created. Providing {{InputStreamSupplier}} solves 
the problem with opening too many files - streams are opened just-in-time - 
when an entry is  compressed, not when it's submitted.

But there are use cases when the stream may contain information about the 
{{ZipArchiveEntry}}. In those cases creating {{ZipArchiveEntry}} before the 
{{InputStream}} is opened won't work. If there is an option to supply both 
{{ZipArchiveEntry}} and {{InputStreamSupplier}} ({{ZipArchiveEntryRequest}}), 
this will solve the issue.

There is a bug in Plexus Archiver 
(https://github.com/codehaus-plexus/plexus-archiver/issues/53) that is example 
for such use case. Plexus Archiver have option that allows entries that are 
already zip files to be stored instead of compressed 
({{AbstractZipArchiver.recompressAddedZips}}). To detect if given entry is zip 
archive, {{AbstractZipArchiver}} should read the first several bytes of the 
stream. So creating {{ZipArchiveEntry}} before the stream is opened is not 
useful - the compress mode is not known. Opening the stream when the  
{{ZipArchiveEntry}} is created won't work either. Because you can add entries 
to {{ParallelScatterZipCreator}} a lot faster than you could compress them you 
could open too many files very fast. And I don't think opening and closing the 
stream is an option as such operations could be relatively expensive in the 
general case. But if it could supply both the {{ZipArchiveEntry}} and the 
{{InputStream}} just-in-time (by passing {{ZipArchiveEntryRequestSupplier}} to 
{{ParallelScatterZipCreator}}) then the problem is solved.

What do you think. Does the addition of 
{{ParallelScatterZipCreator#addArchiveEntry(ZipArchiveEntryRequestSupplier)}} 
make sense?


> Allow the clients of ParallelScatterZipCreator to provide 
> ZipArchiveEntryRequestSupplier
> 
>
> Key: COMPRESS-375
> URL: https://issues.apache.org/jira/browse/COMPRESS-375
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Plamen Totev
>
> Currently clients of {{ParallelScatterZipCreator}} could provide 
> {{ZipArchiveEntry}} and {{InputStreamSupplier}} through 
> {{ParallelScatterZipCreator#addArchiveEntry}}. From those two a 
> {{ZipArchiveEntryRequest}} is created. Providing {{InputStreamSupplier}} 
> solves the problem with opening too many files - streams are opened 
> just-in-time - when an entry is  compressed, not when it's submitted.
> But there are