[ 
https://issues.apache.org/jira/browse/VFS-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064341#comment-16064341
 ] 

Guido Schnepp commented on VFS-637:
-----------------------------------

A small update: I have another ZIP file here compressed in a method 
java.util.zip isn't aware of. Search for error message 
"java.util.zip.ZipException: invalid CEN header (bad signature)". Despite the 
message this file is fully correct, but stored with an unsupported method (by 
j.u.zip) only. Any bug report reagarding this is closed immediately with the 
not so helpful answer, java.util.zip only supports compression methods DEFLATE 
und STORE, it seems. No other comments. There are a lot more compression 
methods defined unfortunately, Wikipedia says.

I will have a look for the option to switch to commons-compress even for my 
alternate OwnZipFileSystem.


> Zip files with legacy encoding and special characters let VFS crash
> -------------------------------------------------------------------
>
>                 Key: VFS-637
>                 URL: https://issues.apache.org/jira/browse/VFS-637
>             Project: Commons VFS
>          Issue Type: Bug
>         Environment: Windows 10 64 Bit, Java 8
>            Reporter: Guido Schnepp
>              Labels: easyfix
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Oracle has reworked the ZipFile object with Java 7. Since then the default 
> constructor used by commons-vfs2 2.1 is more restrictive than with Java 6. 
> The ZipFile constructor has got a second parameter (Charset) now for 
> specification of the legacy charset to be used explicitly if the ZipFile 
> doesn't state its UTF-8 compliance internally. This affects all ZIP files 
> using a legacy charset for filename encoding but not using UTF-8 is it is 
> common today. This could be a ZIP file with files containing german umlauts 
> or russian characters in archived file's filenames, for example.
> To support this new parameter with (more or less) default values, the class 
> org.apache.commons.vfs2.provider.zip.ZipFileSystem has to be extended by a 
> default charset parameter, getter or setter (as you like) to forward this 
> setting to the java.util.zip.ZipFile constructor.
> Quick workaround for me was to create a new OwnZipFileProvider referring to 
> the even new OwnZipFileSystem (extending ZipFileSystem) with the following 
> modified function. Change has been highlighted:
> {{    protected ZipFile createZipFile(final File file) throws 
> FileSystemException {
>               try {
>                       return new ZipFile(file{color:red}*, 
> Charset.forName("IBM437")*{color});
>               } catch (final IOException ioe) {
>                       throw new 
> FileSystemException("vfs.provider.zip/open-zip-file.error", file, ioe);
>               }
>       }
> }}
> Presetting to charset 437 as legacy default charset seems to be a a good 
> workaround as stated in appendix D here: 
> https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT :
> "D.1 The ZIP format has historically supported only the original IBM PC 
> character encoding set, commonly referred to as IBM Code Page 437.  This 
> limits storing file name characters to only those within the original MS-DOS 
> range of values and does not properly support file names in other character 
> encodings, or  languages. [...]"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to