[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-28 Thread Wurstbrot mit Senf (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218158#comment-13218158
 ] 

Wurstbrot mit Senf commented on COMPRESS-176:
-

Seems to be OK. Got a directory ‖ with the file ‖.txt in it.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
Assignee: Stefan Bodewig
 Fix For: 1.4

 Attachments: MkZip.java, test-7zip.zip, test-doublevertical.zip, 
 test-windows.zip, test-winzip.zip, testzap-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-28 Thread Wurstbrot mit Senf (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218161#comment-13218161
 ] 

Wurstbrot mit Senf commented on COMPRESS-176:
-

But 7Zip and windows built in zip both create a directory named %U2016 with a 
file named %U2016.txt in it.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
Assignee: Stefan Bodewig
 Fix For: 1.4

 Attachments: MkZip.java, test-7zip.zip, test-doublevertical.zip, 
 test-windows.zip, test-winzip.zip, testzap-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-27 Thread Stefan Bodewig (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217122#comment-13217122
 ] 

Stefan Bodewig commented on COMPRESS-176:
-

Whether we need forward slashes in Unicode extra fields can only be answered by 
somebody using WinZIP.  The best would be creating a test archive with a 
directory that contains a character in its name that is not part of CP437 - and 
to be safe not part of the platform's default encoding either.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
Assignee: Stefan Bodewig
 Fix For: 1.4

 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip, 
 testzap-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-27 Thread Stefan Bodewig (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217899#comment-13217899
 ] 

Stefan Bodewig commented on COMPRESS-176:
-

Workaround and tests are in svn revision 1294460

I'll look into creating a test archive for the opposite direction today.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
Assignee: Stefan Bodewig
 Fix For: 1.4

 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip, 
 testzap-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-27 Thread Wurstbrot mit Senf (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217974#comment-13217974
 ] 

Wurstbrot mit Senf commented on COMPRESS-176:
-

Hi all, sounds promising. Thanks a lot, I'm looking forward to the next release.

And by the way, how could you tell that the name's a fake? ;-)


 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
Assignee: Stefan Bodewig
 Fix For: 1.4

 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip, 
 testzap-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-26 Thread Sebb (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216739#comment-13216739
 ] 

Sebb commented on COMPRESS-176:
---

Excellent!
Since \ and / are not allowed in file or folder names on Windows systems, there 
should be no case where a \ is incorrectly replaced.
And it would still work if Winzip fixes its implementation to use /, and would 
also work with other applications that use / for the extra fields.

==

There's still potentially the reverse problem - can Winzip handle / in the 
unicode extra field, or does it expect only \ ?
If so, then I guess we might need to make the generated extra fields 
configurable to use \.
I don't have the required version of Winzip to check that.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip, 
 testzap-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-25 Thread Stefan Bodewig (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216446#comment-13216446
 ] 

Stefan Bodewig commented on COMPRESS-176:
-

AFAIK what we have written down based on findings by Wolfgang Glas in 
http://commons.apache.org/compress/zip.html still stands, WinZIP is the only 
one using Unicode extra fields, all other implementations have switched to the 
language encoding flag.  The only exceptions are Windows compressed folders - 
which doesn't understand either - and InfoZIP based tools if they are compiled 
to use the extra fields.

A question to the original reporter (I'm German so I know the name's a fake 
8-): since you also have an installation of 7zip, what does 7zip think of your 
WinZIP created archive?

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-25 Thread Sebb (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216468#comment-13216468
 ] 

Sebb commented on COMPRESS-176:
---

I have 7zip installed, and it reads the archive OK.

However, I don't think that proves anything, since the plain names are correct.

I guess we could look at the 7zip source code to see if it uses the extra 
fields.

A better test would be to create a zip file using a filename that cannot be 
represented in CP437, i.e. only the extra field would show the correct name.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-25 Thread Stefan Bodewig (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216647#comment-13216647
 ] 

Stefan Bodewig commented on COMPRESS-176:
-

OK, this means nobody except for Commons Compress and InfoZIP tools seems to 
read the Unicode extra field.

This is what I get when trying to extract the original ZIP on Linux:

{noformat}
stefan@birdy:~/Desktop$ unzip test-winzip.zip 
Archive:  test-winzip.zip
  inflating: doc.txt.gz  
 extracting: doc2.txt
warning:  test-winzip.zip appears to use backslashes as path separators
   creating: ??/
  inflating: ??/??zip.zip
 extracting: ??/??.txt  
{noformat}

and it creates an ä directory.  I'll try to look through InfoZIPs sources 
what it bases it heuristics on, maybe we can use the same in Commons Compress 
to turn backslashes into slashes.


 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip, 
 testzap-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-25 Thread Stefan Bodewig (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216652#comment-13216652
 ] 

Stefan Bodewig commented on COMPRESS-176:
-

In extract.c of unzip60 line 1310ff there is this code that replaces 
backslashes with slashes.  It only replaces them in names that don't contain 
forward slashes (MBSCHR looks up a character in a character array) and only if 
hostnum indicates a FAT system.

{noformat}
/* for files from DOS FAT, check for use of backslash instead
 *  of slash as directory separator (bug in some zipper(s); so
 *  far, not a problem in HPFS, NTFS or VFAT systems)
 */
#ifndef SFX
if (G.pInfo-hostnum == FS_FAT_  !MBSCHR(G.filename, '/')) {
char *p=G.filename;

if (*p) do {
if (*p == '\\') {
if (!G.reported_backslash) {
Info(slide, 0x21, ((char *)slide,
  LoadFarString(BackslashPathSep), G.zipfn));
G.reported_backslash = TRUE;
if (!error_in_archive)
error_in_archive = PK_WARN;
}
*p = '/';
}
} while (*PREINCSTR(p));
}
#endif /* !SFX */
{noformat}

hostnum is the upper byte of version made by inside the central directory 
header - this is ZipArchiveEntry's get/setPlatform - and FS_FAT_ is 0 
(ZipArchiveEntry#PLATFORM_FAT).  We'd have all pieces together to emulate this.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip, 
 testzap-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-24 Thread Wurstbrot mit Senf (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215664#comment-13215664
 ] 

Wurstbrot mit Senf commented on COMPRESS-176:
-

Btw.: I have no problems to handle this jar using java.util.zip (Java 6) for 
some reason :-(

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-24 Thread Stefan Bodewig (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215704#comment-13215704
 ] 

Stefan Bodewig commented on COMPRESS-176:
-

This is what InfoZIP's zip on Linux says:

{noformat}
stefanb@brick:~$ zip -Tv Desktop/test-winzip.zip 
Archive:  Desktop/test-winzip.zip
testing: doc.txt.gz   OK
testing: doc2.txt OK
testing: ??\  OK
testing: ??\??zip.zip OK
testing: ??\??.txtOK
No errors detected in compressed data of Desktop/test-winzip.zip.
test of Desktop/test-winzip.zip OK
{noformat}

The entry for the directory contains a Unicode extra field with 0xc3 0xa4 0x5c 
as UTF-8 encoded name.  This actually is ä\.

Since directory names in ZIP archives must end with / Compress doesn't detect 
this as a directory.  It may be possible to create a workaround like if the 
'plain name ends with a / and the unicode name uses a \ then bend it, but I 
can't say I'd like that.

Java6 likely works because it doesn't have any idea about unicode extra fields 
and simply uses the plain name.  You'd get the same behavior from 
ZipArchiveInputStream by setting useUnicodeExtraFields to false in the 
constructor.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-24 Thread Sebb (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215732#comment-13215732
 ] 

Sebb commented on COMPRESS-176:
---

The plain names use / and look OK when using CP437.

For some odd reason, the unicode extra fields use \ instead of /
I think that may be a Winzip bug - it does not make sense to use a different 
separator for the extra fields.

To confirm this is a bug, it would be useful to see how other zip tools use the 
extra fields - are there any?
Apart from Ant or other code based on Commons Compress, of course!

Alternatively, find some documentation as to the correct contents of the field.

My version of Winzip is too old to support the fields; if you have purchased a 
more recent one perhaps you could e-mail their support desk?

A possible work-round would be to make the \ = / behaviour optional; I agree 
we should not do this by default

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-21 Thread Sebb (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13212641#comment-13212641
 ] 

Sebb commented on COMPRESS-176:
---

Thanks, but you have not granted the ASF licence to use the file, which means 
we cannot include it in our test suite.

Please could you delete and reattach it?

Also, we will need the equivalent 7zip and Win7 archives for comparison.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-21 Thread Sebb (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13213223#comment-13213223
 ] 

Sebb commented on COMPRESS-176:
---

Thanks.

I'm beginning to wonder if Winzip is faulty.
The unicode filename that is stored uses \ whereas the base name uses /.

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf
 Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip


 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts

2012-02-20 Thread Sebb (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211794#comment-13211794
 ] 

Sebb commented on COMPRESS-176:
---

Could you attach minimal sample archives which show the problem?

 ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
 Umlauts
 

 Key: COMPRESS-176
 URL: https://issues.apache.org/jira/browse/COMPRESS-176
 Project: Commons Compress
  Issue Type: Bug
  Components: Archivers
Affects Versions: 1.3
 Environment: Windows 7
Reporter: Wurstbrot mit Senf

 There is a problem when handling a WinZip-created zip with Umlauts in 
 directories.
 I'm accessing a zip file created with WinZip containing a directory with an 
 umlaut (ä) with ArchiveInputStream. When creating the zip file the 
 unicode-flag of winzip had been active.
 The following problem occurs when accessing the entries of the zip:
 the ArchiveEntry for a directory containing an umlaut is not marked as a 
 directory and the file names for the directory and all files contained in 
 that directory contain backslashes instead of slashes (i.e. completely 
 different to all other files in directories with no umlaut in their path).
 There is no difference when letting the ArchiveStreamFactory decide which 
 ArchiveInputStream to create or when using the ZipArchiveInputStream 
 constructor with the correct encoding (I've tried different encodings CP437, 
 CP850, ISO-8859-15, but still the problem persisted).
 This problem does not occur when using the very same zip file but compressed 
 by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira