[issue4621] zipfile returns string but expects binary

2011-05-18 Thread Tor Arvid Lund

Tor Arvid Lund torar...@gmail.com added the comment:

I was wondering what has prevented Eddies patch from being included into 
python. Has nobody volunteered to verify that it works? I would be willing to 
do that, though I have never compiled python on any platform before.

It just seems a bit silly to me that python cannot work with zip files with 
unicode file names... I just now had to do 'os.system(unzip.exe ...)' because 
zipfile did not work for me...

--
nosy: +talund

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2011-05-18 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

This issue looks to be a duplicate of #10801 which was only fixed 
(33543b4e0e5d) in Python 3.2. See also #12048: similar issue in Python 3.1.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2011-05-18 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

The initial problem is clearly a duplicate of issue #10801 which is now fixed 
in Python 3.1+ (I just backported the fix to Python 3.1).

 I just discovered that attempting to open zip member test\file
 fails where attempting to open test/file works. (...)
 It seems pretty clear that zipfile should do that for me, though.

@v+python: I don't think so, but others may agree with you. Please open a new 
issue, because it is unrelated to the initial bug report.

I'm closing this issue because the initial is now fixed.

For x.zip (UTF-8 encoded filenames with the Unicode flag) problem, there is 
already the issue #10614 which handles this case.

--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2010-03-26 Thread Glenn Linderman

Glenn Linderman v+pyt...@g.nevcal.com added the comment:

I just discovered that attempting to open zip member test\file fails where 
attempting to open test/file works.  Granted the zip contains / not \ 
characters, but using the os.path stuff (on windows) to manipulate the names 
before attempting to open the zip member produces \ characters.  Clearly, I 
could switch them back.  It seems pretty clear that zipfile should do that for 
me, though.

A small, self-contained zip file test case is attached, being a zip that is 
named .py 

My testing using Python 3.1.1

--
nosy: +v+python
Added file: http://bugs.python.org/file16674/testzip.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-20 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

In the ZIP file format, a filename is a byte string because we don't 
know the encoding. You can not guess the encoding because it's not 
stored in the ZIP file and it depends on the OS and the OS 
configuration. So t1.filename have to be a byte string and  
testzip.read() have to use bytes and not str.

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-20 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Oh, I see that zipfile.py uses the following code to choose the 
filename encoding:
if flags  0x800:
# UTF-8 file names extension
filename = filename.decode('utf-8')
else:
# Historical ZIP filename encoding
filename = filename.decode('cp437')

So I'm maybe wrong: the encoding is known using a flag?

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-20 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Test on Ubuntu Gutsy (utf8 file system) with zip 2.32:
$ mkdir x
$ touch x/hé
$ zip -r x.zip x
  adding: x/ (stored 0%)
  adding: x/hé (stored 0%)

$ python # 3.0 trunk
 import zipfile
 testzip = zipfile.ZipFile('x.zip')
 testzip.infolist()[1].filename
'x/hé'
 print(ascii(testzip.infolist()[1].filename))
'x/h\u251c\u2310'

Using my own file parse (hachoir-wx), I can see that flags=0 and 
filename=bytes {78 2f 68 c3 a9} (x/hé in UTF-8).

You can try x.zip: I attached the file.

Added file: http://bugs.python.org/file12406/x.zip

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-20 Thread Eddie

Eddie skr...@gmail.com added the comment:

The problem is not about reading the filenames, but reading the contents
of a file with filename that has non-ascii charaters.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-20 Thread Eddie

Eddie skr...@gmail.com added the comment:

I read again what STINNER Victor and I think that he found another bug.

Because, when listing the filenames of that zip file, the names are not
displayed correctly. In fact
'x/h├⌐' == 'x/hé'.encode('utf-8').decode('cp437')

So, there is again a problem with encodings when reading the contents.

The problem here is that when reading one can not give the filename,
because is not a key in the NameToInfo dictionary.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-20 Thread Eddie

Eddie skr...@gmail.com added the comment:

Attached is a patch that solves (I hope) the initial problem, the one
from Francesco Ricciardi.

--
keywords: +patch
Added file: http://bugs.python.org/file12409/patch.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-18 Thread Eddie

Eddie skr...@gmail.com added the comment:

Sorry, my bad.
I did tried it but with the wrong version (2.5). And it worked perfectly.

So sorry again for my mistake.

Anyways, I've found the error.

The problem is caused by different encodings used when zipping.

In open, the method is comparing b't\x82st.xml' against
b't\xc3\xa9st.xml', and of course they are different.
But they are no so different, because b't\x82st.xml' is
'tést'.encode('cp437') and b't\xc3\xa9st.xml' is 'tést'.encode(utf-8).

The problem arises because the open method supposes the filename is in
utf-8 encoding, but in __init__ it realizes that the encoding depends on
the flags. 
if flags  0x800:
filename = filename.decode.('utf-8')
else:
filename = filename.decode.('cp437')

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-17 Thread Francesco Ricciardi

Francesco Ricciardi francesco.riccia...@hp.com added the comment:

If that is what is requested, then the manual entry for ZipFile.read
must be corrected, because it states:

ZipFile.read(name[, pwd])  name is the name of the file in the
archive, or a ZipInfo object.


However, Eddie, you haven't tried what you suggested, because this is
what you would get:

 import zipfile
 testzip = zipfile.ZipFile('test.zip')
 t1 = testzip.infolist()[0]
 t1.filename
'tést.xml'
 data = testzip.read(t1.filename)
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Python30\lib\zipfile.py, line 843, in read
return self.open(name, r, pwd).read()
  File C:\Python30\lib\zipfile.py, line 883, in open
% (zinfo.orig_filename, fname))
zipfile.BadZipfile: File name in directory 'tést.xml' and header
b't\x82st.xml' differ.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4621] zipfile returns string but expects binary

2008-12-10 Thread Francesco Ricciardi

New submission from Francesco Ricciardi [EMAIL PROTECTED]:

Each entry of a zip file, as read by the zipfile module, can be accessed
via a ZipInfo object. The filename attribute of ZipInfo is a string.
However, the read method of a ZipFile object expects a binary as
argument, or at least this is what I can deduct from the following behavior:

 import zipfile
 testzip = zipfile.ZipFile('test.zip')
 t1 = testzip.infolist()[0]
 t1.filename
'tést.xml'
 data = testzip.read(testzip.infolist()[0])
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Python30\lib\zipfile.py, line 843, in read
return self.open(name, r, pwd).read()
  File C:\Python30\lib\zipfile.py, line 883, in open
% (zinfo.orig_filename, fname))
zipfile.BadZipfile: File name in directory 'tést.xml' and header
b't\x82st.xml' differ.

The test.zip file is attached as help in reproducing this error.

--
components: Library (Lib)
files: test.zip
messages: 77555
nosy: francescor
severity: normal
status: open
title: zipfile returns string but expects binary
type: behavior
versions: Python 3.0
Added file: http://bugs.python.org/file12319/test.zip

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4621
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com