[issue26039] More flexibility in zipfile interface

2016-03-27 Thread Марк Коренберг

Марк Коренберг added the comment:

Also, Python have problems with streaming READ of zip archive. I mean ability 
to read (in some form iterate over) archive when seeking is not available.

I mean iteration like one in TAR archives.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-03-27 Thread Марк Коренберг

Марк Коренберг added the comment:

I have the same problem, and make monkey-patch by myself BEFORE seeing this 
issue (!)

Example how I can do that is attached under name "socketpair.py".

It will be nice if you take my idea. And after that streaming of zip files 
would be possible.

--
nosy: +mmarkk
Added file: http://bugs.python.org/file42311/main.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-03-08 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Yes, using the lock in write() or writestr() is equally compatible.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-03-07 Thread Martin Panter

Martin Panter added the comment:

Acquiring a lock in open(mode="w") and releasing it in close() doesn’t seem 
like a particularly useful feature to me. Maybe it would be better (and equally 
compatible?) to just use the lock in the internal write() or writestr() 
implementations.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-03-03 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Sorry for the delay Thomas. This is complex and important to me issue and I 
want to be attentive to it.

I think we should preserve long existing behavior even if it is not documented 
(documenting or deprecating it is other issue). Concurrent reading and wring 
with concurrent reading always (at least since adding ZipFile.open()) worked, 
but was never documented nor tested. Concurrent writing was added rather as a 
side effect of issue14099. If there is a benefit from getting rid of it, we can 
break it.

For preserving current behavior ZipFile.open(mode='w') should acquire the lock 
and it should be released in _ZipWriteFile.close().

I have added other comments on Rietveld. The patch needs to be updated to 
resolve conflicts with committed zipinfo-from-file5.patch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-03-03 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Serhiy, have you had a chance to look at what the zf.open(mode='w') patch does 
with the lock?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-25 Thread Martin Panter

Martin Panter added the comment:

When you say concurrent writes should be impossible, I guess that only applies 
to a single-threaded program. There is no lock protecting the “self._fileRefCnt 
> 1” check and related manipulation (not that I am saying there should be).

For serializing concurrent writes to a single handle, if that is intended it 
should be documented. If it is not intended, maybe it should be removed (my 
preference)?

It would be good to wait if Serhiy can explain the purpose of the lock, seeing 
he was involved in adding it and probably knows a lot more about this module 
than I.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-24 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Oh, I see test_interleaved now, which does test overlapping reads of two files 
from the same zip file.

Do you want that clarified in the docs - which don't currently mention the lock 
at all - or in a comment in the module?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-24 Thread Thomas Kluyver

Thomas Kluyver added the comment:

My initial patch would have allowed passing a readable file-like object into 
zipfile. I was persuaded that allowing ZipFile.open() to return a writable 
object was a more intuitive and flexible API.

Concurrent writes with zf.open(mode='w') should be impossible, because it only 
allows one open handle at a time. It still uses the lock inside _ZipWriteFile, 
so concurrent writes to a single handle should be serialised.

I would not recommend anyone try to do concurrent access to a single ZipFile 
object from multiple threads or coroutines. It's quite stateful, there is no 
mention of concurrency in the docs, and no tests I can see that try concurrent 
access. The only thing that might be safe is reading from two files inside the 
zip file (which shouldn't be changed by this), but I wouldn't want to guarantee 
even that.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-23 Thread Martin Panter

Martin Panter added the comment:

Thanks for the pointer Dhiraj. I prefer the open(mode="w") version proposed 
here, as being more flexible. This way you could wrap the writer object in e.g. 
TextIOWrapper. The other patch requires passing in a file reader object.

Having another look at zipfile-open-w4.patch, I have some thoughts about 
locking and the writing-while-reading restriction:

The lock seems to be designed to serialize reads and writes (which operate on 
the common underlying file object). See revision 4973ccd46e32, and 
, although it would be good to 
document this, or at the minimum add a comment explaining the purpose and scope 
of the lock.

Currently, it appears that write() and writestr() acquire the lock, so I 
presume it is intended that these methods can be called multiple times 
concurrently, and also while the zip file is being read. With the patch, 
writestr() still preserves the lock usage, but write() does not because it is 
now implemented in terms of the new open(mode="w") method.

I think it would be good to clarify that the lock does _not_ protect concurrent 
writes via open(mode="w").

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-23 Thread Dhiraj

Dhiraj added the comment:

Please ha Look on issue 11980

http://bugs.python.org/issue11980
Already have been Patched

--
nosy: +DhirajMishra

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-23 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Ping! ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-15 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Hi Serhiy, any more comments on the zf.open() patch?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-08 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Thanks Serhiy! I'll keep an eye out for comments on the other patch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-07 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Committed zipinfo-from-file5.patch. Now I'm starting to review 
zipfile-open-w4.patch (I concurred with most Martin's comments for previous 
patches).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-07 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 7fea2cebc604 by Serhiy Storchaka in branch 'default':
Issue #26039: Added zipfile.ZipInfo.from_file() and zipinfo.ZipInfo.is_dir().
https://hg.python.org/cpython/rev/7fea2cebc604

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-02-04 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Is there anything more I should be doing with either of these patches? I think 
I've incorporated all review comments I've seen. Thanks!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-30 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Updated version of the ZipInfo.from_file() patch attached.

--
Added file: http://bugs.python.org/file41758/zipinfo-from-file5.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-29 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Here's a new version of the zf.open() patch following Martin's review (thanks 
Martin!).

I agree that it feels a bit awkward having two completely different actions for 
zf.open(), but it is a familiar interface, and since the mode parameter is 
already there, it requires a minimum of new public API. But I'm happy to add a 
new method like open_write() or write_handle() if people prefer that.

The comments on the other patch are minimal, I'll put a new version of that 
together as well.

--
Added file: http://bugs.python.org/file41752/zipfile-open-w3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-29 Thread Thomas Kluyver

Changes by Thomas Kluyver :


Added file: http://bugs.python.org/file41753/zipinfo-from-file3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-29 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Thanks Serhiy for review comments.

--
Added file: http://bugs.python.org/file41754/zipinfo-from-file4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-27 Thread Thomas Kluyver

Thomas Kluyver added the comment:

The '2' versions of the two different patches include some docs and tests for 
these new features.

--
Added file: http://bugs.python.org/file41726/zipfile-open-w2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-26 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Serhiy, any chance you'd have some time to review my patch(es)? Or is there 
someone else interested in zipfile I might interest? Thanks :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-26 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Thanks! I will work on docs and tests.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Your patches are great Thomas! This is just what I want to implement. It will 
take some time to make a careful review. Besides possible corrections I think 
these features will be added in 3.6. But new features need tests and 
documenting.

--
assignee:  -> serhiy.storchaka
stage:  -> test needed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-26 Thread Thomas Kluyver

Changes by Thomas Kluyver :


Added file: http://bugs.python.org/file41722/zipinfo-from-file2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-17 Thread Thomas Kluyver

Thomas Kluyver added the comment:

zipinfo-from-file.patch has an orthogonal but related change: the code in 
ZipFile.write() to construct a ZipInfo object from a filesystem file is pulled 
out to a classmethod ZipInfo.from_file().

Together, these changes make it much easier to control how a file is written to 
a zip file, like this:

zi = ZipInfo.from_file(blah)
# ... manipulate zi...
with open(blah, 'rb') as src, zf.open(zi, 'w') as dest:
# copy of the read/write loop - maybe this should be
# pulled out separately as well?

If these changes make it in, I might put a backported copy of the module on 
PyPI so I can start using it without waiting for Python 3.6.

--
Added file: http://bugs.python.org/file41637/zipinfo-from-file.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26039] More flexibility in zipfile interface

2016-01-15 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Attached is a first go at a patch enabling zipfile.open(blah, mode='w')

Files must be written sequentially, so you have to close one writing handle 
before opening another. If you try to open a second one before closing the 
first, it will raise RuntimeError. I considered doing something where it would 
write to temporary files and add them to the zip file when they were closed, 
but it seemed like a bad idea.

You can almost certainly break this by reading from a zip file while there's an 
open writing handle. Resolving this is tricky because there's a disconnect in 
the requirements for reading and writing: writing allows for a non-seekable 
output stream, but reading assumes that you can seek freely. The simplest fix 
is to block reading while there is an open file handle. I don't think many 
people will need to read one file from a zip while writing another, anyway.

I have used the lock, but I haven't thought carefully about thread safety, so 
that should still be checked carefully.

--
Added file: http://bugs.python.org/file41626/zipfile-open-w.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com