[Python-Dev] LZW support in tarfile ?
Hello, I want to remove the usage of the tar command in Distutils in favor or the tarfile module. But, there's an option in Distutils.make_archive to create a tarball using the compress [1] program rather than gzip or bzip2. Using tar -Z, it will pipe it to the compress program if present. This program implements the LZW algorithm [2]. The LZW used to be patented but this patent seem to be expired in every country now [3]. On Distutils side I can work things out so the tar archive created can be piped to an arbitraty compression program when it is not compressed using bzip2 or gzip; But I was wondering if we should we add a LZW support in tarinfo, besides gzip and bzip2 ? Although this compression standard doesn't seem very used these days, Regards Tarek [1] http://en.wikipedia.org/wiki/Compress [2] http://en.wikipedia.org/wiki/LZW [3] http://www.unisys.com/about__unisys/lzw -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 : Changing the .egg-info structure
Alexander Shigin wrote: В Сбт, 16/05/2009 в 23:15 +0100, MRAB пишет: FYI, on RISC OS '/' is a valid filename character and '.' is used as the directory separator. I'd probably say that TAB is s reasonable character to use, even though it's OK in POSIX; after all, should anyone really be using a control character in a filename? The '\0' char is invalid in both windows and posix. I don't know if one valid on RISC OS. '\0' isn't a valid filename character on RISC OS. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LZW support in tarfile ?
Tarek Ziadé ziade.tarek at gmail.com writes: But I was wondering if we should we add a LZW support in tarinfo, besides gzip and bzip2 ? Although this compression standard doesn't seem very used these days, It would be more useful to add LZMA / xz support. I don't think compress is used anymore, except perhaps on old legacy systems. On my Linux system, I have lots of .gz, .bz2 and .lzma files, but absolutely no .Z file. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LZW support in tarfile ?
Antoine Pitrou wrote: Tarek Ziadé ziade.tarek at gmail.com writes: But I was wondering if we should we add a LZW support in tarinfo, besides gzip and bzip2 ? Although this compression standard doesn't seem very used these days, It would be more useful to add LZMA / xz support. I don't think compress is used anymore, except perhaps on old legacy systems. On my Linux system, I have lots of .gz, .bz2 and .lzma files, but absolutely no .Z file. I've seen the occasional .Z file in recent years, but never that I recall for a Python package. As plugging in external compression tools is less likely to work cross-platform wouldn't it be both easier and better to deprecate (and not replace) the compress support. If there is a huge outcry adding LZW support to tarfile can be reconsidered. Michael Foord Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LZW support in tarfile ?
But, there's an option in Distutils.make_archive to create a tarball using the compress [1] program rather than gzip or bzip2. Using tar -Z, it will pipe it to the compress program if present. This program implements the LZW algorithm [2]. As everybody else says: it might be best to just remove that option. For compatibility, perhaps deprecate it in 2.7 and 3.1, and remove in in 3.2. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
Ned Deily n...@acm.org (ND) wrote: ND In article m2ocueq6mm@cs.uu.nl, Piet van Oostrum p...@cs.uu.nl ND wrote: Ronald Oussoren ronaldousso...@mac.com (RO) wrote: RO For what it's worth, the OSX API's seem to behave as follows: RO * If you create a file with an non-UTF8 name on a HFS+ filesystem the RO system automaticly encodes the name. RO That is, open(chr(255), 'w') will silently create a file named '%FF' RO instead of the name you'd expect on a unix system. Not for me (I am using Python 2.6.2). f = open(chr(255), 'w') Traceback (most recent call last): File stdin, line 1, in module IOError: [Errno 22] invalid mode ('w') or filename: '\xff' ND What version of OSX are you using? On Tiger 10.4.11 I see the failure ND you see but on Leopard 10.5.6 the behavior Ronald reports. Yes, I am using Tiger (10.4.11). Interesting that it has changed on Leopard. -- Piet van Oostrum p...@cs.uu.nl URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 384: Defining a Stable ABI
Thomas Wouters reminded me of a long-standing idea; I finally found the time to write it down. Please comment! Regards, Martin PEP: 384 Title: Defining a Stable ABI Version: $Revision: 72754 $ Last-Modified: $Date: 2009-05-17 21:14:52 +0200 (So, 17. Mai 2009) $ Author: Martin v. Löwis mar...@v.loewis.de Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 17-May-2009 Python-Version: 3.2 Post-History: Abstract Currently, each feature release introduces a new name for the Python DLL on Windows, and may cause incompatibilities for extension modules on Unix. This PEP proposes to define a stable set of API functions which are guaranteed to be available for the lifetime of Python 3, and which will also remain binary-compatible across versions. Extension modules and applications embedding Python can work with different feature releases as long as they restrict themselves to this stable ABI. Rationale = The primary source of ABI incompatibility are changes to the lay-out of in-memory structures. For example, the way in which string interning works, or the data type used to represent the size of an object, have changed during the life of Python 2.x. As a consequence, extension modules making direct access to fields of strings, lists, or tuples, would break if their code is loaded into a newer version of the interpreter without recompilation: offsets of other fields may have changed, making the extension modules access the wrong data. In some cases, the incompatibilities only affect internal objects of the interpreter, such as frame or code objects. For example, the way line numbers are represented has changed in the 2.x lifetime, as has the way in which local variables are stored (due to the introduction of closures). Even though most applications probably never used these objects, changing them had required to change the PYTHON_API_VERSION. On Linux, changes to the ABI are often not much of a problem: the system will provide a default Python installation, and many extension modules are already provided pre-compiled for that version. If additional modules are needed, or additional Python versions, users can typically compile them themselves on the system, resulting in modules that use the right ABI. On Windows, multiple simultaneous installations of different Python versions are common, and extension modules are compiled by their authors, not by end users. To reduce the risk of ABI incompatibilities, Python currently introduces a new DLL name pythonXY.dll for each feature release, whether or not ABI incompatibilities actually exist. With this PEP, it will be possible to reduce the dependency of binary extension modules on a specific Python feature release, and applications embedding Python can be made work with different releases. Specification = The ABI specification falls into two parts: an API specification, specifying what function (groups) are available for use with the ABI, and a linkage specification specifying what libraries to link with. The actual ABI (layout of structures in memory, function calling conventions) is not specified, but implied by the compiler. As a recommendation, a specific ABI is recommended for selected platforms. During evolution of Python, new ABI functions will be added. Applications using them will then have a requirement on a minimum version of Python; this PEP provides no mechanism for such applications to fall back when the Python library is too old. Terminology --- Applications and extension modules that want to use this ABI are collectively referred to as applications from here on. Header Files and Preprocessor Definitions - Applications shall only include the header file Python.h (before including any system headers), or, optionally, include pyconfig.h, and then Python.h. During the compilation of applications, the preprocessor macro Py_LIMITED_API must be defined. Doing so will hide all definitions that are not part of the ABI. Structures -- Only the following structures and structure fields are accessible to applications: - PyObject (ob_refcnt, ob_type) - PyVarObject (ob_base, ob_size) - Py_buffer (buf, obj, len, itemsize, readonly, ndim, shape, strides, suboffsets, smalltable, internal) - PyMethodDef (ml_name, ml_meth, ml_flags, ml_doc) - PyMemberDef (name, type, offset, flags, doc) - PyGetSetDef (name, get, set, doc, closure) The accessor macros to these fields (Py_REFCNT, Py_TYPE, Py_SIZE) are also available to applications. The following types are available, but opaque (i.e. incomplete): - PyThreadState - PyInterpreterState Type Objects The structure of type objects is not available to applications; declaration of static type objects is not possible anymore (for applications using this ABI). Instead, type objects get created dynamically. To allow an easy creation of types (in particular, to be able to fill out function pointers
Re: [Python-Dev] PEP 384: Defining a Stable ABI
On Sun, May 17, 2009 at 10:54 PM, Martin v. Löwis mar...@v.loewis.de wrote: Excluded Functions -- Functions declared in the following header files are not part of the ABI: - cellobject.h - classobject.h - code.h - frameobject.h - funcobject.h - genobject.h - pyarena.h - pydebug.h - symtable.h - token.h - traceback.h What kind of effect does this have on optimization efforts, for example all the stuff done by Antoine Pitrou over the last few months, and the first few results from unladen? Will it mean we won't get to the good optimizations until 4.0? Or does it just mean unladen swallow takes longer to come back to trunk (until 4.0) and every extension author who wants to be compatible with it will basically have the same burden as now? Cheers, Dirkjan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
Functions declared in the following header files are not part of the ABI: - cellobject.h - classobject.h - code.h - frameobject.h - funcobject.h - genobject.h - pyarena.h - pydebug.h - symtable.h - token.h - traceback.h What kind of effect does this have on optimization efforts, for example all the stuff done by Antoine Pitrou over the last few months, and the first few results from unladen? I fail to see the relationship, so: no effect that I can see. Why do you think that optimization efforts could be related to the PEP 384 proposal? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
Martin v. Löwis schrieb: Header Files and Preprocessor Definitions - Applications shall only include the header file Python.h (before including any system headers), or, optionally, include pyconfig.h, and then Python.h. What about structmember.h? It's not yet included with Python.h AFAICS. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
On Mon, May 18, 2009 at 12:07 AM, Martin v. Löwis mar...@v.loewis.de wrote: I fail to see the relationship, so: no effect that I can see. Why do you think that optimization efforts could be related to the PEP 384 proposal? It would seem to me that optimizations are likely to require data structure changes, for exactly the kind of core data structures that you're talking about locking down. But that's just a high-level view, I might be wrong. Cheers, Dirkjan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
Header Files and Preprocessor Definitions - Applications shall only include the header file Python.h (before including any system headers), or, optionally, include pyconfig.h, and then Python.h. What about structmember.h? It's not yet included with Python.h AFAICS. Right - I think it should be, though. Is there a reason why it's not included? The only reason I can see is that it isn't completely namespace-safe, e.g. it defines a constant READONLY. Not sure whether the T_ constants would need to be changed as well. So if that's the rationale, I would propose to make it namespace-safe under a different file name, and add alias #defines in structmember.h for compatibility. I also think this should happen independent of PEP 384. See also issue 2897 - perhaps we can even fix it for 3.1. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
Dirkjan Ochtman dirkjan at ochtman.nl writes: It would seem to me that optimizations are likely to require data structure changes, for exactly the kind of core data structures that you're talking about locking down. But that's just a high-level view, I might be wrong. Unless I'm misunderstanding something, Martin doesn't advocate locking data structures down (except a couple of outliers such as Py_buffer). An ABI-compliant application mustn't tinker directly with Python's data structures, but use the ABI functions. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
On Mon, May 18, 2009 at 12:43 AM, Antoine Pitrou solip...@pitrou.net wrote: Unless I'm misunderstanding something, Martin doesn't advocate locking data structures down (except a couple of outliers such as Py_buffer). An ABI-compliant application mustn't tinker directly with Python's data structures, but use the ABI functions. Right. Sorry about the noise, then. Cheers, Dirkjan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
Dirkjan Ochtman wrote: On Mon, May 18, 2009 at 12:07 AM, Martin v. Löwis mar...@v.loewis.de wrote: I fail to see the relationship, so: no effect that I can see. Why do you think that optimization efforts could be related to the PEP 384 proposal? It would seem to me that optimizations are likely to require data structure changes, for exactly the kind of core data structures that you're talking about locking down. But that's just a high-level view, I might be wrong. Ah. It's exactly the opposite: The purpose of the PEP is not to lock the data structures down, but to allow more flexible evolution of them - by completely hiding them from extension modules. Currently, any data structure change must be weighed for its impact on binary compatibility. With the PEP, changing structures can be done fairly freely - with the exception of the very few structures that do get locked down. In particular, the list of header files that you quoted precisely contains the structures that can be modified with no impact on the ABI. I'm not aware that any of the structures that I propose to lock would be relevant for optimization - but I might be wrong. If so, I'd like to know, and it would be possible to add accessor functions in cases where extension modules might still legitimately want to access certain fields. Certain changes to the VM would definitely be binary-incompatible, such as removal of reference counting. However, such a change would probably have a much wider effect, breaking not just binary compatibility, but also source compatibility. It would be justified to call a Python release that makes such a change 4.0. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
Dino Viehland wrote: Dirkjan Ochtman wrote: It would seem to me that optimizations are likely to require data structure changes, for exactly the kind of core data structures that you're talking about locking down. But that's just a high-level view, I might be wrong. In particular I would guess that ref counting is the biggest issue here. I would think not directly exposing the field and having inc/dec ref Functions (real methods, not macros) for it would give a lot more ability to change the API in the future. In the context of optimization, I'm skeptical that introducing functions for the reference counting would be useful. Making the INCREF/DECREF macros functions just in case the reference counting goes away is IMO an unacceptable performance cost. Instead, such a change should go through the regular deprecation procedure and/or cause the release of Python 4.0. It also might make it easier for alternate implementations to support the same API so some modules could work cross implementation - but I suspect that's a non-goal of this PEP :). Indeed :-) I'm also skeptical that this would actually allow cross-implementation modules to happen. The list of functions that an alternate implementation would have to provide is fairly long. The memory management APIs in particular also assume a certain layout of Python objects in general, namely that they start with a header whose size is a compile-time constant. Again, making this more flexible just in case would also impact performance, and probably fairly badly so. Other fields directly accessed (via macros or otherwise) might have similar problems but they don't seem as core as ref counting. Access to the type object reference is probably similar. All the other structs are used directly in C code, with no accessor macros. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
Dirkjan Ochtman wrote: It would seem to me that optimizations are likely to require data structure changes, for exactly the kind of core data structures that you're talking about locking down. But that's just a high-level view, I might be wrong. In particular I would guess that ref counting is the biggest issue here. I would think not directly exposing the field and having inc/dec ref Functions (real methods, not macros) for it would give a lot more ability to change the API in the future. It also might make it easier for alternate implementations to support the same API so some modules could work cross implementation - but I suspect that's a non-goal of this PEP :). Other fields directly accessed (via macros or otherwise) might have similar problems but they don't seem as core as ref counting. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
Martin v. Löwis wrote: Dino Viehland wrote: Dirkjan Ochtman wrote: It would seem to me that optimizations are likely to require data structure changes, for exactly the kind of core data structures that you're talking about locking down. But that's just a high-level view, I might be wrong. In particular I would guess that ref counting is the biggest issue here. I would think not directly exposing the field and having inc/dec ref Functions (real methods, not macros) for it would give a lot more ability to change the API in the future. In the context of optimization, I'm skeptical that introducing functions for the reference counting would be useful. Making the INCREF/DECREF macros functions just in case the reference counting goes away is IMO an unacceptable performance cost. Instead, such a change should go through the regular deprecation procedure and/or cause the release of Python 4.0. It also might make it easier for alternate implementations to support the same API so some modules could work cross implementation - but I suspect that's a non-goal of this PEP :). Indeed :-) I'm also skeptical that this would actually allow cross-implementation modules to happen. The list of functions that an alternate implementation would have to provide is fairly long. Just in case you're unaware of it; the company I work for has an open source project called Ironclad. This *is* a reimplementation of the Python C API and gives us binary compatibility with [some subset of] Python C extensions for use from IronPython. http://www.resolversystems.com/documentation/index.php/Ironclad.html It's an ambitious project but it is now at the stage where 1000s of the Numpy and Scipy tests pass when run from IronPython. I don't think this PEP impacts the project, but it is not completely unfeasible for the alternative implementations to do this. In particular we have had to address the issue of the GIL and extensions (IronPython has no GIL) and reference counting (which IronPython also doesn't) use. Michael Foord -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384: Defining a Stable ABI
On May 17, 2009, at 4:54 PM, Martin v. Löwis wrote: Currently, each feature release introduces a new name for the Python DLL on Windows, and may cause incompatibilities for extension modules on Unix. This PEP proposes to define a stable set of API functions which are guaranteed to be available for the lifetime of Python 3, and which will also remain binary-compatible across versions. Extension modules and applications embedding Python can work with different feature releases as long as they restrict themselves to this stable ABI. It seems like a good ideal to strive for. But I think this is too strong a promise. IMO it would be better to say that ABI compatibility across releases is a goal. If someone does make a change that breaks the ABI, I'd expect whomever is proposing it to put forth a fairly strong argument towards why it's a worthwhile change. But it should be possible and allowed, given the right circumstances. Because I think it's pretty much inevitable that it *will* need to happen, sometime. (of course there will need to be ABI tests, so that any potential ABI breakages are known about when they occur) Python is much more defined by its source language than its C extension API, so tying the python major version number to the C ABI might not be the best idea from a marketing standpoint. (I can see it now...Python 4.0 major new features: we changed the C method definition struct layout incompatibly :) James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com