[Python-Dev] LZW support in tarfile ?

2009-05-17 Thread Tarek Ziadé
Hello,

I want to remove the usage of the tar command in Distutils in favor
or the tarfile module.

But, there's an option in Distutils.make_archive to create a tarball
using the compress [1] program rather than gzip or bzip2.
Using tar -Z, it will pipe it to the compress program if present. This
program implements the LZW algorithm [2].

The LZW used to be patented but this patent seem to be expired in
every country now [3].

On Distutils side I can work things out so the tar archive created can
be piped to an arbitraty compression program when it is
not compressed using bzip2 or gzip;

But I was wondering if we should we add a LZW support in tarinfo,
besides gzip and bzip2 ?

Although this compression standard doesn't seem very used these days,

Regards
Tarek

[1] http://en.wikipedia.org/wiki/Compress
[2] http://en.wikipedia.org/wiki/LZW
[3] http://www.unisys.com/about__unisys/lzw


-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 : Changing the .egg-info structure

2009-05-17 Thread MRAB

Alexander Shigin wrote:

В Сбт, 16/05/2009 в 23:15 +0100, MRAB пишет:

FYI, on RISC OS '/' is a valid filename character and '.' is used as
the directory separator.

I'd probably say that TAB is s reasonable character to use, even
though it's OK in POSIX; after all, should anyone really be using a
control character in a filename? 


The '\0' char is invalid in both windows and posix. I don't know if one
valid on RISC OS.


'\0' isn't a valid filename character on RISC OS.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZW support in tarfile ?

2009-05-17 Thread Antoine Pitrou
Tarek Ziadé ziade.tarek at gmail.com writes:
 
 But I was wondering if we should we add a LZW support in tarinfo,
 besides gzip and bzip2 ?
 
 Although this compression standard doesn't seem very used these days,

It would be more useful to add LZMA / xz support.
I don't think compress is used anymore, except perhaps on old legacy systems.
On my Linux system, I have lots of .gz, .bz2 and .lzma files, but absolutely no
.Z file.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZW support in tarfile ?

2009-05-17 Thread Michael Foord

Antoine Pitrou wrote:

Tarek Ziadé ziade.tarek at gmail.com writes:
  

But I was wondering if we should we add a LZW support in tarinfo,
besides gzip and bzip2 ?

Although this compression standard doesn't seem very used these days,



It would be more useful to add LZMA / xz support.
I don't think compress is used anymore, except perhaps on old legacy systems.
On my Linux system, I have lots of .gz, .bz2 and .lzma files, but absolutely no
.Z file.
  


I've seen the occasional .Z file in recent years, but never that I 
recall for a Python package.


As plugging in external compression tools is less likely to work 
cross-platform wouldn't it be both easier and better to deprecate (and 
not replace) the compress support.


If there is a huge outcry adding LZW support to tarfile can be reconsidered.

Michael Foord


Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
  



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZW support in tarfile ?

2009-05-17 Thread Martin v. Löwis
 But, there's an option in Distutils.make_archive to create a tarball
 using the compress [1] program rather than gzip or bzip2.
 Using tar -Z, it will pipe it to the compress program if present. This
 program implements the LZW algorithm [2].

As everybody else says: it might be best to just remove that option.
For compatibility, perhaps deprecate it in 2.7 and 3.1, and remove in
in 3.2.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-05-17 Thread Piet van Oostrum
 Ned Deily n...@acm.org (ND) wrote:

ND In article m2ocueq6mm@cs.uu.nl, Piet van Oostrum p...@cs.uu.nl 
ND wrote:
  Ronald Oussoren ronaldousso...@mac.com (RO) wrote:
 RO For what it's worth, the OSX API's seem to behave as follows:
 RO * If you create a file with an non-UTF8 name on a HFS+ filesystem the
 RO system automaticly encodes the name.
 
 RO That is,  open(chr(255), 'w') will silently create a file named '%FF'
 RO instead of the name you'd expect on a unix system.
 
 Not for me (I am using Python 2.6.2).
 
  f = open(chr(255), 'w')
 Traceback (most recent call last):
 File stdin, line 1, in module
 IOError: [Errno 22] invalid mode ('w') or filename: '\xff'
  

ND What version of OSX are you using?  On Tiger 10.4.11 I see the failure 
ND you see but on Leopard 10.5.6 the behavior Ronald reports.

Yes, I am using Tiger (10.4.11). Interesting that it has changed on Leopard.
-- 
Piet van Oostrum p...@cs.uu.nl
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Martin v. Löwis
Thomas Wouters reminded me of a long-standing idea; I finally
found the time to write it down.

Please comment!

Regards,
Martin

PEP: 384
Title: Defining a Stable ABI
Version: $Revision: 72754 $
Last-Modified: $Date: 2009-05-17 21:14:52 +0200 (So, 17. Mai 2009) $
Author: Martin v. Löwis mar...@v.loewis.de
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 17-May-2009
Python-Version: 3.2
Post-History:

Abstract


Currently, each feature release introduces a new name for the
Python DLL on Windows, and may cause incompatibilities for extension
modules on Unix. This PEP proposes to define a stable set of API
functions which are guaranteed to be available for the lifetime
of Python 3, and which will also remain binary-compatible across
versions. Extension modules and applications embedding Python
can work with different feature releases as long as they restrict
themselves to this stable ABI.

Rationale
=

The primary source of ABI incompatibility are changes to the lay-out
of in-memory structures. For example, the way in which string interning
works, or the data type used to represent the size of an object, have
changed during the life of Python 2.x. As a consequence, extension
modules making direct access to fields of strings, lists, or tuples,
would break if their code is loaded into a newer version of the
interpreter without recompilation: offsets of other fields may have
changed, making the extension modules access the wrong data.

In some cases, the incompatibilities only affect internal objects of
the interpreter, such as frame or code objects. For example, the way
line numbers are represented has changed in the 2.x lifetime, as has
the way in which local variables are stored (due to the introduction
of closures). Even though most applications probably never used these
objects, changing them had required to change the PYTHON_API_VERSION.

On Linux, changes to the ABI are often not much of a problem: the
system will provide a default Python installation, and many extension
modules are already provided pre-compiled for that version. If additional
modules are needed, or additional Python versions, users can typically
compile them themselves on the system, resulting in modules that use
the right ABI.

On Windows, multiple simultaneous installations of different Python
versions are common, and extension modules are compiled by their
authors, not by end users. To reduce the risk of ABI incompatibilities,
Python currently introduces a new DLL name pythonXY.dll for each
feature release, whether or not ABI incompatibilities actually exist.

With this PEP, it will be possible to reduce the dependency of binary
extension modules on a specific Python feature release, and applications
embedding Python can be made work with different releases.

Specification
=

The ABI specification falls into two parts: an API specification,
specifying what function (groups) are available for use with the
ABI, and a linkage specification specifying what libraries to link
with. The actual ABI (layout of structures in memory, function
calling conventions) is not specified, but implied by the
compiler. As a recommendation, a specific ABI is recommended for
selected platforms.

During evolution of Python, new ABI functions will be added.
Applications using them will then have a requirement on a minimum
version of Python; this PEP provides no mechanism for such
applications to fall back when the Python library is too old.

Terminology
---

Applications and extension modules that want to use this ABI
are collectively referred to as applications from here on.

Header Files and Preprocessor Definitions
-

Applications shall only include the header file Python.h (before
including any system headers), or, optionally, include pyconfig.h, and
then Python.h.

During the compilation of applications, the preprocessor macro
Py_LIMITED_API must be defined. Doing so will hide all definitions
that are not part of the ABI.

Structures
--

Only the following structures and structure fields are accessible to
applications:

- PyObject (ob_refcnt, ob_type)
- PyVarObject (ob_base, ob_size)
- Py_buffer (buf, obj, len, itemsize, readonly, ndim, shape,
  strides, suboffsets, smalltable, internal)
- PyMethodDef (ml_name, ml_meth, ml_flags, ml_doc)
- PyMemberDef (name, type, offset, flags, doc)
- PyGetSetDef (name, get, set, doc, closure)

The accessor macros to these fields (Py_REFCNT, Py_TYPE, Py_SIZE)
are also available to applications.

The following types are available, but opaque (i.e. incomplete):

- PyThreadState
- PyInterpreterState

Type Objects


The structure of type objects is not available to applications;
declaration of static type objects is not possible anymore
(for applications using this ABI).
Instead, type objects get created dynamically. To allow an
easy creation of types (in particular, to be able to fill out
function pointers 

Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Dirkjan Ochtman
On Sun, May 17, 2009 at 10:54 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Excluded Functions
 --

 Functions declared in the following header files are not part
 of the ABI:
 - cellobject.h
 - classobject.h
 - code.h
 - frameobject.h
 - funcobject.h
 - genobject.h
 - pyarena.h
 - pydebug.h
 - symtable.h
 - token.h
 - traceback.h

What kind of effect does this have on optimization efforts, for
example all the stuff done by Antoine Pitrou over the last few months,
and the first few results from unladen? Will it mean we won't get to
the good optimizations until 4.0? Or does it just mean unladen swallow
takes longer to come back to trunk (until 4.0) and every extension
author who wants to be compatible with it will basically have the same
burden as now?

Cheers,

Dirkjan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Martin v. Löwis
 Functions declared in the following header files are not part
 of the ABI:
 - cellobject.h
 - classobject.h
 - code.h
 - frameobject.h
 - funcobject.h
 - genobject.h
 - pyarena.h
 - pydebug.h
 - symtable.h
 - token.h
 - traceback.h
 
 What kind of effect does this have on optimization efforts, for
 example all the stuff done by Antoine Pitrou over the last few months,
 and the first few results from unladen? 

I fail to see the relationship, so: no effect that I can see.

Why do you think that optimization efforts could be related to
the PEP 384 proposal?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Georg Brandl
Martin v. Löwis schrieb:

 Header Files and Preprocessor Definitions
 -
 
 Applications shall only include the header file Python.h (before
 including any system headers), or, optionally, include pyconfig.h, and
 then Python.h.

What about structmember.h?  It's not yet included with Python.h AFAICS.

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Dirkjan Ochtman
On Mon, May 18, 2009 at 12:07 AM, Martin v. Löwis mar...@v.loewis.de wrote:
 I fail to see the relationship, so: no effect that I can see.

 Why do you think that optimization efforts could be related to
 the PEP 384 proposal?

It would seem to me that optimizations are likely to require data
structure changes, for exactly the kind of core data structures that
you're talking about locking down. But that's just a high-level view,
I might be wrong.

Cheers,

Dirkjan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Martin v. Löwis
 Header Files and Preprocessor Definitions
 -

 Applications shall only include the header file Python.h (before
 including any system headers), or, optionally, include pyconfig.h, and
 then Python.h.
 
 What about structmember.h?  It's not yet included with Python.h AFAICS.

Right - I think it should be, though. Is there a reason why it's not
included?

The only reason I can see is that it isn't completely namespace-safe,
e.g. it defines a constant READONLY. Not sure whether the T_ constants
would need to be changed as well.

So if that's the rationale, I would propose to make it namespace-safe
under a different file name, and add alias #defines in structmember.h
for compatibility.

I also think this should happen independent of PEP 384.

See also issue 2897 - perhaps we can even fix it for 3.1.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Antoine Pitrou
Dirkjan Ochtman dirkjan at ochtman.nl writes:
 
 It would seem to me that optimizations are likely to require data
 structure changes, for exactly the kind of core data structures that
 you're talking about locking down. But that's just a high-level view,
 I might be wrong.

Unless I'm misunderstanding something, Martin doesn't advocate locking data
structures down (except a couple of outliers such as Py_buffer). An
ABI-compliant application mustn't tinker directly with Python's data structures,
but use the ABI functions.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Dirkjan Ochtman
On Mon, May 18, 2009 at 12:43 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Unless I'm misunderstanding something, Martin doesn't advocate locking data
 structures down (except a couple of outliers such as Py_buffer). An
 ABI-compliant application mustn't tinker directly with Python's data 
 structures,
 but use the ABI functions.

Right. Sorry about the noise, then.

Cheers,

Dirkjan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Martin v. Löwis
Dirkjan Ochtman wrote:
 On Mon, May 18, 2009 at 12:07 AM, Martin v. Löwis mar...@v.loewis.de 
 wrote:
 I fail to see the relationship, so: no effect that I can see.

 Why do you think that optimization efforts could be related to
 the PEP 384 proposal?
 
 It would seem to me that optimizations are likely to require data
 structure changes, for exactly the kind of core data structures that
 you're talking about locking down. But that's just a high-level view,
 I might be wrong.

Ah. It's exactly the opposite: The purpose of the PEP is not to lock
the data structures down, but to allow more flexible evolution of
them - by completely hiding them from extension modules.

Currently, any data structure change must be weighed for its impact
on binary compatibility. With the PEP, changing structures can
be done fairly freely - with the exception of the very few structures
that do get locked down. In particular, the list of header files
that you quoted precisely contains the structures that can be
modified with no impact on the ABI.

I'm not aware that any of the structures that I propose to lock
would be relevant for optimization - but I might be wrong. If so,
I'd like to know, and it would be possible to add accessor functions
in cases where extension modules might still legitimately want to
access certain fields.

Certain changes to the VM would definitely be binary-incompatible,
such as removal of reference counting. However, such a change would
probably have a much wider effect, breaking not just binary
compatibility, but also source compatibility. It would be justified
to call a Python release that makes such a change 4.0.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Martin v. Löwis
Dino Viehland wrote:
 Dirkjan Ochtman wrote:
 It would seem to me that optimizations are likely to require data
 structure changes, for exactly the kind of core data structures that
 you're talking about locking down. But that's just a high-level view,
 I might be wrong.

 
 
 In particular I would guess that ref counting is the biggest issue here.
 I would think not directly exposing the field and having inc/dec ref
 Functions (real methods, not macros) for it would give a lot more
 ability to change the API in the future.

In the context of optimization, I'm skeptical that introducing functions
for the reference counting would be useful. Making the INCREF/DECREF
macros functions just in case the reference counting goes away is IMO
an unacceptable performance cost.

Instead, such a change should go through the regular deprecation
procedure and/or cause the release of Python 4.0.

 It also might make it easier for alternate implementations to support
 the same API so some modules could work cross implementation - but I
 suspect that's a non-goal of this PEP :).

Indeed :-) I'm also skeptical that this would actually allow
cross-implementation modules to happen. The list of functions that
an alternate implementation would have to provide is fairly long.

The memory management APIs in particular also assume a certain layout
of Python objects in general, namely that they start with a header
whose size is a compile-time constant. Again, making this more flexible
just in case would also impact performance, and probably fairly badly
so.

 Other fields directly accessed (via macros or otherwise) might have similar
 problems but they don't seem as core as ref counting.

Access to the type object reference is probably similar. All the other
structs are used directly in C code, with no accessor macros.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Dino Viehland
Dirkjan Ochtman wrote:

 It would seem to me that optimizations are likely to require data
 structure changes, for exactly the kind of core data structures that
 you're talking about locking down. But that's just a high-level view,
 I might be wrong.



In particular I would guess that ref counting is the biggest issue here.
I would think not directly exposing the field and having inc/dec ref
Functions (real methods, not macros) for it would give a lot more
ability to change the API in the future.

It also might make it easier for alternate implementations to support
the same API so some modules could work cross implementation - but I
suspect that's a non-goal of this PEP :).

Other fields directly accessed (via macros or otherwise) might have similar
problems but they don't seem as core as ref counting.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread Michael Foord

Martin v. Löwis wrote:

Dino Viehland wrote:
  

Dirkjan Ochtman wrote:


It would seem to me that optimizations are likely to require data
structure changes, for exactly the kind of core data structures that
you're talking about locking down. But that's just a high-level view,
I might be wrong.

  

In particular I would guess that ref counting is the biggest issue here.
I would think not directly exposing the field and having inc/dec ref
Functions (real methods, not macros) for it would give a lot more
ability to change the API in the future.



In the context of optimization, I'm skeptical that introducing functions
for the reference counting would be useful. Making the INCREF/DECREF
macros functions just in case the reference counting goes away is IMO
an unacceptable performance cost.

Instead, such a change should go through the regular deprecation
procedure and/or cause the release of Python 4.0.

  

It also might make it easier for alternate implementations to support
the same API so some modules could work cross implementation - but I
suspect that's a non-goal of this PEP :).



Indeed :-) I'm also skeptical that this would actually allow
cross-implementation modules to happen. The list of functions that
an alternate implementation would have to provide is fairly long.

  


Just in case you're unaware of it; the company I work for has an open 
source project called Ironclad. This *is* a reimplementation of the 
Python C API and gives us binary compatibility with [some subset of] 
Python C extensions for use from IronPython.


http://www.resolversystems.com/documentation/index.php/Ironclad.html

It's an ambitious project but it is now at the stage where 1000s of the 
Numpy and Scipy tests pass when run from IronPython. I don't think this 
PEP impacts the project, but it is not completely unfeasible for the 
alternative implementations to do this.


In particular we have had to address the issue of the GIL and extensions 
(IronPython has no GIL) and reference counting (which IronPython also 
doesn't) use.


Michael Foord



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread James Y Knight


On May 17, 2009, at 4:54 PM, Martin v. Löwis wrote:

Currently, each feature release introduces a new name for the
Python DLL on Windows, and may cause incompatibilities for extension
modules on Unix. This PEP proposes to define a stable set of API
functions which are guaranteed to be available for the lifetime
of Python 3, and which will also remain binary-compatible across
versions. Extension modules and applications embedding Python
can work with different feature releases as long as they restrict
themselves to this stable ABI.



It seems like a good ideal to strive for.

But I think this is too strong a promise. IMO it would be better to  
say that ABI compatibility across releases is a goal. If someone does  
make a change that breaks the ABI, I'd expect whomever is proposing it  
to put forth a fairly strong argument towards why it's a worthwhile  
change. But it should be possible and allowed, given the right  
circumstances. Because I think it's pretty much inevitable that it  
*will* need to happen, sometime.


(of course there will need to be ABI tests, so that any potential ABI  
breakages are known about when they occur)


Python is much more defined by its source language than its C  
extension API, so tying the python major version number to the C ABI  
might not be the best idea from a marketing standpoint. (I can see  
it now...Python 4.0 major new features: we changed the C method  
definition struct layout incompatibly :)


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com