Re: [Python-Dev] PEP 355 status

2006-10-25 Thread stephen
Scott Dial writes:
 > [EMAIL PROTECTED] wrote:
 > > Talin writes:
 > >  > (one additional postscript - One thing I would be interested in is an 
 > >  > approach that unifies file paths and URLs so that there is a consistent 
 > >  > locator scheme for any resource, whether they be in a filesystem, on a 
 > >  > web server, or stored in a zip file.)
 > > 
 > > +1

 > It would make more sense to register protocol handlers to this magical 
 > unification of resource manipulation.

I don't think it's that magical, and it's not manipulation, it's
location.

The question is, register where and on what?  For example on my Mac
there are some PDFs I want to open in Preview and others in Acrobat.
To the extent that I have some classes which are one or the other, I
might want to register the handler to a wildcard path object.

 > But allow me to perform my first channeling of Guido.. YAGNI.

True, but only because when I do need that kind of stuff I'm normally
writing Emacs Lisp, not Python.  We have a wide variety of functions
for manipulating path strings, and they make exactly the distinction
between path and inode/content that Talin does (where a path is being
manipulated, the function has "filename" in its name, where a file or
its metadata is being accessed, the function's name contains "file").
Nonetheless there are two or three places where programmers I respect
have chosen to invent path classes to handle hairy special cases.
These classes are very useful in those special cases.

One place where this gets especially hairy is in the TRAMP package,
which allows you to construct "remote paths" involving (for example)
logging into host A by ssh, from there to host B by ssh, and finally a
"relay download" of the content from host C to the local host by scp.
The net effect is that you can specify the path in your "open file"
dialog, and Emacs does the rest automatically; the only differences
the user sees between that and a local file is the length of the path
string and the time it takes to actually access the contents.

Once you've done that, that process is embedded into Emacs's notion of
the "current directory", so you can list the directory containing the
resource, or access siblings, very conveniently.

I don't expect to reproduce that functionality in Python personally,
but such use cases do exist.  Whether a general path class can be
invented that doesn't accumulate cruft faster than use cases is
another issue.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Talin
Scott Dial wrote:
> [EMAIL PROTECTED] wrote:
>> Talin writes:
>>  > (one additional postscript - One thing I would be interested in is 
>> an  > approach that unifies file paths and URLs so that there is a 
>> consistent  > locator scheme for any resource, whether they be in a 
>> filesystem, on a  > web server, or stored in a zip file.)
>>
>> +1
>>
>> But doesn't file:/// do that for files, and couldn't we do something
>> like zipfile:///nantoka.zip#foo/bar/baz.txt?  Of course, we'd want to
>> do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too.  That
>> way leads to madness
>>
> 
> It would make more sense to register protocol handlers to this magical 
> unification of resource manipulation. But allow me to perform my first 
> channeling of Guido.. YAGNI.
> 

I'm thinking that it was a tactical error on my part to throw in the 
whole "unified URL / filename namespace" idea, which really has nothing 
to do with the topic. Lets drop it, or start another topic, and let this 
  thread focus on critiques of the path module, which is probably more 
relevant at the moment.

-- Talin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __str__ bug?

2006-10-25 Thread M.-A. Lemburg
Mike Krell wrote:
>> class S(str):
>> def __str__(self): return "S.__str__"
>>
>> class U(unicode):
>> def __str__(self): return "U.__str__"
>>
>> print str(S())
>> print str(U())
>>
>> This script prints:
>>
>> S.__str__
>> U.__str__
> 
> Yes, but "print U()" prints nothing, and the explicit str() should not
> be necessary.

The main difference here is that the string object defines
a tp_print slot, while Unicode doesn't.

As a result, tp_print for the string subtype is called and
this does an extra check for subtypes:

if (! PyString_CheckExact(op)) {
int ret;
/* A str subclass may have its own __str__ method. */
op = (PyStringObject *) PyObject_Str((PyObject *)op);
if (op == NULL)
return -1;
ret = string_print(op, fp, flags);
Py_DECREF(op);
return ret;
}

For Unicode, the PyObject_Print() API defaults to using
PyObject_Str() which uses the tp_str slot. This maps
directly to a Unicode API that works on the internals
and doesn't apply any extra checks to see if it was called
on a subtype.

Note that this is true for many of the __special__
slot methods you can implement on subtypes of built-in
types - they don't always work as you might expect.

Now in this rather common case, I guess we could add
support to the Unicode object to do extra checks like the
string object does.

Dito for the %-formatting.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Nick Coghlan
Talin wrote:
> Part 3: Does this mean that the current API cannot be improved?
> 
> Certainly not! I think everyone (well, almost) agrees that there is much 
> room for improvement in the current APIs. They certainly need to be 
> refactored and recategorized.
> 
> But I don't think that the solution is to take all of the path-related 
> functions and drop them into a single class, or even a single module.

+1 from me.

(for both the fraction I quoted and everything else you said, including the 
locator/inode/file distinction - although I'd also add that 'symbolic link' 
and 'directory' exist at a similar level as 'file').

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __str__ bug?

2006-10-25 Thread Martin v. Löwis
Mike Krell schrieb:
>> Based on the behaviour of str and the fact that overriding unicode.__repr__
>> works just fine, I'd say file a bug on SF.
> 
> Done.  This is item 1583863.

Of course, it would be even better if you could also include a patch.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Talin
Nick Coghlan wrote:
> Talin wrote:
>> Part 3: Does this mean that the current API cannot be improved?
>>
>> Certainly not! I think everyone (well, almost) agrees that there is 
>> much room for improvement in the current APIs. They certainly need to 
>> be refactored and recategorized.
>>
>> But I don't think that the solution is to take all of the path-related 
>> functions and drop them into a single class, or even a single module.
> 
> +1 from me.
> 
> (for both the fraction I quoted and everything else you said, including 
> the locator/inode/file distinction - although I'd also add that 
> 'symbolic link' and 'directory' exist at a similar level as 'file').

I would tend towards classifying directory operations as inode-level 
operations, that you are working at the "filesystem as graph" level, 
rather than the "stream of bytes" level. When you iterate over a 
directory, what you are getting back is effectively inodes (well, 
directory entries are distinct from inodes in the underlying filesystem, 
but from Python there's no practical distinction.)

If I could draw a UML diagram in ASCII, I would have "inode --> points 
to --> directory or file" and "directory --> contains * --> inode". That 
would hopefully make things clearer.

Symbolic links, I am not so sure about; In some ways, hard links are 
easier to classify.

---

Having done a path library myself (in C++, for our code base at work), 
the trickiest part is getting the Windows path manipulations right, and 
fitting them into a model that allows writing of platform-agnostic code. 
This is especially vexing when you realize that its often useful to 
manipulate unix-style paths even when running under Win32 and vice 
versa. A prime example is that I have a lot of Python code at work that 
manipulates Perforce client specs files. The path specifications in 
these files are platform-agnostic, and use forward slashes regardless of 
the host platform, so "os.path.normpath" doesn't do the right thing for me.

> Cheers,
> Nick.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Phillip J. Eby
At 09:49 AM 10/25/2006 -0700, Talin wrote:
>Having done a path library myself (in C++, for our code base at work),
>the trickiest part is getting the Windows path manipulations right, and
>fitting them into a model that allows writing of platform-agnostic code.
>This is especially vexing when you realize that its often useful to
>manipulate unix-style paths even when running under Win32 and vice
>versa. A prime example is that I have a lot of Python code at work that
>manipulates Perforce client specs files. The path specifications in
>these files are platform-agnostic, and use forward slashes regardless of
>the host platform, so "os.path.normpath" doesn't do the right thing for me.

You probably want to use the posixpath module directly in that case, though 
perhaps you've already discovered that.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Talin
Phillip J. Eby wrote:
> At 09:49 AM 10/25/2006 -0700, Talin wrote:
>> Having done a path library myself (in C++, for our code base at work),
>> the trickiest part is getting the Windows path manipulations right, and
>> fitting them into a model that allows writing of platform-agnostic code.
>> This is especially vexing when you realize that its often useful to
>> manipulate unix-style paths even when running under Win32 and vice
>> versa. A prime example is that I have a lot of Python code at work that
>> manipulates Perforce client specs files. The path specifications in
>> these files are platform-agnostic, and use forward slashes regardless of
>> the host platform, so "os.path.normpath" doesn't do the right thing 
>> for me.
> 
> You probably want to use the posixpath module directly in that case, 
> though perhaps you've already discovered that.

Never heard of it. Its not in the standard library, is it? I don't see 
it in the table of contents or the index.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Fred L. Drake, Jr.
On Wednesday 25 October 2006 13:16, Talin wrote:
 > Never heard of it. Its not in the standard library, is it? I don't see
 > it in the table of contents or the index.

This is a documentation bug.  :-(  I'd thought they were mentioned 
*somewhere*, but it looks like I'm wrong.

os.path is an alias for one of several different real modules; which is 
selected depends on the platform.  I see the following: macpath, ntpath, 
os3emxpath, riscospath.  (ntpath is used for all Windows versions, not just 
NT.)


  -Fred

-- 
Fred L. Drake, Jr.   
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Phillip J. Eby
At 10:16 AM 10/25/2006 -0700, Talin wrote:
>Phillip J. Eby wrote:
> > At 09:49 AM 10/25/2006 -0700, Talin wrote:
> >> Having done a path library myself (in C++, for our code base at work),
> >> the trickiest part is getting the Windows path manipulations right, and
> >> fitting them into a model that allows writing of platform-agnostic code.
> >> This is especially vexing when you realize that its often useful to
> >> manipulate unix-style paths even when running under Win32 and vice
> >> versa. A prime example is that I have a lot of Python code at work that
> >> manipulates Perforce client specs files. The path specifications in
> >> these files are platform-agnostic, and use forward slashes regardless of
> >> the host platform, so "os.path.normpath" doesn't do the right thing
> >> for me.
> >
> > You probably want to use the posixpath module directly in that case,
> > though perhaps you've already discovered that.
>
>Never heard of it. Its not in the standard library, is it? I don't see
>it in the table of contents or the index.

posixpath, ntpath, macpath, et al are the platform-specific path 
manipulation modules that are aliased to os.path.  However, each of these 
modules' string path manipulation functions can be imported and used on any 
platform.  See below:

Linux:

Python 2.3.5 (#1, Aug 25 2005, 09:17:44)
[GCC 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import os
 >>> os.path

 >>> import ntpath
 >>> dir(ntpath)
['__all__', '__builtins__', '__doc__', '__file__', '__name__', 'abspath', 
'altsep', 'basename', 'commonprefix', 'curdir', 'defpath', 'dirname', 
'exists', 'expanduser', 'expandvars', 'extsep', 'getatime', 'getctime', 
'getmtime', 'getsize', 'isabs', 'isdir', 'isfile', 'islink', 'ismount', 
'join', 'normcase', 'normpath', 'os', 'pardir', 'pathsep', 'realpath', 
'sep', 'split', 'splitdrive', 'splitext', 'splitunc', 'stat', 
'supports_unicode_filenames', 'sys', 'walk']

Windows:

Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
 >>> import os
 >>> os.path

 >>> import posixpath
 >>> dir(posixpath)
['__all__', '__builtins__', '__doc__', '__file__', '__name__', '_varprog', 
'abspath', 'altsep', 'basename', 'commonprefix', 'curdir', 'defpath', 
'dirname', 'exists', 'expanduser', 'expandvars', 'extsep', 'getatime', 
'getctime', 'getmtime', 'getsize', 'isabs', 'isdir', 'isfile', 'islink', 
'ismount', 'join', 'normcase', 'normpath', 'os', 'pardir', 'pathsep', 
'realpath', 'samefile', 'sameopenfile', 'samestat', 'sep', 'split', 
'splitdrive', 'splitext', 'stat', 'supports_unicode_filenames', 'sys', 'walk']


Note, therefore, that any "path object" system should also allow you to 
create and manipulate foreign paths.  That is, it should have variants for 
each path type, rather than being locked to the local platform's path 
strings.  Of course, the most common need for this is manipulating posix 
paths on non-posix platforms, but sometimes one must deal with Windows 
paths on Unix, too.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Fredrik Lundh
Talin wrote:

>> You probably want to use the posixpath module directly in that case, 
>> though perhaps you've already discovered that.
> 
> Never heard of it. Its not in the standard library, is it? I don't see 
> it in the table of contents or the index.

http://effbot.org/librarybook/posixpath.htm



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Fwd: Re: ANN compiler2 : Produce bytecode from Python 2.5 AST

2006-10-25 Thread Michael Spencer

Martin v. Löwis wrote:
> Georg Brandl schrieb:
>> Perhaps you can bring up a discussion on python-dev about your improvements
>> and how they could be integrated into the standard library...
> 
> Let me second this. The compiler package is largely unmaintained and
> was known to be broken (and perhaps still is). A replacement
> implementation, especially if it comes with a new maintainer, would
> be welcome.
> 
> Regards,
> Martin

Hello python-dev.

I use AST-based code inspection and manipulation, and I've been looking forward 
to using v2.5 ASTs for their increased accuracy, consistency and speed. 
However, 
there is as yet no Python-exposed mechanism for compiling v2.5 ASTs to bytecode.

So to meet my own need and interest I've been implementing 'compiler2', similar 
in scope to the stblib compiler package, but generating code from Python 2.5 
_ast.ASTs.  The code has evolved considerably from the compiler package: in 
aggregate the changes amount to a re-write.  More about the package and its 
status below.

I'm introducing this project here to discuss whether and how these changes 
should be integrated with the stdlib.

I believe there is a prima facie need to have a builtin/stdlib capability for 
compiling v2.5 ASTs from Python, and there is some advantage to having that be 
implemented in Python.  There is also a case for deprecating the v2.4 ASTs to 
ease maintenance and reduce the confusion associated with two different AST 
formats.

If there is interest, I'm willing make compiler2 stdlib-ready.  I'm also open 
to 
alternative approaches, including doing nothing.

compiler2 Objectives and Status
===
My goal is to get compiler2 to produce identical output to __builtin__.compile 
(at least optionally), while also providing an accessible framework for 
AST-manipulation, experimental compiler optimizations and customization.

compiler2 is not finished - there are some unresolved bugs, and open questions 
on interface design - but already it produces identical output to 
__builtin__.compile for all of the stdlib modules and their tests (except for 
the stackdepth attribute which is different in 12 cases). All but three stdlib 
modules pass their tests after being compiled using compiler2.  More on goals, 
status, known issues etc... in the project readme.txt at: 
http://svn.brownspencer.com/pycompiler/branches/new_ast/readme.txt

Code is available in Subversion at 
http://svn.brownspencer.com/pycompiler/branches/new_ast/

The main test script is test/test_compiler.py which compiles all the modules in 
/Lib and /Lib/test and compares the output with __builtin__.compile.


Best regards

Michael Spencer







___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Greg Ewing
Talin wrote:
> (Actually, the OOP approach has a slight advantage in terms of the 
> amount of syntactic sugar available,

Even if you don't use any operator overloading, there's
still the advantage that an object provides a namespace
for its methods. Without that, you either have to use
fairly verbose function names or keep qualifying them
with a module name. Code that uses the current path
functions tends to contain a lot of
os.path.this(os.path.that(...)) stuff which is quite
tedious to write and read.

Another consideration is that having paths be a
distinct data type allows for the possibility of file
system references that aren't just strings. In
Classic MacOS, for example, the definitive way of
referencing a file is by a (volRefum, dirID, name)
tuple, and textual paths aren't guaranteed to be
unique or even to exist.

> (I should not that the Java Path API does *not* follow my scheme of 
> separation between locators and inodes, while the C# API does, which is 
> another reason why I prefer the C# approach.)

A compromise might be to have all the "path algebra"
operations be methods, and everything else functions
which operate on path objects. That would make sense,
because the path algebra ought to be a closed set
of operations that's tightly coupled to the platform's
path semantics.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Talin
Greg Ewing wrote:
> Talin wrote:
>> (Actually, the OOP approach has a slight advantage in terms of the 
>> amount of syntactic sugar available,
> 
> Even if you don't use any operator overloading, there's
> still the advantage that an object provides a namespace
> for its methods. Without that, you either have to use
> fairly verbose function names or keep qualifying them
> with a module name. Code that uses the current path
> functions tends to contain a lot of
> os.path.this(os.path.that(...)) stuff which is quite
> tedious to write and read.

Given the flexibility that Python allows in naming the modules that you 
import, I'm not sure that this is a valid objection -- you can make the 
module name as short as you feel comfortable with.

> Another consideration is that having paths be a
> distinct data type allows for the possibility of file
> system references that aren't just strings. In
> Classic MacOS, for example, the definitive way of
> referencing a file is by a (volRefum, dirID, name)
> tuple, and textual paths aren't guaranteed to be
> unique or even to exist.

That's true of textual paths in general - i.e. even on unix, textual 
paths aren't guaranteed to be unique or exist.

Its been a while since I used classic MacOS - how do you handle things 
like configuration files with path names in them?

>> (I should not that the Java Path API does *not* follow my scheme of 
>> separation between locators and inodes, while the C# API does, which 
>> is another reason why I prefer the C# approach.)
> 
> A compromise might be to have all the "path algebra"
> operations be methods, and everything else functions
> which operate on path objects. That would make sense,
> because the path algebra ought to be a closed set
> of operations that's tightly coupled to the platform's
> path semantics.

Personally, this is one of those areas where I am strongly tempted to 
violate TOOWTDI - I can see use cases where string-based paths would be 
more convenient and less typing, and other use cases where object-based 
paths would be more convenient and less typing.

If I were designing a path library, I would create a string-based system 
as the lowest level, and an object based system on top of it (the reason 
for doing it that was is simply so that people who want to use strings 
don't have to suffer the cost of creating temporary path objects to do 
simple things like joins.) Moreover, I would keep the naming conventions 
of the two systems similar, if at all possible possible - thus, the 
object methods would have the same (short) names as the functions within 
the module.

So for example:

# Import new, refactored module io.path
from io import path

# Case 1 using strings
path1 = path.join( "/Libraries/Frameworks", "Python.Framework" )
parent = path.parent( path1 )

# Case 2 using objects
pathobj = path.Path( "/Libraries/Frameworks" )
pathobj += "Python.Framework"
parent = pathobj.parent()

Let me riff on this just a bit more - don't take this all too seriously 
though:

Refactored organization of path-related modules (under a new name
so as not to conflict with existing modules):

io.path -- path manipulations
io.dir -- directory functions, including dirwalk
io.fs -- dealing with filesystem objects (inodes, symlinks, etc.)
io.file -- file read / write streams

# Import directory module
import io.dir

# String based API
for entry in io.dir.listdir( "/Library/Frameworks" ):
   print entry  # Entry is a string

# Object based API
dir = io.dir.Directory( "/Library/Frameworks" )
for entry in dir: # Iteration protocol on dir object
   print entry  # entry is an obj, but __str__() returns path text

# Dealing with various filesystems: pass in a format parameter
dir = io.dir.Directory( "/Library/Frameworks" )
   print entry.path( format="NT" ) # entry printed in NT format

# Or you can just use a format specifier for PEP 3101 string format:
print "Path in local system format is {0}".format( entry )
print "Path in NT format is {0:NT}".format( entry )
print "Path in OS X format is {0:OSX}".format( entry )

Anyway, off the top of my head, that's what a refactored path API would 
look like if I were doing it :)

(Yes, the names are bad, can't think of better ATM.)

-- Talin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-25 Thread Greg Ewing
Talin wrote:
> Ideally, you should be able to pass 
> "file:///..." to a regular "open" function.

I'm not so sure about that. Consider that "file:///foo.bar"
is a valid relative pathname on Unix to a file called "foo.bar"
in a directory called "file:".

That's not to say there shouldn't be a function available
that understands it, but I wouldn't want it built into
all functions that take pathnames.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com