On Thu, 11 Dec 2008, Ulrich Eckhardt wrote:

On Thursday 11 December 2008, Steve Holden wrote:
Ulrich Eckhardt wrote:
Seems to me this just threatens to add to the confusion.

If you know what your filesystem produces, you can take the appropriate
action to convert it into a type that makes sense to the user. If you
don't, then at least if you have the string in its bytes form you can
                                      ^^^^^^^^^^^^^^^^^^^

There are operating systems that don't use bytes to represent a file path,
namely all the MS Windows variants. Even worse, when you use a byte string
there, it typically means that you want to use the obsolete encoding that is
based on codepages.

Why can we not preserve the representation of a path as it is? Why do we
_have_ to convert it to anything at all, without even knowing if this
conversion is needed? I just want to do something to a file's content, why
does its path have to be converted to something and then be converted back in
order for the system to digest it?

re-present it to the filesystem to manipulate the file. What are we
supposed to do with the "special type"?

You receive from readdir() and pass it to stat(), simple as that. No
conversions from the native representation needed. If you need a textual
representation, then you have to convert it and you have to do so explicitly
according to whatever logic your application requires.

Not only would this address the issue with the local filesystem, it would also provide a principled way to deal with remote filesystems. For example, an FTP interface library for Python could use this type to returns paths of the sort actually supported by the raw FTP protocol.

Thinking of "the" filesystem is actually a misconception - always referring to "a" filesystem opens up all sorts of possibilities. There is a lot of coding to do to allow this, but allowing programs to work with paths and files in the local filesystem, remote filesystems, and filesystems constructed from others (e.g., by expanding symlinks, changing the root similar to chroot, or encoding/unencoding pathnames) would open up lots of possibilities, including better test environments.

This is an interesting case of separating byte strings from character strings. As long as the two are conflated, everything appears simple. But when they are separated, not only are there two types where before there was only one, it turns out that which type is correct in some circumstances depends on the platform. Also, many objects which are byte strings at the protocol level are usually or always meant to be character strings of some sort, but how to translate them simply cannot be nailed down once and for all.

Isaac Morland                   CSCF Web Guru
DC 2554C, x36650                WWW Software Specialist
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to