Talin wrote: > 1) Does os.path need to be refactored at all? Yes. Functions are scattered arbitrarily across six modules: os, os.path, shutil, stat, glob, fnmatch. You have to search through five scattered doc pages in the Python library to find your function, plus the os module doc is split into five sections. You may think 'shlutil' has to do with shells, not paths. shutil.copy2 is riduculously named: what's so "2" about it? Why is 'split' in os.path but 'stat' and 'mkdir' and 'remove' are in os? Don't they all operate on paths?
The lack of method chaning means you have to use nested functions, which must be read "inside out" rather than left-to-right like paths normally go. Say you want to add the absolute path of "../../lib" to the Python path in a platform-independent manner, relative to an absolute path (__file__): # Assume __file__ is "/toplevel/app1/bin/main_program.py". # Result is "/toplevel/app1/lib". p = os.path.join(os.path.dirname(os.path.dirname(__file__)), "lib") PEP 355 proposes a much easier-to-read: # The Path object emulates "/toplevel/app1/bin/main_program.py". p = Path(__file__).parent.parent.join("lib") Noam Raphael's directory-component object would make this even more straightforward: # The Path object emulates ("/", "toplevel", "app1", "bin", "main_program.py") p = Path(__file__)[:-2] + "lib" Stat handling has grown cruft over the years. To check the modify time of a file: os.path.getmtime("/FOO") os.stat("/FOO").st_mtime os.stat("/FOO")[stat.ST_MTIME] # List subscript, deprecated usage. If you want to check whether a file is a type for which there is no os.path.is*() method: stat.S_ISSOCK( os.stat("/FOO").st_mode ) # Is the file a socket? Compare to the directory-component proposal: Path("/foo").stat().issock os.path functions are too low-level. Say you want to recursively delete a path you're about to overwrite, no matter whether it exists or is a file or directory. You can't do it in one line of code, darn, you gotta write this function or inline the code everywhere you use it: def purge(p): if os.path.isdir(p): shutil.rmtree(p) # Raises error if nonexistent or not a directory. elif os.path.exists(): # isfile follows symlinks and returns False for special files, so it's # not a reliable guide of whether we can call os.remove. os.remove(p) # Raises error if nonexistent or a directory. if os.path.isfile(p): # Includes all symlinks. os.remove(p) > 2) is there anything that the existing os.path *won't do* that we desperately > need it to do? For filesystem files, no. Though you really mean all six modules above and not just os.path. It has been proposed to support non-filesystem directories (zip files, CSV/Subversion sandboxes, URLs, FTP objects) under a new Path API. > 3) Assuming that the answer to #1 is "yes", the next question is: "evolution or revolution?" Revolution. It needs a clean new API. However, this can live alongside the existing functions if necessary: posixpath.PosixPath, path.Path, etc. > 4) My third question is: Who are we going to steal our ideas from? Boost, Java, C# and others - all are worthy of being the, ahem, target of our inspiration. Or we have some alternative that's so cool that it makes sense to "Think Different(ly)"? Java is the only one I'm familiar with. The existing Python proposals are summarized below. > 5) Must there be one ring to rule them all? I suggested earlier that we might have a "low-level" and a "high-level" API, one built on top of the other. Is this a good idea, or a non-starter? It's worth discussing. One question is whether the dichotomy does anything useful or just adds unnecessary complexity. But that can only be answered for a specific API proposal. Whatever we do will be "low-level" compared to third-party extensions that will be built on top of it, so we should plan for extensibility. * * * * Here's a summary of the existing Python proposals in chronological order. Jason Orendorff's path.py http://www.jorendorff.com/articles/python/path/src/path.py This provides an OO path object encompassing the six modules above. The object is a string subclass. PEP 355 http://www.python.org/dev/peps/pep-0355/ An update to Orendorff's code. This was rejected by the BDFL because: (A) we haven't proven an OO interface is superior to os.path et al, (B) it mixes abstract path operations and filesystem-dependent operations, and (C) it's not radical enough for many OO proponents. However, it does separate filesystem-dependent operations to an extent: they must be methods, while abstract operations may be attributes. Orendorff's module does not separate these at all. Noam Raphael's directory-based class Introduction: http://wiki.python.org/moin/AlternativePathClass Feature discussion: http://wiki.python.org/moin/AlternativePathDiscussion Reference implementation: http://wiki.python.org/moin/AlternativePathModule Noam's emulates a sequence of components (a la os.path.split) rather than a string. This expresses slicing and joining by Python's [] and + operators, eliminating several named methods. Each component emulates a unicode with methods to extract the name and extension(s). The first component of absolute paths is a "root object" representing the Posix root ("/"), a Windows drive-absolute ("C:\"), a Windows drive-relative ("C:"), a network share, etc. This proposal has a reference implementation but a PEP has not been written, and the discussion page has many feature alternatives that aren't coded. A robust reference implementation would have methods for all alternatives so they can be compared in real programs. There have been a few additional ideas on python-dev but no code. Nick Coghlan suggested separate classes for (A) string manipulation, (B) abstract path operations, (C) read-only inspection of filesystem, (D) add/remove files/directories/links. Another suggested four classes for (A) abstract path operations, (B) file, (C) directory, (D) symlink. Talin proposed an elaborate set of classes and functions to do this. I pointed out that each class would need 3+ versions to accomodate platform differences, so somebody wanting to make a generic subclass would potentially have to make 12 versions to accommodate all the superclass possibilities (4 path classes * 3 platforms). Functions would cut down the need for multiple classes and duplicated methods between them, but functions would make "subclassing Path" more difficult. I would like to see one or more implementations tested and widely used as soon as possible, so that we'd have confidence using them in our programs until a general Python solution emerges. But first we need to see if we can achieve a common API, as Talin started this thread saying. -- Mike Orr <[EMAIL PROTECTED]> _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com