On 2009-Aug-18, at 7:20 am, Timothy S. Nelson wrote:
On Tue, 18 Aug 2009, David Green wrote:
Some ways in which different paths can be considered equivalent: Spelling: ... Simplification: ... Resolution: ... Content-wise: ...
Ok, my next commit will have "canonpath" (stolen directly from p5's File::Spec documentation), which will do "No physical check on the filesystem, but a logical cleanup of a path", and "realpath" (idea taken from p5's Cwd documentation), which will resolve symlinks, etc, and provide an absolute path. Oh, and "resolvepath", which does both. I'm not quite sure I followed all your discussion above -- have I left something out?

I think there's a difference between "canonical" as in a webpage with <link rel="canonical">, and "cleanup" as in Windows turning PROGRA~1 into "Program Files". There could also be other types of normalisation depending on the FS, but we probably shouldn't concern ourselves with them, other than having some way to get to such native calls.

Anyway, my assumption is that there should be a number of comparison options. Since we do Str, we should get string comparison for free. But I'm expecting other options at other levels, but have no idea how or what at this point.

As Leon Timmermans keeps reminding us, that really should be delegated to the OS/FS. I think $file1 =:= $file2 should ask the OS whether it thinks those are the same item or not (it can check paths, it can check inodes, whatever is its official way to compare file-thingies). Similarly, $file1.name === $file2.name should ask the OS whether it thinks those names mean the same thing. And if you want to compare the canonical paths or anything else, just say $file1.name.canonical === $file2.name.canonical, or use 'eq', or whatever you want to do, just do it explicitly.

According to my last commit, p{} will return a Path object that just stores the path, but has methods attached for accessing all the metadata. But it doesn't do file opening or things like that (unless you use the :T and :B thingies, which read the first block and try to guess whether it's text or binary -- these are in Perl 5 too).

There are two things going on here: the user-friendly syntax for casual use, which we basically agree should be something short and pithy, although we have but begun to shed this bike, I'm sure.

    $file = io "/foo/bar";
    $file = p{/foo/bar};
    $file = Q:p/foo/bar/;
    $file = File("/foo/bar");

However we end up spelling it, we want that to give us unified access to the separate inside parts:

    IO::Data            # contents of file
    IO::Handle          # filehandle for using manually
    IO::Metadata
    IO::Path

I'm not sure why Path isn't actually just part of IO::Metadata... maybe it's just handy to have it out on its own because pathnames are so prominent. In any case, $file.size would just be shorthand for something like $file.io.metadata{size}. The :T and :B tests probably ought to be part of IO::Data, since they require opening the file to look at it; I'd rather put them there (vs. ::Metadata, which is all "outside" info) since plain ol' $file abstracts over that detail anyway. You can say $file.r, $file.x, $file.T, $file.B, and not care where those test live under the hood.

We might actually want to distinguish IO::Metadata::Stat from IO::Metadata::Xattr or something... but that's probably too FS- specific. I don't think I mind much whether it's IO::Path or IO::Metadata::Path, or whether they both as exist as synonyms....

I think we want many of the same things, I'm just expressing them slightly differently. Let's keep working on this, and hopefully we end up with something great.

Yes.  A great mess!  Er, wait, no........

And there's no perfect solution, but it would be useful for Perl to stick as closely as the FS/OS's idea of types as it can. Sometimes that would mean looking up an extension; it might mean using (or emulating) "file" magic; it might mean querying the FS for a MIME- type or a UTI. After all, the filename extension may not actually match the correct type of the file.

My suggestion would be that it's an interesting idea, but should maybe be left to a module, since it's not a small problem. Of course, I'm happy to be overruled by a higher power :). I'd like the feature, I'm just unsure it deserved core status.

Well, it's all modules anyway... certainly we'll have to rely on IO::Filesystem::XXX, but I do think this is another area to defer to the OS's own type-determining functions rather than try to do it all internally. What we should have, though, is a standard way to represent the types in Perl so that users know how to deal with them. I think roles are the obvious choice: if the OS tells you that a file is HTML, then $file would do IO::Datatype::HTML, which means in turn it would also do IO::Datatype::Plaintext, and so on.

Of course, if the OS tells you you've got a file that does IO::Datatype::Illudium-phosdex, and you want to *do* something with it, you'll need a module that knows what to do with that kind of file. Perl by itself knows only how to treat it as a string of raw bytes. Well, or as plain text. So you can treat your HTML file as plain text, or you can use HTML::Doc::Tree and treat it as something fancier.


-David

Reply via email to