On Tue, 18 Aug 2009, David Green wrote:
On 2009-Aug-17, at 8:36 am, Jon Lang wrote:
Timothy S. Nelson wrote:
Well, my main thought in this context is that the stuff that can be
done to the inside of a file can also be done to other streams -- TCP
sockets for example (I know, there are differences, but the two are a lot
the same), whereas metadata makes less sense in the context of TCP
sockets;
But any IO object might have metadata; some different from the metadata you
traditionally get with files, and some the same, e.g. $io.size,
$io.times{modified}, $io.charset, $io.type.
Ok, now you're giving me ideas :).
[snipped a bit and moved it further down the e-mail]
I guess what I'm saying here is that I think we can do the things
without people having to worry about the objects being separate unless
they
care. So, separate objects, but hide it as much as possible. Is that
something you're fine with?
Yes -- to me that means some class/role that wraps up all the pieces
together, but all the separate components are still there underneath. But
I'm not too bothered about how it's implemented as long as it's transparent
for casual use.
my $file = io p[/some/file];
my $contents = $file.data;
my $mod-date = $file.times{modified};
my $size = $file.size;
That sounds like the kind of thing I'm heading for.
Pathnames still are strings, so that's fine. In fact, there are
different
As for pathnames being strings, you may be right FSVO string. But
I'd say that, while they may be strings, they're not Str, but they do Str
Agreed, pathnames are "almost" strings, but worth distinguishing
conceptually. There should be a URL type that does Str.
Actually, there are other differences, like case-insensitivity and illegal
chars. Unfortunately, those depend on the given filesystem. As long as
you're dealing with one FS at a time, that's OK; it probably means we have
IO::Name::ext3, IO::Name::NTFS, IO::Name::HFS, etc. But what happens when
you cross FS-barriers? Does a case-sensitive name match a case-insensitive
one? Is filename-equality not commutative or not transitive? If you're
looking for a filename "foo" on Mac/Win, then a file actually called "FOO"
matches; but on Unix it wouldn't.
(Actually, Macs can do both IO::Name::HFS::case-insensitive and
IO::Name::HFS::case-sensitive. Eek.)
I think it should depend on the set of constraints involved.
I'd like Perl 6's treatment of filenames to be smart enough that
smart-matching any of these pairs of "alternative spellings" would result
in a successful match. So while I'll agree that filenames are string-like,
I really don't want them to _be_ strings.
Well, the *files* are the same, but the pathnames are different. I'm not
sure whether some differences in "spelling" should be ignored by default or
not. There are actually several different kinds; S32 has a method
"realpath", but I think "canonical" is a better name, because aliases can be
just as "real" as the canonical path, e.g. a web page with multiple
addresses. Or hard links rather than soft links -- though in that case,
there is no one "canonical" path. It may not even be possible to easily tell
if there is one or not.
Some ways in which different paths can be considered equivalent:
Spelling: C:\PROGRA~1, case-insensitivity
Simplification: foo/../bar/ to bar/
Resolution: of symlinks/shortcuts
Content-wise: hard links/multiple addresses
Depending on the circumstances, you might want any of those to count as the
"same" file; or none of them. We'll need methods for each sort of
transformation, $path.canonical, $path.normalize, $path.simplify, etc. Two
high-level IO objects are "the same", regardless of path, if $file2 =:=
$file2 (which might compare inodes, etc.). There should be a way to set what
level of sameness applies in a given lexical scope; perhaps the first two
listed above are a reasonable default to start with.
Ok, my next commit will have "canonpath" (stolen directly from p5's
File::Spec documentation), which will do "No physical check on the filesystem,
but a logical cleanup of a path", and "realpath" (idea taken from p5's Cwd
documentation), which will resolve symlinks, etc, and provide an absolute
path. Oh, and "resolvepath", which does both. I'm not quite sure I followed
all your discussion above -- have I left something out?
Anyway, my assumption is that there should be a number of comparison
options. Since we do Str, we should get string comparison for free. But I'm
expecting other options at other levels, but have no idea how or what at this
point.
There's something that slightly jars me here... I don't like the
quotation returning an IO object.
But doesn't normal quoting return a Str object? And regex quoting return
an object (Regex? Match? Something, anyway).
Certainly, but a regex doesn't produce a Signature object, say. I don't
object to objects, just to creating objects, then doing something with them,
then returning another kind of object, and calling that "parsing". If we're
parsing the characters, we should end up with an IO::Name. If we end up with
an IO::actual-file/stream-whatever, then we should call it something else
(like an "io constructor").
According to my last commit, p{} will return a Path object that just
stores the path, but has methods attached for accessing all the metadata. But
it doesn't do file opening or things like that (unless you use the :T and :B
thingies, which read the first block and try to guess whether it's text or
binary -- these are in Perl 5 too).
[This bit was further up the e-mail, but I moved it here]
if (path{/path/to/file}.e) {
@lines = slurp(path{/path/to/file});
}
(I'm using one of David's suggested syntaxes above, but I'm not
closely attached to it).
I suggested variations along the line of: io "/path/to/file". It amounts to
much the same thing, but it's important conceptually to distinguish a
pathname from the thing it names. (A path doesn't have a modification date,
a file does.) Also, special quoting/escaping could apply to other things,
not limited to "filenames". That said, I don't think it's unreasonable to
want to combine both operations for brevity, but the io-constructor should
have built-in path parsing, not the other way around.
Did my answer above answer the concerns here?
The difference in our approaches is that you seem keen to integrate
closely the data and the metadata, whereas I'm trying to integrate the
paths
and the metadata.
Well, paths are just metadata too, although typically the most important
kind. (You could even have an IO without a path or name.) I want a view
that integrates all of them, because that's how people ordinarily think about
files, unless they have a specific reason not to.
I think we want many of the same things, I'm just expressing them
slightly differently. Let's keep working on this, and hopefully we end up
with something great.
I was wanting to replace the "glob" language with something more like
XPath, but that idea was vetoed by people who didn't want Tree-related
objects to be part of the core, so I'm doing that as a library.
I'm all for some tree-related fun(ctions). A tree is basically a hash of
hashes, so I'm surprised we don't have a few functions for traversing them
and other very basic hashy concepts. But I would like to see XPath-type
stuff hashed out [pun intended] anyway -- whether it ends up in a third-party
module or not isn't such a big deal when it comes to P6, and somebody will
have to figure how to do it in a perlish way eventually.
That would be me. I have some code, but I'm waiting on improvements
in Rakudo (and btw, thanks to the Rakudo guys for doing a wonderful job).
if $file.type ~~ MIME("text/plain") {...}
Cool idea. How would the type be determined? Are you thinking of
the algorithms in the unix "file" utility? Please tell me you're not
planning to use filename extentions -- that's bad :).
Wouldn't $file.type be metadata?
Yes; and yes, filename extensions are evil, but of course thanks to primitive
filesystems, we're stuck with them to a large extent.
And there's no perfect
solution, but it would be useful for Perl to stick as closely as the FS/OS's
idea of types as it can. Sometimes that would mean looking up an extension;
it might mean using (or emulating) "file" magic; it might mean querying the
FS for a MIME-type or a UTI. After all, the filename extension may not
actually match the correct type of the file.
My suggestion would be that it's an interesting idea, but should maybe
be left to a module, since it's not a small problem. Of course, I'm happy to
be overruled by a higher power :). I'd like the feature, I'm just unsure it
deserved core status.
Anyway, HTH,
---------------------------------------------------------------------
| Name: Tim Nelson | Because the Creator is, |
| E-mail: wayl...@wayland.id.au | I am |
---------------------------------------------------------------------
----BEGIN GEEK CODE BLOCK----
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V-
PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI++++ D G+ e++>++++ h! y-
-----END GEEK CODE BLOCK-----