[EMAIL PROTECTED] (Nathan Wiger) wrote on 12.08.00 in <[EMAIL PROTECTED]>:
> Here's a step-by-step process for supporting both that I propose.
> Suggestions welcome:
>
> 1. If called as an indirect object function (in the form currently
> proposed), steps 2-3 are skipped and the handler specified is
> automatically invoked.
>
> 2. The top-level open() looks to see the file is a valid URI, namely:
>
> [method] :// [resource]
>
> If so, method becomes the handler, the :// is stripped, and the rest
> is passed to the handler. If no handler by that name has been
> registered, undef is returned.
Bad. Not all URIs follow that syntax. (Think mailto: or news:, for
example.) There's also scheme-less relative URIs, but let's not go there.
Of course if you just look at the ":", you get in trouble with everything
that can have a : in the filename.
So I think a better way would be:
2. The top-level open() looks to see the file looks like an URI, namely:
($scheme, $scheme_specific_part) = /^([a-z][a-z0-9.+-]*):(.*)$/i
and if there is a registered handler for (lc $scheme).
If so, (lc $schene) becomes the handler, and $scheme_specific_part
is passed to the handler.
For details on URI syntax, see (IETF) RFC 2396, and yes, case *is*
insignificant in URI schemes. Note that drive letters (and many MacOS
volume names) are followed by ":" and thus *do* look like URI schemes,
this is why the above tests for a registered handler. (You can register 26
default handlers under Windows/DOS, but I trust you don't want to register
a default handler per mounted MacOS volume. And then there are filenames
like "LPT1:" ... please let's not go there.)
I might note in passing that I have a web mirroring application that uses
Web URLs as-is as Unix filenames, yielding directories named "http:" and
"ftp:". I'd obviously need to explicitely disambiguate this.
There's also the current < +> etc. syntax. Do we want to do something
about it? What?
> 3. If no handler was found (the path was not a URI), then the default
> 'file' handler is used.
Well, maybe not. The syntax of file: URIs is not (necessarily) exactly the
same as that of native filenames, so maybe this should not be the same
handler. (Look at DOS/Windows filenames for pretty obvious examples. Then
there are % escapes.)
But this is only a nit.
> 4. The handler's open is called. Any special operations are done by
> the individual handler. For example, the 'file' handler might be
> equipped to auto-determine if something is a directory and, if so,
> fork the 'dir' handler. The handler's open returns a fileobject
> or undef.
I have the uncomfortable feeling that somewhere, out there, we'll hit a
filesystem that can have files and directories with exactly identical
names. Especially as Linus was just seen writing in support of that idea
on the linux-kernel mailing list.
MfG Kai