[EMAIL PROTECTED] (Nathan Wiger)  wrote on 12.08.00 in <[EMAIL PROTECTED]>:

> Here's a step-by-step process for supporting both that I propose.
> Suggestions welcome:
>
> 1. If called as an indirect object function (in the form currently
>    proposed), steps 2-3 are skipped and the handler specified is
>    automatically invoked.
>
> 2. The top-level open() looks to see the file is a valid URI, namely:
>
>         [method] :// [resource]
>
>    If so, method becomes the handler, the :// is stripped, and the rest
>    is passed to the handler. If no handler by that name has been
>    registered, undef is returned.

Bad. Not all URIs follow that syntax. (Think mailto: or news:, for  
example.) There's also scheme-less relative URIs, but let's not go there.

Of course if you just look at the ":", you get in trouble with everything  
that can have a : in the filename.

So I think a better way would be:

 2. The top-level open() looks to see the file looks like an URI, namely:

         ($scheme, $scheme_specific_part) = /^([a-z][a-z0-9.+-]*):(.*)$/i

and if there is a registered handler for (lc $scheme).

    If so, (lc $schene) becomes the handler, and $scheme_specific_part
    is passed to the handler.

For details on URI syntax, see (IETF) RFC 2396, and yes, case *is*  
insignificant in URI schemes. Note that drive letters (and many MacOS  
volume names) are followed by ":" and thus *do* look like URI schemes,  
this is why the above tests for a registered handler. (You can register 26  
default handlers under Windows/DOS, but I trust you don't want to register  
a default handler per mounted MacOS volume. And then there are filenames  
like "LPT1:" ... please let's not go there.)

I might note in passing that I have a web mirroring application that uses  
Web URLs as-is as Unix filenames, yielding directories named "http:" and  
"ftp:". I'd obviously need to explicitely disambiguate this.

There's also the current < +> etc. syntax. Do we want to do something  
about it? What?

> 3. If no handler was found (the path was not a URI), then the default
>    'file' handler is used.

Well, maybe not. The syntax of file: URIs is not (necessarily) exactly the  
same as that of native filenames, so maybe this should not be the same  
handler. (Look at DOS/Windows filenames for pretty obvious examples. Then  
there are % escapes.)

But this is only a nit.

> 4. The handler's open is called. Any special operations are done by
>    the individual handler. For example, the 'file' handler might be
>    equipped to auto-determine if something is a directory and, if so,
>    fork the 'dir' handler. The handler's open returns a fileobject
>    or undef.

I have the uncomfortable feeling that somewhere, out there, we'll hit a  
filesystem that can have files and directories with exactly identical  
names. Especially as Linus was just seen writing in support of that idea  
on the linux-kernel mailing list.


MfG Kai

Reply via email to