Filesystems and files [Was: Re: The obligation of free stuff: Google Storage]
If I might offer a late viewpoint after reading the Aaron's expanded email (attached below). When originally I suggested using 'open' instead of 'connect', the aim was to keep consistency with the paradigm of files on the local system. However, as Aaron's post suggests, when dealing with remote 'files', there is an additional layer of functionality that must be introduced, namely the need to 'connect' to the filesystem so that files on it can be opened. Thus it is wrong to conflate 'open' with 'connect'. It is normally implied that a program already has a 'local' environment, including a 'local' filesystem. Thus the syntax my $fn = open('/path/to/directory/filename', :r) or die $!; implies a local file sytem. The idea of an implied local system suggests an implied local environment. The contents of %*ENV and @*INC seem to be assumed to be local, thought this is not specified. Given the development of the internet, this is an assumption I think should be made implicit, as well as the mechanism for adding remote resources via paths through a network. Would it make sense to define $*FS as the implied local file system, and thus that a bare 'open' is sugar for my $fh = $*FS.open('/path/to/directory/filename', :r); This then means that there is an implicit $*FS.connect(); that makes the local system available to the program. I wonder whether this would also be a way of unifying the program interface for difference Operating Systems, in that a program running on a *nix machine would have $*FS of type IO::Filesystem::POSIX, while $*FS for a Windows machine would have type IO::Filesystem::Windows, etc. Then it would be possible, as Aaron has suggested to have my $remote-fs = IO::Filesystem::Google.connect(%args); my $fh = $remote-fs.open($path-or-url, :r); and then treat $fh as specified elsewhere. Morover, it would then be possible to do $*FS = $remote-fs; I would propose that this sort of flexibility would be useful for programs that are embedded in other virtual environments, such as browser plugins, or programs intended to run on thin clients that do not have their own filesystems. Another possibility would be to have my $windows-from-linux = IO::Filesystem::Windows.connect(%args); my $linux-system = $*FS; $*FS = $windows-from-linux; And then the files on a dual boot system can be accessed by the program. On 06/21/10 02:35, Aaron Sherman wrote: First off, I again have to caution that this thread is conflating open with filesystem interaction. While open is one of many ways of interacting with a filesystem, it isn't even remotely sufficient (nor my immediate focus). One can ask for and modify filesystem metadata, security information, and so on as well as that for individual objects within the filesystem (which in the POSIX model is mostly files and directories). In a traditional POSIX/Unix model, programs (other than key OS utilities) don't usually do much with the structure of the filesystem. That's meant as an interactive task for an admin. However, in building a cloud-storage aware VFS-layer, managing the filesystem in terms of layout, allocation, security (access methods, authorization and authentication), payment models, and many other features are expected to be embodied in the access model. Just as an example, choosing and laying out what Amazon calls buckets is the equivalent of partitioning. That does need an interface. Now, we can just translate the Python bindings for Google Storage (and I believe there are already Perl 5 bindings for Amazon S3), but my inclination is to build a generic VFS that can handle POSIX-like filesystems as well as everything else from Windows/Mac specific features to full-blown cloud storage to more user-oriented storage options (Dropbox comes to mind). Every addressable storage model which could be treated as a filesystem should have a place in the Perl 6 VFS. Now, as to the question of overloading open... I'm not sure. I mean, it's pretty easy to say: URI.new($path).open(:ro) or open(URI.new($path), :ro) When what you want is a VFS object, and I kind of like the idea of the standard open on a string having POSIX semantics. Now, to your question, C.J. On Fri, Jun 18, 2010 at 3:03 PM, C.J. Adams-Collier c...@colliertech.org wrote: Define opening a file for me. Is it something that's associated with a filehandle, by definition? Do TCP sockets count? Opening a file isn't a well defined operation. You have to be more specific. In your question you're conflating the evaluation of a filesystem namespace token (which is one of many possible modes of filesystem interaction), returning a filehandle object that represents access to the named object with evaluation of a socket namespace token and returning a similar filehandle object that represents that object. There are, of course crossovers (filesystem pipes) and other operations that yield filehandle objects (various IPC operations that aren't exactly
Re: Filesystems and files [Was: Re: The obligation of free stuff: Google Storage]
Sounds like a sound generalization to make. bikeshedding On Wed, Jun 30, 2010 at 1:29 AM, Richard Hainsworth rich...@rusrating.ru wrote: This then means that there is an implicit $*FS.connect(); that makes the local system available to the program. mount is the jargon to make a filesystem available, looking backwards, though perhaps connect is more accurate going forwards. /bikeshedding -y
Re: Filesystems and files [Was: Re: The obligation of free stuff: Google Storage]
On Wed, Jun 30, 2010 at 10:29 AM, Richard Hainsworth rich...@rusrating.ru wrote: Would it make sense to define $*FS as the implied local file system, and thus that a bare 'open' is sugar for my $fh = $*FS.open('/path/to/directory/filename', :r); This then means that there is an implicit $*FS.connect(); that makes the local system available to the program. I wonder whether this would also be a way of unifying the program interface for difference Operating Systems, in that a program running on a *nix machine would have $*FS of type IO::Filesystem::POSIX, while $*FS for a Windows machine would have type IO::Filesystem::Windows, etc. Then it would be possible, as Aaron has suggested to have my $remote-fs = IO::Filesystem::Google.connect(%args); my $fh = $remote-fs.open($path-or-url, :r); and then treat $fh as specified elsewhere. Morover, it would then be possible to do $*FS = $remote-fs; I would propose that this sort of flexibility would be useful for programs that are embedded in other virtual environments, such as browser plugins, or programs intended to run on thin clients that do not have their own filesystems. I like this idea. It's flexible and extendable but has sane defaults. Leon
Re: Filesystems and files [Was: Re: The obligation of free stuff: Google Storage]
On Wed, Jun 30, 2010 at 4:29 AM, Richard Hainsworth rich...@rusrating.ruwrote: It is normally implied that a program already has a 'local' environment, including a 'local' filesystem. Thus the syntax my $fn = open('/path/to/directory/filename', :r) or die $!; implies a local file sytem. The idea of an implied local system suggests an implied local environment. The contents of %*ENV and @*INC seem to be assumed to be local, thought this is not specified. Given the development of the internet, this is an assumption I think should be made implicit, as well as the mechanism for adding remote resources via paths through a network. Would it make sense to define $*FS as the implied local file system, and thus that a bare 'open' is sugar for my $fh = $*FS.open('/path/to/directory/filename', :r); Yep, that makes perfect sense. Once I have a working VFS object that could be stored in there, that's probably the best way to go, unless someone proposes another way between now and then. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
S06 -- grammatical categories and macros
See below for the S06 section I'm referring to. I'm wondering how we should be reading the description of user-defined operators. For example, sub infix:(c) doesn't describe the precedence level of this new op, so how is it parsed? Is there a default? Right now, this doesn't work as I'd expect in Rakudo for all categories. For example: $ ./perl6 -e 'sub infix:i ($a,$b) { return $a+($b*1i); } ; say 3 i 2' 3 + 2i Correct. $ ./perl6 -e 'sub term:i () { return 1i; } ; say i' error:imcc:syntax error, unexpected '\n' in file 'EVAL_1' line 105914178 Eh? What newline? And line 105914178? OK, so that's a bug, but the question is, should I expect it to work? Things get a bit more strange when I try to wrap my head around macros. From the example: macro circumfix:«!-- --» ($text) is parsed / .*? / { } OK, so when a circumfix would be allowed (any expression?) we accept !-- -- and the result is an empty string which is... now, follow me here, 'cause I get lost myself... re-parsed within the context of what we've already parsed, and its resulting AST is then returned as if via make. But doesn't that mean that in order to chain two comments, I would need something to join the new null expressions, e.g.: !-- -- ; !-- -- In a way, I'd find that comforting. It's not as useful for creating comments as I'd have wanted, but at least it behaves like any other circumfix: category operator. Along the lines of macros, am I correct in my assumption that a macro will either exist within a grammatical category that it names, or will be evaluated as an expression, just like a subroutine invocation? That is, there will be no way to do something like: macro endofline() { ; } since there's no way to change the state of the parser that invoked the macro, other than to return an AST. I think there's also a bug in the examples when it comes to ±. That can be a method, sure, that makes sense, but in which case I don't think it should be taking a parameter. Wouldn't that be: method prefix:± (-- Num) { return +self.myintvalue | -self.myintvalue } So that it would be used like so: my $x = MyInt.new(:myintvalue(5)); say ±$x; which I would expect to yield: any(5, -5) From S06: Operators are just subroutines with special names and scoping. An operator name consists of a grammatical category name followed by a single colon followed by an operator name specified as if it were a one or more strings. So any of these indicates the same binary addition operator: infix:+ infix:«+» infix:+ infix:['+'] infix:[+] This seems to imply that we can define our own operators. Use the sigil just as you would on ordinary subs. Unary operators are defined as prefix or postfix: sub prefix:OPNAME ($operand) {...} sub postfix:OPNAME ($operand) {...} Binary operators are defined as infix: sub infix:OPNAME ($leftop, $rightop) {...} Bracketing operators are defined as circumfix where a term is expected or postcircumfix where a postfix is expected. A two-element slice containing the leading and trailing delimiters is the name of the operator. sub circumfix:LEFTDELIM RIGHTDELIM ($contents) {...} sub circumfix:['LEFTDELIM','RIGHTDELIM'] ($contents) {...} Contrary to Apocalypse 6, there is no longer any rule about splitting an even number of characters. You must use a two-element slice. Such names are canonicalized to a single form within the symbol table, so you must use the canonical name if you wish to subscript the symbol table directly (as in PKG::{'infix:+'}). Otherwise any form will do. (Symbolic references do not count as direct subscripts since they go through a parsing process.) The canonical form always uses angle brackets and a single space between slice elements. The elements are escaped on brackets, so PKG::circumfix:['','']is canonicalized to PKG::{'circumfix:\ \'}, and decanonicalizing may always be done left-to-right. Operator names can be any sequence of non-whitespace characters including Unicode characters. For example: sub infix:(c) ($text, $owner) { return $text but Copyright($owner) } method prefix:± (Num $x -- Num) { return +$x | -$x } multi sub postfix:! (Int $n) { $n 2 ?? 1 !! $n*($n-1)! } macro circumfix:«!-- --» ($text) is parsed / .*? / { } my $document = $text (c) $me; my $tolerance = ±7!; !-- This is now a comment -- Whitespace may never be part of the name (except as separator within a ... or «...» slice subscript, as in the example above). A null operator name does not define a null or whitespace operator, but a default matching subrule for that syntactic category, which is useful when there is no fixed string that can be recognized, such as tokens beginning with digits. Such an operator *must* supply an is parsed trait. The Perl grammar uses a default subrule for the :1st, :2nd, :3rd, etc. regex modifiers, something like this: sub regex_mod_external: ($x) is parsed(token {
Re: S06 -- grammatical categories and macros
Aaron Sherman wrote: See below for the S06 section I'm referring to. I'm wondering how we should be reading the description of user-defined operators. For example, sub infix:(c) doesn't describe the precedence level of this new op, so how is it parsed? Is there a default? The default is same as infix:+ for infix ops, however the is prec trait (and some other related ones) should also be available (but not yet implemented in Rakudo). Right now, this doesn't work as I'd expect in Rakudo for all categories. For example: $ ./perl6 -e 'sub infix:i ($a,$b) { return $a+($b*1i); } ; say 3 i 2' 3 + 2i Correct. $ ./perl6 -e 'sub term:i () { return 1i; } ; say i' error:imcc:syntax error, unexpected '\n' in file 'EVAL_1' line 105914178 Eh? What newline? And line 105914178? Yeah, our handling of categories we don't handle yet is crappy; it ends up boiling down to a code generation fail. OK, so that's a bug, but the question is, should I expect it to work? Yes, I believe it should. skipping macros bit for somebody else who groks them :-) I think there's also a bug in the examples when it comes to ±. That can be a method, sure, that makes sense, but in which case I don't think it should be taking a parameter. Wouldn't that be: method prefix:± (-- Num) { return +self.myintvalue | -self.myintvalue } So that it would be used like so: my $x = MyInt.new(:myintvalue(5)); say ±$x; which I would expect to yield: any(5, -5) The method case makes no sense to me. It almost certainly won't be any use unless the method gets exported, since operator dispatches are always sub dispatch. Maybe that example is a fossil that should go away. And if not, then yes, it most certainly would want to be written in terms of self, not have a parameter. So something is wonky with the spec here. Hope this helps a little, Jonathan