Re: dis-junctive patterns
HaloO, Gaal Yahas wrote: In pugs, r7961: my @pats = /1/, /2/; say MATCH if 1 ~~ any @pats; # MATCH say MATCH if 0 ~~ any @pats; # no match So far so good. But: my $junc = any @pats; say MATCH if 1 ~~ $junc; # no match say MATCH if 0 ~~ $junc; # no match Bug? Feature? Ohh, interesting. This reminds me to my proposal that junctions are code types and exert their magic only when recognized as such. The any(@pats) form constructs such a code object right in the match while the $junc var hides it. My idea was to explicitly request a code evaluation by one of my junc = any @pats; # 1: use code sigil say MATCH if 1 ~~ junc; say MATCH if 1 ~~ do $junc; # 2: do operator say MATCH if 1 ~~ $junc(); # 3: call operator But this might just be wishful thinking on my side. --
Re: apo5
On Mon, Nov 21, 2005 at 12:08:08PM -0800, Larry Wall wrote: On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote: : There is a [[:alpha:][:digit:] and a [[:alpha:][:digit]] on the : A5-page. Hmm, well, thanks--I went to fix it and I see Patrick beat me to the fix. But in one of the updates, it says: +[Update: Actually, that's now written C +alpha+digit , avoiding +the mistaken impression entirely.] I went ahead and added the update while fixing the typos. :-) And it occurs to me that we could probably allow alpha+digit there since there's no ambiguity what alpha means, and we're already claiming the next character after the opening word to decide how to process the rest of the text inside angles. Even if someone writes alpha + digit that would fail under the current policy of treating + digit as rule, since you can't start a rule with +. Somehow I prefer the explicit leading + or -, so that we *know* this is a rule composition of some sort. It also fits in well with the convention that the first character after the '' lets you know what kind of assertion is being created. Unfortunately, though, identchar - digit would be ambiguous, and/or wrong. Could allow whitespace there if we picked an explicit this is rule character. Did we remove this is string? I didn't recall seeing anything that removed this is string, so it's currently implemented in PGE. It's kind of a nice shortcut: bracketed: []() but it would be no real problem to eliminate it and go strictly with: bracketed('[]()') This is rule is currently whitespace, whatever follows is taken to be a pattern. But let me know what you decide so I can make the appropriate changes. :-) Pm
Re: \x{123a 123b 123c}
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote: : There's also sp, unless someone redefines the sp subrule. But you can't use sp in a character class. Well, that is, unless you write it: +[ a..z ]+sp or some such. Maybe that's good enough. Er, that's now +[ a..z ]+sp, unless you're now changing it back. : And in the general case that's a slightly more expensive mechanism : to get a space (it involves at least a subrule lookup). Perhaps : we could also create a visible meta sequence for it, in the same : way that we have visible metas for \e, \f, \r, \t. But I have : no idea what letter we might use there. Something to be said for \_ in that regard. Yes, I thought of \_ but mentally I still have trouble classifying _ along with the alphabetics -- '_' looks more like punctuation to me. And in general we use backslashes in front of metacharacters to remove their meta meaning (or when we aren't sure if a character has a meta meaning), so that \_ somehow seems like it ought to be a literal underscore, guarding against the possibility that the unescaped underscore has a meta meaning. (And yes, I can shoot holes in this line of thinking along with everyone else.) Whatever shortcuts we introduce, I'll be happy if we can just rule that backslash+space (i.e., \ ) is a literal space character -- i.e., keeping the principle that placing a backslash in front of a metacharacter removes that character's meta behavior. I dunno. If «...» in ordinary code does shell quoting, maybe «...» in rules does filename globbing or some such. I can see some issues with anchoring semantics. Makes more sense on a string as a whole, but maybe can anchor on element boundaries if used on a list of filenames. I suppose one could even go as far as rule jpeg :i « *.jp{e,}g » or whatever the right glob syntax is. Since we already have :perl5, I'd think that we'd want globbing to be something like rule jpeg :i :glob /*.jp{e,}g/ or, for something intra-rule-ish: m :w / mv (:glob *.c)+ dir / And perhaps we'd want a general form for specifying other pattern syntaxes; i.e., :perl5 and :glob are shortcuts for :syntax('perl5') and :syntax('glob') or something like that. Pm
Re: apo5
On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote: There is a [[:alpha:][:digit:] and a [[:alpha:][:digit]] on the A5-page. Now fixed. Besides, you have to be able to distinguish s/^/foo/ from s/$/foo/. 's/$/foo/' becomes 's/after .*/foo/' g Uh, no, because after is still a zero width assertion. :-) Pm
Re: apo5
On Mon, Nov 21, 2005 at 11:19:48PM +0100, Ruud H.G. van Tol wrote: Patrick R. Michaud: 's/$/foo/' becomes 's/after .*/foo/' g Uh, no, because after is still a zero width assertion. :-) That's why I chose it. It is not at the end-of-string? Because .* matches , /after .*/ would be true at every position in the string, including the beginning, and this is where foo would be substituted. Pm
Re: apo5
On Tue, Nov 22, 2005 at 01:09:40AM +0100, Ruud H.G. van Tol wrote: 's/$/foo/' becomes 's/after .*/foo/' Uh, no, because after is still a zero width assertion. :-) That's why I chose it. It is not at the end-of-string? Because .* matches , /after .*/ would be true at every position in the string, including the beginning, and this is where foo would be substituted. I expected greediness, also because after .*? could behave non-greedy. ... But why does after .* behave non-greedy? I think you may be misreading what after .* does -- it's a lookbehind assertion. An assertion such as after pattern attempts to match pattern to the sequence immediately preceding the current match position. It does not mean skip over pattern and then match whatever comes afterwards. The greediness of the .* subpattern in after .* doesn't affect things at all -- after .* is still a zero-width assertion. Since .* can match at every position, after .* will be a successful zero-width match (i.e., a null string) at every position in the target string, including the beginning. So, s/after .*/foo/ matches the first null string it finds -- the one at the beginning of the string -- and replaces it with foo. It's the same as if you had written s/null/foo/, since after .* and null will both end up matching exactly the same (i.e., a zero-width string at any position). If this still doesn't make any sense, contact me off-list and I'll try and explain it there. Pm
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
On Mon, 21 Nov 2005, Larry Wall wrote: I would like to publicly apologize for my remarks, which were far too harsh for the circumstances. I can only plead that I was trying to be far too clever, and not thinking about how it would come across. No, to be perfectly honest, it was more culpable than that. I had a niggling feeling I was being naughty, and I ignored it. Shame on me. I will try to pay better attention to my conscience in the future. Oh, I'm not the person you were responding to, and probably the less entitled one to speak in the name of everyone else here, but I feel like doing so to say that in all earnestness I'm quite sure no one took any offense out of your words. Despite the slight harshness, they're above all witty. Just as usual: and that's the style we all like! Michele -- La vita e' come una scatola di cioccolatini: un regalo banale. - scritta su un muro, V.le Sabotino - Milano.
Re: \x{123a 123b 123c}
On Mon, Nov 21, 2005 at 11:25:20AM -0600, Patrick R. Michaud wrote: : On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote: : : There's also sp, unless someone redefines the sp subrule. : : But you can't use sp in a character class. Well, that is, unless : you write it: : : +[ a..z ]+sp : : or some such. Maybe that's good enough. : : Er, that's now +[ a..z ]+sp, unless you're now changing it back. No, just me going senile. : : And in the general case that's a slightly more expensive mechanism : : to get a space (it involves at least a subrule lookup). Perhaps : : we could also create a visible meta sequence for it, in the same : : way that we have visible metas for \e, \f, \r, \t. But I have : : no idea what letter we might use there. : : Something to be said for \_ in that regard. : : Yes, I thought of \_ but mentally I still have trouble : classifying _ along with the alphabetics -- '_' looks more : like punctuation to me. And in general we use backslashes : in front of metacharacters to remove their meta meaning : (or when we aren't sure if a character has a meta meaning), : so that \_ somehow seems like it ought to be a literal : underscore, guarding against the possibility that the unescaped : underscore has a meta meaning. (And yes, I can shoot : holes in this line of thinking along with everyone else.) I think we'll leave both _ and \_ meaning the same thing, just to avoid that confusion path--I've seen people backwhacking anything remotely resembling punctuation just in case it's a metacharacter, and if they are confused about _, they might backwhack it. More to the point, I think sp and +sp are about the right Huffman length, given that matching a single space is usually wrong. You usually want \s or \s*. : Whatever shortcuts we introduce, I'll be happy if we can just : rule that backslash+space (i.e., \ ) is a literal space : character -- i.e., keeping the principle that placing a backslash : in front of a metacharacter removes that character's meta : behavior. Yes, that will be a space. : I dunno. If «...» in ordinary code does shell quoting, maybe «...» in : rules does filename globbing or some such. I can see some issues with : anchoring semantics. Makes more sense on a string as a whole, but maybe : can anchor on element boundaries if used on a list of filenames. : I suppose one could even go as far as : : rule jpeg :i « *.jp{e,}g » : : or whatever the right glob syntax is. : : Since we already have :perl5, I'd think that we'd want globbing : to be something like : : rule jpeg :i :glob /*.jp{e,}g/ : : or, for something intra-rule-ish: : : m :w / mv (:glob *.c)+ dir / Yep, that's what I decided in my other message that was thinking about using ... for word boundaries and ... for capturing $. : And perhaps we'd want a general form for specifying other : pattern syntaxes; i.e., :perl5 and :glob are shortcuts for : :syntax('perl5') and :syntax('glob') or something like that. Maybe. Or maybe it's enough that there are syntactic categories for adding rule modifiers. Doesn't seem like you'd want to parameterize the current language very often. Larry
Re: \x{123a 123b 123c}
On Tue, Nov 22, 2005 at 07:52:24AM -0800, Larry Wall wrote: I think we'll leave both _ and \_ meaning the same thing, just to avoid that confusion path [...] Yay! : Whatever shortcuts we introduce, I'll be happy if we can just : rule that backslash+space (i.e., \ ) is a literal space : character -- i.e., keeping the principle that placing a backslash : in front of a metacharacter removes that character's meta : behavior. Yes, that will be a space. Yay! : Since we already have :perl5, I'd think that we'd want globbing : to be something like : rule jpeg :i :glob /*.jp{e,}g/ : or, for something intra-rule-ish: : m :w / mv (:glob *.c)+ dir / Yep, that's what I decided in my other message that was thinking about using ... for word boundaries and ... for capturing $. Yay! (Our messages on this crossed in the mail; mine was moderated for some reason but that's been corrected.) : And perhaps we'd want a general form for specifying other : pattern syntaxes; i.e., :perl5 and :glob are shortcuts for : :syntax('perl5') and :syntax('glob') or something like that. Maybe. Or maybe it's enough that there are syntactic categories for adding rule modifiers. Doesn't seem like you'd want to parameterize the current language very often. At least within PGE, I'm starting to come across the situation where each application and host language wants its own slight variations of the regular expression syntax (for compatibility reasons). And I figured that since we (conjecturally) have C:lang('PIR'), C:lang('Python') and C:lang('TCL') to indicate the language to be used for the closures within a rule, it might be nice to have a similar parameterized modifier for the pattern syntax itself. I was also thinking that one of the tricky parts to custom rule modifiers such as :perl and :glob is that they actually change the parsing for whatever follows, so it might be nice to have a parameterized form to hook into rather than defining a custom modifier for each syntax variant. But on thinking about it further from an implementation perspective I guess it all comes out the same anyway... Pm
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
On Tue, Nov 22, 2005 at 10:12:00AM +0100, Michele Dondi wrote: : Oh, I'm not the person you were responding to, and probably the less : entitled one to speak in the name of everyone else here, but I feel like : doing so to say that in all earnestness I'm quite sure no one took any : offense out of your words. Despite the slight harshness, they're above all : witty. Just as usual: and that's the style we all like! I like witty sayings as much as the next guy, but wit can hurt when misdirected. If people want me to be machine for cranking out quote file fodder, I'll do my best. But I also care about my friends. Larry
Re: \x{123a 123b 123c}
Patrick wrote: Since we already have :perl5, I'd think that we'd want globbing to be something like rule jpeg :i :glob /*.jp{e,}g/ or, for something intra-rule-ish: m :w / mv (:glob *.c)+ dir / Here! Here! And perhaps we'd want a general form for specifying other pattern syntaxes; i.e., :perl5 and :glob are shortcuts for :syntax('perl5') and :syntax('glob') or something like that. Agreed. Damian
Re: \x{123a 123b 123c}
On Tue, Nov 22, 2005 at 08:19:04PM +1100, Damian Conway wrote: : And perhaps we'd want a general form for specifying other : pattern syntaxes; i.e., :perl5 and :glob are shortcuts for : :syntax('perl5') and :syntax('glob') or something like that. : : Agreed. But the language in the following lexical scope is a constant, so what can :syntax($foo) possibly mean? [Wait, this is Damian I'm talking to.] Nevermind, don't answer that... And there aren't that many regexish languages anyway. So I think :syntax is relatively useless except for documentation, and in practice people will almost always omit it, which makes it even less useful, and pretty nearly kicks it over into the category of multiplied entities for me. Larry
Re: \x{123a 123b 123c}
Larry Wall wrote: And there aren't that many regexish languages anyway. So I think :syntax is relatively useless except for documentation, and in practice people will almost always omit it, which makes it even less useful, and pretty nearly kicks it over into the category of multiplied entities for me. Its surprising how many are out there. Even if we ignore the various dialects of standard rexen, we can find interesting examples such as PSL, a language for specifying temporal assertions, for hardware design: http://www.project-veripage.com/psl_tutorial_5.php. Whether one would want to fold this syntax into a Crule is a different question. There are actually a number of competing languages in this space. E.g. http://www.pslsugar.org/papers/pslandsva.pdf.
Re: \x{123a 123b 123c}
On Tue, Nov 22, 2005 at 09:46:59AM -0800, Dave Whipp wrote: : Larry Wall wrote: : : And there aren't that many regexish languages anyway. So I think :syntax : is relatively useless except for documentation, and in practice people : will almost always omit it, which makes it even less useful, and pretty : nearly kicks it over into the category of multiplied entities for me. : : Its surprising how many are out there. We can certainly add a :syntax() modifier as easily as a :foolang modifier, if we decide at some point we really need one, or if PGE could make good use of it even if Perl 6 doesn't want it. Larry
Re: \x{123a 123b 123c}
On Tue, Nov 22, 2005 at 10:30:20AM -0800, Larry Wall wrote: On Tue, Nov 22, 2005 at 09:46:59AM -0800, Dave Whipp wrote: : Larry Wall wrote: : : And there aren't that many regexish languages anyway. So I think :syntax : is relatively useless except for documentation, and in practice people : will almost always omit it, which makes it even less useful, and pretty : nearly kicks it over into the category of multiplied entities for me. : : Its surprising how many are out there. We can certainly add a :syntax() modifier as easily as a :foolang modifier, if we decide at some point we really need one, or if PGE could make good use of it even if Perl 6 doesn't want it. I'm agreeing with Larry on this one -- let's wait to decide this until we actually feel like we need it. Pm
Re: \x{123a 123b 123c}
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote: On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote: : On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote: : We already have, from A5, \x[0a;0d], so you can supposedly say : \x[123a;123b;123c] : : Hmm, I hadn't caught that particular syntax in A05. AFAIK it's not : in S05, so I should probably add it, or whatever syntax we end up : adopting. Yes. Out of curiosity (and so I can update S05 and PGE), what syntax are we adopting? Is it semicolon, comma, space, any combination of the three, or ...? Pm
Re: \x{123a 123b 123c}
On Tue, Nov 22, 2005 at 12:48:39PM -0600, Patrick R. Michaud wrote: : On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote: : On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote: : : On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote: : : We already have, from A5, \x[0a;0d], so you can supposedly say : : \x[123a;123b;123c] : : : : Hmm, I hadn't caught that particular syntax in A05. AFAIK it's not : : in S05, so I should probably add it, or whatever syntax we end up : : adopting. : : Yes. : : Out of curiosity (and so I can update S05 and PGE), what syntax : are we adopting? Is it semicolon, comma, space, any combination of the : three, or ...? S02.pod currently has it as comma. Larry
Re: dis-junctive patterns
On Tue, Nov 22, 2005 at 09:31:27AM +0200, Gaal Yahas wrote: : In pugs, r7961: : : my @pats = /1/, /2/; : say MATCH if 1 ~~ any @pats; # MATCH : say MATCH if 0 ~~ any @pats; # no match : : So far so good. But: : : my $junc = any @pats; : say MATCH if 1 ~~ $junc; # no match : say MATCH if 0 ~~ $junc; # no match : : Bug? Feature? Feels like a bug to me. The junction should autothread the ~~ even if ~~ weren't dwimmy. And ~~ ought to be dwimmy about junctions even if they didn't autothread. Maybe they're just doing the hallway dance. Larry
Re: syntax for accessing multiple versions of a module
On Tue, Oct 18, 2005 at 07:38:19PM -0400, Stevan Little wrote: I have been meaning to do some kind of p5 prototype of this, I can push it up the TODO list if it would help you. As you can probably infer from the amount of time that it has taken for me to realise that I've failed to reply to you, I think that I already have rather too much going on to be able to take advantage of anything in the near future. So thanks for the offer, but please do thinks in the order that is most logical to you. Nicholas Clark
type sigils redux, and new unary ^ operator
I'm changing my mind about type sigils. After playing around with ^ for a while, I find it's useful only in signatures and declarations, and I'm generally forced to omit it when using it within inner declarations, or it would redeclare the type. Taking that together with the fact that it installs a local :: symbol anyway, I think we can safely go back to the position that the :: sigil in a signature or declaration captures a parametric type, and otherwise is a no-op. The problem that worried me (about wanting to refer to a type that will exist but hasn't been declared yet) does not arise often in practice, and can be solved with a symbolic ref in any event, or by predeclaring a stub type. What tipped me over the edge, however, is that I want ^$x back for a unary operator that is short for 0..^$x, that is, the range from 0 to $x - 1. I kept wanting such an operator in revising S09. It also makes it easy to write for ^5 { say } # 0, 1, 2, 3, 4 Now, while it's true that ^5 is an illegal type name, a unary operator takes an expression, and that could start with an alpha: ^rand(5). We could conceivably keep the type sigil if we forced you to say instead ^(rand(5)) but that seems like a bad non-orthogonality. So let's go back to ::T for a parametric type, at least until I change my mind again. Sorry if you feel jerked around. Larry
Re: Perl 6 Summary for 2005-11-14 through 2005-11-21
On Nov 22, 2005, at 1:40, Matt Fowles wrote: Call Frame Access Chip began to pontificate about how one should access call frames. Chip suggested using a PMC, but Leo thought that would be too slow. No, not really. It'll be slower, yes. But my argument was: whenever you start introspecting a call frame, by almost whatever means, this will keep the call frame alive[1] (see Continuation or Closure). That is: timely destruction doesn't work for example and the introspection feature is adding another level of complexity that isn't needed per se, because 2 other solutions are already there (or at least implemented mostly). leo [1] a call frame PMC could be stored elsewhere and reused later, refering to then dead contents. Autrijus mentioned that this will need weak references to work properly.
Re: type sigils redux, and new unary ^ operator
On 11/22/05, Larry Wall [EMAIL PROTECTED] wrote: What tipped me over the edge, however, is that I want ^$x back for a unary operator that is short for 0..^$x, that is, the range from 0 to $x - 1. I kept wanting such an operator in revising S09. It also makes it easy to write for ^5 { say } # 0, 1, 2, 3, 4 I read this and I'm trying to figure out why P6 needs a unary operator for something that is an additional character written the more legible way. To me, ^ indicates XOR, so unary ^ should really be the bit-flip of the operand. So, ^0 would be -1 (under 2's complement) and ^1 would be -2. I'm not sure where this would be useful, but that's what comes to mind when discussing a unary ^. Thanks, Rob
Re: Perl 6 Summary for 2005-11-14 through 2005-11-21
On Wed, 2005-11-23 at 01:39 +0100, Leopold Toetsch wrote: But my argument was: whenever you start introspecting a call frame, by almost whatever means, this will keep the call frame alive[1] (see Continuation or Closure). That is: timely destruction doesn't work for example... Destruction or finalization? That is, if I have a filehandle I really want to close at the end of a scope but I don't care when GC drags it into the void, will the close happen even if there's introspection somewhere? -- c