Re: S5: substitutions
On 10/9/06, Jonathan Lang [EMAIL PROTECTED] wrote: Smylers wrote: To be consistent your proposal should also suggest that these become equivalent: * { function() } * qq[ {function() }] * qq{ function() } * eval function() How so? AFAIK, string literal syntax requires you to prepend a sigil on the front of any embedded closure that you want to interpolate a value from; otherwise, it isn't a closure - it's just a pair of curly-brace characters. My proposal isn't curly braces _always_ act like closures, no matter what; it's the second part of a s[] construct doesn't have to be a literal; it can be anything that can be evaluated as needed by the algorithm to provide substitute text. According to S02 bare curlies do interpolate in double-quoted strings: S02 =item * S02 S02 A bare closure also interpolates in double-quotish context. It may S02 not be followed by any dereferencers, since you can always put them S02 inside the closure. The expression inside is evaluated in scalar S02 (string) context. You can force list context on the expression using S02 the Clist operator if necessary. -- Markus Laire
Re: S5: substitutions
Markus Laire wrote: According to S02 bare curlies do interpolate in double-quoted strings: Yeah; that was subsequently pointed out to me. Oops. -- Jonathan Dataweaver Lang
Re: S5: substitutions
Larry Wall wrote: On Sat, Oct 07, 2006 at 07:49:48PM -0700, Jonathan Lang wrote: : Another possibility: make it work. Add a delayed parameter trait : that causes evaluation of that trait to be postponed until the first : time that the parameter actually gets used in the routine. If it : never gets used, then it never gets evaluated. I could see uses for : this outside of the narrow scope of implementing substitutions. Tell me how you plan to do MMD on a value you don't have yet. MMD is based on types, not values; and you don't neccessarily have to evaluate something in order to know its type. Also, you don't neccessarily need every argument in order to do MMD: if there's a semi-colon in any of the candidates' signatures prior to the argument in question, MMD stands a decent chance of selecting a candidate before the question of its type comes up. Worst case scenario (a Code object without a return type being compared to a non-Code parameter), you can treat the argument's type as Any, and let the method redispatch once the type is known, if it's appropriate to do so. That said, the real problem here is figuring out what to do if some candidates ask for a given parameter to be lazily evaluated and others don't. It would probably be best to restrict the lazy evaluation option to the prototype's parameters, so that it always applies across the board. -- Consider this as another option: instead of a parameter trait, apply a trait to the method prototype. With this trait in play, all parameter evaluations are postponed as long as possible. If the first candidate needs only the first two parameters to test its viability, only evaluate the first two parameters before testing it. If the dispatch succeeds, the other parameters remain unevaluated until they actually get used in the body. If all of the two-parameter candidates fail, evaluate the next batch of parameters and go from there. This approach doesn't guarantee that a given parameter won't be evaluated before its first appearance within the routine; but it does remove the guarantee that it will be. -- In the case of subst, there's an additional wrinkle: you can't always evaluate the expression without making reference to the pattern's Match object, which won't be known until the pattern is applied to the invocant. In particular, closures that refer to $0, $1, etc. will only work properly if called by the method itself, and only after $0, $1, etc. have been set. All things considered, the best solution for subst might be to treat the timing of quote evaluation in a manner analogous to regex evaluation. -- Jonathan Dataweaver Lang
Re: S5: substitutions
Jonathan Lang writes: Translating this to perl 6, I'm hoping that perl6 is smart enough to let me say: s(pattern) { doit() } Instead of s(pattern) { { doit() } } That special case is nasty if you don't know about it -- you inadvertently execute as code something which you just expected to be a string. Not a good trap to have in the language. Smylers
Re: S5: substitutions
Jonathan Lang writes: Smylers wrote: Jonathan Lang writes: Translating this to perl 6, I'm hoping that perl6 is smart enough to let me say: s(pattern) { doit() } Instead of s(pattern) { { doit() } } That special case is nasty if you don't know about it -- you inadvertently execute as code something which you just expected to be a string. Not a good trap to have in the language. If you expected it to be a string, why did you use curly braces? Because it isn't possible to learn of all Perl (5 or 6) in one go. And in general you learn rules before exceptions to rules. In general in Perl the replacement part of a substitution is a string, one that takes interpolation like double-quoted strings do. In general in Perl if the default delimiter for something is inconvenient you can pick a different delimiter -- this includes patterns, and also strings. And if you pick any sort of brackets for your delimiters then they match -- which is handy, cos it means that they can still be used even if the string inside contains some of those brackets. So it's quite possible for somebody to have picked up all the above, and have got used to using Cqq[long string] or Cqq{long string} when he wishes to quote long strings. The form with braces has the advantage that they are relatively uncommon in text (and HTML, and SQL, and many other typically encountered long strings). At which point if he wants to do substitution with slashes in at least one of the pattern or the replacement text (perhaps it's a URL or a filename) then he's likely to pick some other arbitrary characters for doing the quoting. And braces seem as likely to be picked as anything else. Unless he specifically knows about an exception there's no reason not to pick them. I refer simply to Perl above. The above situation could just as easily arise (or already have arisen) in Perl 5 -- in which case the programmer's expectations would've been met and the code interpreted fine. Your proposal would make that no longer the case in Perl 6. And, apart from people learning Perl fresh, there's also a large number of existing Perl 5 programmers who also won't be expecting this exception. Yes, Perl 6 isn't supposed to be compatible with Perl 5, and obviously a Perl 5 coder is going to have to learn lots of new things anyway. But usually they are significantly different, or the old way of doing things will be a syntax error. This is a situation where the old syntax continues to work but does something quite different. That's unfortunate, but probably liveable with in general. But in this particular case the particular behaviour involves _executing as Perl code something which the programmer never intended to be code in the first place_. That's crazily dangerous. It's like having a Perl 5 to Perl 6 translator that randomly sticks eval statements in front of some of your double-quoted strings. While I'm completely on board with the idea that _pattern_ delimiters shouldn't affect the _pattern's_ semantics, the second half of the search-and-replace syntax isn't a pattern. Conceptually, it's either a string or an expression that returns a string. Sure. Or rather, it's a string (but braces inside strings can be used to embed expressions in them). To be consistent your proposal should also suggest that these become equivalent: * { function() } * qq[ {function() }] * qq{ function() } * eval function() and, naturally, that these no longer are: * string * qq[string] * qq{string} And if braces are special as delimiters for Cqq consistency would say they should be for Cq as well -- effectively just another way of spelling Ceval, but one that doesn't stand out so much. Smylers
Re: S5: substitutions
Smylers schreef: in this particular case the particular behaviour involves _executing as Perl code something which the programmer never intended to be code in the first place_. That's crazily dangerous. I wouldn't mind eval() to be off by default, so to have to put a use eval in every block that needs it. -- Affijn, Ruud Gewoon is een tijger.
Re: S5: substitutions
Jonathan Lang skribis 2006-10-07 15:07 (-0700): Translating this to perl 6, I'm hoping that perl6 is smart enough to let me say: s(pattern) { doit() } Instead of s(pattern) { { doit() } } I would personally hope that Perl isn't that clever, but treats all bracketing delimiters the same there. Partly for future-proofness, partly for least surprise. -- korajn salutojn, juerd waalboer: perl hacker [EMAIL PROTECTED] http://juerd.nl/sig convolution: ict solutions and consultancy [EMAIL PROTECTED] Ik vertrouw stemcomputers niet. Zie http://www.wijvertrouwenstemcomputersniet.nl/.
Re: S5: substitutions
On Sat, Oct 07, 2006 at 03:07:49PM -0700, Jonathan Lang wrote: : S5 says: : There is no /e evaluation modifier on substitutions; instead use: : : s/pattern/{ doit() }/ : : Instead of /ee say: : : s/pattern/{ eval doit() }/ : : In my perl5 code, I would occasionally take advantage of the pairs of : brackets quoting mechanism to do something along the lines of: : :s(pattern) { doit() }e : : Translating this to perl 6, I'm hoping that perl6 is smart enough to let me : say: : :s(pattern) { doit() } Well, the () are illegal without intervening whitespace because that makes s() a function call, but we'll leave that alone. : Instead of : :s(pattern) { { doit() } } Perl 5 let certain choose-your-own quotes introduce various kinds of odd semantics, and that was generally viewed as a mistake. That is why S02 says: For these q forms the choice of delimiters has no influence on the semantics. That is, C'', C, C , C«», C``, C(), C[], and C{} have no special significance when used in place of C// as delimiters. We could make an exception for the second part of s///, but certainly for this case I think it's easy enough to write: .subst(/pattern/, { doit }) However, taken as a macro, s/// is a rather odd fish. The right side isn't just a string, but a deferred string, which implies that there are always curlies there, much like the right side of implies deferred evaluation. : In a similar vein, I tend to write other perl5 substitutions using : parentheses for the pattern so that I can use double-quotes for the : substitution expression: : :s(pattern) expression Because the right side must be deferred, the .subst form of that would be: .subst(/pattern/, {expression}) Otherwise, the double quotes interpolate too early. That's getting a little more cumbersome. : This highlights to me the fact that the expression is _not_ a pattern, : and uses a syntax more akin to interpolated strings than to patterns. : The above bit about executables got me to thinking: _if_ perl6 is : smart enough to recognize curly braces and automatically treat the : second argument as an executable expression, would there be any : benefit to letting perl6 apply customized quoting semantics to the : second argument as well, based on the choice of delimiters? e.g., : using single quotes would disable variable substitutions and the like : (useful in cases where the substitution doesn't make use of the : captures done by the pattern, if any). Well, again, that's maybe just: .subst(/pattern/, {'expression'}) or even, since we don't need to delay evaluation: .subst(/pattern/, 'expression') But it's possible that some syntactic relief of a dwimmy sort is in order here. One could view s[pattern] as a kind of metaprefix on the following expression, sort of a self-contained unary . I wonder how often we'd have to explain why s/pattern/ expression doesn't do that, though. 'Course, it's already like that in Perl 5. Unlike in Perl 5, this approach would rule out things like: s[pattern] !foo! which would instead have to be written: s[pattern] qq!foo! As a unary lazy prefix, you could even just say s[pattern] doit(); Of course, then people will wonder why .subst(/pattern/, doit()) doesn't work. Which makes me want to build it into the pattern somewhere where there's already deferred evaluation that just happens to be triggered at the right moment: /pattern {subst doit}/ /pattern {subst ($0)}/ /pattern {subst q:to'END'}/ a new line END We can give the user even more rope to shoot themselves in the dark with: /pattern {$/ = doit}/ /pattern {$0 = ($0)}/ /pattern {$() = q:to'END'}/ a new line END The possibilities are endless... Well, not quite. One syntax we *can't* allow is /pattern/{ doit } because that's already used to pull named captures out of the match object. Well, enough random braindump for now. Larry
Re: S5: substitutions
Larry Wall wrote: Jonathan Lang wrote: : Translating this to perl 6, I'm hoping that perl6 is smart enough to let me : say: : :s(pattern) { doit() } Well, the () are illegal without intervening whitespace because that makes s() a function call, but we'll leave that alone. Thank you; I noticed this after I had sent it. Perl 5 let certain choose-your-own quotes introduce various kinds of odd semantics, and that was generally viewed as a mistake. That is why S02 says: For these q forms the choice of delimiters has no influence on the semantics. That is, C'', C, C , C«», C``, C(), C[], and C{} have no special significance when used in place of C// as delimiters. We could make an exception for the second part of s///, but certainly for this case I think it's easy enough to write: .subst(/pattern/, { doit }) However, taken as a macro, s/// is a rather odd fish. The right side isn't just a string, but a deferred string, which implies that there are always curlies there, much like the right side of implies deferred evaluation. Perhaps quotes should be given the same defer or evaluate as appropriate to the context capability that regexes and closures have? That is, 'q (text)' is always a Quote object, which may be evaluated immediately in certain contexts and be passed as an object in others. As a first cut, consider using the same rule for this that regexes use: in a value context (void, boolean, string, or numeric) or as an explicit argument of ~~, a quote is immediately evaluated; otherwise, it's passed as an object to be evaluated later. The main downside I see to this is that there's no way to force one approach or the other; a secondary issue has to do with the usefulness of an unevaluated string: with regexes and closures, the unevaluated versions are useful in part because they can be made to do different things when evaulated, based on the circumstances: $x ~~ $regex will do something different than $y ~~ $regex, and closures can potentially be fed arguments that allow one closure to do many things. A quote, OTOH, isn't neccessarily that flexible. Or is it? Is there benefit to extending the analogy all the way, letting someone define a parameterized quote? But it's possible that some syntactic relief of a dwimmy sort is in order here. One could view s[pattern] as a kind of metaprefix on the following expression, sort of a self-contained unary . I wonder how often we'd have to explain why s/pattern/ expression doesn't do that, though. 'Course, it's already like that in Perl 5. Probably not too often - although I _would_ recommend that you emphasize the distinction between standard regex notation used everywhere else and the extended regex notion used by s///. I _do_ like the idea of reserving this behavior to situations where the pattern delimiters are a matched set, letting you freely choose some other delimiter for the expression. In particular, I'm not terribly fond of the idea of s'pattern'expression' applying single-quote semantics to the expression. Unlike in Perl 5, this approach would rule out things like: s[pattern] !foo! which would instead have to be written: s[pattern] qq!foo! Fine by me. This would also let you easily apply quote modifiers to the expression. As a unary lazy prefix, you could even just say s[pattern] doit(); Of course, then people will wonder why .subst(/pattern/, doit()) doesn't work. Perhaps. But people quickly learn that different approaches in perl often have their own unique quirks; this would just be one more example. Which makes me want to build it into the pattern somewhere where there's already deferred evaluation that just happens to be triggered at the right moment: /pattern {subst doit}/ /pattern {subst ($0)}/ /pattern {subst q:to'END'}/ a new line END We can give the user even more rope to shoot themselves in the dark with: /pattern {$/ = doit}/ /pattern {$0 = ($0)}/ /pattern {$() = q:to'END'}/ a new line END The possibilities are endless... These aren't syntaxes that I'd want to use; but then, TIMTOWTDI. The main problem that I have with this approach is that it could interfere with being able to use the venerable s/pattern/expression/ notation; I'm looking to open up new possibilities, not to remove a perfectly workable existing one. Well, not quite. One syntax we *can't* allow is /pattern/{ doit } because that's already used to pull named captures out of the match object. ...which brings up another potential conflict with the s/// notation: how _do_ you pull named captures out of the match object in s///? -- On a related subject: it seems to me that the notion of extending the pattern notation to include a replace clause is at the heart of the issue here. In addition to the above issues, it seems to be too much of a one-trick pony as currently
Re: S5: substitutions
Larry Wall wrote: As a unary lazy prefix, you could even just say s[pattern] doit(); Of course, then people will wonder why .subst(/pattern/, doit()) doesn't work. Another possibility: make it work. Add a delayed parameter trait that causes evaluation of that trait to be postponed until the first time that the parameter actually gets used in the routine. If it never gets used, then it never gets evaluated. I could see uses for this outside of the narrow scope of implementing substitutions. -- Jonathan Dataweaver Lang
Re: S5: substitutions
Jonathan Lang wrote: Another possibility: make it work. Add a delayed parameter trait... ...although lazy might be a better name for it. :) -- Jonathan Dataweaver Lang
Re: S5: substitutions
On Sat, Oct 07, 2006 at 07:49:48PM -0700, Jonathan Lang wrote: : Another possibility: make it work. Add a delayed parameter trait : that causes evaluation of that trait to be postponed until the first : time that the parameter actually gets used in the routine. If it : never gets used, then it never gets evaluated. I could see uses for : this outside of the narrow scope of implementing substitutions. Tell me how you plan to do MMD on a value you don't have yet. Larry