On Wed, May 10, 2006 at 05:58:57PM -0700, Allison Randal wrote: > To summarize a phone call today, the more intelligent defaults we add to > differently named rule keywords the more comfortable I am with having > different names. So, here's what we have so far (posted both as an FYI > and to confirm that we have the coherent solution I think we have): > [...] > skip: > - We keep :words as shorthand for :skip(/<ws>/) > - And :skip is shorthand for :skip(/<skip>/) > [...]
Please, describe these with <?ws> and <?skip> to make clear their non-capturing semantic. :-) But Allison's message helps me to crystallize what has been bugging me about the term ":skip" (and to a lesser extent ":words") in describing what they do. So, I'll offer my thoughts here in case anyone wants to pick it up before we go a-changing S05 yet again. (If no-one picks it up, I'll just wait for S05 to be updated to whatever is decided and implement that. :-) Whitespace in regexes and rules is metasyntactic, in that it is not matched literally. Effectively what the :w (or :words or :skip) option does it to change the metasyntactic meaning of any whitespace found in the regex. Or, another way of thinking of it -- as S05 currently stands, 'regex' and 'token' cause the pattern whitespace to be treated as <?null>, while 'rule' causes the pattern whitespace to become <?ws>. So what we're really doing with this option--whatever we call it--is to specify what the whitespace _in the pattern_ should match. Somehow ":skip" and <?skip> don't carry that meaning for me. In some sense it seems to me that the correct adverb is more along the lines of :ws, :white, or :whitespace, in that it says what to do with the whitespace in the pattern. It doesn't have to say anything about whether the pattern's whitespace is actually matching \s* (although the default rule for :ws/:white/:whitespace could certainly provide that semantic). I can fully see the argument that people will still confuse :ws and <?ws> with "whitespace in the target", when in reality they specify the meaning of whitespace in the regex pattern, so :ws might not be the right choice for the adverb. But I think that something more closely meaning "whitespace in the pattern means /this/" would be a better adverb than :skip. If someone *really* wants to use "skip", there's always :ws(/<?skip>/) (or whatever we choose) which means "whitespace in the regex matches <?skip>". > - <sp> is a single character of obligatory whitespace This one has bugged me since the day I first saw it implemented in PGE. We _already_ have \s, <blank>, and <space> to represent the notion of "a whitespace character" -- do we really need a separate <sp> form also? (An idle thought: perhaps "sp" is better used as an :sp adverb and a corresponding <?sp> regex?) Pm
