Re: apo5 (was: Re: \x{123a 123b 123c})

Larry Wall Mon, 21 Nov 2005 09:28:25 -0800

On Mon, Nov 21, 2005 at 05:49:59PM +0100, Ruud H.G. van Tol wrote:
: Larry Wall:
: > Juerd:
: >> Ruud:
: 
: >>> Maybe
: >>>     "\x{123a 123b 123c}"
: >>> is a nice alternative of
: >>>     "\x{123a} \x{123b} \x{123c}".
: >>
: >> Hmm, very cute and friendly! Can we keep it, please? Please?
: 
: Thanks for the support.


Hey, this ain't exactly a popularity contest here...  :-)

: > We already have, from A5, \x[0a;0d], so you can supposedly say
: >     "\x[123a;123b;123c]"
: 
: <rereading apo5 />
: Found it in the old/new table on page 7. For me the semicolon is fine.

The fact that you say "page 7" leads me to guess that you're reading
it from perl.com.  That's going to be the most out-of-date version.
Better would be

    dev.perl.org        one day latency but html-ified
    svn.perl.org        up to the minute but only in pod

In particular, the Apocalypses have little [Update:] sections that are
supposed to alert you to things that have changed since the the Apo
was written.  (Though some of those are a little out of date right now
too--I'm just working my way through A12 again.)

: I am using character names more and more, and between those, semicolons
: are less cluttery. Character names can contain spaces, but semicolons
: too? If not then
: \c[BEL; EXTENDED ARABIC-INDIC DIGIT ZERO] would be possible, but maybe
: better not, or more like
: \c['BEL'; 'EXTENDED ARABIC-INDIC DIGIT ZERO'] or even
: \c('BEL', 'EXTENDED ARABIC-INDIC DIGIT ZERO').

None of the current names contain either semicolon or comma, so I expect
they're avoiding them by policy.

: Something else:
: The '^' could be used for both the ultimate start- and end-of-string.
: This frees the '$'.

I think this is one of those aspects of regex culture that is too
entrenched to remove.  Besides, you have to be able to distinguish
s/^/foo/ from s/$/foo/.

: There is still the '$$' that matches before embedded newlines, and since
: '^^' matches after those newlines, the '^^' and '$$' can only be unified
: to '^^' if it is one-width inside a string, so is like '[$$\n^^]' (or
: just '\n') there.

But then if you use it within a capture, you get an extra newline you
probably don't want.

: At start- and end-of-string the '^^' can still be a zero-width match.
: I am not sure about greedy (meaning to try one-width first) or
: non-greedy.
: 
: Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines.
: Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$'
: might be worth it.

I don't think it's any clearer.  In fact, I find all the ^'s there
are a little too visually confusing and contextual.

Larry

Re: apo5 (was: Re: \x{123a 123b 123c})

Reply via email to