I often want to clean up the residue you sometimes get when copying
text from a web page -- bits of unicode, or special characters, like
the "\?B0" my screen uses to show the "degrees" symbols in this line:
        Start Totality  01:33 pm        67.2�   178.0�

I think that long, long, ago, I could find those characters using \P,
but that was before the vile's shorthand search notations were brought
into line with the X/Open classes.  With that change, the TAB character
lost its "printable" status, so \P finds tabs as well as true non-printables.

What I think I want is a shorthand for [:ascii:] (meaning "8th bit clear").
Is this available in some way that I'm missing?

Would it be possible to add this, perhaps bound to \y or \z?  Even if
it weren't bound to a shorthand, if [[:ascii:]] were available as
part of a search string, that would be useful enough.

(Oddly, if I search for "[[:ascii:]]" today, it finds instances of ":]".
Not sure why.)


Current classes and shorthands:
   \i \I  [:alnum:]
   \a \A  [:alpha:]
   \b \B  [:blank:]
   \c \C  [:cntrl:]
   \d \D  [:digit:]
   \f \F  [:file:]
   \g \G  [:graph:]
   \w \W  [:ident:], alphanumeric (plus '_')
   \l \L  [:lower:]
   \o \O  [:octal:]
   \p \P  [:print:], printable (note that space is printable)
   \q \Q  [:punct:]
   \s \S  [:space:]
   \u \U  [:upper:]
   \x \X  [:xdigit:]


=----------------------
 paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 40.1 degrees)


Reply via email to