In a Regular Expression search, if [:space:] and [:digit:] only work
with + or * or {}, it's a bug.
[:space:] by itself, all alone, should recognize exactly one "white"
character.
Square brackets enclose a "character class". A character class is a list
of "atoms". An atom is a matchable entity. The only thing the square
brackets do is allow any one of the atoms inside to match at a given
point in a string. And the whole square bracket list is itself an atom
(note: just one atom!), it's just an atom that can match different
target characters at different positions in a target string.
"+" and "*" are "quantifiers". A quantifier means "how many of".
So an atom (including a square bracket atom) plus a quantifier means
"how many of this". But an atom without any quantifier at all ought to
be interpreted as "one of this".
I think maybe what Uwe meant was that it is not possible to search for
:space: by itself. This is (almost) true. :space: will not locate a
white character in a target string.
(Without the square brackets, :space: is 7 atoms, a colon, a character
s, a character p, ... all in the exact order listed, and it will match
that literal 7 character sequence in a target string. But that was not
the original intent.)
But :space: inside square brackets is not a 7 character sequence. Square
brackets create a special context in which :space: is recognized as a
shorthand for the list of white characters of the current alphabet, and
the whole list becomes a single atom. And as a valid atom, a quantifier
is allowed but should *not* be required.
Another thing Uwe might have been trying to say is that it is not valid
to quantify a quantifier. This is completely true. It is not valid to
try to match +* for example. This would mean something like "any number
of at least one of", except that it's meaningless. A regular expression
can match a count of things, but it's not possible to just match a
count, much less a count of counts. To search for any number of literal
plus characters, you must "escape" the + character with a backslash:
"\+" (but without the quotes). This strips off the quantifier meaning of
the plus character, reverting it temporarily to its literal alphabet
character meaning. Thus, "\+*" (without the quotes) means "any number of
consecutive literal plus characters", "\++" (without the quotes) means
"at least one consecutive literal plus character", and "\+\+" (without
the quotes) means "exactly two consecutive literal plus characters".
But [:space:] is not a quantifier, it's an atom, and it is legal, but
not required, to quantify an atom.
When I've written my own book on regular expressions, it will make all
of this crystal clear. ;)
Uwe Fischer wrote:
Andrew Douglas Pitonyak wrote:
What else can I say besides "Does [:space:] work with regular
expressions?"
I can use regular expressions, but I can not make it find a space
using this syntax, which is documented. I also tested [:digit:], which
does not work for me. [0-9] works just fine, however. In other words,
I can use some regular expressions, just not all. I see no issues for
this. Depending on the answer, I will open an issue.
I am using 2.02 on Linux. I investigated this based on a question here:
http://www.oooforum.org/forum/viewtopic.phtml?p=154379#154379
please use [:space:]+ or [:space:]* as search term.
[:space:] by itself is a regular expression for "any white space" (look
up Wiki or Google what a white space is).
You cannot search for a regular expression by itself, in the same sense
as you cannot search for something as "between 3 and 6 times". You
always must give a parameter what you mean by using the regular expression.
You can find all this in a very short list in Online Help. Be aware that
a complete discussion of "regular expressions" can fill a book of 500
pages, see amazon.com for that keyword.
Regards
Uwe
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]