Andrzej Bialecki wrote:
Tiago Silveira wrote:
IMHO, using "cat cat?" or even "cat cat? cat??" is so simple that it
doesn't
justify keeping the old, undocumented, arguably incorrect behavior.
I have a different view on this issue - IMHO treating "?" as "exactly
one character" is counterintuitive for people familiar with the use of
wildcards: in all popular regular expression languages, and also in
DTD/XML world, a single "?" metacharacter means "zero or one", which
is probably why the original behavior was introduced (or at least it
was more compatible with the use of "?" in other contexts).
There are two distinctly different traditions for ?, *, and +. One is
globbing (standard in UNIX shells) and the other is regular expression.
In the case of globbing ? has always stood for a single character, *
stands for one or more and + is not defined. In regular expression,
these modify the prior regular expression to mean 0 or 1; 0 or more; and
1 or more.
Lucene seems to support globbing (trailing) and not regex. To me this is
clear in the documentation.
That said, a search seems to be a kind of regex and blending these two
traditions leads to confusion. Though the first time I tried lucene to
do a search, I used these metacharacters as if they were regex modifiers
not globbing characters. (Natural behavior of a perl programmer!) It did
not work as expected. This led me to read the docs and then I understood
the errors of my ways.
Personally, I don't want an either/or. I want a both/and. Modern unix
shells provide both/and, albeit with different syntax.
I see this more as a feature request than an argument as to the
usefulness or properness of either. Both are useful. Both are proper.
Both are intuitive. Both are counterintuitive. It all depends on your
"tradition".
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]