Andrzej Bialecki wrote:
Tiago Silveira wrote:
IMHO, using "cat cat?" or even "cat cat? cat??" is so simple that it doesn't
justify keeping the old, undocumented, arguably incorrect behavior.

I have a different view on this issue - IMHO treating "?" as "exactly one character" is counterintuitive for people familiar with the use of wildcards: in all popular regular expression languages, and also in DTD/XML world, a single "?" metacharacter means "zero or one", which is probably why the original behavior was introduced (or at least it was more compatible with the use of "?" in other contexts).

There are two distinctly different traditions for ?, *, and +. One is globbing (standard in UNIX shells) and the other is regular expression. In the case of globbing ? has always stood for a single character, * stands for one or more and + is not defined. In regular expression, these modify the prior regular expression to mean 0 or 1; 0 or more; and 1 or more.

Lucene seems to support globbing (trailing) and not regex. To me this is clear in the documentation.

That said, a search seems to be a kind of regex and blending these two traditions leads to confusion. Though the first time I tried lucene to do a search, I used these metacharacters as if they were regex modifiers not globbing characters. (Natural behavior of a perl programmer!) It did not work as expected. This led me to read the docs and then I understood the errors of my ways.

Personally, I don't want an either/or. I want a both/and. Modern unix shells provide both/and, albeit with different syntax.

I see this more as a feature request than an argument as to the usefulness or properness of either. Both are useful. Both are proper. Both are intuitive. Both are counterintuitive. It all depends on your "tradition".



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to