Re: about my proposed patch for bug 7782

Eric D. Friedman Sat, 13 Apr 2002 19:05:45 -0700

Hi Otis,

On Sat, 13 Apr 2002, Otis Gospodnetic wrote:


> I assume you are right about the need to call close() on TokenStream.
> Have you encontered problems with the query parser before your changes?
>  I'm asking because I haven't (yet).

Yes, because my Analyzers cache TokenStream instances that are
expensive to set up.  Calling close is required for the shared instance
to be made available to the next user.

> It seems like the change is the addition of a few finally blocks and
> some formatting changes.  Ah, talking about that - they make the patch
> longer and harder to check by just glancing at the code.  I think the
> formatting changes are sometimes in order, but it's probably better not
> to mix them with code/functionality changes.  I was once reprimanded
> for doing that. :)

Two things here:  one, I specified a context level of 5 (cvs diff -c5)
so that the diffs would be both longer and clearer.  I'm happy to
upload a shorter diff if that's more helpful.

Two, I made the formatting changes to address inconsistencies introduced
by someone else.  That is, the person who wrote the (broken) code for
which my patch applies used a different coding style than is found in
the rest of the file (and in the rest of Lucene, for that matter).

> I think the change is fine, but since it's not backwards compatible
> (IOException) we should, I think, apply it only after the upcoming
> release.

My opinion here is that code which silently swallows exceptions is not
acceptable in a production release and shouldn't be allowed to ship.
As it stands, the QueryParser allows users to unwittingly assemble a
Query that is inconsistent with the input they provided.  That is very
bad.

> Perhaps it would be good to get rid of TokenMgrException and wrap it in
> ParseException at that time as well.  What do you think, people?
> Subsequently we could also apply Hungarian Peter's enhancement that
> allows one to speciify whether OR or AND should be used by default by
> the query parser.

TokenMgrException and ParseException are two different things.  The
former reports a problem in the lexer that prevented it from reading a
valid token; the latter reports a problem encountered in the parser,
which is where the syntax of the query language is enforced.

To make a source code analogy:  "return if break;" contains a set of
valid tokens for the Java language, but the syntax is invalid, making a
ParseException the right kind of error to raise.  Conversely, "swAtch
(c) { }" contains characters which cannot be recognized by the lexer as
a legal token, so a TokenMgrException is appropriate.

I would strongly urge against blurring the distinction between these
two classes of error, as they really are not the same thing.

Eric


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Re: about my proposed patch for bug 7782

Reply via email to