Hi Otis,
On Sat, 13 Apr 2002, Otis Gospodnetic wrote:
> I assume you are right about the need to call close() on TokenStream.
> Have you encontered problems with the query parser before your changes?
> I'm asking because I haven't (yet).
Yes, because my Analyzers cache TokenStream instances that are
expensive to set up. Calling close is required for the shared instance
to be made available to the next user.
> It seems like the change is the addition of a few finally blocks and
> some formatting changes. Ah, talking about that - they make the patch
> longer and harder to check by just glancing at the code. I think the
> formatting changes are sometimes in order, but it's probably better not
> to mix them with code/functionality changes. I was once reprimanded
> for doing that. :)
Two things here: one, I specified a context level of 5 (cvs diff -c5)
so that the diffs would be both longer and clearer. I'm happy to
upload a shorter diff if that's more helpful.
Two, I made the formatting changes to address inconsistencies introduced
by someone else. That is, the person who wrote the (broken) code for
which my patch applies used a different coding style than is found in
the rest of the file (and in the rest of Lucene, for that matter).
> I think the change is fine, but since it's not backwards compatible
> (IOException) we should, I think, apply it only after the upcoming
> release.
My opinion here is that code which silently swallows exceptions is not
acceptable in a production release and shouldn't be allowed to ship.
As it stands, the QueryParser allows users to unwittingly assemble a
Query that is inconsistent with the input they provided. That is very
bad.
> Perhaps it would be good to get rid of TokenMgrException and wrap it in
> ParseException at that time as well. What do you think, people?
> Subsequently we could also apply Hungarian Peter's enhancement that
> allows one to speciify whether OR or AND should be used by default by
> the query parser.
TokenMgrException and ParseException are two different things. The
former reports a problem in the lexer that prevented it from reading a
valid token; the latter reports a problem encountered in the parser,
which is where the syntax of the query language is enforced.
To make a source code analogy: "return if break;" contains a set of
valid tokens for the Java language, but the syntax is invalid, making a
ParseException the right kind of error to raise. Conversely, "swAtch
(c) { }" contains characters which cannot be recognized by the lexer as
a legal token, so a TokenMgrException is appropriate.
I would strongly urge against blurring the distinction between these
two classes of error, as they really are not the same thing.
Eric
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>