[Answering my own question]
I think a reasonable solution is to have a generic analyzer for use at
query-time that can wrap my application's choice of analyzer and
automatically filter out what it sees as stop words. It would initialize
itself from an IndexReader and create a StopFilter for those terms
greater than a given document frequency.
This approach seems reasonable because:
a) The stop word filter is automatically adaptive and doesn't need
manual tuning.
b) I can live with the disk space overhead of the few "killer" terms
which will make it into the index.
c) "Silent" failure (ie removal of terms from query) is probably
generally preferable to the throw-an-exception approach taken by
BooleanQuery if clause limits are exceeded.
___________________________________________________________
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]