BufferingAnalyzer (or something like that)

2007-11-07 Thread Grant Ingersoll
From time to time, I have run across analysis problems where I want to only analyze a particular field once, but I also want to "pluck" certain tokens (one or more) out of the stream and then use them as the basis for another field. For example, say I have a token filter that can identify

[jira] Commented: (LUCENE-693) ConjunctionScorer - more tuneup

2007-11-07 Thread Mike Klaas (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540913 ] Mike Klaas commented on LUCENE-693: --- Paul wrote: > As just discussed on java-dev, the creation of an object during

[jira] Resolved: (LUCENE-1036) Unreleased 2.3 version of IndexWriter.optimize() consistly throws java.lang.IllegalArgumentException out-of-the-box

2007-11-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1036. Resolution: Fixed > Unreleased 2.3 version of IndexWriter.optimize() consistly th

[jira] Commented: (LUCENE-1036) Unreleased 2.3 version of IndexWriter.optimize() consistly throws java.lang.IllegalArgumentException out-of-the-box

2007-11-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540899 ] Michael McCandless commented on LUCENE-1036: Woops, sorry, I somehow missed your posts here. OK that is

[jira] Commented: (LUCENE-1036) Unreleased 2.3 version of IndexWriter.optimize() consistly throws java.lang.IllegalArgumentException out-of-the-box

2007-11-07 Thread R Giles (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540891 ] R Giles commented on LUCENE-1036: - Key: LUCENE-1036 Michael, I have not seen any response to my question. I chang

Re: Term pollution from binary data

2007-11-07 Thread Doug Cutting
Chuck Williams wrote: It appears that termIndexInterval is factored into the stored index and thus cannot be changed dynamically to work around the problem after an index has become polluted. Other than identifying the documents containing binary data, deleting them, and then optimizing the wh

Re: FuzzyQuery using termDocs() to reduce count of Boolean Queries

2007-11-07 Thread Timo Nentwig
On Wednesday 07 November 2007 10:51:32 Timo Nentwig wrote: > Hi! > > I asked this one already on the user mailing list but maybe it's more > appropriate here: > > As a simple example imagine every document in your index to have a > field "language" and "country". A tuple of language+country is what

FuzzyQuery using termDocs() to reduce count of Boolean Queries

2007-11-07 Thread Timo Nentwig
Hi! I asked this one already on the user mailing list but maybe it's more appropriate here: As a simple example imagine every document in your index to have a field "language" and "country". A tuple of language+country is what I call a context. You want to search context-specific, i.e. langua