"lucene user",

Look at MemoryIndex (in Lucene contrib) for the "alert about new documents that 
match an interest" part of the problem.


Otis 
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: lucene user <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Thursday, May 22, 2008 5:44:29 AM
> Subject: Handeling when a field does not exist in the document
> 
> We have a requirement to inform users on a regular basis of new material on
> which they have expressed interest. How are we to know what is "new" from
> the point of view of a particular user? Our idea is to tag each new item in
> some way (perhaps a date/time stamp in the lucene index indicating when the
> new document was indexed) and remember when the last time we sent out an
> alert to that user.
> How should we tag the documents? With a date/time of indexing stamp? An
> incrementing batch import ID number? Does it matter much?
> 
> *I am reminded that ranges of dates and numbers, (as well as wild cards) are
> evaluated as if they were a large OR query covering all the values that
> exist in the index. Lucene only finds exact matches - it does not do
> comparisons. This means that ranges with lots of different values in them
> are bad - and can actually crash with a 'too many clauses' exception if
> there are enough distinct values to push the number of clauses over 1024. Do
> I understand this correctly?*
> 
> *How do we handle existing documents which do not have such a new field
> associated with them? Can we provide a default value for the existing
> documents? *
> 
> I did not find the place in the Lucene Documentation where it explains what
> you get when you try to retrieve or search on a field that does not exist in
> the document. I remember it not being a problem, but I couldn't find it. How
> do I do this? What should I read?
> 
> Thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to