Hi all,
I have an index task which will index thousands of records with lucene 3.0.1.
My confusion is lucene will always create a .cfx and a .cfs file in the file
system, sometimes more, while I thought it should create a single .cfs file if
I optimize the index data. Is it by design? If yes,
Index all into a directory and determine the size of all files in it.
From http://lucene.apache.org/java/3_0_1/fileformats.html
Starting with Lucene 2.3, doc store files (stored field values and term
vectors) can be shared in a single set of files for more than one segment. When
compound file
You could tell the searching part of your app, via some notification
or messaging call. Or call IndexReader.isCurrent() from time to time,
or even on every search, and reopen() if necessary. See the javadocs
and don't forget to close the old reader when you do call reopen.
--
Ian.
On Wed,
Lucene considers an index with a single .cfx and a single .cfs as optimized.
Also, note that how Lucene stores files in the index is an impl detail
-- it can change from release to release -- so relying on any of these
details is dangerous.
That said, with recent Lucene versions, if you really
On May 2, 2010, at 5:50 AM, Avi Rosenschein wrote:
On 4/30/10, Grant Ingersoll gsing...@apache.org wrote:
On Apr 30, 2010, at 8:00 AM, Avi Rosenschein wrote:
Also, tuning the algorithms to the users can be very important. For
instance, we have found that in a basic search functionality,
Thanks, Peter.
Can you share what kind of evaluations you did to determine that the end user
believed the results were equally relevant? How formal was that process?
-Grant
On May 3, 2010, at 11:08 AM, Peter Keegan wrote:
We discovered very soon after going to production that Lucene's
The feedback came directly from customers and customer facing support folks.
Here is an example of a query with keywords: nurse, rn, nursing, hospital.
The top 2 hits have scores of 26.86348 and 26.407215. To the customer, both
results were equally relevant because all of their keywords were in
On Wed, May 5, 2010 at 5:08 PM, Grant Ingersoll gsing...@apache.org wrote:
On May 2, 2010, at 5:50 AM, Avi Rosenschein wrote:
On 4/30/10, Grant Ingersoll gsing...@apache.org wrote:
On Apr 30, 2010, at 8:00 AM, Avi Rosenschein wrote:
Also, tuning the algorithms to the users can be very
Hi Robert,
thank you very much for your quick response, I have a couple of questions,
did you read the papers that I mention in my e-mail?
do you think that Lucene ranking function could have this problem?
My concern is not about how to implement different kind of ranking
functions for Lucene,
2010/5/5 José Ramón Pérez Agüera jose.agu...@gmail.com
Hi Robert,
thank you very much for your quick response, I have a couple of questions,
did you read the papers that I mention in my e-mail?
Yes.
do you think that Lucene ranking function could have this problem?
I know it does.
Hi Robert,
the problem is not the linear combination of fields, the problem is to
apply the boost factor per field after the term frequency saturation
function and then make the linear combination of fields. Every system
that implement BM25F, including terrier, take care of that, because if
you
2010/5/5 José Ramón Pérez Agüera jose.agu...@gmail.com
Hi Robert,
the problem is not the linear combination of fields, the problem is to
apply the boost factor per field after the term frequency saturation
function and then make the linear combination of fields. Every system
that implement
Hi Robert,
I will be very happy to see this problem fixed :-) I can not image
what reasons people have to use software with bugs, I guess that
others bugs in lucene are removed. Anyway, if finally you are going to
fix the problem, these are good news :-) thank you very much for your
time.
jose
2010/5/5 José Ramón Pérez Agüera jose.agu...@gmail.com:
[...]
The consequence is that a document
matching a single query term over several fields could score much
higher than a document matching several query terms in one field only,
One partial workaround that people use is
Thank you Mike.
Garry
- Original Message -
From: Michael McCandless luc...@mikemccandless.com
To: java-user@lucene.apache.org
Sent: Wednesday, May 05, 2010 8:24 PM
Subject: Re: How can I merge .cfx and .cfs into a single cfs file?
Lucene considers an index with a single .cfx and a
You may look this:
private static IndexSearcher indexSearcher = null;
public synchronized IndexSearcher newIndexSearcher() {
try {
if (null == indexSearcher) {
Directory directory = FSDirectory.open(new
File(Config.DB_DIR+/rssindex));
indexSearcher = new
16 matches
Mail list logo