How to use Sort cass in Lucene 2.4.0?

2008-11-05 Thread 장용석
hi. I have a question :) In lucene 2.3.X I did use Sort class like this.. Sort sort = new Sort("FIELDNAME", true); Hits hits = searcher.search(query, sort); but, in lucene 2.4.0 search(Query, Sort) method is deprecated. I was searched API, so I found this method search(query, filter, n, sort) C

Re: Performance of never optimizing

2008-11-05 Thread Tomer Gabel
Justus Pendleton-2 wrote: > > 1. Why does the merge factor of 4 appear to be faster than the merge > factor of 2? > > 2. Why does non-optimized searching appear to be faster than optimized > searching once the index hits ~500,000 documents? > > 3. There appears to be a fairly sizable perfo

Re: Performance of never optimizing

2008-11-05 Thread Yonik Seeley
On Wed, Nov 5, 2008 at 9:47 AM, Tomer Gabel <[EMAIL PROTECTED]> wrote: > 1. Higher merge factor => more segments. Right, and it's also important to note that it's only "on average" more segments. The number of segments go up and down with merging, so at particular points in time, an index with a h

Re: No segment files found/ Searcher error

2008-11-05 Thread JulieSoko
Yesterday, when typing a reply I noticed an error ... the CheckIndex.check was not checking a good directory.. so the segments not found error was not correct. I apologize. I made the correction and this is the correct error that occurs when an i/o error is thrown. I ran the CheckIndex.check me

Re: Performance of never optimizing

2008-11-05 Thread Michael McCandless
Tomer Gabel wrote: Since you're using an 8-core Mac Pro I also assume you have some sort of RAID setup, which means your storage subsystem can physically handle more than one concurrent request, which can only come into play with multiple segments. This is an important point: a multi-seg

Re: Performance of never optimizing

2008-11-05 Thread Michael McCandless
Justus Pendleton wrote: On 05/11/2008, at 4:36 AM, Michael McCandless wrote: If possible, you should try to use a larger corpus (eg Wikipedia) rather than multiply Reuters by N, which creates unnatural term frequency distribution. I'll replicate the tests with the wikipedia corpus over th

Re: Performance of never optimizing

2008-11-05 Thread Michael McCandless
Otis Gospodnetic wrote: Our current default behaviour is a merge factor of 4. We perform an optimization on the index every 4000 additions. We also perform an optimize at midnight. Our I wouldn't optimize every 4000 additions - you are killing IO, rewriting the whole index, while trying

Re: How to use Sort cass in Lucene 2.4.0?

2008-11-05 Thread Todd Benge
I think it's like this: Sort sort = new Sort("FIELDNAME", true); TopFieldDocs docs = searcher.searcher(query, null, n, sort); // n is the number of documents you want to retrieve ScoreDoc[] hits = docs.scoreDocs; for (int i = 0; i: > hi. > I have a question :) > > In lucene 2.3.X I did use Sort

Re: No segment files found/ Searcher error

2008-11-05 Thread Michael McCandless
If you are able to leave the IndexSearcher open, and resubmit the query, and most likely it works the 2nd time around, it really seems like something intermittent is going wrong with your IO system. That exception from CheckIndex is just happening when Lucene is trying to read bytes from

Re: How to use Sort cass in Lucene 2.4.0?

2008-11-05 Thread Grant Ingersoll
On Nov 5, 2008, at 12:04 PM, Todd Benge wrote: I think it's like this: Sort sort = new Sort("FIELDNAME", true); TopFieldDocs docs = searcher.searcher(query, null, n, sort); // n is the number of documents you want to retrieve ScoreDoc[] hits = docs.scoreDocs; for (int i = 0; i FYI: totalHi

Re: How to use Sort cass in Lucene 2.4.0?

2008-11-05 Thread Todd Benge
Yup - goofed that up. Thanks, Todd 2008/11/5 Grant Ingersoll <[EMAIL PROTECTED]>: > > On Nov 5, 2008, at 12:04 PM, Todd Benge wrote: > >> I think it's like this: >> >> Sort sort = new Sort("FIELDNAME", true); >> TopFieldDocs docs = searcher.searcher(query, null, n, sort); // n is >> the number

memory leak getting docs

2008-11-05 Thread Marc Sturlese
Hey there, I have posted about this problem before but I think I didn't explain mysql very well. I'll try to explain my problem inside the context: I get ids from a database and I look for the documents in an index that correspon to each id. There is just one match for every id. One I have the doc

Re: memory leak getting docs

2008-11-05 Thread bruno da silva
Hello Marc I'd suggest you create the IndexSearcher outside of your method and pass the indexreader as a parameter ... like : private Document getDocumentData(IndexReader reader, String id) you don't have a memory leak you have a intensive use of memory.. On Wed, Nov 5, 2008 at 3:11 PM, Marc S

Re: memory leak getting docs

2008-11-05 Thread Erick Erickson
That's also why your app runs so slowly, opening an IndexReader is a very expensive operation, doing it for every doc is exceedingly bad... Best Erick On Wed, Nov 5, 2008 at 3:21 PM, bruno da silva <[EMAIL PROTECTED]> wrote: > Hello Marc > I'd suggest you create the IndexSearcher outside of your

Re: Performance of never optimizing

2008-11-05 Thread Paul Smith
I don't believe our large users to have enough memory for Lucene indexes to fit in RAM. (Especially given we use quite a bit of RAM for other stuff.) I think we also close readers pretty frequently (whenever any user updates a JIRA issue, which I am assuming happening nearly constantly

RE : Can lucene search from multi-index directory like using FK in database?

2008-11-05 Thread Ulrich Vachon
Hello, Erik has right, Lucene is not a RDBMS. If you want mix both world, you can use tools dedicated like Hibernate Search or Compass. Ulrich VACHON Message d'origine De: Clay Zhong [mailto:[EMAIL PROTECTED] Date: mar. 04/11/2008 13:31 À: java-user@lucene.apache.org Objet : Ca

Re: How to use Sort cass in Lucene 2.4.0?

2008-11-05 Thread 장용석
Thank you very much. It's really usefull sample code for me. :) Thanks. Jang. 08. 11. 6, Todd Benge <[EMAIL PROTECTED]>님이 작성: > > Yup - goofed that up. > > Thanks, > > Todd > > 2008/11/5 Grant Ingersoll <[EMAIL PROTECTED]>: > > > > On Nov 5, 2008, at 12:04 PM, Todd Benge wrote: > > > >> I think

Re: Benchmarking my indexer

2008-11-05 Thread Rafael Cunha de Almeida
On Sun, 2 Nov 2008 21:06:56 -0200 Rafael Cunha de Almeida <[EMAIL PROTECTED]> wrote: > On Sun, 2 Nov 2008 07:11:20 -0500 > Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > > > > On Nov 1, 2008, at 1:39 AM, Rafael Cunha de Almeida wrote: > > > > > Hello, > > > > > > I did an indexer that parses so

Re: Can lucene search from multi-index directory like using FK in database?

2008-11-05 Thread Chris Lu
These are common problems. In general, mapping database tables into Lucene Documents is not always good for performance. You may need to flatten objects to fit Lucene's shoes. My answers are here respectively. 1. You have two alternatives here: 1) create an index to contain both User and Fi