Re: fadvise/madvise during segment-merges....

2014-05-21 Thread Michael McCandless
On Wed, May 21, 2014 at 10:50 AM, Ravikumar Govindarajan wrote: >> >> But does that mean SEQUENTIAL will evict the >> page once we're done reading it? > > > Yes, looks like it does evict the pages once read completes... That's great news, we need to re-test it. > Let me elaborate a bit more > >

Re: Multi-thread indexing, should the commit be called from each thread?

2014-05-21 Thread Jack Krupansky
(Was this supposed to be a java-user/Lucene question or a Solr question?!) -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Wednesday, May 21, 2014 10:58 AM To: java-user Subject: Re: Multi-thread indexing, should the commit be called from each thread? I'll be more em

Re: Multi-thread indexing, should the commit be called from each thread?

2014-05-21 Thread Erick Erickson
I'll be more emphatic than Shai; you should _definitely_ not commit from each thread, especially if you are doing a hard commit with openSearcher=true or a soft commit. In either case you open a new searcher which fires all your autowarming queries which.. IOW they're expensive operations. More t

Re: fadvise/madvise during segment-merges....

2014-05-21 Thread Ravikumar Govindarajan
> > But does that mean SEQUENTIAL will evict the > page once we're done reading it? Yes, looks like it does evict the pages once read completes... Well, that option is too late? Like, say I read in the N 1 GB files to merge, then I call DONTNEED once the merge is done, but by then the pages f

Re: fadvise/madvise during segment-merges....

2014-05-21 Thread Michael McCandless
On Wed, May 21, 2014 at 8:20 AM, Ravikumar Govindarajan wrote: > Great blog and lucid explanation > > I think things have changed in recent kernel versions. I am no expert, but > could see some code related to this here > http://lxr.free-electrons.com/source/mm/fadvise.c?v=3.14 That looks promisi

Re: Multi-thread indexing, should the commit be called from each thread?

2014-05-21 Thread Shai Erera
You don't need to commit from each thread, you can definitely commit when all threads are done. In general, you should commit only when you want to ensure the data is "safe" on disk. Shai On Wed, May 21, 2014 at 2:58 PM, andi rexha wrote: > Hi! > I have a question about multi-thread indexing.

Re: fadvise/madvise during segment-merges....

2014-05-21 Thread Ravikumar Govindarajan
Great blog and lucid explanation I think things have changed in recent kernel versions. I am no expert, but could see some code related to this here http://lxr.free-electrons.com/source/mm/fadvise.c?v=3.14 O_DIRECT will be terrible drag no? Will a battery-backed disk cache help here? We are usin

Multi-thread indexing, should the commit be called from each thread?

2014-05-21 Thread andi rexha
Hi! I have a question about multi-thread indexing. When I perform a Multi-thread indexing, should I commit from each thread that I add documents or the commit should be done only when all the threads are done with their indexing task? Thank you!

RE: search time & number of segments

2014-05-21 Thread De Simone, Alessandro
> (3) Makes it a lot faster to update the index. I find this to be the main > selling point myself. Yes of course :-) We want to update the index more often. That's why it's not really an option to maintain an optimized index. > Do you have some typical response times from the optimized index a

Re: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-05-21 Thread Michael McCandless
On Wed, May 21, 2014 at 3:17 AM, Clemens Wyss DEV wrote: >> Can you just decrease IW's ramBufferSizeMB to relieve the memory pressure? > +1 > Is there something alike for IndexReaders? No, although you can take steps during indexing to reduce the RAM required during searching, e.g. limit how many

Re: fadvise/madvise during segment-merges....

2014-05-21 Thread Michael McCandless
You're right, segment merges can be catastrophic to ongoing searches. I explored the problem here: http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html but a lot has changed since then... SEQUENTIAL is probably best (if the OS implements it; I think the Linux kernel has improved h

Re: Question about multi-valued fields

2014-05-21 Thread Chris Bamford
Hi Tim Nice surprise! For the text search question, though, you could use analysis and then run a SpanQuery against your documenta. You'd get the token offsets and then you could re-analyze to figure out which field index is your hit. I have some helper code that I built as part of LUCEN

fadvise/madvise during segment-merges....

2014-05-21 Thread Ravikumar Govindarajan
Is it a good idea to use FADVISE_DONTNEED/MADVISE_DONTNEED flags during segment merge reads? Buffer-Cache contains critical data belonging to searches. A segment-merge has the potential to disturb the cache no? -- Ravi

AW: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-05-21 Thread Clemens Wyss DEV
> Can you just decrease IW's ramBufferSizeMB to relieve the memory pressure? +1 Is there something alike for IndexReaders? -Ursprüngliche Nachricht- Von: Michael McCandless [mailto:luc...@mikemccandless.com] Gesendet: Montag, 19. Mai 2014 12:19 An: Lucene Users Betreff: Re: [lucene 4.6] N