Re: Static index, fastest way to do forceMerge

2018-11-02 Thread Dawid Weiss
Thanks for chipping in, Toke. A ~1TB index is impressive. Back of the envelope says reading & writing 900GB in 8 hours is 2*900GB/(8*60*60s) = 64MB/s. I don't remember the interface for our SSD machine, but even with SATA II this is only ~1/5th of the possible fairly sequential IO throughput. So

Re: Static index, fastest way to do forceMerge

2018-11-02 Thread Jerven Tjalling Bolleman
On 2018-11-02 20:52, Dawid Weiss wrote: int processors = Runtime.getRuntime().availableProcessors(); int ConcurrentMergeScheduler cms = new ConcurrentMergeScheduler(); cms.setMaxMergesAndThreads(processors,processors); See the number of threads in the CMS only matters if you have concurrent

Re: Static index, fastest way to do forceMerge

2018-11-02 Thread Toke Eskildsen
Dawid Weiss wrote: > Merging segments as large as this one requires not just CPU, but also > serious I/O throughput efficiency. I assume you have fast NVMe drives > on that machine, otherwise it'll be slow, no matter what. It's just a > lot of bytes going back and forth. We have quite a lot of

Re: Static index, fastest way to do forceMerge

2018-11-02 Thread Dawid Weiss
> int processors = Runtime.getRuntime().availableProcessors(); > int ConcurrentMergeScheduler cms = new ConcurrentMergeScheduler(); > cms.setMaxMergesAndThreads(processors,processors); See the number of threads in the CMS only matters if you have concurrent merges of independent segments. What

Re: Static index, fastest way to do forceMerge

2018-11-02 Thread Jerven Tjalling Bolleman
Hi Dawid, Erick, Thanks for the reply. We are using pure lucene and currently this is what I am doing int processors = Runtime.getRuntime().availableProcessors(); int ConcurrentMergeScheduler cms = new ConcurrentMergeScheduler(); cms.setMaxMergesAndThreads(processors,processors);

Re: Static index, fastest way to do forceMerge

2018-11-02 Thread Dawid Weiss
We are faced with a similar situation. Yes, the merge process can take a long time and is mostly single-threaded (if you're merging from N segments into a single segment, only one thread does the job). As Erick pointed out, the merge process takes a backseat compared to indexing and searches (in

access to joined documents

2018-11-02 Thread Michael Sokolov
Hi List, I have a question about query-time joins as provided by JoinUtil in the join package. As I understand it, the main documents returned by the query will be those having a value in the to-field that matches the value in the from-field of some documents returned by the fromQuery. My

Re: Static index, fastest way to do forceMerge

2018-11-02 Thread Erick Erickson
The merge process is rather tricky, and there's nothing that I know of that will use all resources available. In fact the merge code is written to _not_ use up all the possible resources on the theory that there should be some left over to handle queries etc. Yeah, the situation you describe is

Re: Lucene stops working

2018-11-02 Thread Erick Erickson
Is this custom code? What method? Can you show us a sample? There's not enough information here to say much. On Fri, Nov 2, 2018 at 7:38 AM egorlex wrote: > > Hi, I am new in Lucene and i have strange problem. Lucene stops working > without any errors after some time. It works fine for 1 day or

Lucene stops working

2018-11-02 Thread egorlex
Hi, I am new in Lucene and i have strange problem. Lucene stops working without any errors after some time. It works fine for 1 day or several hours. I did some investigation and found that 5 IndexReaders are opened in the search method, but they do not close when exiting the method..can it be a

Static index, fastest way to do forceMerge

2018-11-02 Thread Jerven Bolleman
Dear Lucene Devs and Users, First of all thank you for this wonderful library and API. forceMerges are normally not recommended but we fall into one of the few usecases where it makes sense. In our use case we have a large index (3 actually) and we don't update them ever after indexing.