On 2018-11-02 20:52, Dawid Weiss wrote:
int processors = Runtime.getRuntime().availableProcessors();
int ConcurrentMergeScheduler cms = new ConcurrentMergeScheduler();
cms.setMaxMergesAndThreads(processors,processors);

See the number of threads in the CMS only matters if you have
concurrent merges of independent segments. What you're doing
effectively forces an eventual X -> 1 merge, which is done by a single
thread (regardless of the max processors above).

   38G _583u.fdt
   25M _583u.fdx
   13K _583u.fnm
   47G _583u_Lucene50_0.doc
   54G _583u_Lucene50_0.pos
   30G _583u_Lucene50_0.tim
  413M _583u_Lucene50_0.tip
  2.1G _583u_Lucene70_0.dvd
   213 _583u_Lucene70_0.dvm

Merging segments as large as this one requires not just CPU, but also
serious I/O throughput efficiency. I assume you have fast NVMe drives
on that machine, otherwise it'll be slow, no matter what. It's just a
lot of bytes going back and forth.
Yup, it's now cloud so optimizing for quick index and then merge to one has become financially interesting. Now it's too much cpu and ram being idle. Nor even maxing out the disk io (about 25% of max rate).

If we did such a max resource merge code would there be interest to have this merged?

I think so. Try to experiment locally first though and see if what you
can find out. Hacking that code I pointed at shouldn't be too
difficult. see what happens.
Yeah, before I left I started with an experiment to have one running without the
merge scheduler being involved at all.

Will try a few more experiments next week.

Or should we maybe do something like this assuming 64 cpus

writer.forceMerge(64, true);
writer.forceMerge(32, true);
writer.forceMerge(16, true);
writer.forceMerge(8, true);
writer.forceMerge(4, true);
writer.forceMerge(2, true);
writer.forceMerge(1, true);

No, this doesn't make much sense. If your goal is 1 segment then you
want to read from as many of them as once as possible and merge into a
single segment. Doing what you did above would only bump I/O traffic a
lot.
Thanks, I always thought so but wasn't sure anymore.

Have a nice weekend everyone!

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

--
Jerven Tjalling Bolleman
SIB | Swiss Institute of Bioinformatics
CMU - 1, rue Michel Servet - 1211 Geneva 4
t: +41 22 379 58 85 - f: +41 22 379 58 58
Jerven.Bolleman@sib.swiss - http://www.sib.swiss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to