> +1 to make a single merge concurrent! It is horribly frustrating to watch > that last merge running on a single core :) I have lost many hours of my > life to this frustration.
> Yeah... it is, isn't it? Especially on new machines where you have super-fast > SSDs, countless cores, etc... That last merge consumes so few resources that > the computer feels practically idle... it's hard to explain to people using > our software (who invested in hardware) why we're basically doing nothing... > :) > I do think we need to explore concurrency within terms/postings across fields > in one segment to really see gains in the common case where merge time is > dominated by postings. Yeah, probably. > if you want to experiment with something like that, you can hackishly > simulate it today to quickly see the overhead, correct? its a small hack to > PerFieldPostingsFormat to force it to emit "files-per-field" and then CFS > will combine it all together. Good idea, Robert. I'll try this. > By default merging stored fields is super fast because Lucene can copy > compressed data directly, but if there are deletes or index sorting is > enabled this optimization is not applicable anymore and I wouldn't be > surprised if stored fields started taking non negligible time. In this case these segments are essentially made from scratch but with lots and lots of term vectors and postings... But the more parallel stages we can introduce, the better. I have some other stuff on my plate before I can dive deep into this but I eventually will. Thanks for the pointers, everyone - helpful! D. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org