[
https://issues.apache.org/jira/browse/LUCENE-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16215507#comment-16215507
]
Shawn Heisey commented on LUCENE-8004:
--------------------------------------
Related, but definitely a tangent:
I have noted in the past that forceMerge (optimize in Solr) runs quite a lot
slower than the transfer rates that my disk array can reach. On a system where
the RAID10 array can easily go WELL beyond 100 megabytes per second for both
reads and writes, the merge down to one segment only results in disk writes at
20 to 30 megabytes per second. I would like to see this go faster, but when I
think about the data manipulations required to combine postings for multiple
segments, I can accept it if Lucene experts tell me that it's going as fast as
it can already.
But I do wonder if maybe a segment rewrite as Erick has suggested here (which
would not be an actual merge of multiple segments) could be improved so the
postings are simply converted to the new format, rather than being rebuilt
entirely as I believe a merge requires. Since I am not very familiar with the
actual format or exactly what it is that a merge actually does, I am not
currently able to examine the code and answer my own question.
> IndexUpgraderTool should rewrite segments rather than forceMerge
> ----------------------------------------------------------------
>
> Key: LUCENE-8004
> URL: https://issues.apache.org/jira/browse/LUCENE-8004
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Erick Erickson
>
> Spinoff from LUCENE-7976. We help users get themselves into a corner by using
> forceMerge on an index to rewrite all segments in the current Lucene format.
> We should rewrite each individual segment instead. This would also help with
> upgrading X-2->X-1, then X-1->X.
> Of course the preferred method is to re-index from scratch.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]