[ https://issues.apache.org/jira/browse/SOLR-15119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17274150#comment-17274150 ]
David Smiley commented on SOLR-15119: ------------------------------------- I think AB's comments on the issue originating LINK, SOLR-12509, are excellent. Please look there for further information, including a benchmark comparing both. There is no free lunch with REWRITE vs LINK -- pay up front or pay later, amortized. As a core increases in size, queries to it will slow down, and thus we want to split it for parallelization. A split using REWRITE will consume lots of I/O and CPU that will make queries perform even worse for quite a time. On the other hand, LINK will quickly do the essential, after which queries become parallelized. Then the split children can be replicated to other nodes where further indexing will naturally trigger merges that reduce the dead weight. If you want it to happen faster, issue an expungeDeletes and/or _partial_ forceMerge/optimize. You could choose to do that before replicating (reduce network I/O) or after (reduce load on originating machine). LINK works so fast, it even affords Solr the potential to dispense with a BUFFERING updateLog mode, if we want to employ that. Just block. I think [~megancarey] may be working on that. I'm deliberately commenting here for the benefit of the whole community, not the subset of those with access to the developer Slack. > Make LINK splitMethod the default for SplitShardCmd > --------------------------------------------------- > > Key: SOLR-15119 > URL: https://issues.apache.org/jira/browse/SOLR-15119 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: master (9.0) > Reporter: Megan Carey > Priority: Major > Labels: easy-fix > Time Spent: 1h > Remaining Estimate: 0h > > REWRITE splitMethod is still the default in SplitShardCmd [1], despite LINK > being much faster. IndexSizeTrigger in branch_8x already uses LINK by default > [2], and we have found LINK to be reliable and performant at scale. This work > will just update the default in SplitShardCmd to make LINK the default > overall. > > > [1][https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java#L88] > > [2][https://github.com/apache/lucene-solr/blob/branch_8x/solr/core/src/java/org/apache/solr/cloud/autoscaling/IndexSizeTrigger.java#L186] > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org