[ 
https://issues.apache.org/jira/browse/SOLR-17725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048422#comment-18048422
 ] 

Rahul Goswami edited comment on SOLR-17725 at 12/30/25 6:36 PM:
----------------------------------------------------------------

[~ichattopadhyaya]  This is waiting for [#3903 
|https://github.com/apache/solr/pull/3903]to be merged.

This JIRA is split across 2 PRs:
https://github.com/apache/solr/pull/3883 => A new 
LatestVersionMergePolicyFactory to block older segments from participating in 
merges. (This is merged)
https://github.com/apache/solr/pull/3903 => Expose a 
/admin/cores?action=UPGRADECOREINDEX endpoint to handle the in-place upgrade

However even with just #3883, users should be able to configure the merge 
policy in solrconfig and simply reindex the data. That would be enough to 
enable them to upgrade to Solr 11 in the future without recreating the core. As 
discussed in the thread, I was able to get the associated Lucene PRs into main 
and Lucene 10, so we are good there.

I am almost done with testing #3903, pending one integration issue while 
calling the REST endpoint in async mode (passing async=request_id param). I 
expect to open it up for reviews by tonight. But that should not hold the 10x 
release since #3883 still provides a pathway to upgrade, with a few more manual 
steps and in a slightly less optimized way to what the UPGRADECOREINDEX Core 
Admin API does (in #3903).



was (Author: [email protected]):
[~ichattopadhyaya]  This is waiting for #3903 to be merged.

This JIRA is split across 2 PRs:
https://github.com/apache/solr/pull/3883 => A new 
LatestVersionMergePolicyFactory to block older segments from participating in 
merges. (This is merged)
https://github.com/apache/solr/pull/3903 => Expose a 
/admin/cores?action=UPGRADECOREINDEX endpoint to handle the in-place upgrade

However even with just #3883, users should be able to configure the merge 
policy in solrconfig and simply reindex the data. That would be enough to 
enable them to upgrade to Solr 11 in the future without recreating the core. As 
discussed in the thread, I was able to get the associated Lucene PRs into main 
and Lucene 10, so we are good there.

I am almost done with testing #3903, pending one integration issue while 
calling the REST endpoint in async mode (passing async=request_id param). I 
expect to open it up for reviews by tonight. But that should not hold the 10x 
release since #3883 still provides a pathway to upgrade, with a few more manual 
steps and in a slightly less optimized way to what the UPGRADECOREINDEX Core 
Admin API does (in #3903).


> Automatically upgrade Solr indexes without needing to reindex from source
> -------------------------------------------------------------------------
>
>                 Key: SOLR-17725
>                 URL: https://issues.apache.org/jira/browse/SOLR-17725
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Rahul Goswami
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 10.0, 9.11
>
>         Attachments: High Level Design.png
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Today upgrading from Solr version X to X+2 requires complete reingestion of 
> data from source. This comes from Lucene's constraint which only guarantees 
> index compatibility between the version the index was created in and the 
> immediate next version. 
> This reindexing usually comes with added downtime and/or cost. Especially in 
> case of deployments which are in customer environments and not completely in 
> control of the vendor, this proposition of having to completely reindex the 
> data can become a hard sell.
> I, on behalf of my employer, Commvault, have developed a way which achieves 
> this reindexing in-place on the same index. Also, the process automatically 
> keeps "upgrading" the indexes over multiple subsequent Solr upgrades without 
> needing manual intervention. 
> It comes with the following limitations:
> i) All _source_ fields need to be either stored=true or docValues=true. Any 
> copyField destination fields can be stored=false of course, just that the 
> source fields (or more precisely, the source fields you care about 
> preserving) should be either stored or docValues true. 
> ii) The datatype of an existing field in schema.xml shouldn't change upon 
> Solr upgrade. Introducing new fields is fine. 
> For indexes where this limitation is not a problem (it wasn't for us!), the 
> tool can reindex in-place on the same core with zero downtime and 
> legitimately "upgrade" the index. This can remove a lot of operational 
> headaches, especially in environments with hundreds/thousands of very large 
> indexes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to