[
https://issues.apache.org/jira/browse/SOLR-16703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845135#comment-17845135
]
Rahul Goswami commented on SOLR-16703:
--------------------------------------
I have done some work in this area and happy to take this up. Tied up for the
next one month, but will get to this by end of June/early July 2024.
> Clearing all documents of an index should delete traces of a previous Lucene
> version
> ------------------------------------------------------------------------------------
>
> Key: SOLR-16703
> URL: https://issues.apache.org/jira/browse/SOLR-16703
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 7.6, 8.11.2, 9.1.1
> Reporter: Gaël Jourdan
> Priority: Major
>
> _This is a ticket following a discussion on Slack with_ [~elyograg] _and_
> [~wunder] _especially._
> h1. High level scenario
> Assume you're starting from a current Solr server in version 7.x and want to
> upgrade to 8.x then 9.x.
> Upgrading from 7.x to 8.x works fine. Indexes of 7.x can still be read with
> Solr 8.x.
> On a regular basis, you clear* the index to start fresh, assuming this will
> recreate index in version 8.x.
> This run nicely for some time. Then you want to upgrade to 9.x. When
> starting, you get an error saying that the index is still 7.x and cannot be
> read by 9.x.
>
> *This is surprising because you'd expect that starting from a fresh index in
> 8.x would have removed any trace of 7.x.*
>
> _* : when I say "clear", I mean "delete by query \{{* : * }}all docs" and
> then commit + optionally optimize._
> h1. What I'd like to see
> Clearing an index when running Solr version N should delete any trace of
> Lucene version N-1.
> Otherwise this forces users to delete an index (core / collection) and
> recreate it rather than just clearing it.
> h1. Detailed scenario to reproduce
> The following steps reproduces the issue with a standalone Solr instance
> running in Docker but I experienced the issue in SolrCloud mode running on
> VMs and/or bare-metal.
>
> Also note that for personal troubleshooting I used the tool "luceneupgrader"
> available at [https://github.com/hakanai/luceneupgrader] but it's not
> necessary to reproduce the issue.
>
> 1. Create a directory for data
> {code:java}
> $ mkdir solrdata
> $ chmod -R a+rwx solrdata {code}
>
> 2. Start a Solr 7.x server, create a core and push some docs
> {code:java}
> $ docker run -d -v "$PWD/solrdata:/opt/solr/server/solr/mycores:rw" -p
> 8983:8983 --name my_solr_7 solr:7.6.0 solr-precreate gettingstarted
> $ docker exec -it my_solr_7 post -c gettingstarted
> example/exampledocs/manufacturers.xml
> $ curl -s 'http://localhost:8983/solr/gettingstarted/select?q=*:*' | jq
> .response.numFound
> 11{code}
>
> 3. Look at the index files and check version
> {code:java}
> $ ll solrdata/gettingstarted/data/index
>
> total 40K
> -rw-r--r--. 1 8983 8983 718 16 mars 17:37 _0.fdt
> -rw-r--r--. 1 8983 8983 84 16 mars 17:37 _0.fdx
> -rw-r--r--. 1 8983 8983 656 16 mars 17:37 _0.fnm
> -rw-r--r--. 1 8983 8983 112 16 mars 17:37 _0_Lucene50_0.doc
> -rw-r--r--. 1 8983 8983 1,1K 16 mars 17:37 _0_Lucene50_0.tim
> -rw-r--r--. 1 8983 8983 145 16 mars 17:37 _0_Lucene50_0.tip
> -rw-r--r--. 1 8983 8983 767 16 mars 17:37 _0_Lucene70_0.dvd
> -rw-r--r--. 1 8983 8983 730 16 mars 17:37 _0_Lucene70_0.dvm
> -rw-r--r--. 1 8983 8983 478 16 mars 17:37 _0.si
> -rw-r--r--. 1 8983 8983 203 16 mars 17:37 segments_2
> -rw-r--r--. 1 8983 8983 0 16 mars 17:36 write.lock
> $ java -jar luceneupgrader-0.6.0.jar info solrdata/gettingstarted/data/index
> Lucene index version: 7
> {code}
>
> 4. Stop Solr 7, update solrconfig.xml for Solr 8 and start a Solr 8 server
> {code:java}
> $ docker stop my_solr_7
> $ vim solrdata/gettingstarted/conf/solrconfig.xml
> $ cat solrdata/gettingstarted/conf/solrconfig.xml | grep luceneMatchVersion
> <luceneMatchVersion>8.11.2</luceneMatchVersion>
> $ docker run -d -v "$PWD/solrdata:/var/solr/data:rw" -p 8983:8983 --name
> my_solr_8 solr:8.11.2{code}
>
> 5. Check index is loaded ok and docs are still there
> {code:java}
> $ curl -s 'http://localhost:8983/solr/gettingstarted/select?q=*:*' | jq
> .response.numFound
> 11 {code}
>
> 6. Clear the index and check index files / version
> {code:java}
> $ curl -X POST -H 'Content-Type: application/json'
> 'http://localhost:8983/solr/gettingstarted/update?commit=true' -d '{
> "delete": {"query":"*:*"} }'
> $ ll solrdata/gettingstarted/data/index
> total 4,0K
> -rw-r--r--. 1 8983 8983 135 16 mars 17:45 segments_5
> -rw-r--r--. 1 8983 8983 0 16 mars 17:36 write.lock
> $ java -jar luceneupgrader-0.6.0.jar info solrdata/gettingstarted/data/index
> Lucene index version: 7
> $ curl 'http://localhost:8983/solr/gettingstarted/update?optimize=true'
> $ ll solrdata/gettingstarted/data/index
>
> total 4,0K
> -rw-r--r--. 1 8983 8983 135 16 mars 17:45 segments_5
> -rw-r--r--. 1 8983 8983 0 16 mars 17:36 write.lock
> $ java -jar ~luceneupgrader-0.6.0.jar info solrdata/gettingstarted/data/index
> Lucene index version: 7 {code}
> There's no more docs in the index but it's still considered as version 7.x.
>
> 7. Add docs in the index again
> {code:java}
> $ docker exec -it my_solr_8 post -c gettingstarted
> example/exampledocs/manufacturers.xml
> $ curl -s 'http://localhost:8983/solr/gettingstarted/select?q=*:*' | jq
> .response.numFound
> 11
> $ ll solrdata/gettingstarted/data/index
>
> total 48K
> -rw-r--r--. 1 8983 8983 158 16 mars 17:47 _2.fdm
> -rw-r--r--. 1 8983 8983 832 16 mars 17:47 _2.fdt
> -rw-r--r--. 1 8983 8983 64 16 mars 17:47 _2.fdx
> -rw-r--r--. 1 8983 8983 748 16 mars 17:47 _2.fnm
> -rw-r--r--. 1 8983 8983 767 16 mars 17:47 _2_Lucene80_0.dvd
> -rw-r--r--. 1 8983 8983 750 16 mars 17:47 _2_Lucene80_0.dvm
> -rw-r--r--. 1 8983 8983 80 16 mars 17:47 _2_Lucene84_0.doc
> -rw-r--r--. 1 8983 8983 883 16 mars 17:47 _2_Lucene84_0.tim
> -rw-r--r--. 1 8983 8983 75 16 mars 17:47 _2_Lucene84_0.tip
> -rw-r--r--. 1 8983 8983 395 16 mars 17:47 _2_Lucene84_0.tmd
> -rw-r--r--. 1 8983 8983 505 16 mars 17:47 _2.si
> -rw-r--r--. 1 8983 8983 220 16 mars 17:47 segments_6
> -rw-r--r--. 1 8983 8983 0 16 mars 17:36 write.lock
> $ java -jar luceneupgrader-0.6.0.jar info solrdata/gettingstarted/data/index
> Lucene index version: 7 {code}
> Empty index in which we add new docs through Solr 8.x is still considered as
> a 7.x index.
>
> 8. Stop Solr 8.x, update solrconfig.xml for 9.x, start Solr 9.x
> {code:java}
> $ docker stop my_solr_8
> $ vim solrdata/gettingstarted/conf/solrconfig.xml
> $ cat solrdata/gettingstarted/conf/solrconfig.xml | grep luceneMatchVersion
>
> <luceneMatchVersion>9.1.1</luceneMatchVersion>
> $ # also remove xslt response writer
> $ docker run -d -v "$PWD/solrdata:/var/solr/data:rw" -p 8983:8983 --name
> my_solr_9 solr:9.1.1 {code}
>
> 9. Check out logs of Solr, it cannot start/load the core:
> {code:java}
> $ docker logs my_solr_9
> 2023-03-16 16:53:37.046 ERROR (coreContainerWorkExecutor-2-thread-1) []
> o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup =>
> java.util.concurrent.ExecutionException:
> org.apache.solr.common.SolrException: Unable to create core [gettingstarted]
> ...
> Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format version
> is not supported (resource
> BufferedChecksumIndexInput(MMapIndexInput(path="/var/solr/data/gettingstarted/data/index/segments_6"))):
> This index was initially created with Lucene 7.x while the current version
> is 9.3.0 and Lucene only supports reading the current and previous major
> versions. This version of Lucene only supports indexes created with release
> 8.0 and later by default. {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]