Shawn,
On 8/24/23 16:38, Shawn Heisey wrote:
On 8/24/23 14:02, Christopher Schultz wrote:
I'm preparing to upgrade a standalone Solr server from 7.7.3 -> 8.11.2
and I wanted to confirm a few things.
First, I'm assuming that the new server will be willing to open and
modify the existing index after the upgrade, because I haven't seen
anything to suggest that wouldn't be the case, but please let me know
if I'm wrong about that.
If the index was EVER touched by a version before 7.0, then 8.x will not
read it. If the index was built from scratch by 7.7.3, then it will work.
I believe we have only used Solr 7 and above. Is there a way to confirm
without actually /trying/ it?
Second, I understand that a complete re-index is appropriate, and that
if I were to eventually upgrade to Solr 9 (which is also in the plan),
I would have to ensure that the index has been "properly" upgraded to
an 8.x-style index in order for Solr 9 to open it and re-index, etc.
Version 9 behaves the same as 8. If the index was ever touched by a
version before 8.0, version 9 will not read the index.
I would strongly recommend that even if your existing index will work,
that you perform a complete re-index from scratch as soon as possible.
I'm using a Java-based process via SolrJ to perform the re-index. My
re-index process looks something like this:
1. (Optional) solr.deleteByQuery("*:*")
2. for each document :
2a. SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", primary-identifier);
...
2b. solr.add(doc)
Is the above enough to generate a "new" 8.x-style index that will be
acceptable in the future? Is it necessary to perform step (1) and
delete all documents first, or if I freshen ALL documents will I end
up with a completely new index?
Before running that, you will want to either build a brand new core or
completely delete all index* directories in the core then reload that
core. If you do that, then there will be no possibility of
"contamination" from an earlier version.
Okay, so it's really not even enough to delete all the documents and
re-index. I really do have to nuke everything? That's a bummer.
Is there a way I can inspect the index afterward to ensure I've been
successful? Some kind of magic-number in file headers or an API call I
can make to ask about the version-ness of the index?
The Lucene tool called "Luke" can probably give you that information for
each segment. I do not know enough about low-level Lucene file formats
to give you a definitive answer.
I tried Luke, and it says "Lucene 7.4.0 or later" in my development
environment. It's a GUI utility, so I can't meaningfully use it on my
production servers to check the files. My guess is it's the same in
production.
-chris