Scanned this thread, apologies if I missed something, but here's a few
thoughts:

To get better advice make it clear if you are running Solr in Cloud mode
(a.k.a. self managed) or Legacy (a.k.a user managed). Some ways to know
which quickly:

   1. Is there an associated Zookeeper cluster? If yes, then you are in
   cloud mode if not then *probably* legacy (there's a way to run zookeeper
   embedded, but that's not the normal setup).
   2. In the admin UI do you see the word 'Cloud' in the left navigation
   bar? If yes, cloud, if no, legacy

*Key concept: Solr is (normally) just a server providing access to an index
of your data. It allows you to find a link, or id for a "document" but does
not (normally) serve as a repository for your data.*

This has some implications:

   1. Solr is typically paired with one or more data repositories
   (database, file system, sharepoint, etc)
   2. Solr normally cannot reindex data all by itself. Re-indexing is the
   process of re-reading the repository, and creating a fresh index.
   3. Solr is just an index, and does not manage the process of reading the
   data from sources (Exceptions like Data import handler[DIH] and streaming
   expressions exist, but DIH went away in 9.x and these are exceptions not
   the rule)
   4. Typically *something* outside of solr sends documents to solr.
   Re-indexing is normally the process of re-triggering something to send the
   documents again.
   5. This is unlike a database which contains both the data (the table)
   and an index (PK/FK/index) of the data.
   6. Versus a database, Solr's benefit is that it is an index of the
   *words* in the text of the document rather than entire string values.

Thus (exceptional cases excluded) things you do to or in solr don't
"trigger reindexing".

I have implied that sometimes solr can be the store for your data, which is
technically true. Unfortunately, this is tricky to get right, may
negatively impact performance, and results in long term data loss if done
wrong, so it's rarely recommended. I hope you haven't inherited this type
of problem!

Upgrading Solr across a single minor version is often simple, but
occasionally requires work. Always read release notes and test the result
before going live. Upgrading across major versions is always work. Lucene
(and therefore solr) requires that you reindex data with each major
version. There are stopgap tools to allow an upgrade of an existing index,
but that is a temporary measure that only works for N to N+1 and you are
expected to re-index before N+2.

- Gus

-- 
http://www.needhamsoftware.com (work)
https://a.co/d/b2sZLD9 (my fantasy fiction book)

Reply via email to