1 - if we add fields / remove fields to be indexed, how will this affect
our current indexes. Will we need to completely recreate millions on
indexes (or is it indices)?
Depends what you are trying to do... if you are just adding or removing
fields, the index should be usable. For adding, nothing changes except
'old' documents don't have the new field. If you remove a field, it
will still be in the index for 'old' docs, but you can't add it to new docs.
If you *change* a field definition (say 'date' -> slong) you will need
to reindex. Otherwise you will get lots of odd exceptions.
Scenario 1a :: we've been injecting "field1"... but not indexed. Now,
we just want to add an index. Once the xml is changed, how do we sanely
reindex?
Probably easiest to re-index from your source.
If everything you need is a stored field, you could use something like
SOLR-139 to load the stored fields and reindex.
Scenario 1b :: we add "field2" (and index=true), which was not
previously used as a field at all. Do our indexes need to be completely
recreated, or is there a way to update these indexes individually? I
still have the original data in a DB and can do that if necessary.
Thats fine. new docs have the field, old ones dont.
Scenario 1c :: we remove a few of the fields in the schema.xml (but add
nothing). Reindex required?
removing them from scema keeps them in the index. If thats ok, you
don't need to reindex.
2 - Question about the structure of the injected xml file... does it
need to exactly match the data in solr? I know it makes sense that
we're only injecting the fields that solr needs and not excluding fields
that it needs... but how fussy is solr when it comes to matching the xml
in injection?
by design it is fussy.
I think there is some way to make a non-indexed, non-stored dynamic
field that just will ignore unknown fields.