Hi,

I am trying to figure out a schema for indexing documents that I will
download from another document management system. In the original
source, documents are stored inside collections. Collections act like
folders with extra metadata (they cannot be nested though).

Each document must be part of a collection, so there are no dangling
documents. A document can rarely be in multiple collections, I am
thinking of indexing them for each collection they are in to keep it
simple.

My two primary concerns are that I should be able to modify collection
metadata and I should be able to add more documents without reindexing
all the documents in the collection.

I will only be searching for text and metadata (like creation time,
author name etc.) of individual documents but collection metadata
should also be returned with results.

My second plan is to index documents and collections separately and
join them in query time (documentation seems to indicate query time
joins are possible).

I am very new to solr, so I don't want to start with a schema that I
will regret later. So any advice is appreciated.

Best Regards,
Yaşar

Reply via email to