Re: Schema design for parent child field

Jack Krupansky Sat, 29 Jun 2013 06:55:41 -0700

Both dynamic fields and multivalued fields are powerful Solr features thatcan be used to great effect, but only is used in moderation - a relativelysmall number of discrete values (e.g., a few dozens of strings.) Anythingmore complex and you are asking for trouble and creating a pseudo-schemathat will be difficult to maintain or for anybody else to comprehend.

So, the simple answer to your question: Flatten, in the most straightforwardmanner - each instance of a "record type" should be a discrete Solrdocument, give each "record" its own "id" to be the Solr document key/ID.Solr can support multiple document types in the same collection, or you canstore each record type in separate collection.

The simplest, cleanest structure is to store each record type in a separatecollection and then use multiple Solr queries to emulate SQL join operationsas needed.

But if you would prefer to "mash" multiple record types into the same Solrcollection/schema, you can do that too. Make the schema be the union of theschemas for each record type - Solr/Lucene has no significant overhead forfields which do not have values present for a given document.

Each document would have a unique ID field. In addition, each document wouldhave a parent field for each record type, so you can quickly search for allchildren of a given parent. You can have one common parent ID if you assignunique IDs to all children across all record types, but it can sometimes becleaner for the child ID to reset to zero/one for each new parent. It'smerely a question of whether you want to have a single key value or a tupleof key values to identify a specific child.

You can duplicate a subset of the parent fields in each child to simulatethe effect of a simple join in a single clean query. But you can do aseparate query to get parent record details.


-- Jack Krupansky

-----Original Message-----From: Sperrink

Sent: Saturday, June 29, 2013 5:08 AM
To: solr-user@lucene.apache.org
Subject: Schema design for parent child field

Good day,
I'm seeking some guidance on how best to represent the following data within
a solr schema.
I have a list of subjects which are detailed to n levels.
Each document can contain many of these subject entities.
As I see it if this had been just 1 subject per document, dynamic fields
would have been a good resolution.
Any suggestions on how best to create this structure in a denormalised
fashion while maintaining the data integrity.
For example a document could have:
Subject level 1: contract
Subject level 2: claims
Subject level 1: patent
Subject level 2: counter claims

If I were to search for level 1 contract, I would only want the facet count
for level 2 to contain claims and not counter claims.

Any assistance in this would be much appreciated.

--

View this message in context:http://lucene.472066.n3.nabble.com/Schema-design-for-parent-child-field-tp4074084.htmlSent from the Solr - User mailing list archive at Nabble.com.

Re: Schema design for parent child field

Reply via email to