Re: Schema design for parent child field

2013-07-01 Thread Mikhail Khludnev
from my experience deeply nested scopes is for SOLR-3076 almost only.


On Sat, Jun 29, 2013 at 1:08 PM, Sperrink
kevin.sperr...@lexisnexis.co.zawrote:

 Good day,
 I'm seeking some guidance on how best to represent the following data
 within
 a solr schema.
 I have a list of subjects which are detailed to n levels.
 Each document can contain many of these subject entities.
 As I see it if this had been just 1 subject per document, dynamic fields
 would have been a good resolution.
 Any suggestions on how best to create this structure in a denormalised
 fashion while maintaining the data integrity.
 For example a document could have:
 Subject level 1: contract
 Subject level 2: claims
 Subject level 1: patent
 Subject level 2: counter claims

 If I were to search for level 1 contract, I would only want the facet count
 for level 2 to contain claims and not counter claims.

 Any assistance in this would be much appreciated.




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Schema-design-for-parent-child-field-tp4074084.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Schema design for parent child field

2013-06-29 Thread Sperrink
Good day,
I'm seeking some guidance on how best to represent the following data within
a solr schema.
I have a list of subjects which are detailed to n levels.
Each document can contain many of these subject entities.
As I see it if this had been just 1 subject per document, dynamic fields
would have been a good resolution.
Any suggestions on how best to create this structure in a denormalised
fashion while maintaining the data integrity.
For example a document could have:
Subject level 1: contract
Subject level 2: claims
Subject level 1: patent
Subject level 2: counter claims

If I were to search for level 1 contract, I would only want the facet count
for level 2 to contain claims and not counter claims.

Any assistance in this would be much appreciated.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Schema-design-for-parent-child-field-tp4074084.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Schema design for parent child field

2013-06-29 Thread Jack Krupansky
Both dynamic fields and multivalued fields are powerful Solr features that 
can be used to great effect, but only is used in moderation - a relatively 
small number of discrete values (e.g., a few dozens of strings.) Anything 
more complex and you are asking for trouble and creating a pseudo-schema 
that will be difficult to maintain or for anybody else to comprehend.


So, the simple answer to your question: Flatten, in the most straightforward 
manner - each instance of a record type should be a discrete Solr 
document, give each record its own id to be the Solr document key/ID. 
Solr can support multiple document types in the same collection, or you can 
store each record type in separate collection.


The simplest, cleanest structure is to store each record type in a separate 
collection and then use multiple Solr queries to emulate SQL join operations 
as needed.


But if you would prefer to mash multiple record types into the same Solr 
collection/schema, you can do that too. Make the schema be the union of the 
schemas for each record type - Solr/Lucene has no significant overhead for 
fields which do not have values present for a given document.


Each document would have a unique ID field. In addition, each document would 
have a parent field for each record type, so you can quickly search for all 
children of a given parent. You can have one common parent ID if you assign 
unique IDs to all children across all record types, but it can sometimes be 
cleaner for the child ID to reset to zero/one for each new parent. It's 
merely a question of whether you want to have a single key value or a tuple 
of key values to identify a specific child.


You can duplicate a subset of the parent fields in each child to simulate 
the effect of a simple join in a single clean query. But you can do a 
separate query to get parent record details.


-- Jack Krupansky

-Original Message- 
From: Sperrink

Sent: Saturday, June 29, 2013 5:08 AM
To: solr-user@lucene.apache.org
Subject: Schema design for parent child field

Good day,
I'm seeking some guidance on how best to represent the following data within
a solr schema.
I have a list of subjects which are detailed to n levels.
Each document can contain many of these subject entities.
As I see it if this had been just 1 subject per document, dynamic fields
would have been a good resolution.
Any suggestions on how best to create this structure in a denormalised
fashion while maintaining the data integrity.
For example a document could have:
Subject level 1: contract
Subject level 2: claims
Subject level 1: patent
Subject level 2: counter claims

If I were to search for level 1 contract, I would only want the facet count
for level 2 to contain claims and not counter claims.

Any assistance in this would be much appreciated.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Schema-design-for-parent-child-field-tp4074084.html
Sent from the Solr - User mailing list archive at Nabble.com.