Hi, I need to index hierarchical data but as far as I have seen nutch/solr do not have a concept like hierarchie, the index seems to be flat.
Now I have a problem I would solve using some sort of hierarchy and would like to know how you would solve it. Lets assume I have a set of pages I index that contain information about persons, several persons per page. Each person has some properties I can parse in my plugin as the information has a certain structure. Therefore my index contains fields like firstname, lastname, email,... each of them as multiValued because there are many persons on a page. As an example I say that each person has one or more email addresses associated with it. Now I would like to formulate queries like: return all fields of all persons that have lastname XXX or return all Email addresses of persons XXX. Since the fields are multiValue how can I solve this problem? I see no possibility to associate an entry within firstname with the corresponding lastname, as both fields are multiValue. Note that I have no unique id or something that I could use. Or would the trick here be that the persons are treated as separate documents and indexed separately? Meaning that when parsing I split them and index them, so that each person has a separate entry within the index? Any source code / plugin I could have a look at? -- View this message in context: http://lucene.472066.n3.nabble.com/indexing-hierarchical-data-schema-design-tp3052894p3052894.html Sent from the Nutch - User mailing list archive at Nabble.com.

