Hi Jack Thanks for your answer. Do I understand that correctly that I must create a "merge-entity" that contains all the different validFrom/validUntil dates as fields (and of course the other search-related fields).
This would mean that the number of index entries is equal to the number of all possible combinations of from/until date-ranges in a "record-chain" (all records with all their individual versions connected by foreign keys) since every combination creates a new record in a query across all tables. That also means that I will have a lot of entries with the same values in the other search-related fields - the only difference will be most of the time one of the from/until-ranges. Perhaps the query can be optimized so that irrelevant combinations can be avoided (for example if two date-ranges do not overlap). Then, when I have built that index I can query it with the reference date as argument to compare it with every from/until range in the chain. And so I get only the relevant entries where the reference date is between all from/until ranges. Is this correct? Thanks and regards Stephan On Wed, Aug 15, 2012 at 2:32 PM, Jack Krupansky <j...@basetechnology.com> wrote: > The date checking can be implemented using range query as a filter query, > such as > > &fq=startDate:[* TO NOW] AND endDate:[NOW TO *] > > (You can also use an "frange" query.) > > Then you will have to flatten the database tables. Your Solr schema would > have a single "merged" record type. You will have to decide whether the > different record types (tables) will have common fields versus static > qualification by adding a prefix or suffix, e.g., "name" vs. "employee_name" > and "employer_name". The latter has the advantage that you do not have to > separately specify a table "type" field since the fields would be empty for > records of other types. > > -- Jack Krupansky > > -----Original Message----- From: Stefan Burkard > Sent: Wednesday, August 15, 2012 8:12 AM > To: solr-user@lucene.apache.org > Subject: How to design index for related versioned database records > > > Hi solr-users > > I have a case where I need to build an index from a database. > > ***Data structure*** > The data is spread across multiple tables and in each table the > records are versioned - this means that one "real" record can exist > multiple times in a table, each with different validFrom/validUntil > dates. Therefore it is possible to query the valid version of a record > for a given point in time. > > The relations of the data are something like this: > Employee <-> LinkTable (=Employment) <-> Employer <-> LinkTable > (=offered services) <-> Service > > That means I have data across 5 relations, each of them with versioned > records. > > ***Search needs*** > Now I need to be able to search for employees and employers based on > the services they offer for a given point in time. > > Therefore I have built an index of all employees and employers with > their services as subentity. So I have one index entry for every > version of every employee/employer and each version collects the > offered services for the given timeframe of the employee/employer > version. > > Problem: The offered services of an employee/employer can change > during its validity period. That means I do not only need to take the > version timespan of the employee/employer into account but also the > version timespans of services and the link-tables. > > ***Question*** > I think I could continue with my strategy to have an index entry of an > employee/employer with its services for any given point in time. But > there are much more entries than now since every involved > validfrom/validuntil period (if they overlap) produces more entries. > But I am not sure if this is a good strategy, or if it would be better > to try to index the whole datastructure in an other way. > > Are there any recommendations how to handle such a case? > > Thanks for any help > Stephan