I've found multiple questions that have been asked in various placed online (including StackOverflow <http://stackoverflow.com/questions/8491779/what-would-be-the-best-way-t o-index-and-search-my-data-using-lucene> ) that ask questions along the lines of "How can I index and then search relational data in Lucene". Quite rightly these questions are met with the standard response that Lucene is not designed to model data like this. This quote I found sums it up...
"A Lucene Index is a Document Store. In a Document Store, a single document represents a single concept with all necessary data stored to represent that concept (compared to that same concept being spread across multiple tables in an RDBMS requiring several joins to re-create)." So I will not ask that question and instead provide my high level requirements and see if any Lucene gurus out there can help me. * We have data on People (Name, Gender, DOB, Nationality, etc) * And data on Companies (Name, Country, City, etc). * We also have data about how these two types of entity relate to each other where a person worked at the company (Person, Company, Role, Date Started, Date Ended, etc). We have two entities - Person and Company - that have their own properties and then properties exist for the many-to-many link between them. Some example searches could be as follows... * Find all Companies in Australia * Find all People born between two dates * Find all People who have worked as a .Net Developer * Find all males who have worked as a.Net Developer in London. * Find all People who have worked as a .Net Developer between 2008 and 2010 The criteria span all the three sets of data. Our requirement is to provide a Faceted Search <http://en.wikipedia.org/wiki/Faceted_search> over the data that accepts any combination of the various properties, of which I have given some examples. I am aware of the idea that the Index should be constructed with the search in mind. But I can't seem to come up with a sensible index that would meet all the combinations of search criteria * What classes native to Lucene or what extension points can we make use of. * Are there are established techniques for doing this kind of thing? * Are there any third open source contributions that I have missed that will help us here? For now I won't describe the scenarios we have considered because I don't want to bloat out this question and make it too intimidating. Please ask me to elaborate where necessary. Many thanks in advance, Andy ______________________________________________________________________ This email is intended solely for the addressee and is strictly confidential. If you are not the addressee, please do not read, print, re-transmit, store or act in reliance on it or any attachments. Instead please email it back to the sender and delete the message from your computer. Email transmission cannot be guaranteed to be secure or error free and BoardEx® accepts no liability for changes made to this email (and any attachments) after it was sent or for viruses arising as a result of this email transmission. BoardEx® reserves the right to intercept any emails or other communication for permitted purposes, in accordance with applicable laws, which you send to, or receive from, any of the employees or agents of BoardEx®. BoardEx® is owned by Management Diagnostics Limited, Elizabeth House, York Road, London, SE1 7NQ. Reg No: 3714017 This email has been scanned for viruses by the Email Protection Agency ______________________________________________________________________
