Hi I am complete newbie to Lucene. In fact I'm not even a search guy. I looked up terms such as stemming just yesterday. So this is going to be so much fun ;)
Here's the problem I am trying to solve: I work in the B2B space at Staples (an office supplies company in the US). We sell office products to companies with whom we have a contract. The contract defines what we can sell (or not sell) to a company's employees. One contract may be shared by numerous small and medium sized companies and one company may also have more than one contract based on their ship-to location. Currently, we have a very bad system of doing this. We do blocking at the SKU level. In other words, the sales people go in and mark individual SKUs as blocked or available for sale. Given that we sell upwards of 90000 SKUs, this is a nightmare and unwanted SKUs do become available for sale. A new project called 'Assortment View' is funded to address this problem. The sales people will define blocking/unblocking rules in the catalog at the category, subcategory and even individual SKU level to create a customer's product assortment. With blocking at a much higher level than individual SKUs, we hope the problem will be alleviated. I am evaluating Lucene for this purpose. I realize I am attempting to use primarily a search engine as an inclusion/exclusion index solution, where data about contracts, customers and blocking rules is in the index, and Lucene provides the class of products available/forbidden for sale. Questions: 1. Is this the right use of Lucene? 2. Has anybody done something like this before? 3. When there are two high level entities viz. catalog and contract, what is the best index structure? a. Two separate indices which are 'joined' (my relational database background shows) at runtime to provide the customer's product assortment. b. One index with two tables (db shows again), if there is such a concept in Lucene c. One flattened index containing catalog and contract information as tokenized fields in the same document (now I'm using Lucene terms ;) 4. I need to ask two kinds of questions from the index: a. Given a list of customer's contracts, give me the customer's product assortment. In other words, give me counts of products within a category, subcategory available for sale. b. Given these list of SKUs, tell me which ones are blocked and which are not by looking at all the blocking rules defined at the category or individual SKU level. Is Lucene's Query API flexible enough to support such different queries? Will it scale for query b, where the list of SKUs may be large (thousands)? Fun Fun!! Thanks in advance Sid --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]