Hello, Kart. I still don't fully get the problem. But usually implementing Rule Engine requires to use https://lucene.apache.org/core/7_3_1/sandbox/org/apache/lucene/search/CoveringQuery.html which check number of rule clauses in a dedicated field.
On Thu, Jan 23, 2020 at 12:12 AM Karthick Sundaram <karthic...@trigent.com.invalid> wrote: > Gentlemen: > > > > I am using Lucene as search engine for the below requirement: > > > > Millions of documents (text files) are there. > > Each text file has thousands of words (plain Strings with space separated). > > Example content of a text file 1 (just showing few words): 0001AAA 0001AAB > 0001AAC 0061000 PSBP06 MFBP05 ... > > Example content of a text file 2 (just showing few words): 0001AAX 0001AAB > 0001AAN 0061002 PSBP07 MFBP06 ... > > > > Then there are millions of rules captured in the database. For easy > understanding, I specify couple of rules below: > > > > Rule 1: > > CONDITION 1: WITH: 0001AAA OR 0001AAC > > CONDITION 2: WITH: PSBP06 OR PSBP07 > > CONDITION 3: WITH: MFBP05 > > > > Rule 2: > > CONDITION 1: WITH: 0001AAN OR 0001AAC > > CONDITION 2: WITH: PSBP06 > > CONDITION 3: WITH: PSBP08 > > CONDITION 4: NOT WITH: MFBP05 > > > > Requirement is, for a given rule, find the text files matching at least one > word in each condition of the rule > > I indexed the contents of each text file as a Lucene document with a Field > "FileContents" and another field to just store the file name > > So, for the Rule 1, I constructed query as (0001AAA OR 0001AAC) AND (PSBP06 > OR PSBP07) AND (MFBP05) > > And for Rule 2, the query is (0001AAN OR 0001AAC) AND (PSBP06) AND (PSBP08) > AND NOT (MFBP05). > > > > Queries are working and able to find the appropriate text files. > > > > Now, I have another requirement which is reverse of above requirement. > > i.e., For the given text file, I need to find the list of Rules that can > match. > > Example: For the text file 1, the "Rule 1" should match, because the text > file 1 has 0001AAA which satisfies condition 1, PSBP06 will satisfies > condition 2, MFBP05 will satisfy condition 3. > > Rule 1 has 3 conditions and at least one word in each condition matches for > text file 1. So Rule 1 is good for text file 1. > > Rule 2 should not match for text file 1 because PSBP08 is not there in it. > > > > I don't know whether i can index the "Rule" information in Lucene. A rule > can have 1 or more conditions, so I can't use fixed number of Fields to > query on. Even if there are fixed number of fields, the query has to check > for each field to match at least one word. > > Is it possible to handle this requirement using Lucene? or should I go for > other options? > > I am new to Lucene, any help would be appreciated. > > > > Thanks, > > Kart > > -- Sincerely yours Mikhail Khludnev