I am trying to get a particular search to work and it is proving
problematic. The actual source data is quite complex but can be
summarised by the following example:
I have articles that are indexed so that they can be searched. Each
article also has multiple properties associated with it which are also
indexed and searchable. When users search, they can get hits in either
the main article or the associated properties. Regardless of where a
hit is achieved, the article is returned as a search hit (ie. the
properties are never a hit in their own right).
Now for the complexity:
Each property has security on it, which means that for any given user,
they may or may not be able to see the property. If a user cannot see
a property (based on its value for each article), they obviously do
not get a search hit in it. This security check is proprietary and
cannot be done using the typical mechanism of storing a role in the
index alongside the other fields in the document.
I currently have a index that contains the articles and properties
indexed separately (ie. an article is indexed as a document, and each
property has its own document). When a search happens, a hit in
article A or a hit in any of the properties of article A should be
classed as hit for article A alone, with the scores combined.
To achieve this originally, Lucene v1.3 was modified to allow this to
happen by changing BooleanQuery to have a custom Scorer that could
apply the logic of the security check and the combination of two hits
in different documents being classed as a hit in a single document. I
am trying to upgrade this version to the latest (v2.3.2 - I am using
Lucene.Net), but ideally without having to modify Lucene in any way.
An additional problem occurs if I do an AND search. If an article
contains the word foo and one of its properties contains the word bar,
then searching for "foo AND bar" will return the article as a hit. My
current code deals with this inside the custom Scorer.
Any ideas how/if this can be done?
I am thinking along the lines of using a custom HitCollector and
passing that into the search, but when doing the boolean search "foo
AND bar", execution never reaches my HitCollector as the
ConjunctionScorer filters out all of the results from the sub-queries
before getting there.
Thanks,
Adrian