On Mon, Jan 09, 2012 at 03:25:09PM -0500, Jason McIntosh wrote:
> My client requires the ability to search for blank fields. In other words,
> they'd like to construct search queries that parse as e.g. "Show me all
> results with the title 'MyTitle' but no description", and this would return
> as hits all items whose "title" field contains "MyTitle", and whose
> "description" field is completely blank.

Lucy doesn't support empty term queries.  You will have to achieve this using
an actual string, either by storing a sentinel value like "EMPTY" in the
'title' field itself, or by e.g. storing a "1" or "0" value in an auxilliary
field such as 'has_title'.
 
> These sort of null-searches don't appear to be an out-of-the-box feature of
> Lucy's default query parser; a naive query like "title:MyTitle AND
> description:" doesn't work as I hope.

It's not just the QueryParser -- it's also the fundamental Lucy inverted index
structure.  Each term which is present in an index is associated with a list
of doc ids, so you look up a term to see what documents match.  There is no
list of doc ids available for "has no value", so you have to create such a
list yourself artificially.
    
(As an aside: If we were to implement such a feature, adding new fields on the
fly would become problematic, because we would have to go backfill values for
all existing docs.)

> So it looks like I will have to knit some custom magic here, or perhaps use
> an extension module. I write this list in the hope for some introductory
> pointers; I'm a little unsure where to start, or whether this is a solved
> problem and I'm not finding it.

You will certainly benefit from the last chapter in the tutorial,
Lucy::Docs::Tutorial::QueryObjects.

Cheers,

Marvin Humphrey

Reply via email to