Hi Erik,

I'm not sure exactly how much context you need here, so I'll try to keep it short and expand as needed.

The column I am faceting contains a comma deliniated set of vectors. Each vector is made up of {Make,Year,Model} e.g. _ford_1996_focus,mercedes_1996_clk,ford_2000_focus

I have a custom request handler, where if I want to find all the cars from 1996 I pass in a facet query for the Year (1996) which is transformed to a wildcard facet query :

_*_1996_*

In otherwords, it'll match any records whose vector column contains a string, which somewhere has a car from 1996.

Why not put the Make, Year and Model in separate columns and do a facet query of multiple columns?... because once we've selected 1996, we should (in the above example) then be offering "ford and mercedes" as further facet choices, and nothing more. If the parts were in their own columns, there would be no way to tie the Makes and Models to specific years, for example.

At anyrate, the wildcard search returns the entire match (_ford_1996_focus,mercedes_1996_clk,ford_2000_focus). I then have to do another RegExp over it to extract only the two parts (the first ford and mercedes) that were from 1996. This isn't using SOLR's cache very effectively.

It would be excellent if SOLR could break up that comma separated list into three different parts, and run the RegExp over each , returning only those which match. Is that what you're implying with Analysis? If that were the case, I'd not need to worry about character exclusion.

Sorry if that's a bit fuzzy... it's hard trying to explain enough to be useful, but not too much that it turns into an essay!!!

Thanks,
Ben

The solution I'm using is to form a vector

Erik Hatcher wrote:
Ben,

Could you post an example of the type of data you're dealing with and how you want it handled? I suspect there is a way to accomplish what you want using an analyzed field, or by preprocessing the data you're indexing.

    Erik

On Jun 29, 2009, at 9:29 AM, Ben wrote:

Hello,

I've been using SOLR for a while now, but am stuck for information on two issues :

1) Is it possible to exclude characters in a SOLR facet wildcard query?
e.g.
[^,]* to match any character except an ","  ?

2) Can one setup the facet wildcard query to return the exact sub strings it matched of the queried facet, rather than the whole string?

I hope somebody can help :)

Thanks,

Ben


Reply via email to