On 13/06/14 00:55, Andy Seaborne wrote:

voidsetLowercaseExpandedTerms(boolean lowercaseExpandedTerms)
Set to true to allow leading wildcard characters.
When set, * or ? are allowed as the first character of a PrefixQuery and
WildcardQuery. Note that this can produce very slow queries on big
indexes.

Default: false.

I've just added that to jena-text in svn.

That's great, it might be useful for me as well, though I have to test the performance. In my application, leading wildcards are currently processed with regex (or simpler string functions which are slightly faster), but it tends to be slow.

I can't say. I also would not try to, in this case. I expect performance
would be better using the filter.

A regex may well be better depending on the size and composition of the
lucene index.

If you need to have fast suffix queries, I think it's possible with jena-text + Solr, but I haven't tried. In Solr, you can configure (in schema.xml) a ReversedWildcardFilterFactory that will store the terms reversed in the index (this will double the index size) and use that for fast suffix searches. See e.g. here:

http://docs.lucidworks.com/display/lweug/Wildcard+Queries

-Osma

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Teollisuuskatu 23)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[email protected]
http://www.nationallibrary.fi

Reply via email to