prefix matching
Hi all, I'm trying to use prefixes to match similar strings to a query string. I have the following field type: fieldtype name=prefix stored=true indexed=true class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=10/ /analyzer /fieldtype field: field name=wordPrefix type=prefix indexed=true stored=true/ copyField: copyField source=word dest=wordPrefix/ If I apply this to an indexed string: ipod shuffle and query string: shufle (missing f) I get matching terms for sh, shu shuf Index Analyzer ipodshuffle ipodshuffle ipodshuffle ipipoipodshshushuf shuffshufflshuffle Query Analyzer shufle shufle shufle shshushufshufl shufle However when I query for with shufle i get no results: http://localhost:8983/solr/select?q=wordPrefix%3Ashuflefl=wordPrefixqt=standarddebugQuery=on lst name=debug str name=rawquerystringwordPrefix:shufle/str str name=querystringwordPrefix:shufle/str - str name=parsedquery PhraseQuery(wordPrefix:sh hu uf fl le shu huf ufl fle shuf hufl ufle shufl hufle shufle) /str - str name=parsedquery_toString wordPrefix:sh hu uf fl le shu huf ufl fle shuf hufl ufle shufl hufle shufle /str This post suggests that I need to set the Position Increment for the my token filter, but I'm not sure how to do that or if it's possible. http://www.lucidimagination.com/search/document/bc643c39f0b6e423/queryparser_and_ngrams#629b39ea39aa9cd4 Thoughts? Thanks...Tom
number of matching documents incorrect during postOptimize
Hi all, I'm trying to check that an import using the dataImportHandler was clean before I take a snapshot of the index to be pulled via snappuller to query nodes. One of the checks I do is verify that a certain minimum number of documents are returned for a query. I do this in a script that I'm calling via the postOptimize hook. However, after a full import the numFound results from the query are not accurate until after the postOptimize code completes and so my checks are failing. Glancing at the code this looks non-trivial to fix as the hook call is pretty deep in the call stack. org.apache.solr.handler.dataimport.DataImporter.doFullImport execute eventually calls org.apache.solr.update.UpdateHandler.callPostOptimizeCallbacks One option would be to spawn and background a new job to check the status with an initial sleep to wait for the postOptimize that spawned it to finish. This is pretty ugly and could lead to some race conditions but will probably work. Any better recommendations on how to acheive this functionality? Thanks...Tom
1.3 DisMax and MoreLikeThis
Hi, I wanted to use the new dismax support for more like this described in SOLR-295 https://issues.apache.org/jira/browse/SOLR-295 but can't even get the new syntax for dismax to work (described in SOLR-281https://issues.apache.org/jira/browse/SOLR-281). Any ideas if this functionality works? Here's the relevant part of my solr config, requestHandler name=/genre class=solr.StandardRequestHandler defType=dismax lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=qf relatedExact^2 genre^0.5 /str int name=ps100/int str name=q.alt*:*/str /lst /requestHandler Example query: http://localhost:13280/solr/genre?indent=onversion=2.2q=terrence+howardstart=0rows=10fl=*%2Cscorewt=standarddebugQuery=onexplainOther=hl.fl= Debug output: (I would expect to see dismax scoring) str name=Contributor8843 11.151003 = (MATCH) sum of: 6.925395 = (MATCH) weight(name:terrence in 63941), product of: 0.7880709 = queryWeight(name:terrence), product of: 10.0431795 = idf(docFreq=234, numDocs=1988249) 0.07846827 = queryNorm 8.787782 = (MATCH) fieldWeight(name:terrence in 63941), product of: 1.0 = tf(termFreq(name:terrence)=1) 10.0431795 = idf(docFreq=234, numDocs=1988249) 0.875 = fieldNorm(field=name, doc=63941) 4.2256074 = (MATCH) weight(name:howard in 63941), product of: 0.6155844 = queryWeight(name:howard), product of: 7.84501 = idf(docFreq=2116, numDocs=1988249) 0.07846827 = queryNorm 6.8643837 = (MATCH) fieldWeight(name:howard in 63941), product of: 1.0 = tf(termFreq(name:howard)=1) 7.84501 = idf(docFreq=2116, numDocs=1988249) 0.875 = fieldNorm(field=name, doc=63941) Here's my build info: Solr Specification Version: 1.2.2008.06.02.15.21.48 Solr Implementation Version: 1.3-dev 662524M - tsmorton - 2008-06-02 15:21:48 Is this feature now broken or does it look like my config is wrong? Thanks...Tom
Re: 1.3 DisMax and MoreLikeThis
Hi, Thanks Yonik. That fixed that. I would be useful to change one of the existing dismax query types in the default solrconfig.xml to use this new syntax (Especially since DisMaxRequestHandler is being deprecared.) Thanks again...Tom On Wed, Jun 4, 2008 at 11:19 AM, Yonik Seeley [EMAIL PROTECTED] wrote: On Wed, Jun 4, 2008 at 11:11 AM, Tom Morton [EMAIL PROTECTED] wrote: I wanted to use the new dismax support for more like this described in SOLR-295 https://issues.apache.org/jira/browse/SOLR-295 but can't even get the new syntax for dismax to work (described in SOLR-281https://issues.apache.org/jira/browse/SOLR-281). Any ideas if this functionality works? Here's the relevant part of my solr config, requestHandler name=/genre class=solr.StandardRequestHandler defType=dismax defType is just another parameter and should appear in the defaults section below. -Yonik lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=qf relatedExact^2 genre^0.5 /str int name=ps100/int str name=q.alt*:*/str /lst /requestHandler Example query: http://localhost:13280/solr/genre?indent=onversion=2.2q=terrence+howardstart=0rows=10fl=*%2Cscorewt=standarddebugQuery=onexplainOther=hl.fl= Debug output: (I would expect to see dismax scoring) str name=Contributor8843 11.151003 = (MATCH) sum of: 6.925395 = (MATCH) weight(name:terrence in 63941), product of: 0.7880709 = queryWeight(name:terrence), product of: 10.0431795 = idf(docFreq=234, numDocs=1988249) 0.07846827 = queryNorm 8.787782 = (MATCH) fieldWeight(name:terrence in 63941), product of: 1.0 = tf(termFreq(name:terrence)=1) 10.0431795 = idf(docFreq=234, numDocs=1988249) 0.875 = fieldNorm(field=name, doc=63941) 4.2256074 = (MATCH) weight(name:howard in 63941), product of: 0.6155844 = queryWeight(name:howard), product of: 7.84501 = idf(docFreq=2116, numDocs=1988249) 0.07846827 = queryNorm 6.8643837 = (MATCH) fieldWeight(name:howard in 63941), product of: 1.0 = tf(termFreq(name:howard)=1) 7.84501 = idf(docFreq=2116, numDocs=1988249) 0.875 = fieldNorm(field=name, doc=63941) Here's my build info: Solr Specification Version: 1.2.2008.06.02.15.21.48 Solr Implementation Version: 1.3-dev 662524M - tsmorton - 2008-06-02 15:21:48 Is this feature now broken or does it look like my config is wrong? Thanks...Tom
Boost support for MoreLikeThis fields
Hi, SOLR-295 https://issues.apache.org/jira/browse/SOLR-295 mentions boost support for morelikethis and then seems to have been subsumed by SOLR-281https://issues.apache.org/jira/browse/SOLR-281. To be clear, I'm talking about boosts for the mlt.fl fields and how they are ranked rather than for the seeding query. Has this feature gotten any attention? Thanks...Tom