Rob,

You don't need to change the input of your lib-search call. You could also 
change the lib-parser-custom.xqy, translating the word-query to the or-query 
you wrote below. But that would affect all word-queries you perform with 
lib-search unless you can distinguish between specific ones..

And perhaps someone from Mark Logic can reply to the performance part, it might 
not be as bad as you think..

Kind regards,
Geert

>


Drs. G.P.H. Josten
Consultant


http://www.daidalos.nl/
Daidalos BV
Source of Innovation
Hoekeindsehof 1-4
2665 JZ Bleiswijk
Tel.: +31 (0) 10 850 1200
Fax: +31 (0) 10 850 1199
http://www.daidalos.nl/
KvK 27164984
De informatie - verzonden in of met dit emailbericht - is afkomstig van 
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit 
bericht kunnen geen rechten worden ontleend.


> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Whitby, Rob, CMG
> Sent: maandag 30 maart 2009 16:17
> To: General Mark Logic Developer Discussion
> Subject: RE: [MarkLogic Dev General] stemmed searches
>
> Yes that explains the problem well.
>
> Best solution so far is to rewrite all queries like this:
>
> cts:search(doc(),
>   cts:or-query((
>     cts:word-query('search', 'exact'),
>     cts:word-query('search', 'stemmed')
>   ))
> )
>
> We're using lib-search to generate complex queries on a lot
> of fields and facets, so making this change everywhere in the
> lib-search code won't be trivial. And then there's presumably
> a large performance impact, as it effectively doubles the
> number of queries.
>
> So I'm still planning on removing all the xml:lang attributes...
>
> Rob
>
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> David Sewell
> Sent: 30 March 2009 14:56
> To: General Mark Logic Developer Discussion
> Subject: RE: [MarkLogic Dev General] stemmed searches
>
> On Mon, 30 Mar 2009, Geert Josten wrote:
>
> >> I don't like this solution but can't think of anything else.
> >> Personally I think this is a poor feature of MarkLogic.
> >> Turning stemming on/off should not affect the content base
> searched.
> >> Everything should be searched, with content in the configured
> >> language gaining the benefits of stemming.
> >
> > Are you sure that stemming is affecting which documents are being
> > searched? It does ofcourse affects how many results are found, but
> > since stemming won't work on old english, you will need to enter
> > exactly matching tokens to find results in old english
> texts. Stemming
>
> > should only increase the hit ratio, not decrease it..
>
> We have the same issue. It's more a problem of coding
> verbosity than anything else. We have stemmed searching set
> on our main document database. So given data like this
>
> <p xml:lang="eng">In an earlier stage of the Common law it was death.
> <foreign xml:lang="lat">si quis in aula regia pugnet, vel
> arma sua extrahat et capiatur...</foreign></p>
>
> because our default language is English, the following search
> returns null results:
>
>    cts:search(//p, "extrahat")
>
> as it is stemmed, and stemmed search works only on text in
> elements with @xml:lang = English. So the search must be rewritten as
>
>    cts:search(//p,
>       cts:word-query("extrahat", "exact")
>    )
>
> But then you lose the stemmed search, which you might want if
> the search term was "stage" for example. So either you have
> to "and" all your searches, or choose between one kind of
> search or the other.
>
>
> David
>
>
> --
> David Sewell, Editorial and Technical Manager ROTUNDA, The
> University of Virginia Press PO Box 801079, Charlottesville,
> VA 22904-4318 USA
> Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
> Email: [email protected]   Tel: +1 434 924 9973
> Web: http://rotunda.upress.virginia.edu/
> _______________________________________________
> General mailing list
> [email protected]
> http://xqzone.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://xqzone.com/mailman/listinfo/general
>

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to