Hi Mano,

MarkLogic Server does not really have a concept of stop words, per se.  A term 
is a term, and all the terms in a query are used  to calculate relevance.   The 
relevance is calculated based on the term frequency and the number of fragments 
in the database, so words that are typically thought of as "stop words" will 
not add much to the score of its search results.

That being said, it is quite easy to have your application parse the query text 
before generating a cts:query.  For example, if your application gets its text 
from users via a text box in a browser, you can grab the text from the request 
and do an appropriate fn:replace on the string, removing some list of stop 
words.  I suspect for many stop word lists, the performance of this would be 
fine, assuming the list is not that large.  Depending on how your application 
is written, another approach might be to parse the query after you construct 
the cts:query, removing unwanted terms.  Each approach has advantages and 
disadvantages.

Another question to ask yourself is this: do you really need to remove the stop 
words?  The main reason to remove them (it seems to me) is to give more 
relevant answers, and I don't think it will end up making much difference for 
that.  You might find better ways of improving your relevance such as weighting 
some elements higher than others.

-Danny

From: [email protected] 
[mailto:[email protected]] On Behalf Of mano m
Sent: Tuesday, September 01, 2009 6:59 AM
To: [email protected]
Subject: [MarkLogic Dev General] "Stop words" using Marklogic

Hi,

We need to implement  "Stop words" in search application using Marklogic. Will 
Mark Logic supports this through any API or do we need to implement our own 
logic to achieve this?
Please share your ideas.
Thanks,
Mano



________________________________
See the Web's breaking stories, chosen by people like you. Check out Yahoo! 
Buzz<http://in.rd.yahoo.com/tagline_buzz_1/*http:/in.buzz.yahoo.com/>.
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to