On 1/8/09 7:47 AM, Uwe Baumbach wrote:
> Hi,
>
> is there a comprehensive, reliable, more profound description of the
> logical steps the internal search engine (or parser before the engine)
> undertakes to define:
> - what is recognized as a single word in an entered search string
> (blanks - OK, but what about slash, back slash, hyphen, period?) ?

Check MySQL's documentation; also try diving through SearchMySQL.php to 
check how it's breaking up the input when rendering its output. Also 
check Language.php for the horrid search tweaking code.

> - what are "similar words" (closeness of words) ?

No such metric exists afaik.

> Different sources (www.mediawiki.org, xy.wikipedia.org/wiki/Help:Search, ...) 
> tell more or less and then different things too.

Note that Wikimedia's sites use a different search engine (MWSearch 
extension plus our Lucene-based backend), so descriptions of their 
behavior would not necessarily be what you want if you're looking for 
descriptions of the default MySQL backend. Note also that the PostgreSQL 
backend is different.

-- brion

_______________________________________________
MediaWiki-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Reply via email to