#915: WebSearch: use index-time word breaking information during seach time as
well
-------------------------+-----------------
 Reporter:  simko        |      Owner:
     Type:  enhancement  |     Status:  new
 Priority:  major        |  Milestone:
Component:  WebSearch    |    Version:
 Keywords:               |
-------------------------+-----------------
 In demo site, when searching for "spectrum.", one gets a warning phrase:

 {{{
 No exact match found for spectrum., using spectrum instead...
 }}}

 followed by two hits.

 Considering that dot is stripped away from indexed terms at the index
 time, see CFG_BIBINDEX_CHARS_ALPHANUMERIC_SEPARATORS and
 CFG_BIBINDEX_CHARS_PUNCTUATION and friends, it should not be necessary for
 the search engine to look for the dotted version at the search time.

 The purpose of this ticket is to take advantage of
 CFG_BIBINDEX_CHARS_PUNCTUATION and friends also during search time.  I.e.
 if a character is stripped away during indexing-time, then strip it away
 also during search-time, when looking for words.  (Not for phrases or
 regexps.)  We can amend search_unit_in_bibwords to this effect so that
 incoming terms to look for will be washed similarly as during the indexing
 process.

 Note that this may concern stemming and stopwords and such, but we have
 another ticket to take care of centralising indexing configurations, so
 further improvements could be dealt with there.  See ticket:852.

-- 
Ticket URL: <http://invenio-software.org/ticket/915>
Invenio <http://invenio-software.org>

Reply via email to