On Monday 17 November 2003 08:39, Chong, Herb wrote:
> the core of the search engine has to have certain capabilities, however,
> because they are next to impossible to add as a layer on top with any
> efficiency. detecting sentence boundaries outside the core search engine is
> really hard to do without building another search engine index. if i have
> to do that, there is no point in using Lucene.

It's also good to know what exactly constitutes core; I would assume that 
analyzer implementations are not part per se, as long as core knows how
to use analyzers. But as long as index structure has some way to store 
information needed (perhaps by using existing property of distances between 
tokens, which allows both overlapping tokens and gaps, like someone 
suggested?), core need not know specifics of how analyzers determine 
structural (sentence etc) boundaries.

To me this seems like one of many issues where it's possible to retain 
distinction between Lucene kernel (lean mean core) and more specialized 
functionality; highlighting was another one.

-+ Tatu +-


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to