On Thu, Jan 17, 2013 at 5:48 AM, Nick Wellnhofer <[email protected]> wrote:
> It can be done, but it's not trivial and probably not very performant.
> First, you have to write your own Analyzer class in Perl. See the following
> threads for some guidance:
>
> http://mail-archives.apache.org/mod_mbox/lucy-user/201111.mbox/%[email protected]%3E
> http://mail-archives.apache.org/mod_mbox/lucy-user/201207.mbox/%[email protected]%3E
>
> We really need a cookbook entry describing how to write custom analyzers.

Putting up a cookbook entry on wiki.apache.org/lucy would be great.  I'd have
some misgivings about adding an entry to the offical docs, though, because
subclassing Analyzer isn't officially supported.

(Background: Attempts to increase the speed of the current array-based
Analyzer system using memory pools to allocate Tokens fell short of
expectations; we need to see whether a stream-based implementation would be
superior, but that would require a different subclassing API.)

Marvin Humphrey

Reply via email to