I just published a compiled extension for Lucy on Github:
https://github.com/nwellnhof/LucyX-Analysis-WhitespaceTokenizer
It's a simple whitespace tokenizer that's not meant to be used in
production but to serve as a sample extension for development. Here are
some notes on stuff that's still to do:
Currently, we use the last component of the module name as parcel. This
results in very long symbol names in the case of WhitespaceTokenizer. We
should add a "parcel" build parameter to Clownfish::CFC::Perl::Build, so
we can use something shorter like "WSToker".
In WhitespaceTokenizer.cfh I had to add a __C__ block that includes
Lucy/Analysis/Inversion.h because the generated XS needs the
LUCY_INVERSION VTable. That's not ideal.
As previously mentioned, all Lucy types used in WhitespaceTokenizer.cfh
have to be prefixed with "lucy_".
There's an intricate problem with XSLoader that only manifests when
running the tests. See the comment in WhitespaceTokenizer.pm.
It's very illustrative to look at code that's created in autogen when
building the extension, especially autogen/source/parcel.c.
Nick