Thanks Marvin. This should be enough for me to experiment with it. On a related question, Lucy relies on Snowball for language support (normalization, stemming, stopwords) but snowball has a very limited set of languages it supports. What would be the best way to add support for new languages? Creating a new module (in the same way that snowball seem to be a module)?
Em seg, 30 de mar de 2015 às 22:59, Marvin Humphrey <[email protected]> escreveu: > On Mon, Mar 30, 2015 at 1:28 PM, Bruno Albuquerque <[email protected]> > wrote: > > This is mostly out of curiosity than anything else. If I have the > > Lucy/Clownfish C headers and respective libraries, is it possible to roll > > my own Analyzer and use it with Lucy? Is there any documentation about > > doing something like that? It would be best if it did not involve having > to > > run cfc to achieve that, but I am ok with that if it can not be avoided. > > Subtyping from C is not officially supported because we have not worked out > a stable API for it. Subclassing Analyzer is not officially supported but > for > a different reason: Lucy's array-based model for token processing appears > to > be inferior to a stream-based model and the API was redacted in order to > give > us the option of changing it. > > Nevertheless, I can provide you with undocumented hacks which achieve your > ends. The interface is not yet elegant, but conversations like this will > lead > to improving it. > > typedef struct MyAnalyzer MyAnalyzer; > > // Transform() is the central Analyzer method. Check out the > // documentation in Analyzer.cfh and various implementations; that > // should give you enough to cargo cult your own version. > static Inversion* > S_MyAnalyzer_Transform_IMP(MyAnalyzer *self, Inversion *inversion) { > return (Inversion*)INCREF(inversion); > } > > // Create a subclass at runtime. > static Class* > S_class_var(void) { > StackString *class_name = SSTR_WRAP_UTF8("MyAnalyzer", 10); > Class *klass = Class_fetch_class((String*)class_name); > if (!klass) { > klass = Class_singleton((String*)class_name, ANALYZER); > Class_Override(klass, > (cfish_method_t)S_MyAnalyzer_Transform_IMP, > LUCY_Analyzer_Transform_OFFSET); > } > return klass; > } > > // Constructor. > MyAnalyzer* > MyAnalyzer_new(void) { > Class *klass = S_class_var(); > return (MyAnalyzer*)Class_Make_Obj(klass); > } > > Marvin Humphrey >
