On Thu, Feb 6, 2014 at 2:18 PM, Richard Eckart de Castilho <r...@apache.org>wrote: > > Yep - sounds right. I'd probably do the naming a bit different though. > What is input and output changes from component to component. So I'd used > other names on the pipeline level, e.g. > > RAW_DATA > EXTRACTED_TEXT > TRANSLATED_TEXT > > I'd hardcode simple names like "INPUT" and "OUTPUT" in the components > and then map these to the pipeline-level-names: > > builder.add(TextExtractor, > "INPUT", "RAW_DATA", > "OUTPUT", "EXTRACTED_TEXT"); > builder.add(Parser, > CAS.NAME_DEFAULT_SOFA, "EXTRACTED_TEXT"); > builder.add(Translator, > "INPUT", "EXTRACTED_TEXT", > "OUTPUT", "TRANSLATED_TEXT"); > > That's sounds a good idea indeed.
> > I have to say this feature is quite interesting but in fact the type > > systems are the components generating the real pain… :) > > What's the pain with the type systems? > In my experience of these weeks, I've noticed the most difficult part to overcome when you want to integrate a new annotator are the parameters and the type systems. Understand the type systems and how they are used is not always straightforward. In fact, for experienced developer is not such a big problem, and UIMA fit helps quite a lot. ;-) Cheers and thanks again! -- Luca Foppiano Software Engineer +31615253280 l...@foppiano.org www.foppiano.org