2009/5/19 Manuel Fiorelli <[email protected]> > I would like to see a well-established way to analyze semi-structured > documents, such as (X)HTML pages. UIMA shouldn't provide its own > parser, but at least a type system (like uima.cas) to represent a DOM > Document within a CAS instance (the simplest solution is to represent > element nodes as feature structures and text nodes as annotations on > the plain text, but I suspect there are more convenient solutions). >
I do agree with this. Tommaso
