Hi Nosiert, 2012/5/15 Nosiert Batiste (JIRA) <[email protected]>: > I will try to work around this problem by simply converting everything to > plain text.
Yes that's the best way to solve this for the moment. Apache Stanbol currently has no (good) support for annotating HTML sources. Maybe you would like to implement an enhancement engine that converts your HTML into plain text. This engine could run before the entity extraction engines come into play. Best, - Fabian -- Fabian http://twitter.com/fctwitt
