Dear Stanbol community, I have a use case where I need to extract metadata from structures (mostly HTML or XML), where the position determines the meaning. Since entity recognition is part of the task (so the extracted strings should be resolved against vocabularies) I am considering Stanbol for this job. Now the question is where a scraper (like scraperwiki.com) would fit in such an architecture? Shall I implement a wrapper for the scraper as an enhancer? In this case if an engine adds annotation to the document in the chain, would the next engine in the chain be able to do entity recognition on the annotation? Or would you recommend a different approach?
Thanks, Barna