Hi Barnabas,
Could you provide more details about how your documents are structured
and would be the format of the extracted strings using the scraper?
Regards,
Rafa
El 25/11/13 10:37, Barnabas Szasz escribió:
Dear Stanbol community,
I have a use case where I need to extract metadata from structures (mostly HTML
or XML), where the position determines the meaning. Since entity recognition is
part of the task (so the extracted strings should be resolved against
vocabularies) I am considering Stanbol for this job.
Now the question is where a scraper (like scraperwiki.com) would fit in such an
architecture? Shall I implement a wrapper for the scraper as an enhancer? In
this case if an engine adds annotation to the document in the chain, would the
next engine in the chain be able to do entity recognition on the annotation?
Or would you recommend a different approach?
Thanks,
Barna