That's it. Thank you!
I have already configured KeywordLinkingEngine when I used my own ontology.
I think I'm familiar with that and I will try that option too.
In meanwhile I found another interesting problem. I tried to annotate
document and web page. With web page, I tried
IOUtils.write(byte[], out) and I had to convert URL to byte[]:
public static byte[] GetBytesFromURL(String _url) throws IOException
{
GetMethod get = new GetMethod(_url);
InputStream t_is = get.getResponseBodyAsStream();
byte[] buffer = new byte[1024];
int count = -1;
Reader t_url_reader = new BufferedReader(new
InputStreamReader(t_is));
byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");
return t_bytes;
}
But, the problem is that I'm getting null for InputStream.
Any ideas?
Best,
Srecko
-----Original Message-----
From: Rupert Westenthaler [mailto:[email protected]]
Sent: Wednesday, January 11, 2012 22:08
To: Srecko Joksimovic
Cc: [email protected]
Subject: Re: Annotating using DBPedia ontology
On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
> Hi Rupert,
>
> When I load localhost:8080/engines it says this:
>
> There are currently 5 active engines.
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
>
org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
> ementEngine
>
org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> ine
>
org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> ine
>
> Maybe this could tell you something?
>
This are exactly the 5 engines that are expected to run with the default
configuration.
Based on this the Stanbol Enhnacer should just work fine.
After looking at the the text you enhanced I noticed however that is does
not mention
any named entities such as Persons, Organizations and Places. So I checked
it with
my local Stanbol version and was also not any detected entities.
So to check if Stanbol works as expected you should try to use an other text
the
mentions some Named Entities such as
"John Smith works for the Apple Inc. in Cupertino, California."
If you want to search also for entities like "Bank", "Blog", "Consumer",
"Telephone" .
you need to also configure a KeywordLinkingEngine for dbpedia. Part B or [3]
provides
more information on how to do that.
But let me mention that the KeywordLinkingEngine is more useful if used in
combination
with an own domain specific thesaurus rather than a global data set like
dbpedia. When
used with dbpedia you will also get a lot of false positives.
best
Rupert
[3] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html