Thank you Olivier,
we'll go on to test OpenCalais / Zemanta / SalsaDev separately for the time being.

Em 16-03-2011 11:20, Olivier Grisel escreveu:
2011/3/14 Alex Lopez<[email protected]>:
Hi Stanbol devs,

we are working on a semantic app. that will/should do content categorization
and other stuff.

So I have some code that takes detected entities as input (in form of
dbpedia urls) and looks up in dbpedia both YAGO categories and wikipedia
categories (the ones that use to be skos:subject and now are dct:subject).

But of course this is tied to particular namespaces/service, I would like to
expand it and abstract the particulars while retaining the functionality,
something like a wrapper:

input: resource/resources/text (raw content)
output: categories/topics

Keep in mind I'm not looking for categories of the sort
Person/Organisation... more of the sort Science/Jazz Musicians etc

So I've been following Stanbol's devs list for some time, and I'm exited
about the possibilities, maybe I can use some of it for this particular
requeriment. Right now, I see it as a collection of services, for example I
can see zemanta doing what I want with DMOZ topics, included as an engine.
(are there other engines doing similar?)

But I wonder, is there a "central" place with methods/services doing this
for all implementations? or perhaps I misunderstood what the project is
about...

kind of a getTopicsForResource(){
    getDMOZ();
    getWikipediaCategories();
    getFreebaseCategories();
    ...
}

If not, what are good place to look for this kind of functionality so I can
include it in my method?

Now I understand this in an incubating project so perhaps this is a planned
feature, do you have any roadmap? Any expected date for a "first release" of
stanbol?

Yes this is a planned feature. The existing
RelatedTopicEnhancementEngine is to be reimplemented to use the entity
hub index and to build predefined topic indexes out of the dbpedia
skos hierarchy and the fulltext of the related articles (to be able to
perform similarity queries using the MoreLikeThis feature of Solr).

We also need to extend the Stanbol vocabulary to handle topics that
are not entities.

In the mean time you can use OpenCalais / Zemanta / SalsaDev directly.

Reply via email to