Hi Ivo,

> So here are some questions:
>
> 1. What could be added to EntityHub from the knowledge listed above and
> what is the best (stable enough) way to add entities to Entity Hub: is it
> via REST or manually as described here:
> http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html (of
> course REST would be more interesting, but if it is not ready yet a manual
> approach would be good too)

I have already started to implement STANBOL-673 [1] that will bring a
new type of "Site" to the Entityhub that can be fully managed by the
RESTful API. This is exactly tailored to use cases as described here.
In a 2nd iteration of this I do also plan to make all the
functionality of the Entityhub Indexing tool (as described in [2])
available for ManagedSites.

In the meantime you could use the RESTful services of the Entityhub
(http://{stanbol-server}/entityhub/entity)

> 2. Would we need to tackle Onthology Manager in any way to organize the
> entities or this can be skipped? (this is the most vague thing I have
> encountered)

It really depends what you want to do. If you just want to use the
Stanbol Enhancer (and the Entityhub for managing the Entities to
extract from parsed Content) than you will not need to use the
Ontology Manager, Reasonings and Rules component.

> 3. How to connect enhancer to use our entities? Is the Keywordlinking
> Engine way to go:
> http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.html
>  ?

I would expect so. The use cases [2] and [3] should provide all
necessary information to help you with that decision. If not feedback
on how to improve those is very welcome.


> 4. Would it be faster to use the same entity types as dbpedia (Person,
> Company,...) or introducing new ones should be straightforward?

Personally I would suggest to use

* schema.org [4] for entity specific information - because you might be
able to use the schema.org mapping also for SEO
* SKOS [5] for describing the hierarchy and relations between concepts

Some additional comments/suggestions:

Note: I copied lines multiple times and also re-ordered them to better
fit to my recommendations.

> - objects are instantiated from a class (there is a collection of default
> classes but it is possible to add more)
> - class has collection of attributes
> - objects can have relations with other objects
> object1 is_a_class_of class1 (e.g. Person, Blog post, Folder, ... )

Content Objects that use classes representing Entities - Person in the
above list) would be very useful for adding to the Entityhub (or a
ManagedSite as soon as they are available). As mentioned above I would
try to map those classes and attributes to schema.org

> - objects can be tagged with tags
> - tags are organized separately and can be hierarchical too
> object1 is_tagged with tag1
> tag1 is_a_parent_of tag2

Having a tag hierarchy AND tagged documents is ideal for
training/using the TopicEngine. See [6] for how to train/use this
engine. So this data can be used to provide an auto-tagging feature
for your CMS.

> - objects are stored hierarchically as nodes
> object1 is_a_parent_of object2
> object1 is_related_to object3
> object1 is_a_class_of class1 (e.g. Person, Blog post, Folder, ... )

best
Rupert

[1] https://issues.apache.org/jira/browse/STANBOL-673
[2] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
[3] http://incubator.apache.org/stanbol/docs/trunk/multilingual.html
[4] http://schema.org/docs/full.html
[5] http://www.w3.org/2009/08/skos-reference/skos.html
[6] 
http://dl.dropbox.com/u/5743203/IKS/ReviewMeeting2012/Topic-Classification.pdf


On Wed, Jul 4, 2012 at 10:54 AM, Ivo Lukač <[email protected]> wrote:
> Dear Stanbol Community,
>
> I work at Netgen, a small web agency mostly using eZ Publish CMS and we are
> also an eZ Publish  business partners for a long time. Together with
> another independent eZ consultant Paul Borgermans we are interested to add
> some semantic possibilities to it so we were in Salzburg last month to try
> to figure out what exactly could we do. My colleague Petar followed up the
> discussion with Mr. Suat Gonul after the event.
>
> After seeing some solutions already been made by early adapters and knowing
> the eZ Publish CMS architecture really well we have a potential idea on
> what we could do as a proof of concept. But we need your help to evaluate
> the idea and give some feedback.
>
> The idea is simple: to map the eZ taxonomy to Stanbol and use the
> enhancements to help the editor annotate the content.
>

> Of course, we could add more specific knowledge (depends on the specific
> project) but we would like to keep it as general as can be. What we could
> do easily is to generate an RDF with the data.
>
> So our main goal would be to push part of this knowledge to Stanbol and use
> it to enhance newly created objects not using dbpedia but rather internal
> data.
>
>

>

>
>
> 4. Would it be faster to use the same entity types as dbpedia (Person,
> Company,...) or introducing new ones should be straightforward?
>
> Any kind of feedback would be welcome and would be helpful in our
> evaluation....
>
> Best regards
>
>
> --
> Ivo Lukač
>
> Netgen d.o.o. - A.M.Tripala 3/I, 10000 Zagreb, Croatia
> web: http://www.netgen.hr, tel: +385 (0)1 3879722, mob: +385 (0)91 5251566
> ---------------------------------------------------------
> everyday tweets: http://twitter.com/ilukac
> company blog: http://www.netgen.hr/eng/blog
> professional profile: http://www.linkedin.com/in/ivolukac
> personal blog: http://ilukac.com/
> member of the CISEx board: http://www.exportboomers.com/
> presenting: http://ezsummercamp.com



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to