Hi Rupert, Thanks for your feedback. My response below.
On Thu, Jul 5, 2012 at 6:40 AM, Rupert Westenthaler < [email protected]> wrote: > Hi Ivo, > > > So here are some questions: > > > > 1. What could be added to EntityHub from the knowledge listed above and > > what is the best (stable enough) way to add entities to Entity Hub: is it > > via REST or manually as described here: > > http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html (of > > course REST would be more interesting, but if it is not ready yet a > manual > > approach would be good too) > > I have already started to implement STANBOL-673 [1] that will bring a > new type of "Site" to the Entityhub that can be fully managed by the > RESTful API. This is exactly tailored to use cases as described here. > In a 2nd iteration of this I do also plan to make all the > functionality of the Entityhub Indexing tool (as described in [2]) > available for ManagedSites. > Ok, as we are evaluating the project right now we need something that works now :) > > In the meantime you could use the RESTful services of the Entityhub > (http://{stanbol-server}/entityhub/entity) > ok, I found this links http://dev.iks-project.eu:8081/entityhub as a reference, is there any other docs about this? > > > 2. Would we need to tackle Onthology Manager in any way to organize the > > entities or this can be skipped? (this is the most vague thing I have > > encountered) > > It really depends what you want to do. If you just want to use the > Stanbol Enhancer (and the Entityhub for managing the Entities to > extract from parsed Content) than you will not need to use the > Ontology Manager, Reasonings and Rules component. > Ok, that simplifies things as we need to tackle one module less, we don't have much time to do it anyway. > > > 3. How to connect enhancer to use our entities? Is the Keywordlinking > > Engine way to go: > > > http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.html > > ? > > I would expect so. The use cases [2] and [3] should provide all > necessary information to help you with that decision. If not feedback > on how to improve those is very welcome. > Ok. > > > > 4. Would it be faster to use the same entity types as dbpedia (Person, > > Company,...) or introducing new ones should be straightforward? > > Personally I would suggest to use > > * schema.org [4] for entity specific information - because you might be > able to use the schema.org mapping also for SEO > * SKOS [5] for describing the hierarchy and relations between concepts > Ok. so we would be adding SKOS in RDF format to entityhub, and schema.orgoptionally for specific info. > > Some additional comments/suggestions: > > Note: I copied lines multiple times and also re-ordered them to better > fit to my recommendations. > > > - objects are instantiated from a class (there is a collection of default > > classes but it is possible to add more) > > - class has collection of attributes > > - objects can have relations with other objects > > object1 is_a_class_of class1 (e.g. Person, Blog post, Folder, ... ) > > Content Objects that use classes representing Entities - Person in the > above list) would be very useful for adding to the Entityhub (or a > ManagedSite as soon as they are available). As mentioned above I would > try to map those classes and attributes to schema.org > > > - objects can be tagged with tags > > - tags are organized separately and can be hierarchical too > > object1 is_tagged with tag1 > > tag1 is_a_parent_of tag2 > > Having a tag hierarchy AND tagged documents is ideal for > training/using the TopicEngine. See [6] for how to train/use this > engine. So this data can be used to provide an auto-tagging feature > for your CMS. > ok, that is interesting, we will check this out. Btw, is this Topic Engine using solr/cluster feature internally? To wrap it up, we would need to: - do the stanbol installation and configuration - define what entities we are going to push to stanbol - make a SKOS exporter for CMS - make a servise that syncs the SKOS info to entitiyhub REST interface - configure the keyword linking engine to use our entities - use the enhancer rest api to get the enhancements via keyword linking engine - optionally explore the Topicengine Are we missing some step or we could define a project with this and make a proof of concept? Cheers > > > - objects are stored hierarchically as nodes > > object1 is_a_parent_of object2 > > object1 is_related_to object3 > > object1 is_a_class_of class1 (e.g. Person, Blog post, Folder, ... ) > > best > Rupert > > [1] https://issues.apache.org/jira/browse/STANBOL-673 > [2] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html > [3] http://incubator.apache.org/stanbol/docs/trunk/multilingual.html > [4] http://schema.org/docs/full.html > [5] http://www.w3.org/2009/08/skos-reference/skos.html > [6] > http://dl.dropbox.com/u/5743203/IKS/ReviewMeeting2012/Topic-Classification.pdf > > > On Wed, Jul 4, 2012 at 10:54 AM, Ivo Lukač <[email protected]> wrote: > > Dear Stanbol Community, > > > > I work at Netgen, a small web agency mostly using eZ Publish CMS and we > are > > also an eZ Publish business partners for a long time. Together with > > another independent eZ consultant Paul Borgermans we are interested to > add > > some semantic possibilities to it so we were in Salzburg last month to > try > > to figure out what exactly could we do. My colleague Petar followed up > the > > discussion with Mr. Suat Gonul after the event. > > > > After seeing some solutions already been made by early adapters and > knowing > > the eZ Publish CMS architecture really well we have a potential idea on > > what we could do as a proof of concept. But we need your help to evaluate > > the idea and give some feedback. > > > > The idea is simple: to map the eZ taxonomy to Stanbol and use the > > enhancements to help the editor annotate the content. > > > > > Of course, we could add more specific knowledge (depends on the specific > > project) but we would like to keep it as general as can be. What we could > > do easily is to generate an RDF with the data. > > > > So our main goal would be to push part of this knowledge to Stanbol and > use > > it to enhance newly created objects not using dbpedia but rather internal > > data. > > > > > > > > > > > > > > 4. Would it be faster to use the same entity types as dbpedia (Person, > > Company,...) or introducing new ones should be straightforward? > > > > Any kind of feedback would be welcome and would be helpful in our > > evaluation.... > > > > Best regards > > > > > > -- > > Ivo Lukač > > > > Netgen d.o.o. - A.M.Tripala 3/I, 10000 Zagreb, Croatia > > web: http://www.netgen.hr, tel: +385 (0)1 3879722, mob: +385 (0)91 > 5251566 > > --------------------------------------------------------- > > everyday tweets: http://twitter.com/ilukac > > company blog: http://www.netgen.hr/eng/blog > > professional profile: http://www.linkedin.com/in/ivolukac > > personal blog: http://ilukac.com/ > > member of the CISEx board: http://www.exportboomers.com/ > > presenting: http://ezsummercamp.com > > > > -- > | Rupert Westenthaler [email protected] > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen > -- Ivo Lukač Netgen d.o.o. - A.M.Tripala 3/I, 10000 Zagreb, Croatia web: http://www.netgen.hr, tel: +385 (0)1 3879722, mob: +385 (0)91 5251566 --------------------------------------------------------- everyday tweets: http://twitter.com/ilukac company blog: http://www.netgen.hr/eng/blog professional profile: http://www.linkedin.com/in/ivolukac personal blog: http://ilukac.com/ member of the CISEx board: http://www.exportboomers.com/ presenting: http://ezsummercamp.com
