Hi again, Sorry to bug you, but we need the info to evaluate the project.
Could the project fit in this phases? - do the stanbol installation and configuration - define what entities we are going to push to stanbol - make a SKOS exporter for CMS - make a servise that syncs the SKOS info to entitiyhub REST interface - configure the keyword linking engine to use our entities - use the enhancer rest api to get the enhancements via keyword linking engine - optionally explore the Topicengine Did we miss something? On Thu, Jul 5, 2012 at 6:05 PM, Ivo Lukač <[email protected]> wrote: > Hi Rupert, > > Thanks for your feedback. My response below. > > On Thu, Jul 5, 2012 at 6:40 AM, Rupert Westenthaler < > [email protected]> wrote: > >> Hi Ivo, >> >> > So here are some questions: >> > >> > 1. What could be added to EntityHub from the knowledge listed above and >> > what is the best (stable enough) way to add entities to Entity Hub: is >> it >> > via REST or manually as described here: >> > http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html(of >> > course REST would be more interesting, but if it is not ready yet a >> manual >> > approach would be good too) >> >> I have already started to implement STANBOL-673 [1] that will bring a >> new type of "Site" to the Entityhub that can be fully managed by the >> RESTful API. This is exactly tailored to use cases as described here. >> In a 2nd iteration of this I do also plan to make all the >> functionality of the Entityhub Indexing tool (as described in [2]) >> available for ManagedSites. >> > > Ok, as we are evaluating the project right now we need something that > works now :) > > >> >> In the meantime you could use the RESTful services of the Entityhub >> (http://{stanbol-server}/entityhub/entity) >> > > > ok, I found this links http://dev.iks-project.eu:8081/entityhub as a > reference, is there any other docs about this? > > >> >> > 2. Would we need to tackle Onthology Manager in any way to organize the >> > entities or this can be skipped? (this is the most vague thing I have >> > encountered) >> >> It really depends what you want to do. If you just want to use the >> Stanbol Enhancer (and the Entityhub for managing the Entities to >> extract from parsed Content) than you will not need to use the >> Ontology Manager, Reasonings and Rules component. >> > > Ok, that simplifies things as we need to tackle one module less, we don't > have much time to do it anyway. > > >> >> > 3. How to connect enhancer to use our entities? Is the Keywordlinking >> > Engine way to go: >> > >> http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.html >> > ? >> >> I would expect so. The use cases [2] and [3] should provide all >> necessary information to help you with that decision. If not feedback >> on how to improve those is very welcome. >> > > Ok. > > >> >> >> > 4. Would it be faster to use the same entity types as dbpedia (Person, >> > Company,...) or introducing new ones should be straightforward? >> >> Personally I would suggest to use >> >> * schema.org [4] for entity specific information - because you might be >> able to use the schema.org mapping also for SEO >> * SKOS [5] for describing the hierarchy and relations between concepts >> > > Ok. so we would be adding SKOS in RDF format to entityhub, and > schema.orgoptionally for specific info. > > >> >> Some additional comments/suggestions: >> >> Note: I copied lines multiple times and also re-ordered them to better >> fit to my recommendations. >> >> > - objects are instantiated from a class (there is a collection of >> default >> > classes but it is possible to add more) >> > - class has collection of attributes >> > - objects can have relations with other objects >> > object1 is_a_class_of class1 (e.g. Person, Blog post, Folder, ... ) >> >> Content Objects that use classes representing Entities - Person in the >> above list) would be very useful for adding to the Entityhub (or a >> ManagedSite as soon as they are available). As mentioned above I would >> try to map those classes and attributes to schema.org >> >> > - objects can be tagged with tags >> > - tags are organized separately and can be hierarchical too >> > object1 is_tagged with tag1 >> > tag1 is_a_parent_of tag2 >> >> Having a tag hierarchy AND tagged documents is ideal for >> training/using the TopicEngine. See [6] for how to train/use this >> engine. So this data can be used to provide an auto-tagging feature >> for your CMS. >> > > ok, that is interesting, we will check this out. > Btw, is this Topic Engine using solr/cluster feature internally? > > To wrap it up, we would need to: > - do the stanbol installation and configuration > - define what entities we are going to push to stanbol > - make a SKOS exporter for CMS > - make a servise that syncs the SKOS info to entitiyhub REST interface > - configure the keyword linking engine to use our entities > - use the enhancer rest api to get the enhancements via keyword linking > engine > - optionally explore the Topicengine > > Are we missing some step or we could define a project with this and make a > proof of concept? > > Cheers > > > > > > >> >> > - objects are stored hierarchically as nodes >> > object1 is_a_parent_of object2 >> > object1 is_related_to object3 >> > object1 is_a_class_of class1 (e.g. Person, Blog post, Folder, ... ) >> >> best >> Rupert >> >> [1] https://issues.apache.org/jira/browse/STANBOL-673 >> [2] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html >> [3] http://incubator.apache.org/stanbol/docs/trunk/multilingual.html >> [4] http://schema.org/docs/full.html >> [5] http://www.w3.org/2009/08/skos-reference/skos.html >> [6] >> http://dl.dropbox.com/u/5743203/IKS/ReviewMeeting2012/Topic-Classification.pdf >> >> >> On Wed, Jul 4, 2012 at 10:54 AM, Ivo Lukač <[email protected]> wrote: >> > Dear Stanbol Community, >> > >> > I work at Netgen, a small web agency mostly using eZ Publish CMS and we >> are >> > also an eZ Publish business partners for a long time. Together with >> > another independent eZ consultant Paul Borgermans we are interested to >> add >> > some semantic possibilities to it so we were in Salzburg last month to >> try >> > to figure out what exactly could we do. My colleague Petar followed up >> the >> > discussion with Mr. Suat Gonul after the event. >> > >> > After seeing some solutions already been made by early adapters and >> knowing >> > the eZ Publish CMS architecture really well we have a potential idea on >> > what we could do as a proof of concept. But we need your help to >> evaluate >> > the idea and give some feedback. >> > >> > The idea is simple: to map the eZ taxonomy to Stanbol and use the >> > enhancements to help the editor annotate the content. >> > >> >> > Of course, we could add more specific knowledge (depends on the specific >> > project) but we would like to keep it as general as can be. What we >> could >> > do easily is to generate an RDF with the data. >> > >> > So our main goal would be to push part of this knowledge to Stanbol and >> use >> > it to enhance newly created objects not using dbpedia but rather >> internal >> > data. >> > >> > >> >> > >> >> > >> > >> > 4. Would it be faster to use the same entity types as dbpedia (Person, >> > Company,...) or introducing new ones should be straightforward? >> > >> > Any kind of feedback would be welcome and would be helpful in our >> > evaluation.... >> > >> > Best regards >> > >> > >> > -- >> > Ivo Lukač >> > >> > Netgen d.o.o. - A.M.Tripala 3/I, 10000 Zagreb, Croatia >> > web: http://www.netgen.hr, tel: +385 (0)1 3879722, mob: +385 (0)91 >> 5251566 >> > --------------------------------------------------------- >> > everyday tweets: http://twitter.com/ilukac >> > company blog: http://www.netgen.hr/eng/blog >> > professional profile: http://www.linkedin.com/in/ivolukac >> > personal blog: http://ilukac.com/ >> > member of the CISEx board: http://www.exportboomers.com/ >> > presenting: http://ezsummercamp.com >> >> >> >> -- >> | Rupert Westenthaler [email protected] >> | Bodenlehenstraße 11 ++43-699-11108907 >> | A-5500 Bischofshofen >> > > > > -- > Ivo Lukač > > Netgen d.o.o. - A.M.Tripala 3/I, 10000 Zagreb, Croatia > web: http://www.netgen.hr, tel: +385 (0)1 3879722, mob: +385 (0)91 5251566 > --------------------------------------------------------- > everyday tweets: http://twitter.com/ilukac > company blog: http://www.netgen.hr/eng/blog > professional profile: http://www.linkedin.com/in/ivolukac > personal blog: http://ilukac.com/ > member of the CISEx board: http://www.exportboomers.com/ > presenting: http://ezsummercamp.com > > -- Ivo Lukač Netgen d.o.o. - A.M.Tripala 3/I, 10000 Zagreb, Croatia web: http://www.netgen.hr, tel: +385 (0)1 3879722, mob: +385 (0)91 5251566 --------------------------------------------------------- everyday tweets: http://twitter.com/ilukac company blog: http://www.netgen.hr/eng/blog professional profile: http://www.linkedin.com/in/ivolukac personal blog: http://ilukac.com/ member of the CISEx board: http://www.exportboomers.com/ presenting: http://ezsummercamp.com
