[
https://issues.apache.org/jira/browse/STANBOL-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fabian Christ updated STANBOL-51:
---------------------------------
Priority: Minor (was: Major)
> Get enhancements as RDFa
> ------------------------
>
> Key: STANBOL-51
> URL: https://issues.apache.org/jira/browse/STANBOL-51
> Project: Stanbol
> Issue Type: Improvement
> Components: FISE
> Reporter: Fabian Christ
> Priority: Minor
>
> Reported by henri.bergius, Jul 27, 2010
> If original content has been submitted to FISE as HTML5 (see #41), then FISE
> could provide the enhancements back as RDFa inside the original content.
> Delete comment
> Comment 1 by project member rupert.westenthaler, Jul 27, 2010
> This would be only possible if the content type of the content is HTML.
> We would need to define how enhancements are represented as RDFa.
> e.g.
> <p> The meeting takes palce in <span about="/TextAnnotation1">Helsinki
> <span property="dc:creator"
> content="eu.iksproject.fise.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine"></span>
> <span property="dc:type"
> content="http://dbpedia.org/ontology/Place"></span>
> <span ... add all the other properties
> </span>.</p>
> An other possibility would be to define URIs for enhancements (e.g.
> http://localhost:8080/store/<contentID>/<enhancementID> and only add such
> URLs as RDFa annotations to the content.
> e.g.
> <p> The meeting takes palce in <span
> about="http://localhost:8080/store/myContentItem/TextAnnotation1">Helsinki</span>.</p>
> This would have the advantage, that the HTML does not get overloaded with
> RDFa annotations, because only the link to the enhancement is added. However
> this is not quit useful if FISE runs on local host.
> An third possibility would be to directly add results of the enhancement
> process as RDFa annotation
> e.g. assuming that FISE has found the Entity
> "http://www.dbpedia.org/page/Helsinki" that would result in the following
> RDFa annotation
> <p> The meeting takes palce in <span
> about="http://www.dbpedia.org/page/Helsinki">Helsinki
> <span property="dc:title" content="Helsinki"></span>
> <span property="rdf:type"
> content="http://dbpedia.org/ontology/Place"></span>
> </span>.</p>
> This produces RDFa annotations that could be easily used by the client,
> however all the information about how such enhancements where created are
> lost. Maybe we could add an additional span with an property pointing to the
> ID of the enhancement.
> Delete comment
> Comment 2 by project member rupert.westenthaler, Jul 27, 2010
> here is an Example for the third option that includes links to the
> enhancements
> <p> The meeting takes palce in <span
> about="http://www.dbpedia.org/page/Helsinki">Helsinki
> <span property="dc:title" content="Helsinki"></span>
> <span property="rdf:type"
> content="http://dbpedia.org/ontology/Place"></span>
> <span property="dc:creator" content="FISE">//tells that this RDFa
> annotation was created by FISE (one may use an URI instead)
> <span property="dc:source" content="TextEnhancement1"> //tells that this
> RDFa annotation is based on this enhancement (one may use an URI instead)
> </span>.</p>
> This example assumes that the entity with the highest confidence was added
> for Helsinki, but there might also be other entity suggestions. Therefore it
> would make sense, to implement a service that can be used to get additional
> information based on the content of the "dc:source" property.
> best
> Rupert
> Delete comment
> Comment 3 by project member rupert.westenthaler, Jul 29, 2010
> Changed to type-Enhancement and status to accepted
> Status: Accepted
> Labels: -Type-Defect Type-Enhancement
> Delete comment
> Comment 4 by project member christ.fabian, Sep 14, 2010
> This issue addresses [FR-220303].
> "IKS services shall annotate the content items with meta-data."
> Delete comment
> Comment 5 by [email protected], Oct 21, 2010
> Will it be possible to mark HTML5 sections which should not be enhanced ?
> What about submitting HTML5 fragments?
> Delete comment
> Comment 6 by project member rupert.westenthaler, Oct 22, 2010
> Adding support for additional media types and extracting metadata of
> submitted content is definitely on the road map.
> see also http://wiki.iks-project.eu/index.php/Workshops/EntityLinkingWorkshop
> Regarding sections that should not be enhanced: As far as I can be remember,
> such a feature was not yet discussed.
> Can you please provide some additional information such as:
> What would be the actual Usecase?
> Do you think of annotations within the HTML Document or additional
> information that describe what section should be processed by FISE?
> Delete comment
> Comment 7 by [email protected], Nov 03, 2010
> > Regarding sections that should not be enhanced ...
> > What would be the actual Usecase?
> This would enable CMS systems to (automatically) add enhancements at the very
> end of the HTML-generation process after templates etc. have already been
> applied. There likely are sections in complete HTML pages where enhancements
> do not make sense or are unwanted such as navigation sections or (perhaps)
> advertisements.
> Ideally FISE would even be able to automatically determine those areas
> without requiring that they are marked before (maybe by applying semantic
> technology ;-).
> > Do you think of annotations within the HTML Document
> > or additional information that describe what section
> > should be processed by FISE?
> Thinking about that... The markup to denote the irrelevant (or the relevant?)
> sections would be semantic. So RDFa would be ok if that is easy to add in
> templates.
> Delete comment
> Comment 8 by project member rupert.westenthaler, Nov 03, 2010
> During the last Project Meeting in Istanbul the FISE team discussed necessary
> changes to provide better support for different content types as well as
> existing metadata (such as RDFa embedded in HTML or exif metadata in fotos).
> see http://wiki.iks-project.eu/index.php/Workshops/EntityLinkingWorkshop for
> more details.
> With that additions in place it should be easy to support using RDFa to
> denote the irrelevant and/or the relevant sections of an HTML document.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.