Hi Rupert and Rafa, Thank you both for the ideas. It is now clear the required work flow to extract currency data and enhance the content.
On Fri, Feb 14, 2014 at 1:57 PM, Rupert Westenthaler < rupert.westentha...@gmail.com> wrote: > Hi Tharindu > > As Rafa suggested you should generate fise:TextAnnotation [1]. You can > use "http://dbpedia.org/ontology/Currency" as "dc:type" for the > fise:TextAnnotation. > > If you engine can detect the currency and the value you should add > those information by using additional properties to the annotation. > There are already some ontologies that define such properties. > > I would suggest to use GoodReleations [2] 'gr:hasCurrency' and > 'gr:hasCurrencyValue' properties. But there are also other > possibilities such as schema.org PriceSpecification [3] defines > 'schema:price' and 'schema:priceCurrency' properties. If would be also > possible to define Stanbol specific 'fise:currency' and > 'fise:currency-value' properties. > > Finally it could make sense that the Engine does not use the String > codes for referring the currency but the dbpedia resource instead > (e.g. <http://dbpedia.org/resource/Euro> instead of "EUR"). In that > case fise:TextAnnotation could have properties for both the resource > and the code e.g. > > <urn:example:textannotation.1> > rdf:type fise:Enhancement, fise:TextAnnotation > dc:type dbo:Currency > fise:selected-text "100 EURO"@en > fise:start "150"^^xsd:int > fise:end "154"^^xsd:int > fise:selection-context "I had to spend additional 100 EURO just to > arrive in time"@en > fise:confidence "0.92"^^xsd:double > fise:currency <http://dbpedia.org/resource/Euro> > fise:currency-code "EUR" > fise:currency-value "100.0"^^xsd:double > > Thank you Rupert for the well elaborated example. > BTW we should start an JIRA issue [4] for such an EnhancementEngine > and use it to collect/document the discussions. > > JIRA issue opened at[1] > best > Rupert > [1] https://issues.apache.org/jira/browse/STANBOL-1281 Thanks -Tharindu > > [1] > http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure.html#fisetextannotation > [2] http://semanticweb.org/wiki/GoodRelations > [3] http://schema.org/PriceSpecification > [4] https://issues.apache.org/jira/browse/STANBOL > > On Thu, Feb 13, 2014 at 5:23 PM, Rafa Haro <rh...@apache.org> wrote: > > Hi Tharindu, > > > > Probably you would need to use a custom type for Currencies' Text > > Annotations. I think that a good ideas is to check the code of another > > engine, for example OpenNLP NER Extraction Engine > > ( > https://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner > ) > > and check how that engine creates TextAnnotations and how they assign > named > > entities' types. Following this example, the type mappings could be also > > configurable for the engine. > > > > Hope that helps Thar, > > > > Cheers, > > Rafa > > > > El 13/02/14 13:42, Tharindu Rusira escribió: > > > >> Hi Antonio, thanks for your reply. > >> > >> I have been studying the enhancement architecture for a while now. So in > >> summary, what we will be doing is adding more data to the ContentItem's > >> MGraph indicating where we found currency related information in the > >> content. Am I right? > >> > >> > >> > >> > >> On Thu, Feb 13, 2014 at 1:34 PM, Antonio David Perez Morales < > >> ape...@zaizi.com> wrote: > >> > >>> Hi Tharindu. > >>> > >>> What do you mean? > >>> The idea is that you can extract the currencies contained in the text > and > >>> they are added to the Enhancements graph as text annotations. > >>> This way, the client sending a text can obtain the currencies in the > text > >>> and the their positions inside the text. > >>> > >>> If you need some info about how to write a new enhancement engine I can > >>> give you some hints. > >>> > >>> Regards > >>> > >>> > >>> On Thu, Feb 13, 2014 at 3:47 AM, Tharindu Rusira > >>> <tharindurus...@gmail.com>wrote: > >>> > >>>> Hi everyone, > >>>> I'm quite new to Stanbol and interested in Stanbol enhancement > engines. > >>> > >>> In > >>>> > >>>> a previous mail[1] in the mailing list, I saw there is a requirement > for > >>> > >>> a > >>>> > >>>> currency related data extraction engine. I am planning to proceed with > >>>> an > >>>> implementation of this idea. > >>>> Can anybody please give me a simple example how this enhancement > engine > >>>> would run on a sample text ? > >>>> > >>>> [1] > >>>> > >>>> > >>> > >>> > http://mail-archives.apache.org/mod_mbox/stanbol-dev/201311.mbox/%3CCAA7LAO0ekW8bvBVd-=r879g-upcgqhbzny9j5uvvglocj94...@mail.gmail.com%3E > >>>> > >>>> Thanks, > >>>> -Tharindu > >>>> -- > >>>> M.P. Tharindu Rusira Kumara > >>>> > >>>> Department of Computer Science and Engineering, > >>>> University of Moratuwa, > >>>> Sri Lanka. > >>>> +94757033733 > >>>> www.tharindu-rusira.blogspot.com > >>>> > >>> -- > >>> > >>> ------------------------------ > >>> This message should be regarded as confidential. If you have received > >>> this > >>> email in error please notify the sender and destroy it immediately. > >>> Statements of intent shall only become binding when confirmed in hard > >>> copy > >>> by an authorised signatory. > >>> > >>> Zaizi Ltd is registered in England and Wales with the registration > number > >>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, > >>> London W6 7AN. > >>> > >> > >> > > > > > > -- > | Rupert Westenthaler rupert.westentha...@gmail.com > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen > -- M.P. Tharindu Rusira Kumara Department of Computer Science and Engineering, University of Moratuwa, Sri Lanka. +94757033733 www.tharindu-rusira.blogspot.com