Hi Tharindu

As Rafa suggested you should generate fise:TextAnnotation [1]. You can
use "http://dbpedia.org/ontology/Currency"; as "dc:type" for the
fise:TextAnnotation.

If you engine can detect the currency and the value you should add
those information by using additional properties to the annotation.
There are already some ontologies that define such properties.

I would suggest to use GoodReleations [2] 'gr:hasCurrency' and
'gr:hasCurrencyValue' properties. But there are also other
possibilities such as schema.org PriceSpecification [3] defines
'schema:price' and 'schema:priceCurrency' properties. If would be also
possible to define Stanbol specific 'fise:currency' and
'fise:currency-value' properties.

Finally it could make sense that the Engine does not use the String
codes for referring the currency but the dbpedia resource instead
(e.g. <http://dbpedia.org/resource/Euro> instead of "EUR"). In that
case fise:TextAnnotation could have properties for both the resource
and the code e.g.

    <urn:example:textannotation.1>
        rdf:type fise:Enhancement, fise:TextAnnotation
        dc:type dbo:Currency
        fise:selected-text "100€"@en
        fise:start "150"^^xsd:int
        fise:end "154"^^xsd:int
        fise:selection-context "I had to spend additional 100€ just to
arrive in time"@en
        fise:confidence "0.92"^^xsd:double
        fise:currency <http://dbpedia.org/resource/Euro>
        fise:currency-code "EUR"
        fise:currency-value "100.0"^^xsd:double

BTW we should start an JIRA issue [4] for such an EnhancementEngine
and use it to collect/document the discussions.

best
Rupert

[1] 
http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure.html#fisetextannotation
[2] http://semanticweb.org/wiki/GoodRelations
[3] http://schema.org/PriceSpecification
[4] https://issues.apache.org/jira/browse/STANBOL

On Thu, Feb 13, 2014 at 5:23 PM, Rafa Haro <rh...@apache.org> wrote:
> Hi Tharindu,
>
> Probably you would need to use a custom type for Currencies' Text
> Annotations. I think that a good ideas is to check the code of another
> engine, for example OpenNLP NER Extraction Engine
> (https://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner)
> and check how that engine creates TextAnnotations and how they assign named
> entities' types. Following this example, the type mappings could be also
> configurable for the engine.
>
> Hope that helps Thar,
>
> Cheers,
> Rafa
>
> El 13/02/14 13:42, Tharindu Rusira escribió:
>
>> Hi Antonio, thanks for your reply.
>>
>> I have been studying the enhancement architecture for a while now. So in
>> summary, what we will be doing is adding more data to the ContentItem's
>> MGraph indicating where we found currency related information in the
>> content. Am I right?
>>
>>
>>
>>
>> On Thu, Feb 13, 2014 at 1:34 PM, Antonio David Perez Morales <
>> ape...@zaizi.com> wrote:
>>
>>> Hi Tharindu.
>>>
>>> What do you mean?
>>> The idea is that you can extract the currencies contained in the text and
>>> they are added to the Enhancements graph as text annotations.
>>> This way, the client sending a text can obtain the currencies in the text
>>> and the their positions inside the text.
>>>
>>> If you need some info about how to write a new enhancement engine I can
>>> give you some hints.
>>>
>>> Regards
>>>
>>>
>>> On Thu, Feb 13, 2014 at 3:47 AM, Tharindu Rusira
>>> <tharindurus...@gmail.com>wrote:
>>>
>>>> Hi everyone,
>>>> I'm quite new to Stanbol and interested in Stanbol enhancement engines.
>>>
>>> In
>>>>
>>>> a previous mail[1] in the mailing list, I saw there is a requirement for
>>>
>>> a
>>>>
>>>> currency related data extraction engine. I am planning to proceed with
>>>> an
>>>> implementation of this idea.
>>>> Can anybody please give me a simple example how this enhancement engine
>>>> would run on a sample text ?
>>>>
>>>> [1]
>>>>
>>>>
>>>
>>> http://mail-archives.apache.org/mod_mbox/stanbol-dev/201311.mbox/%3CCAA7LAO0ekW8bvBVd-=r879g-upcgqhbzny9j5uvvglocj94...@mail.gmail.com%3E
>>>>
>>>> Thanks,
>>>> -Tharindu
>>>> --
>>>> M.P. Tharindu Rusira Kumara
>>>>
>>>> Department of Computer Science and Engineering,
>>>> University of Moratuwa,
>>>> Sri Lanka.
>>>> +94757033733
>>>> www.tharindu-rusira.blogspot.com
>>>>
>>> --
>>>
>>> ------------------------------
>>> This message should be regarded as confidential. If you have received
>>> this
>>> email in error please notify the sender and destroy it immediately.
>>> Statements of intent shall only become binding when confirmed in hard
>>> copy
>>> by an authorised signatory.
>>>
>>> Zaizi Ltd is registered in England and Wales with the registration number
>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>>> London W6 7AN.
>>>
>>
>>
>



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to