Hi Anuj

On Thu, Nov 28, 2013 at 1:51 PM, Anuj Kumar <anujs...@gmail.com> wrote:
> I second that. Regex will work better w.r.t. the default trained model of
> OpenNLP.

Both such projects do look interesting:

> Also, take a look at this extractor- https://code.google.com/p/heideltime/ and

As this is GPLv3 you can not directly use it to implement an
EnhancementEngine that is part of the Stanbol Codebase. Integrating it
via a RESTful service would be an option.

> Stanford's tagger- http://nlp.stanford.edu/downloads/sutime.shtml#!

The same is true for SuTime as all Stanford NLP components are under GPL.

If we want to integrate those projects I suggest to extend the Stanbol
RESTful NLP protocol [1] and service [2] so that it can represent
date/time points and ranges. SuTime support could be added to the
already existing Stanbol-Stanford integration [3]. For HeidelTime one
would need to implement a similar component.


But before integrating those I would prefer to have a base-line engine
that is directly integrated in Stanbol. Looks like a Regex based
approach could be sufficient for that. WDYT Jayani?

best
Rupert

[1] https://issues.apache.org/jira/browse/STANBOL-878
[2] https://issues.apache.org/jira/browse/STANBOL-892
[3] https://github.com/westei/stanbol-stanfordnlp

>
> It will be useful to have similar temporal expression enhancement engine in
> Stanbol.
>
> Regards,
> Anuj
>
>
> On Thu, Nov 28, 2013 at 11:05 AM, Rupert Westenthaler <
> rupert.westentha...@gmail.com> wrote:
>
>> Hi Jayani,
>>
>> I was not even aware that there exists a Time model for OpenNLP.
>> Documentation shows that this uses a purely statistical model so I am
>> wondering about the quality. Note also that OpenNLP only provides a
>> prebuilt model for English [1].
>>
>> AFAIK OpenNLP will only provide you with the information that some
>> tokens do represent a date. It will not provide you the parsed
>> xsd:dateTime. So if you use this Engine you will still need to
>> implement this part of your own. So most likely you will end up using
>> regex patterns to parse the actual time from the Tokens marked by
>> OpenNLP as time.
>>
>> So I am wondering if it is not better to start with Regex from the
>> beginning. If you search for "Regey Date Time extraction" you can
>> fined a huge set of example you could start from.
>>
>> best
>> Rupert
>>
>>
>> [1] http://opennlp.sourceforge.net/models-1.5/
>>
>>
>>
>> On Thu, Nov 28, 2013 at 5:15 AM, Jayani Withanawasam
>> <jayaniwithanawa...@gmail.com> wrote:
>> > Hi Dileepa,
>> >
>> > Thank you so much for your valuble feedback. I'm working on this.
>> >
>> >
>> > On Mon, Nov 25, 2013 at 9:00 PM, Dileepa Jayakody <
>> dileepajayak...@gmail.com
>> >> wrote:
>> >
>> >> Hi Jayani,
>> >>
>> >> There are several enhancement engines in Stanbol developed based on
>> >> OpenNLP. (opennlp-ner, opennlp-sentence, opennlp-pos...See [1])  Each of
>> >> these engines focus on a particular enhancement aspect using OpenNLP.
>> >> Therefore I think it's better to write a new engine for temporal
>> >> extractions rather than extending the OpenNLP-NER engine.
>> >>
>> >> Thanks,
>> >> Dileepa
>> >>
>> >> [1]
>> >>
>> https://svn.apache.org/repos/asf/stanbol/trunk/enhancement-engines/opennlp
>> >>
>> >>
>> >> On Mon, Nov 25, 2013 at 4:30 PM, Jayani Withanawasam <
>> >> jayaniwithanawa...@gmail.com> wrote:
>> >>
>> >> > Hi,
>> >> >
>> >> > I'm researching on adding new enhancement engine for extracting date
>> and
>> >> > time (Temporal extraction) to Stanbol as suggested by Rupert.
>> >> >
>> >> > There, it is being found that OpenNLP has an entity extraction unit
>> for
>> >> > date and time.
>> >> > Also, I noticed that OpenNLP is already integrated to Stanbol in NER
>> >> > engine.
>> >> >
>> >> > So, as per my understanding, there are two options to extract date and
>> >> > time.
>> >> >
>> >> > One is to have a seperate enhancement engine for date and time
>> >> information
>> >> > extraction. Another one is to add date time extraction as a code
>> >> > enhancement to exisitng OpenNLP NER engine.
>> >> >
>> >> > What is your opinion on this? Is there any other approach which you
>> think
>> >> > that would be better?
>> >> >
>> >> > Thank you
>> >> > Jayani
>> >> >
>> >>
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westentha...@gmail.com
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to