If you want to go rule-based, you should take a look at Apache UIMA Ruta.

It is possible to 

a) use DKPro Core components inside the Ruta IDE and 
b) to build an uimaFIT pipeline that contains DKPro Core components as well as 
Ruta.

An example for b) can be found here: 
https://dkpro-tutorials.googlecode.com/svn/GSCL2013/trunk/gscl2013-ruta/

The full UIMA / Ruta / DKPro Core tutorial from GSCL 2013 can be found here:

https://uima.apache.org/gscl13.html#gscl.workshop

Cheers,

-- Richard

On 01.12.2014, at 03:03, Srinivas Yerram <[email protected]> wrote:

> Hi,
>  
> I have been working on apache UIMA and DKPro SDK to parse the email data.I 
> would like to extract the required info from a given input email data(which 
> is string format).
>  
> Ex : email data contains structured and unstructured data.
>  
> Sample email data in below. where I would like to retrieve the  From, Sent To 
> ,Subject, Name , Booking Reference , Ticketing Airline , Ticketnumbers etc
> ======================================================================
> From: [email protected] [mailto:[email protected]]
> Sent: Monday, February 17, 2014 3:51 PM
> To: Crump, Jenelle
> Subject: myIDTravel Leisure Booking/Listing Rebooking
>  
> Greetings,
>  
> Your leisure Travel rebooking was successful. Below you will find a new
> copy of your itinerary.
>  
> Names: DABLING, KALE LINCOLN  MR
> Booking Reference:   UNRKCP
> Ticketing Airline:   JetBlue
> Ticketnumbers:       279-2107038822
>  
>  
> I know this is possible with rule file.But how to define rules to apply on 
> tokens.Do we have any sample rule file format ? I would like to apply the 
> rules on Stanford token.
>  
> Please provide sample rule file format to extract the tokens from email data. 
> Thank you.
>  
> Regards,
> Srinivas Yerram

Reply via email to