Look good. If possible we should do this with OpenNLP as it has apache
licence. However, I could not find NLP regex impl there. Please look at it
in detial.

--Srinath


On Tue, Aug 19, 2014 at 9:52 PM, Malithi Edirisinghe <[email protected]>
wrote:

>
> Hi All,
>
> We are working on a NLP Toolbox improvement in CEP. The main idea of this
> improvement is to use a NLP library and let user do some NLP operations as
> Siddhi extensions.
>
> So in our implementation we have decided to support following NLP
> operations.
>
> *1. findNameEntityType(sentence, entityType)*
>
> *Description:*
>
> This operation takes a sentence and a predefined entity type as it's
> inputs. It will return noun(s) in the sentence that match the defined
> entity type, as event(s).
>
> *inputs:*
>
> sentence  : sentence to be processed
> entityType: predefined entity type
>  ORGANIZATION
> NAME
>  LOCATION
>  *output:*
>
> matching noun(s) as event(s)
>
> *example:*
>
>  inputs:
> sentence   : Alice works at WSO2
>  entityType : NAME
>
>  output: Alice
>
> *2. findNLRegexPattern(sentence, regex)*
>
> *Description:*
>
> This operation takes a sentence and a regular expression as it's inputs.
> It will return each match in the sentence, as an event.
>
> *inputs:*
>
> sentence  : sentence to be processed
> regex       : regular expression to be matched
>  *output:*
>
> matching pharase(s) as event(s)
>
> *example:*
>
> inputs:
>  sentence   : WSO2 was found in 2005
>  regex        : \\d{4}
>
>  output: 2005
>
> *3. findRelationship(sentence, regex)*
>
> *Description:*
>
> This operation takes a sentence and a regular expression as it's inputs.
> For each relationship extracted from the regular expression the operation
> will return a triplet; subject, object and relationship as an event.
>
> *inputs:*
>
> sentence  : sentence to be processed
> regex       : regular expression to extract the relationship
>  *output:*
>
> triplet(s) of (subject, object, relationship) as event(s)
>
> *example:*
>
>  inputs:
> sentence   : Bob works for WSO2
>  regex        : works for
>
>  output: (Bob, WSO2, works for)
>  *4. findNameEntityTypeViaDictionary(sentence, dictionary, entityType)*
>
> *Description:*
>
> This operation takes a sentence, dictionary file and a predefined entity
> type as it's inputs. It will return noun(s) in the sentence of the defined
> entity type, that also exists in the dictionary as event(s).
>
> *inputs:*
>
> sentence   : sentence to be processed
> dictionary  : dictionary of entities of the defined entity type
> entityType : predefined entity type
>  ORGANIZATION
>   NAME
>  LOCATION
>  *output:*
>
> matching noun(s) as event(s)
>
> *example:*
>
>  inputs:
> sentence    : Bob works at WSO2
>  dictionary   : (WSO2,ORACLE,IBM)
> entityType  : ORGANIZATION
>
> output: WSO2
>
> Each NLP operation defined here will be implemented as a transformer
> extension to Siddhi.
> --
>
> *Malithi Edirisinghe*
> Senior Software Engineer
> WSO2 Inc.
>
> Mobile : +94 (0) 718176807
>  [email protected]
>



-- 
============================
Director, Research, WSO2 Inc.
Visiting Faculty, University of Moratuwa
Member, Apache Software Foundation
Research Scientist, Lanka Software Foundation
Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
Site: http://people.apache.org/~hemapani/
Photos: http://www.flickr.com/photos/hemapani/
Phone: 0772360902
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to