Look good. If possible we should do this with OpenNLP as it has apache licence. However, I could not find NLP regex impl there. Please look at it in detial.
--Srinath On Tue, Aug 19, 2014 at 9:52 PM, Malithi Edirisinghe <[email protected]> wrote: > > Hi All, > > We are working on a NLP Toolbox improvement in CEP. The main idea of this > improvement is to use a NLP library and let user do some NLP operations as > Siddhi extensions. > > So in our implementation we have decided to support following NLP > operations. > > *1. findNameEntityType(sentence, entityType)* > > *Description:* > > This operation takes a sentence and a predefined entity type as it's > inputs. It will return noun(s) in the sentence that match the defined > entity type, as event(s). > > *inputs:* > > sentence : sentence to be processed > entityType: predefined entity type > ORGANIZATION > NAME > LOCATION > *output:* > > matching noun(s) as event(s) > > *example:* > > inputs: > sentence : Alice works at WSO2 > entityType : NAME > > output: Alice > > *2. findNLRegexPattern(sentence, regex)* > > *Description:* > > This operation takes a sentence and a regular expression as it's inputs. > It will return each match in the sentence, as an event. > > *inputs:* > > sentence : sentence to be processed > regex : regular expression to be matched > *output:* > > matching pharase(s) as event(s) > > *example:* > > inputs: > sentence : WSO2 was found in 2005 > regex : \\d{4} > > output: 2005 > > *3. findRelationship(sentence, regex)* > > *Description:* > > This operation takes a sentence and a regular expression as it's inputs. > For each relationship extracted from the regular expression the operation > will return a triplet; subject, object and relationship as an event. > > *inputs:* > > sentence : sentence to be processed > regex : regular expression to extract the relationship > *output:* > > triplet(s) of (subject, object, relationship) as event(s) > > *example:* > > inputs: > sentence : Bob works for WSO2 > regex : works for > > output: (Bob, WSO2, works for) > *4. findNameEntityTypeViaDictionary(sentence, dictionary, entityType)* > > *Description:* > > This operation takes a sentence, dictionary file and a predefined entity > type as it's inputs. It will return noun(s) in the sentence of the defined > entity type, that also exists in the dictionary as event(s). > > *inputs:* > > sentence : sentence to be processed > dictionary : dictionary of entities of the defined entity type > entityType : predefined entity type > ORGANIZATION > NAME > LOCATION > *output:* > > matching noun(s) as event(s) > > *example:* > > inputs: > sentence : Bob works at WSO2 > dictionary : (WSO2,ORACLE,IBM) > entityType : ORGANIZATION > > output: WSO2 > > Each NLP operation defined here will be implemented as a transformer > extension to Siddhi. > -- > > *Malithi Edirisinghe* > Senior Software Engineer > WSO2 Inc. > > Mobile : +94 (0) 718176807 > [email protected] > -- ============================ Director, Research, WSO2 Inc. Visiting Faculty, University of Moratuwa Member, Apache Software Foundation Research Scientist, Lanka Software Foundation Blog: http://srinathsview.blogspot.com twitter:@srinath_perera Site: http://people.apache.org/~hemapani/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
