Hi all,

I’ve often got the need to analyse some random unstructured text to discover 
(structured) information (in email for instance), to extract :
- emails
- telephone numbers
- addresses
- events
- person names (according to a list of known persons), 
- etc… 

Apple do it in email for instance (strangely, this is not generalized).


So my questions are :
- do we have something equivalent in Smalltalk/Pharo ? (I didn’t find) 
- if not, what strategy would you use ?
=> I do really stupid text analysis (substrings, finding @, …, parsing 
according to the text structure when there is… kind of Soup parsing…)
=> I feel this is a job for PetitParser ? And would be a nice feet to the new 
GToolkit.

All ideas or suggestions are welcome ;-)


TIA,

Cédrick 



Reply via email to