Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Cédrick Béler
> > > You say that named entity recognition is not generalised beyond Mail, > but the support library is there for anyone to use. See for > example > https://developer.apple.com/documentation/foundation/nslinguistictagger/identifying_people_places_and_organizations > >

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Cédrick Béler
Hi Hernan, Really nice. I try it today. It might be what I need. I come back if installation pb. Cheers, Cédrick > Le 8 mars 2019 à 03:34, Hernán Morales Durand a > écrit : > > Hi Cédrick, > > I wrote some years ago an interface to a named-entity recognizer: >

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Cédrick Béler
> Couldn't find anything in Smalltalk but that should you give ideas and > inspire you or get you started... > > https://github.com/search?q=contact+scraping=Repositories > > I guess we have all that's needed in Pharo : parsers (HTML, XML, > PetitParser), Soup & regex ! Yes for markup, I

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Richard O'Keefe
You say that named entity recognition is not generalised beyond Mail, but the support library is there for anyone to use. See for example https://developer.apple.com/documentation/foundation/nslinguistictagger/identifying_people_places_and_organizations In Python, you can use NLTK to do roughly

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Hernán Morales Durand
Hi Cédrick, I wrote some years ago an interface to a named-entity recognizer: https://80738163270632.blogspot.com/2015/02/stner-interface-to-stanford-named.html I think that was Pharo 5, so you may want to check if there are load problems in current Pharo. The blogger post didn't parsed

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Cédrick Béler
on about pharo is welcome > Subject: Re: [Pharo-users] Parsing text to discover general data of interest > (phone, email, address, ...) > > > > >> >> When you say "unstructured material ... is now a standard pattern in >> PetitParser », how co

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread PBKResearch
From: Pharo-users On Behalf Of Cédrick Béler Sent: 07 March 2019 11:20 To: Any question about pharo is welcome Subject: Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...) When you say "unstructured material ... is now a sta

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Cédrick Béler
> > When you say "unstructured material ... is now a standard pattern in > PetitParser », how could I begin exploring that ? Any tutorials ? I’ll load it and play around. https://github.com/kursjan/petitparser2/blob/master/README.md

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Cédrick Béler
> >> This also assumes that the items of interest are really structured; there >> could be many ways of writing phone numbers, for instance. > > Phone numbers are actually not easy… I see them as a limited sequence of > number (if not well structure) + eventually the +contrycode). > I’d like

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Cédrick Béler
etitParser », how could I begin exploring that ? Any tutorials ? Thanks Peter, Cédrick > > HTH > > Peter Kenny > > -Original Message- > From: Pharo-users On Behalf Of Cédrick > Béler > Sent: 07 March 2019 09:52 > To: Any question about pharo is welcome

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread PBKResearch
be many ways of writing phone numbers, for instance. HTH Peter Kenny -Original Message- From: Pharo-users On Behalf Of Cédrick Béler Sent: 07 March 2019 09:52 To: Any question about pharo is welcome Cc: Tudor Girba Subject: [Pharo-users] Parsing text to discover general data of interest

Re: [Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Benoit St-Jean via Pharo-users
--- Begin Message --- Couldn't find anything in Smalltalk but that should you give ideas and inspire you or get you started... https://github.com/search?q=contact+scraping=Repositories I guess we have all that's needed in Pharo : parsers (HTML, XML, PetitParser), Soup & regex ! On

[Pharo-users] Parsing text to discover general data of interest (phone, email, address, ...)

2019-03-07 Thread Cédrick Béler
Hi all, I’ve often got the need to analyse some random unstructured text to discover (structured) information (in email for instance), to extract : - emails - telephone numbers - addresses - events - person names (according to a list of known persons), - etc… Apple do it in email for instance