Thanks Martin! I will go with Regex! 2015-08-21 21:11 GMT+02:00 Martin Wunderlich <martin...@gmx.net>:
> Hi Damiano, > > If you can do it with RegExes, I would say go with that. It will be much > easier and faster to implement, compared to preparing the training data > required for building a machine-learning model. > You might also want to have a look at CLDR, the Common Locale Data > Repository, which provides locale-specific support for parsing things like > numbers. http://cldr.unicode.org/ <http://cldr.unicode.org/> > Finally, if you want to check the zip codes for validity, I am sure you > can find a web service that provides this, depending on what country you’re > in. > > Cheers, > > Martin > > > > > Am 21.08.2015 um 20:17 schrieb Damiano Porta <damianopo...@gmail.com>: > > > > Hello, > > I am thinking about the best method to find zipcodes and telephones > inside > > my text. > > > > Zipcodes must have 5 digits and i also have a Dictionary with a list of > > real zipcodes of my country. So the first questions is: > > > > Do i have to train a NER model or use something like RegexNameFinder or > > DictionaryNameFinder? > > > > Same question for telephones, they have specific patterns, so the > > extractions is pretty easy with regex, but, is this correct? Does a NER > > model is better here? > > > > Thank you! > >