Hi Damiano, If you can do it with RegExes, I would say go with that. It will be much easier and faster to implement, compared to preparing the training data required for building a machine-learning model. You might also want to have a look at CLDR, the Common Locale Data Repository, which provides locale-specific support for parsing things like numbers. http://cldr.unicode.org/ <http://cldr.unicode.org/> Finally, if you want to check the zip codes for validity, I am sure you can find a web service that provides this, depending on what country you’re in.
Cheers, Martin > Am 21.08.2015 um 20:17 schrieb Damiano Porta <damianopo...@gmail.com>: > > Hello, > I am thinking about the best method to find zipcodes and telephones inside > my text. > > Zipcodes must have 5 digits and i also have a Dictionary with a list of > real zipcodes of my country. So the first questions is: > > Do i have to train a NER model or use something like RegexNameFinder or > DictionaryNameFinder? > > Same question for telephones, they have specific patterns, so the > extractions is pretty easy with regex, but, is this correct? Does a NER > model is better here? > > Thank you!