On Fri, Mar 5, 2010 at 12:16 PM, Hassan Schroeder <[email protected]> wrote: > On Thu, Mar 4, 2010 at 10:53 PM, Allan Last <[email protected]> wrote: > >> I run into problems where the name is close to the beginning of the >> sentence: >> Having John Smith over for dinner. --- This will look at "Having John" >> Getting Jane Smith ready for school. --- This will look at "Getting >> Jane" >> >> Do you know how to do a RegEx where it will ignore the first word >> whenever three capitalized words are next to each other? Thanks! > > You know this is not something you're going to solve with regular > expressions, though, right? :-) > > "San Francisco's Jane Smith, quoted in Broder's Washington Post > article, said ..." > > You need a lot more heuristics than a simple RegEx to reliably find > names in a block of text.
Some other cases to consider John Phillip Sousa (or if you're a kid a heart John Jacob Jingelheimer Smith) not to mention Spanish names which can have MANY parts. Robert De Niro Jesus Mary and Joseph Surnames with origins in some languages don't start with a capital Michael Henry de Young - Dutch Wernher von Braun - German -- Rick DeNatale Blog: http://talklikeaduck.denhaven2.com/ Twitter: http://twitter.com/RickDeNatale WWR: http://www.workingwithrails.com/person/9021-rick-denatale LinkedIn: http://www.linkedin.com/in/rickdenatale -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

