Or in our system we have entries like Dean Suarez Smith In actuality the customer states that "Suarez Smith" is a double last name, not a middle name and last name And Dean is their title... or no it's their first name...
Actually Doctor can be a first name as well. It's a mess -----Original Message----- From: Mecki Foerthmann <mec...@gmx.net> To: u2-users <u2-users@listserver.u2ug.org> Sent: Wed, Dec 14, 2011 5:09 am Subject: Re: [U2] Extract first and last name from free-form name Just face it - it can't be done! So what if Dean has 2 first names and is a plumber? On 14/12/2011 09:57, Symeon Breen wrote: > You need to do a proper lexical analysis in order to work these out > > For example > > Input : Dean Foster > Lex: title word > > Input: Dean Reginald McGraw > Lex: title word word > > > > Then set rules to say a lex of "title word" is probable worked out as > "forename surname" and a lex of "title word word" is probably "title > forename surname" you can assign probabilities against these and build > some self learning in. > > It is a whole massive topic. > > > > > -----Original Message----- > From: u2-users-boun...@listserver.u2ug.org > [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Mecki Foerthmann > Sent: 14 December 2011 08:22 > To: u2-users@listserver.u2ug.org > Subject: Re: [U2] Extract first and last name from free-form name > > And the list goes on and on and... > That's why free form names are an absolute pain and should be avoided. > It's so much easier to have Title, First Name(s), Last Name(s) fields in the > input screen and keep them as separate attributes. > You never get it 100% right. > In your list take Dean or Prince for instance - they could be first names > and not titles at all. > A colleague of mine tried a last name upper to lower case conversion > including Irish and Scottish names and out of Machine Co it made MacHine Co. > And don't even ask what happened to last names starting with O.:-) > > > On 14/12/2011 01:02, Charlie Noah wrote: >> Great start, but here is a longer list, although still nowhere near >> complete: >> >> Prefixes >> >> Code Description >> 1st Lt First Lieutenant >> Adm Admiral >> Atty Attorney >> Brother Brother (religious) >> Capt Captain >> Chief Chief >> Cmdr Commander >> Col Colonel >> Dean University Dean (includes Assistant and Associate) >> Dr Doctor (Medical or Educator) >> Elder Elder (religious) >> Father Father (religious) >> Gen General >> Gov Governor >> Hon Honorable (Cabinet Officer, Commissioner, Congressman, Judge, >> etc.) >> Lt Col Lieutenant Colonel >> Maj Major >> MSgt Major/Master Sergeant >> Mr Mister >> Mrs Married Woman >> Ms Single or Married Woman >> Prince Prince >> Prof Professor (includes Assistant and Associate >> Rabbi Rabbi (religious) >> Rev Reverend (religious) >> Sister Sister (religious) >> >> Suffixes >> >> Code Description >> II The Second >> III The Third >> IV The Fourth >> V The Fifth >> CPA Certified Public Accountant >> DDS Doctor of Dental Medicine >> Esq Esquire >> JD Jurist Doctor >> Jr Junior >> Jnr Junior (British) >> LLD Doctor of Laws >> MD Doctor of Medicine >> PhD Doctorate >> Ret Retired from Armed Forces >> RN Registered Nurse >> RPh Registered Pharmacist >> Sr Senior >> Snr Senior (British) >> DO Doctor of Osteopathy >> >> Perhaps others can add more to the list. >> >> Regards, >> Charlie Noah >> >> Tiny Bear's Wild Bird Store >> "Everything For The Backyard Bird Enthusiast, Except For The Birds" >> Info, Forum: http://www.TinyBearMarketing.com >> Store: http://Stores.TinyBearMarketing.com >> >> >> On 12-13-2011 5:12 PM, Wjhonson wrote: >>> 0044: SUFFIXES = ",JR,SR,MD,III," >>> 0045: S.NAME = DCOUNT(UM.NAME,' ') >>> 0046: LAST.WORD.IN.NAME = FIELD(UM.NAME,' ',S.NAME) >>> 0047: IF INDEX(SUFFIXES,",":LAST.WORD.IN.NAME:",",1) THEN >>> 0048: LAST.NAME = FIELD(UM.NAME,' ',S.NAME-1) >>> 0049: END ELSE >>> 0050: LAST.NAME = LAST.WORD.IN.NAME >>> 0051: END >>> 0052: PREFIXES = ',DR,MR,MS,MISS,MRS,' >>> 0053: FIRST.WORD.IN.NAME = FIELD(UM.NAME,' ',1) >>> 0054: IF INDEX(PREFIXES,",":FIRST.WORD.IN.NAME:",",1) THEN >>> 0055: FIRST.NAME = FIELD(UM.NAME,' ',2) >>> 0056: END ELSE >>> 0057: FIRST.NAME = FIRST.WORD.IN.NAME >>> 0058: END >>> _______________________________________________ >>> U2-Users mailing list >>> U2-Users@listserver.u2ug.org >>> http://listserver.u2ug.org/mailman/listinfo/u2-users >>> >> _______________________________________________ >> U2-Users mailing list >> U2-Users@listserver.u2ug.org >> http://listserver.u2ug.org/mailman/listinfo/u2-users > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 10.0.1415 / Virus Database: 2102/4079 - Release Date: 12/13/11 > > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users