You need to do a proper lexical analysis in order to work these out For example
Input : Dean Foster Lex: title word Input: Dean Reginald McGraw Lex: title word word Then set rules to say a lex of "title word" is probable worked out as "forename surname" and a lex of "title word word" is probably "title forename surname" you can assign probabilities against these and build some self learning in. It is a whole massive topic. -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Mecki Foerthmann Sent: 14 December 2011 08:22 To: [email protected] Subject: Re: [U2] Extract first and last name from free-form name And the list goes on and on and... That's why free form names are an absolute pain and should be avoided. It's so much easier to have Title, First Name(s), Last Name(s) fields in the input screen and keep them as separate attributes. You never get it 100% right. In your list take Dean or Prince for instance - they could be first names and not titles at all. A colleague of mine tried a last name upper to lower case conversion including Irish and Scottish names and out of Machine Co it made MacHine Co. And don't even ask what happened to last names starting with O.:-) On 14/12/2011 01:02, Charlie Noah wrote: > Great start, but here is a longer list, although still nowhere near > complete: > > Prefixes > > Code Description > 1st Lt First Lieutenant > Adm Admiral > Atty Attorney > Brother Brother (religious) > Capt Captain > Chief Chief > Cmdr Commander > Col Colonel > Dean University Dean (includes Assistant and Associate) > Dr Doctor (Medical or Educator) > Elder Elder (religious) > Father Father (religious) > Gen General > Gov Governor > Hon Honorable (Cabinet Officer, Commissioner, Congressman, Judge, > etc.) > Lt Col Lieutenant Colonel > Maj Major > MSgt Major/Master Sergeant > Mr Mister > Mrs Married Woman > Ms Single or Married Woman > Prince Prince > Prof Professor (includes Assistant and Associate > Rabbi Rabbi (religious) > Rev Reverend (religious) > Sister Sister (religious) > > Suffixes > > Code Description > II The Second > III The Third > IV The Fourth > V The Fifth > CPA Certified Public Accountant > DDS Doctor of Dental Medicine > Esq Esquire > JD Jurist Doctor > Jr Junior > Jnr Junior (British) > LLD Doctor of Laws > MD Doctor of Medicine > PhD Doctorate > Ret Retired from Armed Forces > RN Registered Nurse > RPh Registered Pharmacist > Sr Senior > Snr Senior (British) > DO Doctor of Osteopathy > > Perhaps others can add more to the list. > > Regards, > Charlie Noah > > Tiny Bear's Wild Bird Store > "Everything For The Backyard Bird Enthusiast, Except For The Birds" > Info, Forum: http://www.TinyBearMarketing.com > Store: http://Stores.TinyBearMarketing.com > > > On 12-13-2011 5:12 PM, Wjhonson wrote: >> 0044: SUFFIXES = ",JR,SR,MD,III," >> 0045: S.NAME = DCOUNT(UM.NAME,' ') >> 0046: LAST.WORD.IN.NAME = FIELD(UM.NAME,' ',S.NAME) >> 0047: IF INDEX(SUFFIXES,",":LAST.WORD.IN.NAME:",",1) THEN >> 0048: LAST.NAME = FIELD(UM.NAME,' ',S.NAME-1) >> 0049: END ELSE >> 0050: LAST.NAME = LAST.WORD.IN.NAME >> 0051: END >> 0052: PREFIXES = ',DR,MR,MS,MISS,MRS,' >> 0053: FIRST.WORD.IN.NAME = FIELD(UM.NAME,' ',1) >> 0054: IF INDEX(PREFIXES,",":FIRST.WORD.IN.NAME:",",1) THEN >> 0055: FIRST.NAME = FIELD(UM.NAME,' ',2) >> 0056: END ELSE >> 0057: FIRST.NAME = FIRST.WORD.IN.NAME >> 0058: END >> _______________________________________________ >> U2-Users mailing list >> [email protected] >> http://listserver.u2ug.org/mailman/listinfo/u2-users >> > _______________________________________________ > U2-Users mailing list > [email protected] > http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list [email protected] http://listserver.u2ug.org/mailman/listinfo/u2-users ----- No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1415 / Virus Database: 2102/4079 - Release Date: 12/13/11 _______________________________________________ U2-Users mailing list [email protected] http://listserver.u2ug.org/mailman/listinfo/u2-users
