Or in our system we have entries like
Dean Suarez Smith

In actuality the customer states that "Suarez Smith" is a double last name, not 
a middle name and last name
And Dean is their title... or no it's their first name...

Actually Doctor can be a first name as well.
It's a mess

 

 

-----Original Message-----
From: Mecki Foerthmann <mec...@gmx.net>
To: u2-users <u2-users@listserver.u2ug.org>
Sent: Wed, Dec 14, 2011 5:09 am
Subject: Re: [U2] Extract first and last name from free-form name


Just face it - it can't be done!
So what if Dean has 2 first names and is a plumber?

On 14/12/2011 09:57, Symeon Breen wrote:
> You need to do a proper lexical analysis in order to work these out
>
> For example
>
> Input : Dean Foster
> Lex: title word
>
> Input: Dean Reginald McGraw
> Lex: title word word
>
>
>
> Then set rules to say a lex of "title word" is probable worked out as
> "forename surname"  and a lex of "title word word" is probably "title
> forename surname"   you can assign probabilities against these and build
> some self learning in.
>
> It is a whole massive topic.
>
>
>
>
> -----Original Message-----
> From: u2-users-boun...@listserver.u2ug.org
> [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Mecki Foerthmann
> Sent: 14 December 2011 08:22
> To: u2-users@listserver.u2ug.org
> Subject: Re: [U2] Extract first and last name from free-form name
>
> And the list goes on and on and...
> That's why free form names are an absolute pain and should be avoided.
> It's so much easier to have Title, First Name(s), Last Name(s) fields in the
> input screen and keep them as separate attributes.
> You never get it 100% right.
> In your list take Dean or Prince for instance - they could be first names
> and not titles at all.
> A colleague of mine tried a last name upper to lower case conversion
> including Irish and Scottish names and out of Machine Co it made MacHine Co.
> And don't even ask what happened to last names starting with O.:-)
>
>
> On 14/12/2011 01:02, Charlie Noah wrote:
>> Great start, but here is a longer list, although still nowhere near
>> complete:
>>
>> Prefixes
>>
>> Code     Description
>> 1st Lt   First Lieutenant
>> Adm      Admiral
>> Atty     Attorney
>> Brother  Brother (religious)
>> Capt     Captain
>> Chief    Chief
>> Cmdr     Commander
>> Col      Colonel
>> Dean     University Dean (includes Assistant and Associate)
>> Dr       Doctor (Medical or Educator)
>> Elder    Elder (religious)
>> Father   Father (religious)
>> Gen      General
>> Gov      Governor
>> Hon      Honorable (Cabinet Officer, Commissioner, Congressman, Judge,
>> etc.)
>> Lt Col   Lieutenant Colonel
>> Maj      Major
>> MSgt     Major/Master Sergeant
>> Mr       Mister
>> Mrs      Married Woman
>> Ms       Single or Married Woman
>> Prince   Prince
>> Prof     Professor (includes Assistant and Associate
>> Rabbi    Rabbi (religious)
>> Rev      Reverend (religious)
>> Sister   Sister (religious)
>>
>> Suffixes
>>
>> Code     Description
>> II       The Second
>> III      The Third
>> IV       The Fourth
>> V        The Fifth
>> CPA      Certified Public Accountant
>> DDS      Doctor of Dental Medicine
>> Esq      Esquire
>> JD       Jurist Doctor
>> Jr       Junior
>> Jnr      Junior (British)
>> LLD      Doctor of Laws
>> MD       Doctor of Medicine
>> PhD      Doctorate
>> Ret      Retired from Armed Forces
>> RN       Registered Nurse
>> RPh      Registered Pharmacist
>> Sr       Senior
>> Snr      Senior (British)
>> DO       Doctor of Osteopathy
>>
>> Perhaps others can add more to the list.
>>
>> Regards,
>> Charlie Noah
>>
>> Tiny Bear's Wild Bird Store
>> "Everything For The Backyard Bird Enthusiast, Except For The Birds"
>> Info, Forum:  http://www.TinyBearMarketing.com
>> Store:            http://Stores.TinyBearMarketing.com
>>
>>
>> On 12-13-2011 5:12 PM, Wjhonson wrote:
>>> 0044:          SUFFIXES = ",JR,SR,MD,III,"
>>> 0045:          S.NAME = DCOUNT(UM.NAME,' ')
>>> 0046:          LAST.WORD.IN.NAME = FIELD(UM.NAME,' ',S.NAME)
>>> 0047:          IF INDEX(SUFFIXES,",":LAST.WORD.IN.NAME:",",1) THEN
>>> 0048:             LAST.NAME = FIELD(UM.NAME,' ',S.NAME-1)
>>> 0049:          END ELSE
>>> 0050:             LAST.NAME = LAST.WORD.IN.NAME
>>> 0051:          END
>>> 0052:          PREFIXES = ',DR,MR,MS,MISS,MRS,'
>>> 0053:          FIRST.WORD.IN.NAME = FIELD(UM.NAME,' ',1)
>>> 0054:          IF INDEX(PREFIXES,",":FIRST.WORD.IN.NAME:",",1) THEN
>>> 0055:             FIRST.NAME = FIELD(UM.NAME,' ',2)
>>> 0056:          END ELSE
>>> 0057:             FIRST.NAME = FIRST.WORD.IN.NAME
>>> 0058:          END
>>> _______________________________________________
>>> U2-Users mailing list
>>> U2-Users@listserver.u2ug.org
>>> http://listserver.u2ug.org/mailman/listinfo/u2-users
>>>
>> _______________________________________________
>> U2-Users mailing list
>> U2-Users@listserver.u2ug.org
>> http://listserver.u2ug.org/mailman/listinfo/u2-users
> _______________________________________________
> U2-Users mailing list
> U2-Users@listserver.u2ug.org
> http://listserver.u2ug.org/mailman/listinfo/u2-users
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1415 / Virus Database: 2102/4079 - Release Date: 12/13/11
>
> _______________________________________________
> U2-Users mailing list
> U2-Users@listserver.u2ug.org
> http://listserver.u2ug.org/mailman/listinfo/u2-users
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Reply via email to