I would be quite happy to add some sort of frequency metric
to given and family names in the ENAMDICT file. The trouble
is I have no time spare to go digging out the data. If someone
else were prepared to compile it, I'd be glad to add it.
Jim Breen
2011/8/11 Osamu Aoki os...@debian.org:
Hi,
Hi,
On Thu, Aug 11, 2011 at 06:00:55PM +1000, Jim Breen wrote:
I would be quite happy to add some sort of frequency metric
to given and family names in the ENAMDICT file. The trouble
is I have no time spare to go digging out the data.
I have found a data as below in CSV format for family
こんばんは,
2011/8/11 Osamu Aoki os...@debian.org:
I have found a data as below in CSV format for family name.
Anyway raw data has a bit over 100,600 names.
Given name is a bit difficult.
Yes, but family names is a great start.
It looks like
sei,rank,number
佐藤,1位,481980
鈴木,2位,426804
Hi,
This is about: http://bugs.debian.org/271397
Mr. Tashiro is quite obvious.(% population uses, popularity position)
田代(0.061%, #287th) - I pick this without second thought.
田城(0.001%, #6981th) - mozc Japanese imput listed this too.
Not that popular names but this names pupolar than 田代
4 matches
Mail list logo