Edit report at https://bugs.php.net/bug.php?id=64740&edit=1
ID: 64740 User updated by: RQuadling at GMail dot com Reported by: RQuadling at GMail dot com Summary: Gender ignores country for some names. Status: Wont fix Type: Bug Package: PECL Operating System: Centos PHP Version: Irrelevant Assigned To: ab Block user comment: N Private report: N New Comment: The issue for me that even if the name is female in one country, it is a single country that I'm asking for. Dan, in England, _IS_ a male name. And the data does show this. As things stand, supplying the country is redundant. Previous Comments: ------------------------------------------------------------------------ [2013-05-18 17:22:05] [email protected] To be more clear, the way i see it should be solved is $g = new Gender\Gender; if($g->isNick($name)) { $name = $g->getNameForNick($name); // to impement } $gender = $g->get($name); That could work as long as there is an unambiguous correlation between the name and the nick. What would you say? ------------------------------------------------------------------------ [2013-05-18 17:03:27] [email protected] Richard, after some research I come to the conclusion this being not a bug. Strictly speaking, both Dan and Ben aren't names but nicknames, respectively for Daniel and Benjamin. Looking into the data there are two corresponding lines = Ben Benjamin 1 1 = Dan Daniel 111 1 That means in both cases it is male in Britain. However because the exact input was the nickname, not the real name, the library looks and evaluates the literally given input and compares the frequencies in all the other countries. I think a change in this area could break the data integrity for the normal operations. Where Ben can evaluate to Benjamin, it could also to Benedict. I think more like about adding some new method like Gender::getRealName($nickname) or adding some options to the existing get method. Changing that behaviour globally might break other more complicated cases. Any ideas? Thanks ) ------------------------------------------------------------------------ [2013-04-30 08:37:50] RQuadling at GMail dot com Description: ------------ The Gender extension ignores the country when the requested name is male or female (trying not to mention the word uni- es ee ex as I think this is causing SPAM alert when posting bugs) in a country where the name is known, even though it is male in the requested country. Also had to post this as a PECL bug as there is no PECL/Gender entry to choose. Test script: --------------- <?php $o_Gender = new Gender\Gender; $o_Gender->trace(); var_dump($o_Gender->get('Ben', Gender\Gender::BRITAIN)); // var_dump($o_Gender->get('Dan', Gender\Gender::BRITAIN)); // var_dump($o_Gender->get('Richard', Gender\Gender::BRITAIN)); Expected result: ---------------- Searching for name 'Ben' (country = Great Britain) Range = line 1 - 48891, guess = 24446 ('Kyung+Ju') Range = line 1 - 24445, guess = 12223 ('Esben') Range = line 1 - 12222, guess = 6111 ('Brendon') Range = line 1 - 6110, guess = 3055 ('Aranita') Range = line 3056 - 6110, guess = 4583 ('Barak') Range = line 4584 - 6110, guess = 5347 ('Beybala') Range = line 4584 - 5346, guess = 4965 ('Benet') Range = line 4584 - 4964, guess = 4774 ('Bavani') Range = line 4775 - 4964, guess = 4869 ('Behudin') Range = line 4870 - 4964, guess = 4917 ('Belk<i>s') Range = line 4918 - 4964, guess = 4941 ('Bendina') Range = line 4918 - 4940, guess = 4929 ('Belu<sch>e') Range = line 4930 - 4940, guess = 4935 ('Benan') Range = line 4930 - 4934, guess = 4932 ('Belva') Range = line 4933 - 4934, guess = 4933 ('Ben') Result: name 'Ben' found evaluating name 'Ben': 'is male' (country = Great Britain[3] or Ireland[1] or U.S.A.[3] or Belgium[4] or the Netherlands[7]) evaluating name 'Ben': 'is uni*** name' (country = China[3]) result for 'Ben': 'is male name' int(77) Searching for name 'Dan' (country = Great Britain) Range = line 1 - 48891, guess = 24446 ('Kyung+Ju') Range = line 1 - 24445, guess = 12223 ('Esben') Range = line 1 - 12222, guess = 6111 ('Brendon') Range = line 6112 - 12222, guess = 9167 ('Delfa') Range = line 6112 - 9166, guess = 7639 ('Chrysostomia') Range = line 7640 - 9166, guess = 8403 ('Curzio') Range = line 8404 - 9166, guess = 8785 ('Danu?e') Range = line 8404 - 8784, guess = 8594 ('Dalbir') Range = line 8595 - 8784, guess = 8689 ('Dan Daniel') Range = line 8595 - 8688, guess = 8641 ('Dalva') Range = line 8642 - 8688, guess = 8665 ('Damion') Range = line 8666 - 8688, guess = 8677 ('Dan') Range = line 8666 - 8677, guess = 8671 ('Damnjan') Range = line 8672 - 8677, guess = 8674 ('Damyan') Range = line 8675 - 8677, guess = 8676 ('Dan') Range = line 8675 - 8676, guess = 8675 ('Damyanti') Range = line 8676 - 8676, guess = 8676 ('Dan') Result: name 'Dan' found evaluating name 'Dan': 'is male' (country = Great Britain[2] or Ireland[3] or U.S.A.[4] or Belgium[1] or Luxembourg[4] or the Netherlands[1] or Swiss[1] or Denmark[6] or Norway[2] or Sweden[6] or Finland[3] or Romania[8] or Moldova[6] or Israel[7]) evaluating name 'Dan': 'is mostly male' (country = Vietnam[6]) evaluating name 'Dan': 'is uni*** name' (country = China[7]) result for 'Dan': 'is male name' int(77) Actual result: -------------- Searching for name 'Ben' (country = Great Britain) Range = line 1 - 48891, guess = 24446 ('Kyung+Ju') Range = line 1 - 24445, guess = 12223 ('Esben') Range = line 1 - 12222, guess = 6111 ('Brendon') Range = line 1 - 6110, guess = 3055 ('Aranita') Range = line 3056 - 6110, guess = 4583 ('Barak') Range = line 4584 - 6110, guess = 5347 ('Beybala') Range = line 4584 - 5346, guess = 4965 ('Benet') Range = line 4584 - 4964, guess = 4774 ('Bavani') Range = line 4775 - 4964, guess = 4869 ('Behudin') Range = line 4870 - 4964, guess = 4917 ('Belk<i>s') Range = line 4918 - 4964, guess = 4941 ('Bendina') Range = line 4918 - 4940, guess = 4929 ('Belu<sch>e') Range = line 4930 - 4940, guess = 4935 ('Benan') Range = line 4930 - 4934, guess = 4932 ('Belva') Range = line 4933 - 4934, guess = 4933 ('Ben') Result: name 'Ben' found evaluating name 'Ben': 'is male' (country = Great Britain[3] or Ireland[1] or U.S.A.[3] or Belgium[4] or the Netherlands[7]) evaluating name 'Ben': 'is uni*** name' (country = China[3]) result for 'Ben': 'is uni*** name' int(63) Searching for name 'Dan' (country = Great Britain) Range = line 1 - 48891, guess = 24446 ('Kyung+Ju') Range = line 1 - 24445, guess = 12223 ('Esben') Range = line 1 - 12222, guess = 6111 ('Brendon') Range = line 6112 - 12222, guess = 9167 ('Delfa') Range = line 6112 - 9166, guess = 7639 ('Chrysostomia') Range = line 7640 - 9166, guess = 8403 ('Curzio') Range = line 8404 - 9166, guess = 8785 ('Danu?e') Range = line 8404 - 8784, guess = 8594 ('Dalbir') Range = line 8595 - 8784, guess = 8689 ('Dan Daniel') Range = line 8595 - 8688, guess = 8641 ('Dalva') Range = line 8642 - 8688, guess = 8665 ('Damion') Range = line 8666 - 8688, guess = 8677 ('Dan') Range = line 8666 - 8677, guess = 8671 ('Damnjan') Range = line 8672 - 8677, guess = 8674 ('Damyan') Range = line 8675 - 8677, guess = 8676 ('Dan') Range = line 8675 - 8676, guess = 8675 ('Damyanti') Range = line 8676 - 8676, guess = 8676 ('Dan') Result: name 'Dan' found evaluating name 'Dan': 'is male' (country = Great Britain[2] or Ireland[3] or U.S.A.[4] or Belgium[1] or Luxembourg[4] or the Netherlands[1] or Swiss[1] or Denmark[6] or Norway[2] or Sweden[6] or Finland[3] or Romania[8] or Moldova[6] or Israel[7]) evaluating name 'Dan': 'is mostly male' (country = Vietnam[6]) evaluating name 'Dan': 'is uni*** name' (country = China[7]) result for 'Dan': 'is uni*** name' int(63) ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=64740&edit=1 -- PECL development discussion Mailing List (http://pecl.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
