+1 (Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name)

2022-07-21 Thread Holger Levsen
On Wed, Jul 20, 2022 at 10:16:19AM -0600, Sam Hartman wrote:
> Yes, it's one of the ways people learn about software that is being
> packaged and they might like to become involved in.
> I find reading ITPs
> 
> 1) increases my interest in Debian because I see cool stuff people are
> doing
> 
> 2) Is one of the ways I learn about software I might find useful
> 
> 3) Potentially could point me in the direction of software to contribute
> to.
> 
> All those are valuable to me even when an ITP is filed immediately
> before uploading.
> 
> I agree that we could find other ways of getting that information to
> people who are interested.
> But today, in my work flow, ITPs are useful to me as an ITP consumer
> even if immediately before upload.
 
+1 to evertyhing quoted. Thanks, Sam!


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

"A developed country is not a place where the poor have cars. It's where the
rich use public transportation." (quote attributed to several people)


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-21 Thread Hakan Bayındır
This is exactly my point of view of ITPs as well, while I'm not as 
involved in Debian in most of the people here, it's a nice and proper 
gateway to see what's happening and what people are working on.


Also, I have taken note of at least of couple pieces of software which I 
could use developing mine.


Cheers,

H.

On 20.07.2022 19:16, Sam Hartman wrote:

"Andrey" == Andrey Rahmatullin  writes:

 Andrey> More sensible than not filing it?  This defeats both
 Andrey> purposes of an ITP: getting it discussed and working as a
 Andrey> mutex for people who are thinking about packaging the same
 Andrey> software. Are there other purposes?

Yes, it's one of the ways people learn about software that is being
packaged and they might like to become involved in.
I find reading ITPs

1) increases my interest in Debian because I see cool stuff people are
doing

2) Is one of the ways I learn about software I might find useful

3) Potentially could point me in the direction of software to contribute
to.

All those are valuable to me even when an ITP is filed immediately
before uploading.

I agree that we could find other ways of getting that information to
people who are interested.
But today, in my work flow, ITPs are useful to me as an ITP consumer
even if immediately before upload.




Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-20 Thread Sam Hartman
> "Andrey" == Andrey Rahmatullin  writes:
Andrey> More sensible than not filing it?  This defeats both
Andrey> purposes of an ITP: getting it discussed and working as a
Andrey> mutex for people who are thinking about packaging the same
Andrey> software. Are there other purposes?

Yes, it's one of the ways people learn about software that is being
packaged and they might like to become involved in.
I find reading ITPs

1) increases my interest in Debian because I see cool stuff people are
doing

2) Is one of the ways I learn about software I might find useful

3) Potentially could point me in the direction of software to contribute
to.

All those are valuable to me even when an ITP is filed immediately
before uploading.

I agree that we could find other ways of getting that information to
people who are interested.
But today, in my work flow, ITPs are useful to me as an ITP consumer
even if immediately before upload.



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-20 Thread Stefano Rivera
Hi Philip (2022.07.19_20:51:36_+)
> > In other words, if you don't pick a culture, the _global_ dataset (ie, the
> > default one) must not assume either.  It takes a lot of work to prepare
> > such a database, that's why this package is good to have.
> 
> What exactly is it supposed to be good for?

Seems perfect for statistics from large data sets. The data in those
kind of things is always messy as hell. Full of mistakes and
misalignment to whatever lines you are trying to draw through it.
But, on aggregate, it does give you results, if you can control for the
problems.

A lot of research is done like this, "data science" they call it.

> From what I can tell, at best it produces harmless nonsense, and at
> worst it will cause people to be misidentified in ways that will vary
> between mildly humourous to significantly hurtful.

On an individual level, but on a large scale, you'd expect things to
average out. Unless it has a systemic bias towards any particular
output.

SR

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-20 Thread Tzafrir Cohen
Hi,

On Tue, Jul 19, 2022 at 06:08:16PM +0200, Andrej Shadura wrote:
> Hi,
> 
> On Tue, 19 Jul 2022, at 16:57, Adam Borowski wrote:
> > On Tue, Jul 19, 2022 at 01:48:17PM +0200, Andrej Shadura wrote:
> >> Take Misha/Miša/Миша or Petya/Peťa/Петя.  In Russian tradition, these are
> >> very likely masculine names, from Mikhail and Petr.
> 
> > If only this piece of software had a distinction between "almost always
> > male", "leaning male", "neutral", "leaning female", "almost always
> > female"...  Oh wait, it does!
> > Precisely for the reason you mention.
> 
> No, it does not and cannot, since some names are almost always male in one 
> culture but almost always female in another one.
> 
> >> And we haven’t yet touched the topic of people who were given 
> >> non-traditional names.
> 
> > In which case it says "unknown".
> 
> No, it cannot know about cases when a person is given a name traditionally 
> given to another gender in another culture. Pretty common in the US, for 
> example. Sure, there probably aren’t many cases of women named Michael, but 
> there are many other names where you wouldn’t be easily able to tell.
> 
> https://en.wikipedia.org/wiki/Category:English_unisex_given_names

Both of you make arguments based on wrong data.
If you would just tried the software (clone it from its source. No need
to install anything) you would notice that.

>>> import gender_guesser.detector as gender
>>> d = gender.Detector()
>>> print(d.get_gender(u"Andrea"))
female
>>> print(d.get_gender(u"Misha"))
male
>>> print(d.get_gender(u"Miša"))
andy
>>> print(d.get_gender(u"Миша"))
unknown
>>> print(d.get_gender(u"Petya"))
male
>>> print(d.get_gender(u"Peťa"))
unknown
>>> print(d.get_gender(u"Петя"))
unknown

So: the software can give an output "andy", that is androgynous, it also
has the output "mostly_male" and "mostly_female".

Furthermore, reading the README, I noticed I can give it more context:

>>> print(d.get_gender(u"Andrea", "italy"))
male
>>> print(d.get_gender(u"Misha", "slovakia"))
andy
>>> print(d.get_gender(u"Petya", "slovakia"))
andy

However, "unknown" means "Not in my database". It does not mean "Neither
male nor female". So it seems Adam also did not check the results before
posting.

Maybe gender-guesser belongs in Debian and maybe it doesn't. But please
try to at least look at the software before judging it.

-- 
mail / xmpp / matrix: tzaf...@cohens.org.il



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-19 Thread Yao Wei (魏銘廷)

>> * Package name: gender-guesser
>> Version : 0.4.0
>> Upstream Author : Israel Saeta Pérez 
>> * URL : https://github.com/lead-ratings/gender-guesser
>> * License : GPL-3 & GFDL-1.2+
>> Programming Lang: Python
>> Description : Guess the gender from first name

Hi,

I'd like to ask a practical question, do we have anything either in WNPP or in 
the archive that depends or uses this package?

Although I guess this library might violate DFSG 5 by itself, I would like to 
see where it's actually used and why we need the library.

Yao Wei

(This email is sent from a phone; sorry for HTML email if it happens.)

Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-19 Thread Philip Hands
Adam Borowski  writes:

> In other words, if you don't pick a culture, the _global_ dataset (ie, the
> default one) must not assume either.  It takes a lot of work to prepare
> such a database, that's why this package is good to have.

What exactly is it supposed to be good for?

From what I can tell, at best it produces harmless nonsense, and at
worst it will cause people to be misidentified in ways that will vary
between mildly humourous to significantly hurtful.

It seems to me about as useful as a hammer with a loose head, which would
allow you to drive nails just well enough that you'll keep on using it
until it breaks your toe.

Cheers, Phil.
-- 
|)|  Philip Hands  [+44 (0)20 8530 9560]  HANDS.COM Ltd.
|-|  http://www.hands.com/http://ftp.uk.debian.org/
|(|  Hugo-Klemm-Strasse 34,   21075 Hamburg,GERMANY


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-19 Thread Adam Borowski
On Tue, Jul 19, 2022 at 06:08:16PM +0200, Andrej Shadura wrote:
> On Tue, 19 Jul 2022, at 16:57, Adam Borowski wrote:
> > On Tue, Jul 19, 2022 at 01:48:17PM +0200, Andrej Shadura wrote:
> >> Take Misha/Miša/Миша or Petya/Peťa/Петя.  In Russian tradition, these are
> >> very likely masculine names, from Mikhail and Petr.
> 
> > If only this piece of software had a distinction between "almost always
> > male", "leaning male", "neutral", "leaning female", "almost always
> > female"...  Oh wait, it does!
> > Precisely for the reason you mention.
> 
> No, it does not and cannot, since some names are almost always male in one 
> culture but almost always female in another one.

In other words, if you don't pick a culture, the _global_ dataset (ie, the
default one) must not assume either.  It takes a lot of work to prepare
such a database, that's why this package is good to have.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ At least spammers get it right: "Hello beautiful!".
⠈⠳⣄



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-19 Thread Adam Borowski
On Tue, Jul 19, 2022 at 01:48:17PM +0200, Andrej Shadura wrote:
> No need to go that far.

> Andrea in Germany is traditionally a woman’s name, Andrea in Italy is a
> masculine name.  How can we tell if a certain specific Andrea is named
> according to the German (Czech, Slovak etc) tradition (and hence likely a
> woman) or the Italian (and hence probably a man)?  There’s no way to
> generally tell this based on the name alone.
> 
> Take Misha/Miša/Миша or Petya/Peťa/Петя.  In Russian tradition, these are
> very likely masculine names, from Mikhail and Petr.

If only this piece of software had a distinction between "almost always
male", "leaning male", "neutral", "leaning female", "almost always
female"...  Oh wait, it does!
Precisely for the reason you mention.

> And we haven’t yet touched the topic of people who were given non-traditional 
> names.

In which case it says "unknown".


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ We domesticated dogs 36000 years ago; together we chased
⣾⠁⢠⠒⠀⣿⡁ animals, hung out and licked or scratched our private parts.
⢿⡄⠘⠷⠚⠋⠀ Cats domesticated us 9500 years ago, and immediately we got
⠈⠳⣄ agriculture, towns then cities. -- whitroth on /.



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Andrew M.A. Cater
On Sat, Jul 16, 2022 at 10:54:10PM +0200, Didier 'OdyX' Raboud wrote:
> Le samedi, 16 juillet 2022, 19.36:17 h CEST Adrian Bunk a écrit :
> > What tools did you use to generate this data?
> > 
> > The irony is that your "fight" requires exactly the tools you want to
> > condemn, and data Debian should better not collect at all.
> 
> It does not. The whole argument is "gender-guessing if prone to errors, if 
> you 
> want to know what gender a person identify themselves, ask them".

Further to this, Edward has already suggested that he would be prepared
to withdraw the ITP.

Can we bring this thread to a close? It seems sensible to ask people
how they wish to be identified - in name, pronouns, gender - not to 
stress ourselves if someone replies "Would prefer not to say", and to leave
any automated attempt to guess to some level of obscurity and academic
interest. That approach would resolve many of the difficulties already
outlined.

Steve McIntyre has already raised the "Things people believe about names"
post which has other companions. Could I respectfully ask that the ITP
be withdrawn at this point?

With every good wish, as ever,

Andrew Cater



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Didier 'OdyX' Raboud
Le samedi, 16 juillet 2022, 19.36:17 h CEST Adrian Bunk a écrit :
> What tools did you use to generate this data?
> 
> The irony is that your "fight" requires exactly the tools you want to
> condemn, and data Debian should better not collect at all.

It does not. The whole argument is "gender-guessing if prone to errors, if you 
want to know what gender a person identify themselves, ask them".


signature.asc
Description: This is a digitally signed message part.


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Adrian Bunk
On Fri, Jul 15, 2022 at 07:10:55PM +, Andrew M.A. Cater wrote:
> On Fri, Jul 15, 2022 at 07:05:09PM +0300, Adrian Bunk wrote:
>...
> > Debian is not a project that fights for trans people or fights for
> > denazification or fights for whatever other non-technical topics
> > individual contributors might consider worth fighting for elsewhere.
> 
> It does fight for under-represented / disadvantaged groups within Debian in a
> Debian context.

What data do you have to prove or disprove whether a group is actually 
under-represented or disadvantaged within Debian?

What tools did you use to generate this data?

The irony is that your "fight" requires exactly the tools you want to 
condemn, and data Debian should better not collect at all.

>...
> > The exact opposite of diversity is to call everything one dislikes or 
> > disagrees with "harassment" or *phobic.
> 
> I wonder how it would be if you wanted to use a similar script to test
> familiarity with English in our developers / a test for neurodiversity
> and high functioning autism / a test for colour vision or dexterity to
> single out anybody who's visually impaired or blind or a guess for
> background religion/beliefs/no belief - I don't think any of these
> (hypothetical, straw man) scripts would be useful or constructive or
> contribute well to our Debian community.
>...

Most software can be used for many purposes good or bad, looking at 
the vast amount of packages maintained by the Debian Med team I am 
quite astonished that you consider it not constructive contributions to 
Debian when people are packaging software that can be used to diagnose 
diseases.

I would rather wonder for how many of your "hypothetical" examples we 
already ship software.

I wouldn't be surprised if we already ship software that can tell the 
familiarity with English of a person based on a few emails.

Steve highlighted the problems of trying to guess gender based on names, 
determining the biological gender based on voice can be far more 
reliable than using the name. Debian does publish videos with audio that 
can be used for the mentioned usecase of determining the gender of 
Debconf speakers. I would expect that speech recognition tools for deaf 
people either already or in the future will be able to output gender and 
accent of the speaker in an audio recording.

In the Debian Med or Deep Learning teams we might some day have software 
that can test for high functioning autism of the speakers in the videos 
of Debconf talks.

Trying to restrict tools is not a new idea.
An EU directive from 2013 make it mandatory that production or
distibution of tools primarily for the purpose of committing hacking
offences must have a maximum sentence of at least 2 years in prison
in all EU countries.
Debian ships many such tools, in practice prosecution faces the
technical reality that the same tools are used for testing the
security of systems against attacks.

Prosecuting people caught using these tools for offences works.

What can realistically work for your examples is not restricting tools,
but restricting what can be done with data.

One thing we can and should do to protect members of our Debian
community is a robust legal response of prosecutions under civil and
criminal law if people are guilty of privacy abuse through policies or
practices when handling personal data that are not compliant with
applicable legislations like the GDPR.

Even in cases where such prosecution is not happening, it should be 
clear that privacy abusers are not welcome in our Debian community.

What is the defined maximum retention time for sensitive personal data
like sexual orientation, race, ethnicity, religion or political believes
in the Debian Community Team?
If there is none or if it is too long, how to fix this swiftly?
If it is not fixed swiftly, how should Debian act against the abusers?

>...
> Andy Cater 

cu
Adrian



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Holger Levsen
On Sat, Jul 16, 2022 at 10:05:59AM +, Stefano Rivera wrote:
> I think it's our business, as a community, and as conference organisers,
> to try to increase the diversity at our events. To me, that means
> increasing speaker diversity, primarily. Attendee diversity won't change
> unless the speaker diversity changes.

I agree, I just don't see how collecting statistics can be useful here. OTOH
I know about several cases where harmless and unneeded data collection has 
become harmful later.


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Imagine god created trillions of galaxies but freaks out because some dude
kisses another.


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Stefano Rivera
Hi debian-devel (2022.07.16_09:12:16_+)
> I guess we should expose this in our conference statistics. We care
> about it.

And in the future, we will:
https://salsa.debian.org/debconf-team/public/websites/wafer-debconf/-/merge_requests/150

SR

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Devops PK Carlisle LLC
First, by all means, if you want to develop, package and maintain this 
app in keeping with the quality standards a Linux distro like Debian 
demands, go to it.


Having said that, it sounds like one of the goofiest things I have heard 
of recently, and I can not see ever having a use for it. But that does 
not remotely mean that it should be disallowed. I have done my share of 
coding little functions which I adore, but which most people have not 
heard of, and would have little interest in. It doesn't ruin my day. 
I'll share, but I am Customer Number One.


With gender-guesser, if I ever saw it on a list of available modules, 
I'd say "Huh, sounds like trouble waiting to happen. No, thanks." and 
that would be all. It may be well-intentioned, but most people aren't 
triggered if a computer gets their gender wrong, and those that are 
triggered by that sort of thing tend to go full Karen if the guess is 
wrong. Therefore you may be setting yourself up for primarily intense 
and negative feedback.


On 7/15/22 19:01, Marvin Renich wrote:

* Jeremy Bicha  [220714 10:06]:

On Thu, Jul 14, 2022 at 2:41 PM Roberto C. Sánchez  wrote:


On Thu, Jul 14, 2022 at 11:14:43AM +0100, Steve McIntyre wrote:

edw...@4angle.com wrote:


Package: wnpp
Severity: wishlist
Owner: Edward Betts 
X-Debbugs-Cc: debian-devel@lists.debian.org, debian-pyt...@lists.debian.org

* Package name: gender-guesser
  Version : 0.4.0
  Upstream Author : Israel Saeta Pérez 
* URL : https://github.com/lead-ratings/gender-guesser
* License : GPL-3 & GFDL-1.2+
  Programming Lang: Python
  Description : Guess the gender from first name


Oh, not *another* package that tries to guess things from names.

Do you have a real use for this package?


Why in the world is that even a relevant question?  There are plenty of
packages in the archive which are useful to essentially nobody apart
from the maintainer and there are even packages which are maintained
without being useful to the maintainer at all (but rather useful to
others).


There are a *lot* of issues
in this area, and mis-gendering people is not something to risk
lightly...



"There are a *lot* of issues in this area" seems rather nebulous.  In
which area?  Given the fact that we have clear and rather unambiguous
guidelines for what constitutes software which is appropriate for
inclusion in the archive, and given that on its face this software does
not seem to be in conflict with any of those guidelines, what then is
the problem?  BTW, I'm not interested in any sort of "well I don't like
..." or "such and such could offend so and so ..." sort of arguments.


Debian has a Diversity Statement [1] which says that Debian welcomes
people regardless of how they identify themselves. Trans people and
non-binary people face a lot of discrimination, harrassment and
bullying around the world. That bad treatment of these people is
against Debian's core values. Therefore, the Debian Project wouldn't
want to distribute software that appears to facilitate that kind of
harassment, regardless of the software license it is released under.
We might not want to distribute such software even if it also has
non-harmful uses. We don't have to distribute *everything* ourselves.


People within the Debian community have a right to expect that others in
the community will not bully, harass, or denigrate them.  They do _not_
have any right to expect that others will not offend them by discussing
or making contributions that espouse values that are different and
incompatible with their own.  Such an expectation assumes that one set
of values is correct and the other is wrong.  In order for such an
expectation to be met, only one of the two sets of values could exist
within Debian.

Saying that gender-guesser should not be packaged within Debian (using
the excuse given early in this thread) is excluding a contribution based
on the values to which that package adheres and possibly the contributor
and the users who would like to use it.  This is contrary to being
inclusive.

Being offended by someone else's civil expression of their values
(including the packaging of a particular piece of software) is not the
same as being bullied or denigrated.  Please stop trying to use the
excuse "it might offend someone" to block participation or inclusion of
software.  Instead, be inclusive and acknowledge that others' values may
be different from and incompatible with yours, and accept that Debian is
a collection of software from diverse sources and some of it may not
adhere to your values.

This is the difference between true inclusiveness and the false
"political correctness" that seems to be permeating our society today.

When we can all say, "I disagree with your values, but I accept you as a
Debian contributor," then we will be truly inclusive.

...Marvin





Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Steve McIntyre
Roberto C. Sánchez wrote:
>On Thu, Jul 14, 2022 at 11:14:43AM +0100, Steve McIntyre wrote:
>> 
>> Do you have a real use for this package? 
>
>Why in the world is that even a relevant question?  There are plenty of
>packages in the archive which are useful to essentially nobody apart
>from the maintainer and there are even packages which are maintained
>without being useful to the maintainer at all (but rather useful to
>others).

I think it's a valid question to ask.

>> There are a *lot* of issues
>> in this area, and mis-gendering people is not something to risk
>> lightly...
>
>"There are a *lot* of issues in this area" seems rather nebulous.  In
>which area?  Given the fact that we have clear and rather unambiguous
>guidelines for what constitutes software which is appropriate for
>inclusion in the archive, and given that on its face this software does
>not seem to be in conflict with any of those guidelines, what then is
>the problem?

I'll link to

  
https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

again, as a start. Assuming *anything* about names is iffy at best,
and trying to derive other information (whether that's gender, age,
nationality, whatever) from names is unreliable as all hell. When
there is the potential for *also* causing offense from that unreliable
information then I'd hope that people would know better.

>BTW, I'm not interested in any sort of "well I don't like ..." or
>"such and such could offend so and so ..." sort of arguments.

Thanks, I'm well aware that you don't care. Maybe you could try?

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
"We're the technical experts.  We were hired so that management could
 ignore our recommendations and tell us how to do our jobs."  -- Mike Andrews



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Stefano Rivera
Hi Holger (2022.07.16_09:42:56_+)
> but why? how is gender relevant for participating in DebConf as
> a whole? (i can see how it could be relevant for some events, but
> not for the whole conference.)

It's not. It's for statistics, as we say when we collect it.

I think it's our business, as a community, and as conference organisers,
to try to increase the diversity at our events. To me, that means
increasing speaker diversity, primarily. Attendee diversity won't change
unless the speaker diversity changes.

SR

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Steve McIntyre
Edward wrote:
>Steve McIntyre  wrote:
>> 
>> Oh, not *another* package that tries to guess things from names.
>> 
>> Do you have a real use for this package? There are a *lot* of issues
>> in this area, and mis-gendering people is not something to risk
>> lightly...
>
>I can add a warning to the package about problems guessing things from names,
>or I'm happy to retract the ITP.

At the very least I think a warning would be useful, yes.

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
"We're the technical experts.  We were hired so that management could
 ignore our recommendations and tell us how to do our jobs."  -- Mike Andrews



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Steve McIntyre
Adrian Bunk wrote:
>On Thu, Jul 14, 2022 at 04:05:35PM +0200, Jeremy Bicha wrote:
>>...
>> Debian has a Diversity Statement [1] which says that Debian welcomes
>> people regardless of how they identify themselves. Trans people and
>> non-binary people face a lot of discrimination, harrassment and
>> bullying around the world.
>
>Our Diversity Statement says that Debian "welcomes and encourages 
>participation by everyone".
>
>People who express how they identify themselves by having a swastika 
>tattoo on their forehead also face a lot of discrimination, harrassment 
>and bullying around the world. Our Diversity Statement makes it clear 
>that we are welcoming and encouraging their participation and are not 
>ourselves discriminating against them.

And, to add the bit that a lot of people conveniently ignore when
trying to make strawman arguments:

"We welcome contributions from everyone as long as they interact
constructively with our community."

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
"We're the technical experts.  We were hired so that management could
 ignore our recommendations and tell us how to do our jobs."  -- Mike Andrews



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Christoph Biedl
Steve McIntyre wrote...

> IMHO there are 2 points to an ITP:
>
>  * to save effort in case two people might be working on the same
>package

And having such a lock is a good thing.

>  * to invite discussion on debian-devel / elsewhere

Which might include reactions like:

* There is already a package that does the same thing. Do we really need
  a duplicate?
* It's already packaged, possibly under an obscure name or within
  another package.
* There are issues with the license, the package description and
  similar things.

> If people post an ITP and upload iummediately, then I don't think that
> helps on either count.

Indeed. And if it's about "Getting through NEW takes so much time", then
giving it a few days more instead of increasing the risk of a REJECT is
even the better way around.

> How do others feel?

To me, uploading immediately after ITP signalizes "I am 100% sure this
package and my packages will certainly not create objections of any
kind". Which I find somewhere between highly optimistic and plain
arrogant.

So, in my opintion, as a rule of thumb, have a time of three days
between these two actions.

Christoph


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Holger Levsen
On Sat, Jul 16, 2022 at 09:12:16AM +, Stefano Rivera wrote:
> If you're asking about DebConf 22, we have that information:
[...] 
> I guess we should expose this in our conference statistics. We care
> about it.
 
but why? how is gender relevant for participating in DebConf as
a whole? (i can see how it could be relevant for some events, but
not for the whole conference.)

society should be *less* about gender, sex, race, etc, not more.

that's at least for me the reason why I usually select "decline
to state". my gender is none of your business for running DebConf.


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

There are no jobs on a dead planet. (Also many other things but people mostly
seem to care about jobs.)


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Jonas Smedegaard
Quoting Enrico Zini (2022-07-16 10:17:11)
> On Thu, Jul 14, 2022 at 12:43:16PM +0100, Edward Betts wrote:
> 
> > I've been writing some code to work out the gender balance of speakers at a
> > conference. It parses the pentabarf XML of the schedule and feeds the 
> > speaker
> > names to this module.
> > 
> > Here's the results for Debconf 22.
> > 
> > 72 speakers
> > 
> > male  48   66.7%
> > unknown   16   22.2%
> > female 45.6%
> > mostly_male22.8%
> > andy   11.4%
> > mostly_female  11.4%
> 
> If the library works as the author intended, it will identify "Enrico"
> as male, which is a gender *I* don't identify with.
> 
> This kind of extends to anything related to a person's identity: any
> software trying to determine an aspect of a person's identity is bound
> to eventually conflict with how a person lives their own identity.
> 
> That conflict can be quite painful, so it's not surprising you get
> strong reactions when intending to package something that pretends to
> tell people what a person is, without asking them first.
> 
> This external determination of identity will then extend to the library
> to any software or research using it. I totally understand the good
> intentions, but the result honestly amplifies the pain.
> 
> I think the right way to get the statistics you're looking for would be
> to ask speakers to state their own identity on pentabarf, so that
> statistics are based on self-determination, rather than external
> overrides of it.

I am not aware the the author or packager or any user of gender-guesser
would *override* explicitly stated identity annotations.

What makes sense to me is to apply a tool like gender-guesser *after*
asking for explicit annotation and then apply guessing only when the
speaker (or whoever was involved) answered "Don't care" (which I would
find sensible to place as default answer).


My point being that I see a use-case for this library that is respectful
- am I missing something and the existance of such tool is *always*
  painful for some?

(sure, it *can* be painful is used wrongly or sloppily, as is the case
with any tool - so I think the more relevant question is if it *always*
is painful for some).


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Stefano Rivera
Hi Enrico (2022.07.16_08:17:11_+)
> I think the right way to get the statistics you're looking for would be
> to ask speakers to state their own identity on pentabarf, so that
> statistics are based on self-determination, rather than external
> overrides of it.

If you're asking about DebConf 22, we have that information:

 count | gender
---+--
21 | Decline to State
56 | Male
 2 | Non-Binary
 9 | Female
(4 rows)

I guess we should expose this in our conference statistics. We care
about it.

If you want similar data for DebConfs since 16, we can get that, too.

SR

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Enrico Zini
On Thu, Jul 14, 2022 at 12:43:16PM +0100, Edward Betts wrote:

> I've been writing some code to work out the gender balance of speakers at a
> conference. It parses the pentabarf XML of the schedule and feeds the speaker
> names to this module.
> 
> Here's the results for Debconf 22.
> 
> 72 speakers
> 
> male  48   66.7%
> unknown   16   22.2%
> female 45.6%
> mostly_male22.8%
> andy   11.4%
> mostly_female  11.4%

If the library works as the author intended, it will identify "Enrico"
as male, which is a gender *I* don't identify with.

This kind of extends to anything related to a person's identity: any
software trying to determine an aspect of a person's identity is bound
to eventually conflict with how a person lives their own identity.

That conflict can be quite painful, so it's not surprising you get
strong reactions when intending to package something that pretends to
tell people what a person is, without asking them first.

This external determination of identity will then extend to the library
to any software or research using it. I totally understand the good
intentions, but the result honestly amplifies the pain.

I think the right way to get the statistics you're looking for would be
to ask speakers to state their own identity on pentabarf, so that
statistics are based on self-determination, rather than external
overrides of it.


Enrico

-- 
GPG key: 4096R/634F4BD1E7AD5568 2009-05-08 Enrico Zini 


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Ansgar
On Sat, 2022-07-16 at 08:51 +0100, Steve McIntyre wrote:
> Are you actually somehow claiming that Debian's core values include
> bad treatment of Jews and those other groups? Seriously, WTF?

We had project members supporting antisemitic groups on project lists
and DebConf events, for example in discussions about a DebConf taking
place in a certain location. Not much happened as a result.

(FWIW, it was said project members even went so far to try to get
support for having sponsors not sponsor that DebConf, i.e., directly
working against the project. Also seems to be fine.)

Ansgar



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-16 Thread Steve McIntyre
Adam Borowski wrote:
>On Thu, Jul 14, 2022 at 04:05:35PM +0200, Jeremy Bicha wrote:
>> > > >* Package name: gender-guesser
>
>> Debian has a Diversity Statement [1] which says that Debian welcomes
>> people regardless of how they identify themselves. Trans people and
>> non-binary people face a lot of discrimination, harrassment and
>> bullying around the world. That bad treatment of these people is
>> against Debian's core values.
>
>Unless they're Jewish, believe that a woman should be allowed to abort a
>Down syndrome fetus, believe that there's more than just a name to the
>gender, or have a kind of transsexualism that matches their life's
>experiences and is detectable by brain imaging but the loud group says
>doesn't exist.
>
>The inconsistency here is astounding.

I genuinely have no clue what you're trying to say here.

Are you actually somehow claiming that Debian's core values include
bad treatment of Jews and those other groups? Seriously, WTF?

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
"We're the technical experts.  We were hired so that management could
 ignore our recommendations and tell us how to do our jobs."  -- Mike Andrews



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-15 Thread Marvin Renich
* Jeremy Bicha  [220714 10:06]:
> On Thu, Jul 14, 2022 at 2:41 PM Roberto C. Sánchez  wrote:
> >
> > On Thu, Jul 14, 2022 at 11:14:43AM +0100, Steve McIntyre wrote:
> > > edw...@4angle.com wrote:
> > >
> > > >Package: wnpp
> > > >Severity: wishlist
> > > >Owner: Edward Betts 
> > > >X-Debbugs-Cc: debian-devel@lists.debian.org, 
> > > >debian-pyt...@lists.debian.org
> > > >
> > > >* Package name: gender-guesser
> > > >  Version : 0.4.0
> > > >  Upstream Author : Israel Saeta Pérez 
> > > >* URL : https://github.com/lead-ratings/gender-guesser
> > > >* License : GPL-3 & GFDL-1.2+
> > > >  Programming Lang: Python
> > > >  Description : Guess the gender from first name
> > >
> > > Oh, not *another* package that tries to guess things from names.
> > >
> > > Do you have a real use for this package?
> >
> > Why in the world is that even a relevant question?  There are plenty of
> > packages in the archive which are useful to essentially nobody apart
> > from the maintainer and there are even packages which are maintained
> > without being useful to the maintainer at all (but rather useful to
> > others).
> >
> > > There are a *lot* of issues
> > > in this area, and mis-gendering people is not something to risk
> > > lightly...
> > >
> >
> > "There are a *lot* of issues in this area" seems rather nebulous.  In
> > which area?  Given the fact that we have clear and rather unambiguous
> > guidelines for what constitutes software which is appropriate for
> > inclusion in the archive, and given that on its face this software does
> > not seem to be in conflict with any of those guidelines, what then is
> > the problem?  BTW, I'm not interested in any sort of "well I don't like
> > ..." or "such and such could offend so and so ..." sort of arguments.
> 
> Debian has a Diversity Statement [1] which says that Debian welcomes
> people regardless of how they identify themselves. Trans people and
> non-binary people face a lot of discrimination, harrassment and
> bullying around the world. That bad treatment of these people is
> against Debian's core values. Therefore, the Debian Project wouldn't
> want to distribute software that appears to facilitate that kind of
> harassment, regardless of the software license it is released under.
> We might not want to distribute such software even if it also has
> non-harmful uses. We don't have to distribute *everything* ourselves.

People within the Debian community have a right to expect that others in
the community will not bully, harass, or denigrate them.  They do _not_
have any right to expect that others will not offend them by discussing
or making contributions that espouse values that are different and
incompatible with their own.  Such an expectation assumes that one set
of values is correct and the other is wrong.  In order for such an
expectation to be met, only one of the two sets of values could exist
within Debian.

Saying that gender-guesser should not be packaged within Debian (using
the excuse given early in this thread) is excluding a contribution based
on the values to which that package adheres and possibly the contributor
and the users who would like to use it.  This is contrary to being
inclusive.

Being offended by someone else's civil expression of their values
(including the packaging of a particular piece of software) is not the
same as being bullied or denigrated.  Please stop trying to use the
excuse "it might offend someone" to block participation or inclusion of
software.  Instead, be inclusive and acknowledge that others' values may
be different from and incompatible with yours, and accept that Debian is
a collection of software from diverse sources and some of it may not
adhere to your values.

This is the difference between true inclusiveness and the false
"political correctness" that seems to be permeating our society today.

When we can all say, "I disagree with your values, but I accept you as a
Debian contributor," then we will be truly inclusive.

...Marvin



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-15 Thread Tzafrir Cohen
On Fri, Jul 15, 2022 at 07:10:55PM +, Andrew M.A. Cater wrote:

> Debconf is on in Kosovo right now. If I had to work out Albanian
> gender mappings from names, I'd have no clue.

I decided to take a random name: the President of Kosovo. In Wikipedia I
see it is Vjosa Osmani, a name completely unfamiliar to me.

>>> print(d.get_gender(u"Vjosa"))
female

This time the guess happened to be correct.

> 
> Then S. Indian - Malayalam character sets?? and names from a number of
> Indian languages  then Israel and Hebrew/Arabic
> Taiwan had Chinese character sets and names

I tried various Hebrew names: names in Hebrew letters are always
unknown. My name (a rare one) is unknown even in Latin letters. However
some common Hebrew names in Latin letters are detected correctly.

I have not tried to do any proper sampling or experiment. Just a random
data point.

-- 
mail / xmpp / matrix: tzaf...@cohens.org.il



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-15 Thread Andrew M.A. Cater
On Fri, Jul 15, 2022 at 07:05:09PM +0300, Adrian Bunk wrote:
> On Thu, Jul 14, 2022 at 04:05:35PM +0200, Jeremy Bicha wrote:
> >...
> > Debian has a Diversity Statement [1] which says that Debian welcomes
> > people regardless of how they identify themselves. Trans people and
> > non-binary people face a lot of discrimination, harrassment and
> > bullying around the world.
> 
> Our Diversity Statement says that Debian "welcomes and encourages 
> participation by everyone".
> 

Correct. That implies useful, constructive participation in accordance with 
our community values and not being divisive for the sake of it.

gender-guesser may have a negative effect on some of our community of
contributors and users. It probably doesn't help people who are gender-fluid,
non-binary or trans who feel as if they are being categorised by their names.

The categorisation isn't aware of diversity in language - as it stands, it
appears biased to Western languages and only a few of them. The world has
more cultural groups and lignuistic categories to take account of.

Consider - 

Debconf is on in Kosovo right now. If I had to work out Albanian
gender mappings from names, I'd have no clue.

Then S. Indian - Malayalam character sets?? and names from a number of
Indian languages  then Israel and Hebrew/Arabic
Taiwan had Chinese character sets and names

At what point is this useful for a very small subset of the world's population?

> People who express how they identify themselves by having a swastika 
> tattoo on their forehead also face a lot of discrimination, harrassment 
> and bullying around the world. Our Diversity Statement makes it clear 
> that we are welcoming and encouraging their participation and are not 
> ourselves discriminating against them.
> 

A swastika tattoo on the forehead would be mostly invisible unless meeting
someone in person. I couldn't guarantee how people would react on first
meeting such a person - I'm assuming this is a straw man for argument
purposes. 

> > That bad treatment of these people is
> > against Debian's core values.
> >...
> 
> Our Diversity Statement says that we "welcome contributions from 
> everyone as long as they interact constructively with our community".
> 
> Debian does not have core values regarding how people are treated 
> outside Debian.
> 

No, but it has very clear core values regarding how they are treated by
and within Debian. The adverse effects on some of our people and our
users probably outweigh the usefulness of such a script, even if it
is accurate and useful to all.

> Debian is not a project that fights for trans people or fights for
> denazification or fights for whatever other non-technical topics
> individual contributors might consider worth fighting for elsewhere.
> 

It does fight for under-represented / disadvantaged groups within Debian in a
Debian context.

> Diversity means that in any kinds of conflicts people on all sides
> are encouraged to contribute to Debian as long as they interact 
> constructively with our community.
> 

I would ask you (Adrian) to consider whether your mailing list
 message is a fully
constructive interaction with our community and its values here or is just
seeking to stir up opinions and arguments for the sake of it. 
The Code of Conduct calls for a constructive and considered approach to debate
on our lists.

> > Therefore, the Debian Project wouldn't
> > want to distribute software that appears to facilitate that kind of
> > harassment, regardless of the software license it is released under.
> > We might not want to distribute such software even if it also has 
> > non-harmful uses.
> >...
> 
> The exact opposite of diversity is to call everything one dislikes or 
> disagrees with "harassment" or *phobic.
> 

I wonder how it would be if you wanted to use a similar script to test
familiarity with English in our developers / a test for neurodiversity
and high functioning autism / a test for colour vision or dexterity to
single out anybody who's visually impaired or blind or a guess for
background religion/beliefs/no belief - I don't think any of these
(hypothetical, straw man) scripts would be useful or constructive or
contribute well to our Debian community.

> > Thank you,
> > Jeremy Bicha
> 
> cu
> Adrian
>

With every good wish, as ever,

Andy Cater 



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-15 Thread Adrian Bunk
On Thu, Jul 14, 2022 at 04:05:35PM +0200, Jeremy Bicha wrote:
>...
> Debian has a Diversity Statement [1] which says that Debian welcomes
> people regardless of how they identify themselves. Trans people and
> non-binary people face a lot of discrimination, harrassment and
> bullying around the world.

Our Diversity Statement says that Debian "welcomes and encourages 
participation by everyone".

People who express how they identify themselves by having a swastika 
tattoo on their forehead also face a lot of discrimination, harrassment 
and bullying around the world. Our Diversity Statement makes it clear 
that we are welcoming and encouraging their participation and are not 
ourselves discriminating against them.

> That bad treatment of these people is
> against Debian's core values.
>...

Our Diversity Statement says that we "welcome contributions from 
everyone as long as they interact constructively with our community".

Debian does not have core values regarding how people are treated 
outside Debian.

Debian is not a project that fights for trans people or fights for
denazification or fights for whatever other non-technical topics
individual contributors might consider worth fighting for elsewhere.

Diversity means that in any kinds of conflicts people on all sides
are encouraged to contribute to Debian as long as they interact 
constructively with our community.

> Therefore, the Debian Project wouldn't
> want to distribute software that appears to facilitate that kind of
> harassment, regardless of the software license it is released under.
> We might not want to distribute such software even if it also has 
> non-harmful uses.
>...

The exact opposite of diversity is to call everything one dislikes or 
disagrees with "harassment" or *phobic.

> Thank you,
> Jeremy Bicha

cu
Adrian



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Wookey
On 2022-07-14 13:52 +0100, Steve McIntyre wrote:
> If the only reason for the ITP is to make lintian quiet then I think
> that's a total waste of time - it's following a guideline blindly
> without understanding the reason for it.
> 
> How do others feel?

I mostly file ITPs because Lintian moans at me if I don't. I agree
that ITPs just before uploading, mostly to keep lintian quiet, are
largely makework.

I do check them for other people working on stuff. I do file them for
things that I expect to take a long time. Essentially I find ITPs
useful for non-trivial packages, but not not very useful for trivial ones.

Wookey
-- 
Principal hats:  Debian, Wookware, ARM
http://wookware.org/


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Adam Borowski
On Thu, Jul 14, 2022 at 08:14:13AM -0700, Russ Allbery wrote:
> Edward Betts  writes:
> 
> > I've been writing some code to work out the gender balance of speakers
> > at a conference. It parses the pentabarf XML of the schedule and feeds
> > the speaker names to this module.
> 
> > Here's the results for Debconf 22.
> 
> > 72 speakers
> 
> > male  48   66.7%
> > unknown   16   22.2%
> > female 45.6%
> > mostly_male22.8%
> > andy   11.4%
> > mostly_female  11.4%
> 
> I fear this may be an example of statistics that look meaningful but
> probably aren't because the error bar is much higher than the typical
> consumer of the statistic intuitively thinks it is.  Although maybe that's
> not a worry in this case since the program itself says that it totally
> failed to make a guess about a quarter of the time.

So instead of making a knowingly-bad guess it says it doesn't know?
That's an upside in my book.

> I don't really have any objections to the package being in the archive;
> this is certainly something that a lot of people seem to want to do and
> thus seem to find some utility in doing.  But unless one has a
> higher-quality source of data than just names (preferred pronouns, direct
> self-identification, etc.)

Real people who want to switch their visible gender (ie, how others view
them) do pick a name that matches the gender they want to present to the
world.


As of actually using first names for statistics:
Several years ago, I did stats on who does uploads in Debian.
My methodology was:
1. limit packages to "key packages" (RT meaning, ie popcon/d-i/{b-,}deps)
2. take the last changed-by of every package (this avoids maintainers
   who haven't been seen in 20 years, etc)
3. for every unique name, manually:
   a. do I recognize that person?  If so, use gender I know.
   b. is the first name gender-specific? (I know western and slavic names)
   c. ~60 seconds of web search using DDG (I seemed to extend suspected
  females to >15 minutes somehow...)
   d. if none of the above gave an answer, say '?'
4. weight every name by the # of packages from 2. (ie, give count of
   packages)

Obviously every step introduces inaccuracies; eg. I used first-[mid...]-last
name combinations, merging distinct spellings only when I spotted them by
hand.  I seem to recall there are two DDs with the same name (I don't
remember who though), they'd be unified by this methodology.  Of course
there'll be no error if they're of the same gender but that's not the case
for other uses of the input data.

Thus, my stats are _not perfect_.  But, as long as I divulge my methodology,
it is sound science.

A famous example is one of first phone surveys, that worked by randomly
selecting phone numbers.  The results turned out to be totally wrong -- with
individual-owned phones being still a quite new thing, phone owners tended
to be affluent and tech-friendly people, and their responses were not
representative of the population at large.

Thus, to be valid science, any use of statistics should disclose the
methodology used.  But, that doesn't make the results any less valid,
it merely attaches a caveat.  Barring some other error (eg. bogus random
generator, ignoring people who hang up, etc), that survey still provided
accurate info on the population of phone owners.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ Ash nazg durbatulûk,
⣾⠁⢠⠒⠀⣿⡁   ash nazg gimbatul,
⢿⡄⠘⠷⠚⠋⠀ ash nazg thrakatulûk
⠈⠳⣄   agh burzum-ishi krimpatul.



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Russ Allbery
Vincent Bernat  writes:
> On 2022-07-14 17:14, Russ Allbery wrote:

>> (Also, due to the limitations and history of naming conventions, the
>> software is inherently trying to map into a gender binary, which if one
>> is attempting to capture self-identification is likely to be unhelpful
>> for many populations, such as ones with lots of people under 30, due to
>> not having a way to represent nonbinary people.)

> This one does not. It maps a first name to male, female, androgynous,
> mostly make, mostly female, or unknown.

Oh, is that what "andy" in the output meant?  I thought that was some
other quirk of the software, but in retrospect I should have figured that
out.

Thanks for the correction.  Androgynous and nonbinary are not really the
same thing, but at least the software is trying to incorporate that, to
the extent that people's names reflect their gender at all.

-- 
Russ Allbery (r...@debian.org)  



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Adam Borowski
On Thu, Jul 14, 2022 at 04:05:35PM +0200, Jeremy Bicha wrote:
> > > >* Package name: gender-guesser

> Debian has a Diversity Statement [1] which says that Debian welcomes
> people regardless of how they identify themselves. Trans people and
> non-binary people face a lot of discrimination, harrassment and
> bullying around the world. That bad treatment of these people is
> against Debian's core values.

Unless they're Jewish, believe that a woman should be allowed to abort a
Down syndrome fetus, believe that there's more than just a name to the
gender, or have a kind of transsexualism that matches their life's
experiences and is detectable by brain imaging but the loud group says
doesn't exist.

The inconsistency here is astounding.

> Therefore, the Debian Project wouldn't
> want to distribute software that appears to facilitate that kind of
> harassment, regardless of the software license it is released under.
> We might not want to distribute such software even if it also has
> non-harmful uses.

While not 100% accurate and thus shouldn't be used to determine the gender
of an _individual_, it's a very useful tools for analyzing larger datasets.

And where it comes to diversity, we so much need data rather than
assumptions.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ I was born a dumb, ugly and work-loving kid, then I got swapped on
⢿⡄⠘⠷⠚⠋⠀ the maternity ward.
⠈⠳⣄



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Jonathan Carter (highvoltage)

On 2022/07/14 14:52, Steve McIntyre wrote:

IMHO there are 2 points to an ITP:

  * to save effort in case two people might be working on the same
package
  * to invite discussion on debian-devel / elsewhere

If people post an ITP and upload iummediately, then I don't think that
helps on either count.


I believe an ITP is even helpful in that case. It's happened on many 
packages before that the package had an issue and not accepted by FTP 
team, and then eventually the ITP got renamed to an RFP. Also, if I want 
to package something, I (and tools like reportbug) check whether there 
are ITPs/RFPs filed. I /don't/ check whether the package is in NEW.


So, in short, I think that filing ITPs is still a good practice and the 
times where it should be left out are really some edge / special cases.


-Jonathan



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Vincent Bernat

On 2022-07-14 17:14, Russ Allbery wrote:

(Also, due to the limitations and history of naming conventions, the
software is inherently trying to map into a gender binary, which if one is
attempting to capture self-identification is likely to be unhelpful for
many populations, such as ones with lots of people under 30, due to not
having a way to represent nonbinary people.)


This one does not. It maps a first name to male, female, androgynous, 
mostly make, mostly female, or unknown.




Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Russ Allbery
Edward Betts  writes:

> I've been writing some code to work out the gender balance of speakers
> at a conference. It parses the pentabarf XML of the schedule and feeds
> the speaker names to this module.

> Here's the results for Debconf 22.

> 72 speakers

> male  48   66.7%
> unknown   16   22.2%
> female 45.6%
> mostly_male22.8%
> andy   11.4%
> mostly_female  11.4%

I fear this may be an example of statistics that look meaningful but
probably aren't because the error bar is much higher than the typical
consumer of the statistic intuitively thinks it is.  Although maybe that's
not a worry in this case since the program itself says that it totally
failed to make a guess about a quarter of the time.

I don't really have any objections to the package being in the archive;
this is certainly something that a lot of people seem to want to do and
thus seem to find some utility in doing.  But unless one has a
higher-quality source of data than just names (preferred pronouns, direct
self-identification, etc.), I personally would be worried about attaching
the appearance of scientific accuracy (three significant figures!) to data
that, depending on the nationalities involved and the strength of naming
conventions and other factors, may be only rough guesswork.

I know someone who keeps similar statistics as an aid to balancing the
range of authors of books he chooses to review, and I see why someone
would want to do that.  But he tries to use higher-quality data sources
than guessing based on names, and that feels like a best practice for that
kind of thing to me.

(Also, due to the limitations and history of naming conventions, the
software is inherently trying to map into a gender binary, which if one is
attempting to capture self-identification is likely to be unhelpful for
many populations, such as ones with lots of people under 30, due to not
having a way to represent nonbinary people.)

Anyway, that's just all my personal opinion and I don't think any of that
says that the package shouldn't be in the archive.  We package all sorts
of not-very-useful software and that's totally fine.  But I've worked in
identity management fields for long enough to have a professional
knee-jerk reaction to anyone doing computer analysis of names or trying to
record gender in any way other than simply asking people.  :)

-- 
Russ Allbery (r...@debian.org)  



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Sean Whitton
Hello,

On Thu 14 Jul 2022 at 01:52PM +01, Steve McIntyre wrote:

> IMHO there are 2 points to an ITP:
>
>  * to save effort in case two people might be working on the same
>package
>  * to invite discussion on debian-devel / elsewhere
>
> If people post an ITP and upload iummediately, then I don't think that
> helps on either count.

Regarding the first point, in previous discussions others have said that
they don't look at NEW but do look at ITPs, so they still want it posted
for that reason.

> If the only reason for the ITP is to make lintian quiet then I think
> that's a total waste of time - it's following a guideline blindly
> without understanding the reason for it.

Definitely.

-- 
Sean Whitton



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Sean Whitton
Hello,

On Thu 14 Jul 2022 at 02:23PM +02, Johannes Schauer Marin Rodrigues wrote:

> Quoting Steve McIntyre (2022-07-14 13:54:52)
>> And I see you uploaded ~immediately - why even bother with an ITP?
>
> I did that quite a few times in the past as well. Is there a rule of how long 
> I
> have to wait with my upload to NEW after filing the ITP?

No, in fact they're not even required.  The Haskell team doesn't post
them, for example.

-- 
Sean Whitton



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Russ Allbery
Steve McIntyre  writes:

> IMHO there are 2 points to an ITP:

>  * to save effort in case two people might be working on the same
>package
>  * to invite discussion on debian-devel / elsewhere

> If people post an ITP and upload iummediately, then I don't think that
> helps on either count.

> If the only reason for the ITP is to make lintian quiet then I think
> that's a total waste of time - it's following a guideline blindly
> without understanding the reason for it.

> How do others feel?

I feel the same way, although I have for a long time been dubious of the
benefit of debian-devel review for ITPs (and, to be honest, the benefit of
WNPP in general apart from orphaning, although sometimes ITP and RFP bugs
are a convenient central place to document all the reasons why packaging
some specific piece of software is really hard), so the whole system feels
kind of creaky to me.

ITPs do occasionally catch things that really shouldn't be packaged, and
we don't have another good mechanism for doing it.  But the whole process
as we currently follow it feels oddly dated and manual and sometimes like
a box-ticking exercise.  (It also adds a lot of noise to debian-devel from
the perspective of, I suspect, most participants.  But we've talked about
that aspect of it before, and there was some moderate desire to see the
new packages flow by.)

Given that new packages as uploaded (a) include nearly all of the
information in an ITP in a more structured form, and (b) have to flow
through NEW anyway, I do sort of wonder if it would make sense to notify
some mailing list of every new source package, extracting similar fields
and the top entry of the changelog (which hopefully has some explanation
for why the package is being packaged for Debian, and we could encourage
people to do that), and then use the time the package sits waiting for NEW
review as the window for people to raise concerns.

That doesn't address the locking purpose of ITP (avoiding duplicate work).
I'm not sure how frequently ITPs are effective at doing that.  It feels
like the percentage of the total software ecosystem that Debian is
packaging is smaller than it used to be (we've grown but free software has
grown way faster) and most of the places where I'd expect contention to
happen are handled by language packaging teams that probably have their
own processes.

-- 
Russ Allbery (r...@debian.org)  



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Nilesh Patra
On Thu, Jul 14, 2022 at 02:31:22PM +0200, julien.pu...@gmail.com wrote:
> Hi,
> 
> Le jeudi 14 juillet 2022 à 14:16 +0200, Marc Haber a écrit :
> > On Thu, 14 Jul 2022 12:54:52 +0100, Steve McIntyre 
> > wrote:
> > > And I see you uploaded ~immediately - why even bother with an ITP?
> > 
> > Is it proper procedure to upload without an ITP?
> > 
> 
> No ; I have to admit a large percentage of the new packages I upload
> get their ITP minutes before the package is ready.
> 
> Basically: I wait for the bug number before pushing to salsa & NEW.

I do exactly this and have never had a problem. I maintain a number of
packages (like Julien does too) and push a bunch of stuff to new from time to 
time.

Filing an ITP, waiting for it, having a discussion and then uploading is just 
beyond
the time I have these days -- sorry.
And needless to say there is always a possibility of rejecting a package if 
deemed in-appropriate.

-- 
Best,
Nilesh


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Jeremy Bicha
On Thu, Jul 14, 2022 at 2:41 PM Roberto C. Sánchez  wrote:
>
> On Thu, Jul 14, 2022 at 11:14:43AM +0100, Steve McIntyre wrote:
> > edw...@4angle.com wrote:
> >
> > >Package: wnpp
> > >Severity: wishlist
> > >Owner: Edward Betts 
> > >X-Debbugs-Cc: debian-devel@lists.debian.org, debian-pyt...@lists.debian.org
> > >
> > >* Package name: gender-guesser
> > >  Version : 0.4.0
> > >  Upstream Author : Israel Saeta Pérez 
> > >* URL : https://github.com/lead-ratings/gender-guesser
> > >* License : GPL-3 & GFDL-1.2+
> > >  Programming Lang: Python
> > >  Description : Guess the gender from first name
> >
> > Oh, not *another* package that tries to guess things from names.
> >
> > Do you have a real use for this package?
>
> Why in the world is that even a relevant question?  There are plenty of
> packages in the archive which are useful to essentially nobody apart
> from the maintainer and there are even packages which are maintained
> without being useful to the maintainer at all (but rather useful to
> others).
>
> > There are a *lot* of issues
> > in this area, and mis-gendering people is not something to risk
> > lightly...
> >
>
> "There are a *lot* of issues in this area" seems rather nebulous.  In
> which area?  Given the fact that we have clear and rather unambiguous
> guidelines for what constitutes software which is appropriate for
> inclusion in the archive, and given that on its face this software does
> not seem to be in conflict with any of those guidelines, what then is
> the problem?  BTW, I'm not interested in any sort of "well I don't like
> ..." or "such and such could offend so and so ..." sort of arguments.

Debian has a Diversity Statement [1] which says that Debian welcomes
people regardless of how they identify themselves. Trans people and
non-binary people face a lot of discrimination, harrassment and
bullying around the world. That bad treatment of these people is
against Debian's core values. Therefore, the Debian Project wouldn't
want to distribute software that appears to facilitate that kind of
harassment, regardless of the software license it is released under.
We might not want to distribute such software even if it also has
non-harmful uses. We don't have to distribute *everything* ourselves.

[1] https://www.debian.org/intro/diversity

Thank you,
Jeremy Bicha



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Jeremy Bicha
On Thu, Jul 14, 2022 at 3:12 PM Roberto C. Sánchez  wrote:
>
> On Thu, Jul 14, 2022 at 05:48:56PM +0500, Andrey Rahmatullin wrote:
> > On Thu, Jul 14, 2022 at 08:45:24AM -0400, Roberto C. Sánchez wrote:
> > >
> > > Filing the ITP then immediately uploading seems really sensible,
> > More sensible than not filing it?
> > This defeats both purposes of an ITP: getting it discussed and working as
> > a mutex for people who are thinking about packaging the same software. Are
> > there other purposes?
> >
> Filing the ITP and then uploading immediately seems like it still fully
> allows for both things you describe.

I also file ITP bugs and try to CC debian-devel as an announcement
that the package is coming soon to Debian.

I generally wait to file the ITP until my packaging is nearly ready in
Salsa and provide the Salsa link in my ITP bug which makes it easy for
someone to review it if they want.

Thank you,
Jeremy Bicha



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Roberto C . Sánchez
On Thu, Jul 14, 2022 at 05:48:56PM +0500, Andrey Rahmatullin wrote:
> On Thu, Jul 14, 2022 at 08:45:24AM -0400, Roberto C. Sánchez wrote:
> > 
> > Filing the ITP then immediately uploading seems really sensible,
> More sensible than not filing it?
> This defeats both purposes of an ITP: getting it discussed and working as
> a mutex for people who are thinking about packaging the same software. Are
> there other purposes?
> 
Filing the ITP and then uploading immediately seems like it still fully
allows for both things you describe.

The discussion can take place as the package waits in NEW (which can be
highly variable, from days to weeks, even to months).  Revisions can be
uploaded (if called for based on the discussion) without losing the
place in NEW.

As far as the mutex aspect, suppose I have some software that I want to
package.  I experiment and create a package before filing an ITP, for
reasons, and then decide, "yes, I do want to upload this".  First I
search existing ITPs and see if someone has expressed an interest in
working on this.  If so, I communicate and coordinate with that person.
If not, I file a new ITP.  At that point, I am faced with a question,
"how long to wait before uploading?"  We can make the argument that
whatever delay is chosen is likely to be insufficient for any of a
number of reasons.  So, then what's the difference with just uploading
as soon as the ITP is filed?  If someone comes along during the period
where the package is in NEW and has an interest, then a simple "hey I'm
also interested in packaging this, can we join forces?" seems like the
thing to do.

Perhaps then it might be that ITP should not be mandatory.  If we
substitue "search NEW and search open ITPs" for "search open ITPs" then
the main reason to have ITPs would be for the instance where someone has
the intention of packaging something but not until some time in the
future.  This might be because the person lacks sufficient time in the
present, because upstream has not yet made a first release suitable for
upload, or any of a number of other reasons.  In any event, this seems
like something that each maintainer can reasonably judge based on the
circumstances.

Regards,

-Roberto

-- 
Roberto C. Sánchez



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Marco d'Itri
On Jul 14, Marc Haber  wrote:

> >And I see you uploaded ~immediately - why even bother with an ITP?
> Is it proper procedure to upload without an ITP?
Is there any point in an ITP if you are already ready to upload the 
package? No.

-- 
ciao,
Marco


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Steve McIntyre
Marc Haber wrote:
>On Thu, 14 Jul 2022 12:54:52 +0100, Steve McIntyre 
>wrote:
>>And I see you uploaded ~immediately - why even bother with an ITP?
>
>Is it proper procedure to upload without an ITP?

IMHO there are 2 points to an ITP:

 * to save effort in case two people might be working on the same
   package
 * to invite discussion on debian-devel / elsewhere

If people post an ITP and upload iummediately, then I don't think that
helps on either count.

If the only reason for the ITP is to make lintian quiet then I think
that's a total waste of time - it's following a guideline blindly
without understanding the reason for it.

How do others feel?

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
"We're the technical experts.  We were hired so that management could
 ignore our recommendations and tell us how to do our jobs."  -- Mike Andrews



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Andrey Rahmatullin
On Thu, Jul 14, 2022 at 08:45:24AM -0400, Roberto C. Sánchez wrote:
> > > > And I see you uploaded ~immediately - why even bother with an ITP?
> > > 
> > > Is it proper procedure to upload without an ITP?
> > > 
> > 
> > No ; I have to admit a large percentage of the new packages I upload
> > get their ITP minutes before the package is ready.
> > 
> > Basically: I wait for the bug number before pushing to salsa & NEW.
> > 
> It's been a while since I've packaged something entirely new, but this
> is also how I have approached it.  Especially during periods when it
> might take months to make it through NEW, waiting an additional week or
> two for discussion around an ITP (especially when the vast majority of
> ITPs actually never elecit any sort of response from anyone), seems
> rather pointless.
> 
> Filing the ITP then immediately uploading seems really sensible,
More sensible than not filing it?
This defeats both purposes of an ITP: getting it discussed and working as
a mutex for people who are thinking about packaging the same software. Are
there other purposes?

-- 
WBR, wRAR


signature.asc
Description: PGP signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Roberto C . Sánchez
On Thu, Jul 14, 2022 at 02:31:22PM +0200, julien.pu...@gmail.com wrote:
> Hi,
> 
> Le jeudi 14 juillet 2022 à 14:16 +0200, Marc Haber a écrit :
> > On Thu, 14 Jul 2022 12:54:52 +0100, Steve McIntyre 
> > wrote:
> > > And I see you uploaded ~immediately - why even bother with an ITP?
> > 
> > Is it proper procedure to upload without an ITP?
> > 
> 
> No ; I have to admit a large percentage of the new packages I upload
> get their ITP minutes before the package is ready.
> 
> Basically: I wait for the bug number before pushing to salsa & NEW.
> 
It's been a while since I've packaged something entirely new, but this
is also how I have approached it.  Especially during periods when it
might take months to make it through NEW, waiting an additional week or
two for discussion around an ITP (especially when the vast majority of
ITPs actually never elecit any sort of response from anyone), seems
rather pointless.

Filing the ITP then immediately uploading seems really sensible,
especially since in the event of a mistake it is trivial to email
ftp-master requesting a REJECT, which IME is usually something they do
right away.

Regards,

-Roberto

-- 
Roberto C. Sánchez



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Roberto C . Sánchez
On Thu, Jul 14, 2022 at 11:14:43AM +0100, Steve McIntyre wrote:
> edw...@4angle.com wrote:
> 
> >Package: wnpp
> >Severity: wishlist
> >Owner: Edward Betts 
> >X-Debbugs-Cc: debian-devel@lists.debian.org, debian-pyt...@lists.debian.org
> >
> >* Package name: gender-guesser
> >  Version : 0.4.0
> >  Upstream Author : Israel Saeta P�rez 
> >* URL : https://github.com/lead-ratings/gender-guesser
> >* License : GPL-3 & GFDL-1.2+
> >  Programming Lang: Python
> >  Description : Guess the gender from first name
> 
> Oh, not *another* package that tries to guess things from names.
> 
> Do you have a real use for this package? 

Why in the world is that even a relevant question?  There are plenty of
packages in the archive which are useful to essentially nobody apart
from the maintainer and there are even packages which are maintained
without being useful to the maintainer at all (but rather useful to
others).

> There are a *lot* of issues
> in this area, and mis-gendering people is not something to risk
> lightly...
> 

"There are a *lot* of issues in this area" seems rather nebulous.  In
which area?  Given the fact that we have clear and rather unambiguous
guidelines for what constitutes software which is appropriate for
inclusion in the archive, and given that on its face this software does
not seem to be in conflict with any of those guidelines, what then is
the problem?  BTW, I'm not interested in any sort of "well I don't like
..." or "such and such could offend so and so ..." sort of arguments.

Please provide an objective and technically-based reason for why this
particular package should not be in the archive rather than hand-wavy
arguments without any actual substance.  Otherwise, it will appear as
though you are simply attempting to conform everyone else to your own
personal view on things.  I think we can all agree that "there are a
*lot* of issues" with such an approach.

Regards,

-Roberto

-- 
Roberto C. S�nchez



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread julien . puydt
Hi,

Le jeudi 14 juillet 2022 à 14:16 +0200, Marc Haber a écrit :
> On Thu, 14 Jul 2022 12:54:52 +0100, Steve McIntyre 
> wrote:
> > And I see you uploaded ~immediately - why even bother with an ITP?
> 
> Is it proper procedure to upload without an ITP?
> 

No ; I have to admit a large percentage of the new packages I upload
get their ITP minutes before the package is ready.

Basically: I wait for the bug number before pushing to salsa & NEW.

Cheers,

J.Puydt



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Johannes Schauer Marin Rodrigues
Quoting Steve McIntyre (2022-07-14 13:54:52)
> And I see you uploaded ~immediately - why even bother with an ITP?

I did that quite a few times in the past as well. Is there a rule of how long I
have to wait with my upload to NEW after filing the ITP?

signature.asc
Description: signature


Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Marc Haber
On Thu, 14 Jul 2022 12:54:52 +0100, Steve McIntyre 
wrote:
>And I see you uploaded ~immediately - why even bother with an ITP?

Is it proper procedure to upload without an ITP?

-- 
-- !! No courtesy copies, please !! -
Marc Haber |   " Questions are the | Mailadresse im Header
Mannheim, Germany  | Beginning of Wisdom " | 
Nordisch by Nature | Lt. Worf, TNG "Rightful Heir" | Fon: *49 621 72739834



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Steve McIntyre
Steve McIntyre wrote:
>edw...@4angle.com wrote:
>
>>Package: wnpp
>>Severity: wishlist
>>Owner: Edward Betts 
>>X-Debbugs-Cc: debian-devel@lists.debian.org, debian-pyt...@lists.debian.org
>>
>>* Package name: gender-guesser
>>  Version : 0.4.0
>>  Upstream Author : Israel Saeta Pérez 
>>* URL : https://github.com/lead-ratings/gender-guesser
>>* License : GPL-3 & GFDL-1.2+
>>  Programming Lang: Python
>>  Description : Guess the gender from first name
>
>Oh, not *another* package that tries to guess things from names.
>
>Do you have a real use for this package? There are a *lot* of issues
>in this area, and mis-gendering people is not something to risk
>lightly...

And I see you uploaded ~immediately - why even bother with an ITP?

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
"We're the technical experts.  We were hired so that management could
 ignore our recommendations and tell us how to do our jobs."  -- Mike Andrews



Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name

2022-07-14 Thread Steve McIntyre
edw...@4angle.com wrote:

>Package: wnpp
>Severity: wishlist
>Owner: Edward Betts 
>X-Debbugs-Cc: debian-devel@lists.debian.org, debian-pyt...@lists.debian.org
>
>* Package name: gender-guesser
>  Version : 0.4.0
>  Upstream Author : Israel Saeta Pérez 
>* URL : https://github.com/lead-ratings/gender-guesser
>* License : GPL-3 & GFDL-1.2+
>  Programming Lang: Python
>  Description : Guess the gender from first name

Oh, not *another* package that tries to guess things from names.

Do you have a real use for this package? There are a *lot* of issues
in this area, and mis-gendering people is not something to risk
lightly...

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
"We're the technical experts.  We were hired so that management could
 ignore our recommendations and tell us how to do our jobs."  -- Mike Andrews