Re: [Python-ideas] Objectively Quantifying Readability

2018-05-02 Thread Greg Ewing

Tim Peters wrote:

def objective_readability_score(text):
"Return the readability of `text`, a float in 0.0 .. 1.0"
return 2.0 * text.count(":=") / len(text)


A useful-looking piece of code, but it could be more readable.
It only gives itself a readability score of 0.0136986301369863.

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Dan Sommers
On Wed, 02 May 2018 05:08:41 +1000, Steven D'Aprano wrote:

> The difference was that when Windows users used the mouse, even though
> they were *objectively* faster to complete the task compared to using
> the arrow keys, subjectively they swore that they were slower, and
> were *very confident* about their subjective experience.

Another driving analogy:  when I get stuck at a stoplight, sometimes I
take advantage of turn on red or a protected turn, even though I know
that it's going to take longer to get where I'm going.  But I feel
better because I'm not just sitting there at the stoplight.  Call it
cognitive dissonance, I guess.

Some of my coworkers claim that using vi is objectively faster or
requires fewer keystrokes than using emacs.  I counter that I've been
using emacs since before they were born, and that I now do so with the
reptilian part of my brain, which means that I can keep thinking about
the problem at hand rather than about editing the source code.

Who remembers the One True Brace Style holy wars?  If we agreed on
anything, it was to conform to existing code rather than to write new
code in a different style.  Reading a mixture of styles was harder, no
matter which particular style you thought was better or why you thought
it was better.

> Athletes are not great judges of what training works for themselves.

Wax on, wax off?  ;-)

Dan

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Mikhail V
On Wed, May 2, 2018 at 4:03 AM, Matt Arcidy  wrote:
> On Tue, May 1, 2018 at 5:35 PM, Mikhail V  wrote:
>
>> to be pedantic - ReallyLongDescriptiveIdentifierNames
>> has also an issue with "I" which might confuse because it
>> looks same as little L. Just to illustrate that choice of
>> comparison samples is very sensitive thing.
>> In such a way an experienced guy can even scam
>> the experimental subjects by making samples which
>> will show what he wants in result.
>
> I love this discussion, but I think anything that isn't included in a
> .py file would have to be outside the scope, at least of the alpha
> version :).  I am really interested in these factors in general,
> however.  Now I'm surprised no one asks which font each other are
> using when determining readability.
>
> "serif?  are you mad?  no wonder!"
> "+1 on PEP conditional on mandatory yellow (#FFEF00) keyword syntax
> highlighting in vim"
>

Well, I am asking.
Looking at online PEPs I am under impression everyone
should use huge-sized Consolas and no syntax highlighting at all.

Just as with "=" and "==". Making samples without highlighting
will show a similarity issue. Making it with different
highlighting/font style will show
that there is no issue.

Or, ":=" looks ok with Times New Roman, but with Consolas -
it looks like Dr. Zoidberg's face.



Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread CFK
Matt, you took the words right out of my mouth!  The fonts that are being
used will have a big difference in readability, as will font size,
foreground and background coloring, etc.  It would be interesting to see if
anyone has done a serious study of this type though, especially if they
studied it over the course of several hours (I'm getting older, and I've
noticed that after about 8-10 hours of coding it doesn't matter what I'm
looking at, I can't focus enough to read it, but I don't know when I start
to degrade, nor do I know if different fonts would help me degrade more
slowly)

Thanks,
Cem Karan


On Tue, May 1, 2018 at 9:03 PM, Matt Arcidy  wrote:

> On Tue, May 1, 2018 at 5:35 PM, Mikhail V  wrote:
>
> > to be pedantic - ReallyLongDescriptiveIdentifierNames
> > has also an issue with "I" which might confuse because it
> > looks same as little L. Just to illustrate that choice of
> > comparison samples is very sensitive thing.
> > In such a way an experienced guy can even scam
> > the experimental subjects by making samples which
> > will show what he wants in result.
>
> I love this discussion, but I think anything that isn't included in a
> .py file would have to be outside the scope, at least of the alpha
> version :).  I am really interested in these factors in general,
> however.  Now I'm surprised no one asks which font each other are
> using when determining readability.
>
> "serif?  are you mad?  no wonder!"
> "+1 on PEP conditional on mandatory yellow (#FFEF00) keyword syntax
> highlighting in vim"
>
> -Matt
>
> >
> >
> > Mikhail
> > ___
> > Python-ideas mailing list
> > Python-ideas@python.org
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Matt Arcidy
On Tue, May 1, 2018 at 5:35 PM, Mikhail V  wrote:

> to be pedantic - ReallyLongDescriptiveIdentifierNames
> has also an issue with "I" which might confuse because it
> looks same as little L. Just to illustrate that choice of
> comparison samples is very sensitive thing.
> In such a way an experienced guy can even scam
> the experimental subjects by making samples which
> will show what he wants in result.

I love this discussion, but I think anything that isn't included in a
.py file would have to be outside the scope, at least of the alpha
version :).  I am really interested in these factors in general,
however.  Now I'm surprised no one asks which font each other are
using when determining readability.

"serif?  are you mad?  no wonder!"
"+1 on PEP conditional on mandatory yellow (#FFEF00) keyword syntax
highlighting in vim"

-Matt

>
>
> Mikhail
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Mikhail V
On Tue, May 1, 2018 at 6:04 PM, Jacco van Dorp  wrote:
> 2018-05-01 14:54 GMT+02:00 Greg Ewing :
>> Rhodri James wrote:
>>>
>>> I'd be interested to know if there is a readability difference between
>>> really_long_descriptive_identifier_name and
>>> ReallyLongDescriptiveIdentifierNames.
>>
>>
>> As one data point on that, jerking my eyes quickly across
>> that line I found it much easier to pick out the component
>> words in the one with underscores.
>>
>> --
>> Greg
>
>
> Which is funny, because I had the exact opposite.
>
> Might it be that we've had different conditioning ?
>
> Jacco

The one with underscores reads fairly better though.
Might it be that it does read better?
Ok, that's not scientific enough. But the scores 2:1 so far ;-)

to be pedantic - ReallyLongDescriptiveIdentifierNames
has also an issue with "I" which might confuse because it
looks same as little L. Just to illustrate that choice of
comparison samples is very sensitive thing.
In such a way an experienced guy can even scam
the experimental subjects by making samples which
will show what he wants in result.


Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Mikhail V
On Tue, May 1, 2018 at 3:42 AM, Steven D'Aprano  wrote:
> On Mon, Apr 30, 2018 at 11:28:17AM -0700, Matt Arcidy wrote:
>
> - people are not good judges of readability;

People are the only judges of readability.
Just need the right people.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Steven D'Aprano
On Tue, May 01, 2018 at 03:02:27PM +, Dan Sommers wrote:

> >> I happen to be an excellent judge of whether a given block of code is
> >> readable to me.
> 
> > In the same way that 93% of people say that they are an above-average
> > driver, I'm sure that most people think that they are an excellent
> > judge of readability. Including myself in *both* of those.
> 
> Are you claiming that I'm not an excellent judge of whether a given
> block of code is readable to me?

Of course not. I don't know you. I wouldn't dream of making *specific* 
claims about you. I speak only in broad generalities which apply to 
people in general.

I'm reminded that in the 1990s during the UI wars between Apple and 
Microsoft, people had *really strong* opinions about the useability of 
the two OSes' GUIs. Macs required the user to use the mouse to navigate 
menus, while Windows also allowed the use to navigate them using the Alt 
key and arrow keys.

Not surprisingly, *both* Mac users and Windows users were absolutely 
convinced that they were much more efficient using the method they were 
familiar with, and could justify their judgement. For example, Windows 
users typically said that having to move their hand from the keyboard to 
grab the mouse was slow and inefficient, and using the Alt key and 
arrows was much faster.

But when researchers observed users in action, and timed how long it 
took them to perform simple tasks requiring navigating the menus, they 
found that using the mouse was significantly faster for *both* groups of 
users, both Windows and Mac users.

The difference was that when Windows users used the mouse, even though 
they were *objectively* faster to complete the task compared to using 
the arrow keys, subjectively they swore that they were slower, and were 
*very confident* about their subjective experience.

This is a good example of the overconfidence effect:

https://en.wikipedia.org/wiki/Overconfidence_effect

This shouldn't be read as a tale about Mac users being superior. One of 
the two methods had to be faster, and it happened to be Macs. My point 
is not about Macs versus Windows, but that people in general are not 
good at this sort of self-reflection.

Another example of this is the way that the best professional athletes 
no longer rely on their own self-judgement about the best training 
methods to use, because the training techniques that athletes think are 
effective, and those which actually are effective, are not strongly 
correlated.

Athletes are not great judges of what training works for themselves.

The psychological processes that lead to these cognitive biases apply to 
us all, to some degree or another.

Aside from you and me, of course.



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Nathaniel Smith
On Tue, May 1, 2018, 02:55 Matt Arcidy  wrote:

>
> I am not inferring causality when creating a measure.


No, but when you assume that you can use that measure to *make* code more
readable, then you're assuming causality.

Measuring the
> temperature of a steak doesn't infer why people like it medium rare.
> It just quantifies it.
>

Imagine aliens who have no idea how cooking works decide to do a study of
steak rareness. They go to lots of restaurants, order steak, ask people to
judge how rare it was, and then look for features that could predict these
judgements.

They publish a paper with an interesting finding: it turns out that
restaurant decor is highly correlated with steak rareness. Places with
expensive leather seats and chandeliers tend to serve steak rare, while
cheap diners with sticky table tops tend to serve it well done.

(I haven't done this study, but I bet if you did then you would find this
correlation is actually true in real life!)

Now, should we conclude based on this that if we want to get rare steak,
the key is to *redecorate the dining room*? Of course not, because we
happen to know that the key thing that changes the rareness of steak is how
it's exposed to heat.

But for code readability, we don't have this background knowledge; we're
like the aliens. Maybe the readability metric in this study is like
quantifying temperature; maybe it's like quantifying how expensive the
decor is. We don't know.

(This stuff is extremely non-obvious; that's why we force
scientists-in-training to take graduate courses on statistics and
experimental design, and it still doesn't always take.)


> > And yeah, it doesn't help that they're only looking at 3 line blocks
> > of code and asking random students to judge readability – hard to say
> > how that generalizes to real code being read by working developers.
>
> Respectfully, this is practical application and not a PhD defense,  so
> it will be generated by practical coding.
>

Well, that's the problem. In a PhD defense, you can get away with this kind
of stuff; but in a practical application it has to actually work :-). And
generalizability is a huge issue.

People without statistical training tend to look at studies and worry about
how big the sample size is, but that's usually not the biggest concern; we
have ways to quantify how big your sample needs to be. There bigger problem
is whether your sample is *representative*. If you're trying to guess who
will become governor of California, then if you had some way to pick voters
totally uniformly at random, you'd only need to ask 50 or 100 of them how
they're voting to get an actually pretty good idea of what all the millions
of real votes will do. But if you only talk to Republicans, it doesn't
matter how many you talk to, you'll get a totally useless answer. Same if
you only talk to people of the same age, or who all live in the same town,
or who all have land-line phones, or... This is what makes political
polling difficult, is getting a representative sample.

Similarly, if we only look at out-of-context Java read by students, that
may or may not "vote the same way" as in-context Python read by the average
user. Science is hard :-(.

-n

>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Eric Fahlgren
On Tue, May 1, 2018 at 8:04 AM, Jacco van Dorp  wrote:

> 2018-05-01 14:54 GMT+02:00 Greg Ewing :
> > Rhodri James wrote:
> >>
> >> I'd be interested to know if there is a readability difference between
> >> really_long_descriptive_identifier_name and
> >> ReallyLongDescriptiveIdentifierNames.
> >
> >
> > As one data point on that, jerking my eyes quickly across
> > that line I found it much easier to pick out the component
> > words in the one with underscores.
>
> Which is funny, because I had the exact opposite.
>
> Might it be that we've had different conditioning ?


​Almost certainly.  I started using CamelCase in the mid-'80s and it seems
very natural to me, since we still use it for (as you mention) GUI packages
derived from C extension modules with that convention.  On the other hand,
I've also written a lot of snake_form identifiers in non-GUI Python, so
that seems fairly natural to me, too.​
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Jacco van Dorp
2018-05-01 14:54 GMT+02:00 Greg Ewing :
> Rhodri James wrote:
>>
>> I'd be interested to know if there is a readability difference between
>> really_long_descriptive_identifier_name and
>> ReallyLongDescriptiveIdentifierNames.
>
>
> As one data point on that, jerking my eyes quickly across
> that line I found it much easier to pick out the component
> words in the one with underscores.
>
> --
> Greg


Which is funny, because I had the exact opposite.

Might it be that we've had different conditioning ?

Jacco
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Dan Sommers
On Tue, 01 May 2018 22:37:11 +1000, Steven D'Aprano wrote:

> On Tue, May 01, 2018 at 04:50:05AM +, Dan Sommers wrote:
>> On Tue, 01 May 2018 10:42:53 +1000, Steven D'Aprano wrote:
>> 
>> > - people are not good judges of readability;

>> I happen to be an excellent judge of whether a given block of code is
>> readable to me.

> In the same way that 93% of people say that they are an above-average
> driver, I'm sure that most people think that they are an excellent
> judge of readability. Including myself in *both* of those.

Are you claiming that I'm not an excellent judge of whether a given
block of code is readable to me?

> 2. Anecdotally, we all know that many programmers are just awful.

No argument here!  :-)

> https://thedailywtf.com/

Readability is only one criterion by which to judge code.  Most code on
thedailywtf is bad due to varieties of bugs, inefficiencies,
misunderstandings, or unnecessary complexities, regardless of how
readable it is.

> And presumably most of them think they are writing readable code ...

Why would you presume that?  I haved worked with plenty programmers who
didn't consider readability.  If it passes the tests, then it's good
code.  If I have difficulty reading it, then that's my problem.

Also, when I write code, I put down my text editor to review it myself
before I submit it for formal review.  My authoring criteria for good
code is different from my reviewing criteria for good code; the latter
includes more readability than the former.

> ... I have known many people who swear black and blue that they work
> best with their editor configured to show code in a tiny, 8pt font,
> and I've watched them peering closely at the screen struggling to read
> the text and making typo after typo which they failed to notice.

> In other words, people are often not even a great judge of what is 
> readable to *themselves*.

On that level, aesthetics definitely count (typographers have known this
for centuries), but in an entirely different way.  Is their code better
when their editor shows it at 14.4pt?  At 17.3pt?

When we first started with color displays and syntax coloring editors,
it was popular to make a language's keywords really stand out.  But the
keywords usually aren't the important part of the code (have you ever
programmed in lisp?), and I find it easier to read algol-like code when
the keywords are lowlighted rather than highlighted.  In Python, for
example, the word "import" is far less important than the name of the
module being imported, especially when all the imports are grouped
together near the top of the source file.

Dan

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Greg Ewing

Rhodri James wrote:
I'd be interested to know if there is a readability difference between 
really_long_descriptive_identifier_name and 
ReallyLongDescriptiveIdentifierNames.


As one data point on that, jerking my eyes quickly across
that line I found it much easier to pick out the component
words in the one with underscores.

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Matt Arcidy
On Tue, May 1, 2018 at 1:29 AM, Nathaniel Smith  wrote:
> On Mon, Apr 30, 2018 at 8:46 PM, Matt Arcidy  wrote:
>> On Mon, Apr 30, 2018 at 5:42 PM, Steven D'Aprano  wrote:
>>> (If we know that, let's say, really_long_descriptive_identifier_names
>>> hurt readability, how does that help us judge whether adding a new kind
>>> of expression will hurt or help readability?)
>>
>> A new feature can remove symbols or add them.  It can increase density
>> on a line, or remove it.  It can be a policy of variable naming, or it
>> can specifically note that variable naming has no bearing on a new
>> feature.  This is not limited in application.  It's just scoring.
>> When anyone complains about readability, break out the scoring
>> criteria and assess how good the _comparative_ readability claim is:
>> 2 vs 10?  4 vs 5?  The arguments will no longer be singularly about
>> "readability," nor will the be about the question of single score for
>> a specific statement.  The comparative scores of applying the same
>> function over two inputs gives a relative difference.  This is what
>> measures do in the mathematical sense.
>
> Unfortunately, they kind of study they did here can't support this
> kind of argument at all; it's the wrong kind of design. (I'm totally
> in favor of being more evidence-based decisions about language design,
> but interpreting evidence is tricky!) Technically speaking, the issue
> is that this is an observational/correlational study, so you can't use
> it to infer causality. Or put another way: just because they found
> that unreadable code tended to have a high max variable length,
> doesn't mean that taking those variables and making them shorter would
> make the code more readable.
>
I think you are right about the study, but are tangential to what I am
trying to say.

I am not inferring causality when creating a measure.  In the most
tangible example, there is no inference that the euclidean measure
_creates_ a distance, or that _anything_ creates a distance at all, it
merely generates a number based on coordinates in space.  That
generation has specific properties which make it a measure, or a
metric, what have you.

The average/mean is another such object: a measure of central tendency
or location.  It does not infer causality, it is merely an algorithm
by which things can be compared.  Even misapplied, it provides a
consistent ranking of one mean higher than another in an objective
sense.

Even if not a single person agrees that line length is a correct
measure for an application, it is a measure.  I can feed two lines
into "len" and get consistent results out.   This result will be the
same value for all strings of length n, and for a string with length m
> n, the measure will always report a higher measured value for the
string of length m than the string of length n.   This is straight out
of measure theory, the results are a distance between the two objects,
not a reason why.

The same goes for unique symbols.  I can count the unique symbols in
two lines, and state which is higher.  This does not infer a
causality, nor do _which_ symbols matter in this example, only that I
can count them, and that if count_1 == count_2, the ranks are equal
aka no distance between them, and if count_1 > count_2, count 1 is
ranked higher.

The cause of complexity can be a number of things, but stating a bunch
of criteria to measure is not about inference.  Measuring the
temperature of a steak doesn't infer why people like it medium rare.
It just quantifies it.

> This sounds like a finicky technical complaint, but it's actually a
> *huge* issue in this kind of study. Maybe the reason long variable
> length was correlated with unreadability was that there was one
> project in their sample that had terrible style *and* super long
> variable names, so the two were correlated even though they might not
> otherwise be related. Maybe if you looked at Perl, then the worst
> coders would tend to be the ones who never ever used long variables
> names. Maybe long lines on their own are actually fine, but in this
> sample, the only people who used long lines were ones who didn't read
> the style guide, so their code is also less readable in other ways.
> (In fact they note that their features are highly correlated, so they
> can't tell which ones are driving the effect.) We just don't know.
>

Your points here are dead on.  It's not like a single metric will be
the deciding factor.  Nor will a single rank end all disagreements.
It's a tool.  Consider line length 79, that's an explicit statement
about readability, it's "hard coded" in the language.  Disagreement
with the value 79 or even the metric line-length doesn't mean it's not
a measure.  Length is the euclidean measure in one dimension.

The measure will be a set of filters and metrics that combine to a
value or set of values in a reliable way.  It's not about any sense of
correctness or even being 

Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Antoine Pitrou

Yes, it seems that this study has many limitations which don't make its
results very interesting for our community.  I think the original point
was that readability *can* be studied rationnally and scientifically,
though.

Regards

Antoine.


On Tue, 1 May 2018 09:00:44 +0200
Jacco van Dorp 
wrote:
> I must say my gut agrees that
> really_long_identifier_names_with_a_full_description don't look
> readable to me. Perhaps it's my exposure to (py)Qt, but I really like
> my classes like ThisName and my methods like thisOne. I also tend to
> keep them to three words max (real code from yesterday:
> getActiveOutputs(), or at most setAllDigitalOutputs()). I also really
> dislike more than 3 or 4 arguments.
> 
> A question for another type of science would be, do I agree with this
> study because it agrees with me ?
> 
> It should be noted that the snippets used were short and small. This
> might cause a bias towards short identifiers - after all, if you only
> got 3 to keep track of it, they're more likely to be distinct enough
> compared to when you have 20. I couldn't give a source, but IIRC
> people can hold up to around 5 to 7 concepts in their head at one time
> - which means that if you got less identifiers than that, you don't
> remember the names, but their concepts.(further reading shows this is
> supported with their strongest negative correlation - # of identifiers
> strongly decreases readability.). Compare it to RAM - it's only big
> enough for 5 to 7 identifiers, and after that you have to switch them
> out to the harddisk. *nobody* wants to code that does this switching,
> and our brains don't like running it either. I think this is one of
> the main reasons list/generator comprehensions increase readability so
> much. You can get rid of 1 or 2 variable names.
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Nathaniel Smith
On Mon, Apr 30, 2018 at 8:46 PM, Matt Arcidy  wrote:
> On Mon, Apr 30, 2018 at 5:42 PM, Steven D'Aprano  wrote:
>> (If we know that, let's say, really_long_descriptive_identifier_names
>> hurt readability, how does that help us judge whether adding a new kind
>> of expression will hurt or help readability?)
>
> A new feature can remove symbols or add them.  It can increase density
> on a line, or remove it.  It can be a policy of variable naming, or it
> can specifically note that variable naming has no bearing on a new
> feature.  This is not limited in application.  It's just scoring.
> When anyone complains about readability, break out the scoring
> criteria and assess how good the _comparative_ readability claim is:
> 2 vs 10?  4 vs 5?  The arguments will no longer be singularly about
> "readability," nor will the be about the question of single score for
> a specific statement.  The comparative scores of applying the same
> function over two inputs gives a relative difference.  This is what
> measures do in the mathematical sense.

Unfortunately, they kind of study they did here can't support this
kind of argument at all; it's the wrong kind of design. (I'm totally
in favor of being more evidence-based decisions about language design,
but interpreting evidence is tricky!) Technically speaking, the issue
is that this is an observational/correlational study, so you can't use
it to infer causality. Or put another way: just because they found
that unreadable code tended to have a high max variable length,
doesn't mean that taking those variables and making them shorter would
make the code more readable.

This sounds like a finicky technical complaint, but it's actually a
*huge* issue in this kind of study. Maybe the reason long variable
length was correlated with unreadability was that there was one
project in their sample that had terrible style *and* super long
variable names, so the two were correlated even though they might not
otherwise be related. Maybe if you looked at Perl, then the worst
coders would tend to be the ones who never ever used long variables
names. Maybe long lines on their own are actually fine, but in this
sample, the only people who used long lines were ones who didn't read
the style guide, so their code is also less readable in other ways.
(In fact they note that their features are highly correlated, so they
can't tell which ones are driving the effect.) We just don't know.

And yeah, it doesn't help that they're only looking at 3 line blocks
of code and asking random students to judge readability – hard to say
how that generalizes to real code being read by working developers.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-05-01 Thread Jacco van Dorp
I must say my gut agrees that
really_long_identifier_names_with_a_full_description don't look
readable to me. Perhaps it's my exposure to (py)Qt, but I really like
my classes like ThisName and my methods like thisOne. I also tend to
keep them to three words max (real code from yesterday:
getActiveOutputs(), or at most setAllDigitalOutputs()). I also really
dislike more than 3 or 4 arguments.

A question for another type of science would be, do I agree with this
study because it agrees with me ?

It should be noted that the snippets used were short and small. This
might cause a bias towards short identifiers - after all, if you only
got 3 to keep track of it, they're more likely to be distinct enough
compared to when you have 20. I couldn't give a source, but IIRC
people can hold up to around 5 to 7 concepts in their head at one time
- which means that if you got less identifiers than that, you don't
remember the names, but their concepts.(further reading shows this is
supported with their strongest negative correlation - # of identifiers
strongly decreases readability.). Compare it to RAM - it's only big
enough for 5 to 7 identifiers, and after that you have to switch them
out to the harddisk. *nobody* wants to code that does this switching,
and our brains don't like running it either. I think this is one of
the main reasons list/generator comprehensions increase readability so
much. You can get rid of 1 or 2 variable names.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-04-30 Thread Dan Sommers
On Tue, 01 May 2018 10:42:53 +1000, Steven D'Aprano wrote:

> - people are not good judges of readability;

WTF?  By definition, people are the *only* judge of readability.¹

I happen to be an excellent judge of whether a given block of code is
readable to me.

OTOH, if you mean is that I'm a bad judge of what makes code readable to
you, and that you're a bad judge of what makes code readable to me, then
I agree.  :-)

Dan

¹ Well, okay, compilers will tell you that your code is unreadable, but
they're known to be fairly pedantic.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-04-30 Thread Matt Arcidy
On Mon, Apr 30, 2018 at 5:42 PM, Steven D'Aprano  wrote:
> On Mon, Apr 30, 2018 at 11:28:17AM -0700, Matt Arcidy wrote:
>
>> A study has been done regarding readability in code which may serve as
>> insight into this issue. Please see page 8, fig 9 for a nice chart of
>> the results, note the negative/positive coloring of the correlations,
>> grey/black respectively.
>
> Indeed. It seems that nearly nothing is positively correlated to
> increased readability, aside from comments, blank lines, and (very
> weakly) arithmetic operators. Everything else hurts readability.
>
> The conclusion here is that if you want readable source code, you should
> remove the source code. *wink*
>
>
>> https://web.eecs.umich.edu/~weimerw/p/weimer-tse2010-readability-preprint.pdf
>>
>> The criteria in the paper can be applied to assess an increase or
>> decrease in readability between current and proposed changes.  Perhaps
>> even an automated tool could be implemented based on agreed upon
>> criteria.
>
>
> That's a really nice study, and thank you for posting it. There are some
> interested observations here, e.g.:
>
> - line length is negatively correlated with readability;
>
>   (a point against those who insist that 79 character line
>   limits are irrelevant since we have wide screens now)
>
> - conventional measures of complexity do not correlate well
>   with readability;
>
> - length of identifiers was strongly negatively correlated
>   with readability: long, descriptive identifier names hurt
>   readability while short variable names appeared to make
>   no difference;
>
>   (going against the common wisdom that one character names
>   hurt readability -- maybe mathematicians got it right
>   after all?)
>
> - people are not good judges of readability;
>
> but I think the practical relevance here is very slim. Aside from
> questions about the validity of the study (it is only one study, can the
> results be replicated, do they generalise beyond the narrowly self-
> selected set of university students they tested?) I don't think that it
> gives us much guidance here. For example:

I don't propose to replicate correlations.  I don't see these
"standard" terminal conclusions as forgone when looking at the idea as
a whole, as opposed to the paper itself, which they may be.  The
authors crafted a method and used that method to do a study, I like
the method.  I think I can agree with your point about the study
without validating or invalidating the method.

>
> 1. The study is based on Java, not Python.

An objective measure can be created, based or not on the paper's
parameters, but it clearly would need to be adjusted to a specific
language, good point.

Here "objective" does not mean "with absolute correctness" but
"applied the same way such that a 5 is always a 5, and a 5 is always
greater than 4."  I think I unfortunately presented the paper as "The
Answer" in my initial email, but I didn't intend to say "each detail
must be implemented as is" but more like "this is a thing which can be
done."  Poor job on my part.

>
> 2. It looks at a set of pre-existing source code features.
>
> 3. It gives us little or no help in deciding whether new syntax will or
> won't affect readability: the problem of *extrapolation* remains.
>
> (If we know that, let's say, really_long_descriptive_identifier_names
> hurt readability, how does that help us judge whether adding a new kind
> of expression will hurt or help readability?)

A new feature can remove symbols or add them.  It can increase density
on a line, or remove it.  It can be a policy of variable naming, or it
can specifically note that variable naming has no bearing on a new
feature.  This is not limited in application.  It's just scoring.
When anyone complains about readability, break out the scoring
criteria and assess how good the _comparative_ readability claim is:
2 vs 10?  4 vs 5?  The arguments will no longer be singularly about
"readability," nor will the be about the question of single score for
a specific statement.  The comparative scores of applying the same
function over two inputs gives a relative difference.  This is what
measures do in the mathematical sense.

Maybe the "readability" debate then shifts to arguing criteria: "79?
Too long in your opinion!"  A measure will at least break
"readability" up and give some structure to that argument.  Right now
"readability" comes up and starts a semi-polite flame war.  Creating
_any_ criteria will help narrow the scope of the argument.

Even when someone writes perfectly logical statements about it, the
statements can always be dismantled because it's based in opinion.
By creating a measure, objectivity is forced.  While each criterion is
less or more subjective, the measure will be applied objectively to
each instance, the same way, to get a score.

>
> 4. The authors themselves warn that it is descriptive, not prescriptive,
> for example replacing long identifier names with randomly selected 

Re: [Python-ideas] Objectively Quantifying Readability

2018-04-30 Thread Chris Angelico
On Tue, May 1, 2018 at 10:42 AM, Steven D'Aprano  wrote:
> The conclusion here is that if you want readable source code, you should
> remove the source code. *wink*

That's more true than your winky implies. Which is more readable: a
Python function, or the disassembly of its corresponding byte-code?
Which is more readable: a "for item in items:" loop, or one that
iterates up to the length of the list and subscripts it each time? The
less code it takes to express the same concept, the easier it is to
read - and to debug.

So yes, if you want readable source code, you should have less source code.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Objectively Quantifying Readability

2018-04-30 Thread Steven D'Aprano
On Mon, Apr 30, 2018 at 11:28:17AM -0700, Matt Arcidy wrote:

> A study has been done regarding readability in code which may serve as
> insight into this issue. Please see page 8, fig 9 for a nice chart of
> the results, note the negative/positive coloring of the correlations,
> grey/black respectively.

Indeed. It seems that nearly nothing is positively correlated to 
increased readability, aside from comments, blank lines, and (very 
weakly) arithmetic operators. Everything else hurts readability.

The conclusion here is that if you want readable source code, you should 
remove the source code. *wink*

 
> https://web.eecs.umich.edu/~weimerw/p/weimer-tse2010-readability-preprint.pdf
> 
> The criteria in the paper can be applied to assess an increase or
> decrease in readability between current and proposed changes.  Perhaps
> even an automated tool could be implemented based on agreed upon
> criteria.


That's a really nice study, and thank you for posting it. There are some 
interested observations here, e.g.:

- line length is negatively correlated with readability;

  (a point against those who insist that 79 character line 
  limits are irrelevant since we have wide screens now)

- conventional measures of complexity do not correlate well
  with readability;

- length of identifiers was strongly negatively correlated
  with readability: long, descriptive identifier names hurt
  readability while short variable names appeared to make
  no difference;

  (going against the common wisdom that one character names
  hurt readability -- maybe mathematicians got it right 
  after all?)

- people are not good judges of readability;

but I think the practical relevance here is very slim. Aside from 
questions about the validity of the study (it is only one study, can the 
results be replicated, do they generalise beyond the narrowly self- 
selected set of university students they tested?) I don't think that it 
gives us much guidance here. For example:

1. The study is based on Java, not Python.

2. It looks at a set of pre-existing source code features.

3. It gives us little or no help in deciding whether new syntax will or 
won't affect readability: the problem of *extrapolation* remains.

(If we know that, let's say, really_long_descriptive_identifier_names 
hurt readability, how does that help us judge whether adding a new kind 
of expression will hurt or help readability?)

4. The authors themselves warn that it is descriptive, not prescriptive, 
for example replacing long identifier names with randomly selected two 
character names is unlikely to be helpful.

5. The unfamiliarity affect: any unfamiliar syntax is going to be less 
readable than a corresponding familiar syntax.


It's a great start to the scientific study of readability, but I don't 
think it gives us any guidance with respect to adding new features.


> Opinions about readability can be shifted from:
>  - "Is it more or less readable?"
> to
>  - "This change exceeds a tolerance for levels of readability given
> the scope of the change."

One unreplicated(?) study for readability of Java snippets does not give 
us a metric for predicting the readability of new Python syntax. While 
it would certainly be useful to study the possibly impact of adding new 
features to a language, the authors themselves state that this study is 
just "a framework for conducting such experiments".

Despite the limitations of the study, it was an interesting read, thank 
you for posting it.



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/