On Tue, May 1, 2018 at 1:29 AM, Nathaniel Smith <n...@pobox.com> wrote: > On Mon, Apr 30, 2018 at 8:46 PM, Matt Arcidy <marc...@gmail.com> wrote: >> On Mon, Apr 30, 2018 at 5:42 PM, Steven D'Aprano <st...@pearwood.info> wrote: >>> (If we know that, let's say, really_long_descriptive_identifier_names >>> hurt readability, how does that help us judge whether adding a new kind >>> of expression will hurt or help readability?) >> >> A new feature can remove symbols or add them. It can increase density >> on a line, or remove it. It can be a policy of variable naming, or it >> can specifically note that variable naming has no bearing on a new >> feature. This is not limited in application. It's just scoring. >> When anyone complains about readability, break out the scoring >> criteria and assess how good the _comparative_ readability claim is: >> 2 vs 10? 4 vs 5? The arguments will no longer be singularly about >> "readability," nor will the be about the question of single score for >> a specific statement. The comparative scores of applying the same >> function over two inputs gives a relative difference. This is what >> measures do in the mathematical sense. > > Unfortunately, they kind of study they did here can't support this > kind of argument at all; it's the wrong kind of design. (I'm totally > in favor of being more evidence-based decisions about language design, > but interpreting evidence is tricky!) Technically speaking, the issue > is that this is an observational/correlational study, so you can't use > it to infer causality. Or put another way: just because they found > that unreadable code tended to have a high max variable length, > doesn't mean that taking those variables and making them shorter would > make the code more readable. > I think you are right about the study, but are tangential to what I am trying to say.
I am not inferring causality when creating a measure. In the most tangible example, there is no inference that the euclidean measure _creates_ a distance, or that _anything_ creates a distance at all, it merely generates a number based on coordinates in space. That generation has specific properties which make it a measure, or a metric, what have you. The average/mean is another such object: a measure of central tendency or location. It does not infer causality, it is merely an algorithm by which things can be compared. Even misapplied, it provides a consistent ranking of one mean higher than another in an objective sense. Even if not a single person agrees that line length is a correct measure for an application, it is a measure. I can feed two lines into "len" and get consistent results out. This result will be the same value for all strings of length n, and for a string with length m > n, the measure will always report a higher measured value for the string of length m than the string of length n. This is straight out of measure theory, the results are a distance between the two objects, not a reason why. The same goes for unique symbols. I can count the unique symbols in two lines, and state which is higher. This does not infer a causality, nor do _which_ symbols matter in this example, only that I can count them, and that if count_1 == count_2, the ranks are equal aka no distance between them, and if count_1 > count_2, count 1 is ranked higher. The cause of complexity can be a number of things, but stating a bunch of criteria to measure is not about inference. Measuring the temperature of a steak doesn't infer why people like it medium rare. It just quantifies it. > This sounds like a finicky technical complaint, but it's actually a > *huge* issue in this kind of study. Maybe the reason long variable > length was correlated with unreadability was that there was one > project in their sample that had terrible style *and* super long > variable names, so the two were correlated even though they might not > otherwise be related. Maybe if you looked at Perl, then the worst > coders would tend to be the ones who never ever used long variables > names. Maybe long lines on their own are actually fine, but in this > sample, the only people who used long lines were ones who didn't read > the style guide, so their code is also less readable in other ways. > (In fact they note that their features are highly correlated, so they > can't tell which ones are driving the effect.) We just don't know. > Your points here are dead on. It's not like a single metric will be the deciding factor. Nor will a single rank end all disagreements. It's a tool. Consider line length 79, that's an explicit statement about readability, it's "hard coded" in the language. Disagreement with the value 79 or even the metric line-length doesn't mean it's not a measure. Length is the euclidean measure in one dimension. The measure will be a set of filters and metrics that combine to a value or set of values in a reliable way. It's not about any sense of correctness or even being better, that is, at a minimum, an interpretation. > And yeah, it doesn't help that they're only looking at 3 line blocks > of code and asking random students to judge readability – hard to say > how that generalizes to real code being read by working developers. Respectfully, this is practical application and not a PhD defense, so it will be generated by practical coding. People can argue about the chosen metrics, but it is a more informative debate than just the label "readability". If 10 people state a change badly violates one criteria, perhaps that can be easily addressed. if many people make multiple claims based on many criteria, there is a real readability problem (assuming the metric survived SOME vetting of course) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/