Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-04 Thread Ondrej Bilka
On Thu, Jan 03, 2013 at 11:33:53AM -0700, Marcus G. Daniels wrote:
 On 1/2/13 1:49 AM, Ondřej Bílka wrote:
 A better example is that you have c code where at several places
 is code for inserting element into sorted array and using that
 array. What should you do. CS course taugth us to use red-black
 tree there. Right? Well not exactly. When one looks how is this
 code used it discovers that first occurence is shortest path
 problem so radix heap is appropriate. Second does not use ordering
 so hash table is more appropriate. Third periodicaly generates
 webpage from sorted data so keeping new data in separate buffer
 and calling sort from generating routine looks best. Finally
 fourth, original contains a comment: /* Used to find closest value
 to given value. Profiling shown this accounted to 13% of time. As
 updates are rare (not changed in previous year) using more
 sophisticated structures than binary search is
 counterproductive.*/
 I imagine a process like this:
 
 First create a generic container type, say a Map, for the array and
 a lookup and traversal routine.
 
 If performance matters, profiling will reveal that certain uses are
 more conspicuous than others.
Profiling will reveal that you spend 5% time in insert and 3% time in 
remove. You spend two weeks optimizing your tree and memory allocator 
for it. 
 
 Inspection of those uses will suggest refinement of the Map to drop
 the traversal and/or lookup routines, and thus diversification of
 the types to an ordered or unordered sets.  Distinguishing between
 integer and more expensive comparisons might motivate introduction
 of Patricia trees (for integer sets).

And thats exactly a problem I am talking about. Person that writes 
structures will not notice that what he writes is not needed and 
introduce much more problems by adding patricia trees etc.
A person who uses that structure will not notice that he can do it more
effectively without these structures because performance details were 
abstracted away.
Note that in my examples each uses some special property that is not 
worth abstracting because its quite narrow use case. 
 
 Maybe I'm losing sight of the question at hand.  I'm just saying
 that going maximally generic at first will motivate appropriate
 specialization of the abstractions.   The opposite approach of
 specialized inline implementations, built from scratch in different
 apps or even different modules of the same app has a higher risk of
 creating frozen accidents.
 
 Marcus
 
 
 
 
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-04 Thread Ondrej Bilka
On Thu, Jan 03, 2013 at 01:07:55AM +0100, Loup Vaillant-David wrote:
 On Tue, Jan 01, 2013 at 11:18:29PM +0100, Ondřej Bílka wrote:
  On Tue, Jan 01, 2013 at 09:12:07PM +0100, Loup Vaillant-David wrote:
   
 void latin1_to_utf8(std::string  s);
   
  Let me guess. They do it to save cycles caused by allocation of new
  string.
   instead of
   
 std::string utf8_of_latin1(std::string s)
   or
 std::string utf8_of_latin1(const std::string  s)
 
 You may have guessed right.  But then, *they* guessed wrong.
I often se how people blindly use performance related suggestions. Here 
it was  that passing structure by reference is faster than by value. 
(which is now sometimes false.)
 
 First, the program in which I saw this conversion routine is dead slow
 anyway.  If they really cared about the performance of a few encoding
 conversion, they should have started by unifying string handling to
 begin with (there are 6 string types in the program, all actively
 used, and sometimes converted back and forth).
 
 Second, every time the conversion does actually do anything, the utf8
 string will be longer than the original one, and require a realloc()
 anyway (unless they wrote some very clever code, but the overall
 quality of their monstrosity makes it unlikely).
 
 Finally, I often needed to write this:
 
   std::string temp = compute_text();
   latin1_to_utf8(temp);
   call_function(temp);
 
 Which does not reduce allocations in the slightest, compared to
 
   call_function(utf8_of_latin1(compute_text()));
 
 My version may even be a bit more amenable to optimisation by the
 compiler. (In addition to be more readable, I dare say.)
 
 So, they *may* have made this move because they cared about
 performance.  A more likely explanation though, is that they simply
 thought oh, I need to convert some strings to utf8, and
 transliterated that in C++.  They could have thought oh, I need utf8
 versions of some strings instead, but that would be functional
 thinking.
 
 Loup.
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-04 Thread Marcus G. Daniels

On 1/4/13 4:04 AM, Ondrej Bilka wrote:

Profiling will reveal that you spend 5% time in insert and 3% time in
remove. You spend two weeks optimizing your tree and memory allocator
for it.
The call tree, accumulating these costs in different parent contexts, 
when correlated to the fact that (presumably) the containers being acted 
upon are different containers, can suggest a change to a specialized 
container type depending on the access pattern...


Marcus
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-04 Thread Alan Moore
I have been looking at Datomic/Datalog which abstracts away the container
entirely and focuses simply on entity, attribute, value + time deltas. In
theory, the underlying container(s) could adjust to optimize for typical,
and possibly changing, usage patterns. Worth a look and some consideration
:-)

Alan M.



On Fri, Jan 4, 2013 at 7:48 AM, Marcus G. Daniels mar...@snoutfarm.comwrote:

 On 1/4/13 4:04 AM, Ondrej Bilka wrote:

 Profiling will reveal that you spend 5% time in insert and 3% time in
 remove. You spend two weeks optimizing your tree and memory allocator
 for it.

 The call tree, accumulating these costs in different parent contexts, when
 correlated to the fact that (presumably) the containers being acted upon
 are different containers, can suggest a change to a specialized container
 type depending on the access pattern...


 Marcus
 __**_
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/**listinfo/fonchttp://vpri.org/mailman/listinfo/fonc




-- 
*Whatever you can do, or dream you can do, begin it. Boldness has genius,
power, and magic in it. Begin it now.* - *Goethe*
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-02 Thread Ondřej Bílka
On Tue, Jan 01, 2013 at 04:57:05PM -0700, Marcus G. Daniels wrote:
 On 1/1/13 3:18 PM, Ondřej Bílka wrote:
 On opposite end of spectrum you have piece of haskell code where
 everything is abstracted and each abstraction is wrong in some way
 or another. Main reason of later is functional fixedness. A
 haskell programmer will see a structure as a monad but then does
 not see more apropriate abstractions. This is mainly problematic
 when there are portions of code that are very similar but only by
 chance and each requires different treatment. You merge them into
 one function and after some time this function ends with ten
 parameters.

A better example is that you have c code where at several places is code
for inserting element into sorted array and using that array. 
What should you do. 
CS course taugth us to use red-black tree there. Right?

Well not exactly. When one looks how is this code used it discovers that
first occurence is shortest path problem so radix heap is appropriate.
Second does not use ordering so hash table is more appropriate.
Third periodicaly generates webpage from sorted data so keeping new data 
in separate buffer and calling sort from generating routine looks best.
Finally fourth, original contains a comment:

/* Used to find closest value to given value. Profiling shown this
accounted to 13% of time. As updates are rare (not changed in previous
year) using more sophisticated structures than binary search is
counterproductive.*/

 Hmm, yeah.  I've done that, but I've also recognized it and undone
 it, or added parameters to types and/or reworked the combinators to
 deal with both treatments.  That one is even looking for combinators
 seems to me to show that the Haskell programmer is inclined to
 resist functional fixedness...
That depends. Combinators also make you think into terms of combinators. 
You will get same problem.

In general important point here is to have basis a set of primitives
that are quite unlikely to get split into parts. But then you can also
reinvent forth.
 
 Marcus
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc

-- 

Just type 'mv * /dev/null'.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-02 Thread Loup Vaillant-David
On Tue, Jan 01, 2013 at 11:18:29PM +0100, Ondřej Bílka wrote:
 On Tue, Jan 01, 2013 at 09:12:07PM +0100, Loup Vaillant-David wrote:
  
void latin1_to_utf8(std::string  s);
  
 Let me guess. They do it to save cycles caused by allocation of new
 string.
  instead of
  
std::string utf8_of_latin1(std::string s)
  or
std::string utf8_of_latin1(const std::string  s)

You may have guessed right.  But then, *they* guessed wrong.

First, the program in which I saw this conversion routine is dead slow
anyway.  If they really cared about the performance of a few encoding
conversion, they should have started by unifying string handling to
begin with (there are 6 string types in the program, all actively
used, and sometimes converted back and forth).

Second, every time the conversion does actually do anything, the utf8
string will be longer than the original one, and require a realloc()
anyway (unless they wrote some very clever code, but the overall
quality of their monstrosity makes it unlikely).

Finally, I often needed to write this:

  std::string temp = compute_text();
  latin1_to_utf8(temp);
  call_function(temp);

Which does not reduce allocations in the slightest, compared to

  call_function(utf8_of_latin1(compute_text()));

My version may even be a bit more amenable to optimisation by the
compiler. (In addition to be more readable, I dare say.)

So, they *may* have made this move because they cared about
performance.  A more likely explanation though, is that they simply
thought oh, I need to convert some strings to utf8, and
transliterated that in C++.  They could have thought oh, I need utf8
versions of some strings instead, but that would be functional
thinking.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-01 Thread Loup Vaillant-David
On Mon, Dec 31, 2012 at 04:36:09PM -0700, Marcus G. Daniels wrote:
 On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
 2. The programmer has a belief or preference that the code is easier
 to work with if it isn't abstracted. […]

I have evidence for this poisonous belief.  Here is some production
C++ code I saw:

  if (condition1)
  {
if (condition2)
{
  // some code
}
  }

instead of

  if (condition1 
  condition2)
  {
// some code
  }

-

  void latin1_to_utf8(std::string  s);

instead of

  std::string utf8_of_latin1(std::string s)
or
  std::string utf8_of_latin1(const std::string  s)

-

(this one is more controversial)

  Foo foo;
  if (condition)
foo = bar;
  else
foo = baz;

instead of

  Foo foo = condition
  ? bar
  : baz;

I think the root cause of those three examples can be called step by
step thinking.  Some people just can't deal with abstractions at all,
not even functions.  They can only make procedures, which do their
thing step by step, and rely on global state.  (Yes, global state,
though they do have the courtesy to fool themselves by putting it in a
long lived object instead of the toplevel.)  The result is effectively
a monster of mostly linear code, which is cut at obvious boundaries
whenever `main()` becomes too long (too long generally being a
couple hundred lines.  Each line of such code _is_ highly legible,
I'll give them that.  The whole however would frighten even Cthulhu.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-01 Thread BGB

On 1/1/2013 2:12 PM, Loup Vaillant-David wrote:

On Mon, Dec 31, 2012 at 04:36:09PM -0700, Marcus G. Daniels wrote:

On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
2. The programmer has a belief or preference that the code is easier
to work with if it isn't abstracted. […]

I have evidence for this poisonous belief.  Here is some production
C++ code I saw:

   if (condition1)
   {
 if (condition2)
 {
   // some code
 }
   }

instead of

   if (condition1 
   condition2)
   {
 // some code
   }

-

   void latin1_to_utf8(std::string  s);

instead of

   std::string utf8_of_latin1(std::string s)
or
   std::string utf8_of_latin1(const std::string  s)

-

(this one is more controversial)

   Foo foo;
   if (condition)
 foo = bar;
   else
 foo = baz;

instead of

   Foo foo = condition
   ? bar
   : baz;

I think the root cause of those three examples can be called step by
step thinking.  Some people just can't deal with abstractions at all,
not even functions.  They can only make procedures, which do their
thing step by step, and rely on global state.  (Yes, global state,
though they do have the courtesy to fool themselves by putting it in a
long lived object instead of the toplevel.)  The result is effectively
a monster of mostly linear code, which is cut at obvious boundaries
whenever `main()` becomes too long (too long generally being a
couple hundred lines.  Each line of such code _is_ highly legible,
I'll give them that.  The whole however would frighten even Cthulhu.


part of the issue may be a tradeoff:
does the programmer think in terms of abstractions and using high-level 
overviews?
or, does the programmer mostly think in terms of step-by-step operations 
and make use of their ability to keep large chunks of information in memory?


it is a question maybe of whether the programmer sees the forest or the 
trees.


these sorts of things may well have an impact on the types of code a 
person writes, and what sorts of things the programmer finds more readable.



like, for a person who can mentally more easily deal with step-by-step 
thinking, but can keep much of the code in their mind at-once, and 
quickly walk around and explore the various possibilities and scenarios, 
this kind of bulky low-abstraction code may be preferable, since when 
they walk the graph in their mind, they don't really have to stop and 
think too much about what sorts of items they encounter along the way.


in their minds-eye, it may well look like a debugger stepping at a rate 
of roughly 5-10 statements per second or so. maybe they may or may not 
be fully aware how their mind does it, but they can vaguely see the 
traces along the call-stack, ghosts of intermediate values, and the 
sudden jump of attention to somewhere where a crash has occurred or an 
exception has been thrown.


actually, I had before compared it to ants:
it is like ones' mind has ants in it, which walk along trails, either 
stepping code, or trying out various possibilities, ...
once something interesting comes up, it starts attracting more of 
these mental ants, until it has a whole swarm, and then a more clear 
image of the scenario or idea may emerge in ones' mind.


but, abstractions and difficult concepts are like oil to these ants, 
where if ants encounter something they don't like (like oil) they will 
back up and try to walk around it (and individual ants aren't 
particularly smart).



and, probably, other people use other methods of reasoning about code...


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-01 Thread Loup Vaillant-David
On Tue, Jan 01, 2013 at 03:02:09PM -0600, BGB wrote:
 On 1/1/2013 2:12 PM, Loup Vaillant-David wrote:
 On Mon, Dec 31, 2012 at 04:36:09PM -0700, Marcus G. Daniels wrote:
 On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
 2. The programmer has a belief or preference that the code is easier
 to work with if it isn't abstracted. […]
 I have evidence for this poisonous belief.  Here is some production
 C++ code I saw:
 
  [code snips]
 
 I think the root cause of those three examples can be called step by
 step thinking.  […]
 
 part of the issue may be a tradeoff:
 does the programmer think in terms of abstractions and using
 high-level overviews?
 or, does the programmer mostly think in terms of step-by-step
 operations and make use of their ability to keep large chunks of
 information in memory?
 
 it is a question maybe of whether the programmer sees the forest or
 the trees.
 
 these sorts of things may well have an impact on the types of code a
 person writes, and what sorts of things the programmer finds more
 readable.

Well, that could be tested.  Let's write some code in a procedural
way, and in a functional way.  Show it to a bunch of programmers, and
see if they understand it, spot the bugs, can extend it etc.  I'm not
sure what to expect from such tests.  One could think most people
would deal more easily with the procedural program, but on the other
hand, I expect the procedural version will be significantly more
complex, especially if it abides step by step aesthetics.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-01 Thread Ondřej Bílka
On Tue, Jan 01, 2013 at 09:12:07PM +0100, Loup Vaillant-David wrote:
 On Mon, Dec 31, 2012 at 04:36:09PM -0700, Marcus G. Daniels wrote:
  On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
  2. The programmer has a belief or preference that the code is easier
  to work with if it isn't abstracted. […]
This depends lot on context. On one end you have pile copypasted of visual 
basic code that could be easily refactored into tenth of its size.
On opposite end of spectrum you have piece of haskell code where
everything is abstracted and each abstraction is wrong in some way or
another. 

Main reason of later is functional fixedness. A haskell programmer will see a
structure as a monad but then does not see more apropriate abstractions.

This is mainly problematic when there are portions of code that are very
similar but only by chance and each requires different treatment. You
merge them into one function and after some time this function ends with
ten parameters.

 
 I have evidence for this poisonous belief.  Here is some production
 C++ code I saw:
 
   if (condition1)
   {
 if (condition2)
 {
   // some code
 }
   }
 
 instead of
 
   if (condition1 
   condition2)
   {
 // some code
   }
 
 -
 
   void latin1_to_utf8(std::string  s);
 
Let me guess. They do it to save cycles caused by allocation of new
string.
 instead of
 
   std::string utf8_of_latin1(std::string s)
 or
   std::string utf8_of_latin1(const std::string  s)
 
 -
 
 (this one is more controversial)
 
   Foo foo;
   if (condition)
 foo = bar;
   else
 foo = baz;
 
 instead of
 
   Foo foo = condition
   ? bar
   : baz;
 
 I think the root cause of those three examples can be called step by
 step thinking.  Some people just can't deal with abstractions at all,
 not even functions.  They can only make procedures, which do their
 thing step by step, and rely on global state.  (Yes, global state,
 though they do have the courtesy to fool themselves by putting it in a
 long lived object instead of the toplevel.)  The result is effectively
 a monster of mostly linear code, which is cut at obvious boundaries
 whenever `main()` becomes too long (too long generally being a
 couple hundred lines.  Each line of such code _is_ highly legible,
 I'll give them that.  The whole however would frighten even Cthulhu.
 
 Loup.
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc

-- 

The electricity substation in the car park blew up.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality (eye tracking)

2013-01-01 Thread Paul D. Fernhout

On 1/1/13 4:29 PM, Loup Vaillant-David wrote:

On Tue, Jan 01, 2013 at 03:02:09PM -0600, BGB wrote:

it is a question maybe of whether the programmer sees the forest or
the trees.

these sorts of things may well have an impact on the types of code a
person writes, and what sorts of things the programmer finds more
readable.


Well, that could be tested.  Let's write some code in a procedural
way, and in a functional way.  Show it to a bunch of programmers, and
see if they understand it, spot the bugs, can extend it etc.  I'm not
sure what to expect from such tests.  One could think most people
would deal more easily with the procedural program, but on the other
hand, I expect the procedural version will be significantly more
complex, especially if it abides step by step aesthetics.


This sounds like a great idea, and there are probably some PhDs to be 
had doing that (if it has not been done a lot already?). At least such 
research is starting though. Here is a related article about research 
using eye tracking software to find differences between how experts and 
novices look at code, with links to videos of eye movements:

http://developers.slashdot.org/story/12/12/19/1711225/how-experienced-and-novice-programmers-see-code

Here is a direct link to Michael Hansen's blog, who is a PhD student 
doing related research:

  http://synesthesiam.com/?p=218
As my fellow Ph.D. student Eric Holk talked about recently in his blog, 
I’ve been running eye-tracking experiments with programmers of different 
experience levels. In the experiment, a programmer is tasked with 
predicting the output of 10 short Python programs. A Tobii TX300 eye 
tracker keeps track of their eyes at 300 Hz, allowing me to see where 
they’re spending their time.


I imagine the same approach could be useful to look into this issue, as 
a new way to quantify what different programmers are doing in different 
situations. Here is a link to question on eye tracking code, and form 
that I see that a search on eye tracking software opencv turns up a 
bunch of stuff, where the basic theory is that the pupils change shape 
depending on where the eyes are looking:

http://stackoverflow.com/questions/8959423/opencv-eye-tracking-on-android

In theory, the more real data we have on how people actually use 
software, the better we can make designs for new computing. Eye tracking 
is one way to collect a lot of that fairly quickly, and it can do that 
in a way that is much better than just recording what a user clicks on.


I'm getting a Samsung Galaxy Note 10.1 tablet as a next step towards a 
Dynabook. I chose that one mostly because it comes with a pressure 
sensitive pen as an input device. Apparently, that tablet still is not 
as good as a fifteen year old Apple Newton in some ways, and so is yet 
another example of technology regressing for reasons of strong copyright 
and generational turn-over. Related:

http://myapplenewton.blogspot.com/2012/10/apple-newton-still-beats-samsung-galaxy.html
http://myapplenewton.blogspot.com/2012/12/apple-newton-replacement-candidate.html

But I mention that tablet because one other feature is that the 
Samsung tablet supposedly uses the built-in front-facing camera to look 
to see whether the user's pupils are visible. If the tablet can't see 
the users pupils, then it goes into power saving mode. Of course, that 
is probably one of the first features I'll turn off in case it got 
hacked. :-) Still, there is a lot of possible promise there perhaps 
where the tablet could provide information or functionality more 
effectively somehow using that information about where I was looking. 
I've also heard you can make a fairly cheap eye tracker with a pair of 
glasses and some infrared LEDs and receivers, which sounds less 
intrusive than using cameras.


Anyway, I feel it is quite possible what would be found from that sort 
of eye tracking research on programmers is that, as in other domains of 
life, people have various characteristics, preferences, habits, skills, 
and so on at some particular time in their life, and those can be 
strengths or weaknesses depending on the context. A big part of whether 
a programmer is productive probably has to do with whether they are in 
flow, which in turn depends on how their current abilities relate to 
the current task (which is why many game developers make levels of their 
games progressively harder as players get better at them). So, even if 
we find that some programmers look at code differently than others based 
on experience or aptitude, that still does not mean that there is likely 
to be one type of programmer who is going to solve all the worlds 
programming problems, and nor would that mean that there is one kind of 
IDE that would satisfy all programmers at all stages of their careers. 
(Again, DrScheme/PLTScheme/Racket's language levels are a step towards 
this.)


Many of those programmers best at abstraction and with years of 
experience might probably just 

[fonc] Incentives and Metrics for Infrastructure vs. Functionality (was Re: Linus Chews Up Kernel Maintainer...)

2012-12-31 Thread Paul D. Fernhout

On 12/31/12 1:39 PM, Marcus G. Daniels wrote:

Of course, there is rarely the time or incentive structure to do any of
this.  Productive programmers are the ones that get results and are fast
at fixing (and creating) bugs.  In critical systems, at least, that's
the wrong incentive structure.  In these situations, it's more important
to reward people that create tests, create internal proofs, and refactor
and simplify code.  Having very dense code that requires investment to
change is a good thing in these situations.


Programming for the broadcasting industry right now (where a few seconds 
downtime might cost millions of dollars), I especially liked your point, 
Marcus. I live within this tension every day, as I imagine so do to an 
even higher degree aircraft software designers, medical system 
designers, automotive software designers, and so on where many lives are 
at risk from a bug. Certainly the more unit tests that code has, the 
more dense the code might feel, as the more resistant to casual change 
it can become, even as one may be ever more assured that the code is 
probably doing what you expect most of the time. And the argument goes 
that such denseness in terms of unit tests may actually give you more 
confidence in refactoring. But I can't say I started out feeling or 
programming that way.


The movie The Seven Samurai begins with the villagers having a big 
conceptual problem. How do the agriculturalists know how to hire 
competent Samurai, not being Samurai themselves? The villagers would 
most likely be able to know the difference in a short time between an 
effective and ineffective farm hand they might hire (based on their 
agricultural domain knowledge) -- but what do farmers know about 
evaluating swordsmanship or military planning? Likewise, an end user may 
know lots about their problem domain, but how can users tell the 
difference between effective and ineffective coding in the short term? 
How can users distinguish between software that just barely works at 
handling current needs and, by contrast, software that could handle a 
broad variety of input data, which could be easily expandable, and which 
would detect through unit tests unintended consequences of coding 
changes? That is meant mostly rhetorically -- although maybe a more 
on-topic question for this list would be how do we create software 
systems that somehow help people more easily appreciate or understand or 
visualize that difference?


Unless you know what to look for (and even sometimes if you do), it is 
hard to tell whether a programmer spending a month or two refactoring or 
writing tests is making the system better, or making the system worse, 
or maybe just is not doing much at all. Even worse from a bean counter 
perspective, what about the programmer who claims to be spending time 
(weeks or months) just trying to *understand* what is going on? And 
then, what if after apparently doing nothing for weeks, the programmer 
then removes lots of code? How does one measure that level of apparent 
non-productivity or even negative-productivity? A related bit of history:

  http://c2.com/cgi/wiki?NegativeLinesOfCode
A division of AppleComputer started having developers report 
LinesOfCode written as a ProductivityMetric?. The guru, BillAtkinson, 
happened to be refactoring and tuning a graphics library at the time, 
and ended up with a six-fold speedup and a much smaller library. When 
asked to fill in the form, he wrote in NegativeLinesOfCode. Management 
got the point and stopped using those forms soon afterwards.


If there is a systematic answer, part of it might be in having lots of 
different sorts of metrics for code, like in the direction of projects 
like Sonar. I don't see Sonar mentioned on this list, at least in the 
past six or so years. Here is a link:

  http://www.sonarsource.org/
Sonar is an open platform to manage code quality. As such, it covers 
the 7 axes of code quality: Architecture  Design, Comments, Coding 
rules, Potential bugs, Complexity, Duplications, and Unit tests


We tend to get what we measure. So, are these the sorts of things new 
computing efforts should be measuring?


Obviously, users can generally see the value of new functionality, 
especially if they asked for it. And thus there is this tension between 
infrastructure and functionality. This tension is especially strong in 
the context of black swan situations where chances are some rare thing 
will never happen, and if it does, someone else will be maintaining the 
code by then. How does one create incentives (and supporting metrics) 
related to that? In practice, this tension may sometimes get resolved by 
spending some time on refactoring and tests that the user will not 
appreciate directly and some time of obvious enhancements users will 
appreciate. Of course, this will make the enhancements seem to take 
longer than they might otherwise in the short term. And it can be an 
organizational and personal challenge 

Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality (was Re: Linus Chews Up Kernel Maintainer...)

2012-12-31 Thread Marcus G. Daniels

On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
Unless you know what to look for (and even sometimes if you do), it is 
hard to tell whether a programmer spending a month or two refactoring 
or writing tests is making the system better, or making the system 
worse, or maybe just is not doing much at all.
Sometimes I see codes that have data structure traversal procedures cut 
and pasted into several modules without any apparent effort to abstract 
them into a map/operation high order form or a parameterized macro 
expansion.


I think for reasons like:

1. The logic to do the traversal is intermingled with the operation to 
be done at each site.  The programmer knows the outcome of the whole 
operation from the context they cribbed the code, but they don't want to 
think about the details.  And why should they `reduce their 
productivity' (e.g. lines of code per unit time) by refactoring the 
cribbed code if the original author couldn't be bothered?


2. The programmer has a belief or preference that the code is easier to 
work with if it isn't abstracted.
It's all right in front of them in the context they want it. Perhaps 
they are copying the code from foreign modules they don't maintain and 
don't want to.  They don't think in terms of global redundancy and the 
cost for the project but just their personal ease of maintenance after 
they assimilate the code.  If they have to study any indirect 
mechanisms, they become agitated or lose their momentum.


3. The programmer thinks they are smarter than the compiler and that the 
code will be faster if they inline everything and avoid function calls.


In any case, a manager confronted with this situation can choose to 
reward individuals who slow this type of proliferation.  An objective 
justification being an increase in the percentage of lines touched by 
coverage analysis profiles across test suites, unit tests, etc.


Marcus
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality effective abstraction

2012-12-31 Thread Paul D. Fernhout

On 12/31/12 6:36 PM, Marcus G. Daniels wrote:

2. The programmer has a belief or preference that the code is easier to
work with if it isn't abstracted.
It's all right in front of them in the context they want it. Perhaps
they are copying the code from foreign modules they don't maintain and
don't want to.  They don't think in terms of global redundancy and the
cost for the project but just their personal ease of maintenance after
they assimilate the code.  If they have to study any indirect
mechanisms, they become agitated or lose their momentum.


A lot of programmers don't seem to get abstraction (I say this having 
taught programming at the college level), and that is probably a major 
reason code has more duplication and lack of conciseness than it should. 
Nonetheless, a lot of programmers who don't really get abstraction 
(including ideas like recursion) can still get a lot of useful-to-them 
stuff done by writing programs within their own comfort zone. My guess 
is that 80% to 90% or more of people who actually make their living 
programming these days fall into this category of not really being 
able to deal well with abstraction to any great degree much beyond being 
able to write a class to represent some problem domain object. So, such 
developers would not be very comfortable thinking about writing the code 
to compile those classes etc. -- even if many of them might understand a 
problem domain very well. These are the people who love Visual Basic. 
:-) And probably SQL. :-) And they love both of them precisely for all 
the reason that probably many people on this list may feel very 
constrained and unhappy when using either of those two -- stuff like 
abstraction or recursion in the case of SQL is hard to do in them. So, 
code in such languages in practice tends to stay within the abstraction 
limits of what such programmers can understand without a lot of effort.


Still, those who do get abstraction eventually realize that every layer 
of abstraction imposes its own cost -- both on the original developer 
and any subsequent maintainer. One can hope that new computer languages 
(whatever?) and new programming tools (like for refactoring) and new 
cultural habits (unit tests) could reduce those costs, but it does not 
seem like we are quite there yet -- although there are some signs of 
progress. Often invention is about tradeoffs, you can have some of this 
(reduced redundancy) if you give up some of that (the support code being 
straightforward). True breakthroughs come when someone figures out how 
to have lots of both (reduced redundancy and the supporting code being 
straightforward.


Related:
  http://en.wikipedia.org/wiki/Indirection
A famous aphorism of David Wheeler goes: All problems in computer 
science can be solved by another level of indirection;[1] this is often 
deliberately mis-quoted with abstraction layer substituted for level 
of indirection. Kevlin Henney's corollary to this is, ...except for 
the problem of too many layers of indirection.


My programming these days is a lot less abstract (clever?) than it used 
to be, and I don't think that is mostly because I am less sharp having 
programmed for about thirty years (although it is true that I am indeed 
somewhat less sharp than I used to be). I'm willing to tolerate somewhat 
more duplication than I used to because I know the price that needs to 
be paid for introducing abstraction. Abstractions need to be maintained. 
They may be incomplete. They may leak. They are often harder to debug. 
They can be implemented badly because a couple examples of something are 
often not enough to generalize well from. And so on (much of this known 
from personal experience). So, these days, I tend more towards the 
programming language equivalent of the Principle of least privilege, 
where I would like a computer language where, somewhat like DrScheme, I 
could specify a language level for every section of code to reduce the 
amount of cleverness allowed in that code. :-) That way, the Visual 
Basic like parts might be 90% of the code, and when I got into sections 
that did a lot of recursion or implemented parsers or whatever, it would 
be clear I was in code working at a higher level of abstraction. 
Considering how you can write VMs in Fortran, maybe that idea would not 
work in practice, but it is at least a thing to think about. Related:

http://en.wikipedia.org/wiki/Principle_of_least_privilege
http://c2.com/cgi/wiki?PrincipleOfLeastPrivilege

And on top of that, it is quite likely that future maintainers who have 
more problems with abstractions. So, the code in practice is likely to 
be less easily maintained by a random programmer. So, I now try to apply 
cleverness in other directions. For example, I might make tools that may 
stand outside the system and help with maintaining or understanding it 
(where even if the tool was not maintained, it would have served its 
purpose and given a return on time invested). Or I