Re: [fonc] Linus Chews Up Kernel Maintainer For Introducing Userspace Bug - Slashdot

2013-01-01 Thread BGB

On 12/31/2012 10:47 PM, Marcus G. Daniels wrote:

On 12/31/12 8:30 PM, Paul D. Fernhout wrote:
So, I guess another meta-level bug in the Linux Kernel is that it is 
written in C, which does not support certain complexity management 
features, and there is no clear upgrade path from that because C++ 
has always had serious linking problems.
But the ABIs aren't specified in terms of language interfaces, they 
are architecture-specific.  POSIX kernel interfaces don't need C++ 
link level compatibility, or even extern C compatibility 
interfaces.  Similarly on the device side, that's packing command 
blocks and such, byte by byte.  Until a few years ago, GCC was the 
only compiler ever used (or able) to compile the Linux kernel.  It is 
a feature that it all can be compiled with one open source toolchain.  
Every aspect can be improved.




granted.

typically, the actual call into kernel-land is a target-specific glob of 
ASM code, which may then be wrapped up to make all the various system calls.



as for ABIs a few things could help:
if the C++ ABI was defined *along with* the C ABI for a given target;
if the C++ compilers would use said ABI, rather than each rolling their own;
if the ABI were sufficiently general to be more useful to multiple 
languages (besides just C and C++);

...

in this case, the C ABI could be considered a formal subset of the C++ ABI.


admittedly, if I could have my say, I would make some changes to the way 
struct/class passing and returning is handled in SysV / AMD64. namely 
make it less complicated/evil, like, say, the struct is either passed in 
a single register, or passed as a reference (no decomposition and 
passing via multiple registers).


more-so, probably also provide spill-space for arguments passed as 
registers (more like in Win64).



granted, this itself may illustrate part of the problem:
with many of these ABIs, not everyone is happy, so there is a lot of 
temptation for compiler vendors to go their own way (making going mix 
and match with code compiled by different compilers, or sometimes with 
different compiler options, unsafe...).


sometimes, it may usually work, but sometimes fail, due to minor ABI 
differences.



From that thread I read that those in the Linus camp are fine with 
abstraction, but it has to be their abstraction on their terms. An 
later in the thread, Theodore T'so gave an example of opacity in the 
programming model:


a = b + /share/ + c + serial_num;

Arguing where you can have absolutely no idea how many memory 
allocations are

done, due to type coercions, overloaded operators

Well, I'd say just write the code in concise notation.  If there are 
memory allocations they'll show up in valgrind runs, for example. Then 
disassemble that function and understand what the memory allocations 
actually are.  If there is a better way to do it, then either change 
abstractions, or improve the compiler to do it more efficiently.   
Yes, there can be an investment in a lot of stuff. But just defining 
any programming model with a non-obvious performance model as a bad 
programming model is shortsighted advice, especially for developers 
outside of the world of operating systems.   That something is 
non-obvious is not necessarily a bad thing.   It just means a bit more 
depth-first investigation.   At least one can _learn_ something from 
the diversion.




yep.

some of this is also a bit of a problem for many VM based languages, 
which may, behind the scenes, chew through memory, while giving little 
control of any of this to the programmer.


in my case, I have been left fighting performance in many areas with my 
own language, admittedly because its present core VM design isn't 
particularly high performance in some areas.



though, one can still be left looking at a sort of ugly wall:
the wall separating static and dynamic types.

dynamic types is a land of relative ease, but not particularly good 
performance.
static types is a land of pain and implementation complexity, but also 
better performance.


well, there is also the fixnum issue, where a fixnum may be just 
slightly smaller than an analogous native type (it is the curse of the 
28-30 bit fixnum, or the 60-62 bit long-fixnum...).


this issue is annoying specifically because it specifically gets in the 
way of having an efficient fixnum type and also map it to a sensible 
native type (like int) while keeping the usual definition  intact that 
int is exactly 32-bits and/or that long is exactly 64-bits.


but, as a recent attempt at trying to switch to untagged value types 
revealed, even with an interpreter core that is mostly statically 
typed, making this switch may still open a big can of worms in some 
other cases (because there are still holes in the static type-system).



I have been left considering the possibility of instead making a compromise:
int, float, and double can be represented directly;
long, however, would (still) be handled as a boxed-value.

this 

[fonc] Current topics

2013-01-01 Thread Alan Kay
The most recent discussions get at a number of important issues whose 
pernicious snares need to be handled better.

In an analogy to sending messages most of the time successfully through noisy 
channels -- where the noise also affects whatever we add to the messages to 
help (and we may have imperfect models of the noise) -- we have to ask: what 
kinds and rates of error would be acceptable?

We humans are a noisy species. And on both ends of the transmissions. So a 
message that can be proved perfectly received as sent can still be 
interpreted poorly by a human directly, or by software written by humans.


A wonderful specification language that produces runable code good enough to 
make a prototype, is still going to require debugging because it is hard to get 
the spec-specs right (even with a machine version of human level AI to help 
with larger goals comprehension).

As humans, we are used to being sloppy about message creation and sending, and 
rely on negotiation and good will after the fact to deal with errors. 

We've not done a good job of dealing with these tendencies within programming 
-- we are still sloppy, and we tend not to create negotiation processes to deal 
with various kinds of errors. 

However, we do see something that is actual engineering -- with both care in 
message sending *and* negotiation -- where eventual failure is not tolerated: 
mostly in hardware, and in a few vital low-level systems which have to scale 
pretty much finally-essentially error-free such as the Ethernet and Internet.

My prejudices have always liked dynamic approaches to problems with error 
detection and improvements (if possible). Dan Ingalls was (and is) a master at 
getting a whole system going in such a way that it has enough integrity to 
exhibit its failures and allow many of them to be addressed in the context of 
what is actually going on, even with very low level failures. It is interesting 
to note the contributions from what you can say statically (the higher the 
level the language the better) -- what can be done with meta (the more 
dynamic and deep the integrity, the more powerful and safe meta becomes) -- 
and the tradeoffs of modularization (hard to sum up, but as humans we don't 
give all modules the same care and love when designing and building them).

Mix in real human beings and a world-wide system, and what should be done? (I 
don't know, this is a question to the group.)

There are two systems I look at all the time. The first is lawyers contrasted 
with engineers. The second is human systems contrasted with biological systems.

There are about 1.2 million lawyers in the US, and about 1.5 million engineers 
(some of them in computing). The current estimates of programmers in the US 
are about 1.3 million (US Dept of Labor counting programmers and developers). 
Also, the Internet and multinational corporations, etc., internationalizes the 
impact of programming, so we need an estimate of the programmers world-wide, 
probably another million or two? Add in the ad hoc programmers, etc? The 
populations are similar in size enough to make the contrasts in methods and 
results quite striking.

Looking for analogies, to my eye what is happening with programming is more 
similar to what has happened with law than with classical engineering. Everyone 
will have an opinion on this, but I think it is partly because nature is a 
tougher critic on human built structures than humans are on each other's 
opinions, and part of the impact of this is amplified by the simpler shorter 
term liabilities of imperfect structures on human safety than on imperfect laws 
(one could argue that the latter are much more of a disaster in the long run).

And, in trying to tease useful analogies from Biology, one I get is that the 
largest gap in complexity of atomic structures is the one from polymers to the 
simplest living cells. (One of my two favorite organisms is Pelagibacter 
unique, which is the smallest non-parasitic standalone organism. Discovered 
just 10 years ago, it is the most numerous known bacterium in the world, and 
accounts for 25% of all of the plankton in the oceans. Still it has about 1300+ 
genes, etc.) 

What's interesting (to me) about cell biology is just how much stuff is 
organized to make integrity of life. Craig Ventor thinks that a minimal 
hand-crafted genome for a cell would still require about 300 genes (and a 
tiniest whole organism still winds up with a lot of components).

Analogies should be suspect -- both the one to the law, and the one here should 
be scrutinized -- but this one harmonizes with one of Butler Lampson's 
conclusions/prejudices: that you are much better off making -- with great care 
-- a few kinds of relatively big modules as basic building blocks than to have 
zillions of different modules being constructed by vanilla programmers. One of 
my favorite examples of this was the Beings master's thesis by Doug Lenat at 
Stanford in the 70s. And this 

Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-01 Thread Loup Vaillant-David
On Mon, Dec 31, 2012 at 04:36:09PM -0700, Marcus G. Daniels wrote:
 On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
 2. The programmer has a belief or preference that the code is easier
 to work with if it isn't abstracted. […]

I have evidence for this poisonous belief.  Here is some production
C++ code I saw:

  if (condition1)
  {
if (condition2)
{
  // some code
}
  }

instead of

  if (condition1 
  condition2)
  {
// some code
  }

-

  void latin1_to_utf8(std::string  s);

instead of

  std::string utf8_of_latin1(std::string s)
or
  std::string utf8_of_latin1(const std::string  s)

-

(this one is more controversial)

  Foo foo;
  if (condition)
foo = bar;
  else
foo = baz;

instead of

  Foo foo = condition
  ? bar
  : baz;

I think the root cause of those three examples can be called step by
step thinking.  Some people just can't deal with abstractions at all,
not even functions.  They can only make procedures, which do their
thing step by step, and rely on global state.  (Yes, global state,
though they do have the courtesy to fool themselves by putting it in a
long lived object instead of the toplevel.)  The result is effectively
a monster of mostly linear code, which is cut at obvious boundaries
whenever `main()` becomes too long (too long generally being a
couple hundred lines.  Each line of such code _is_ highly legible,
I'll give them that.  The whole however would frighten even Cthulhu.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-01 Thread BGB

On 1/1/2013 2:12 PM, Loup Vaillant-David wrote:

On Mon, Dec 31, 2012 at 04:36:09PM -0700, Marcus G. Daniels wrote:

On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
2. The programmer has a belief or preference that the code is easier
to work with if it isn't abstracted. […]

I have evidence for this poisonous belief.  Here is some production
C++ code I saw:

   if (condition1)
   {
 if (condition2)
 {
   // some code
 }
   }

instead of

   if (condition1 
   condition2)
   {
 // some code
   }

-

   void latin1_to_utf8(std::string  s);

instead of

   std::string utf8_of_latin1(std::string s)
or
   std::string utf8_of_latin1(const std::string  s)

-

(this one is more controversial)

   Foo foo;
   if (condition)
 foo = bar;
   else
 foo = baz;

instead of

   Foo foo = condition
   ? bar
   : baz;

I think the root cause of those three examples can be called step by
step thinking.  Some people just can't deal with abstractions at all,
not even functions.  They can only make procedures, which do their
thing step by step, and rely on global state.  (Yes, global state,
though they do have the courtesy to fool themselves by putting it in a
long lived object instead of the toplevel.)  The result is effectively
a monster of mostly linear code, which is cut at obvious boundaries
whenever `main()` becomes too long (too long generally being a
couple hundred lines.  Each line of such code _is_ highly legible,
I'll give them that.  The whole however would frighten even Cthulhu.


part of the issue may be a tradeoff:
does the programmer think in terms of abstractions and using high-level 
overviews?
or, does the programmer mostly think in terms of step-by-step operations 
and make use of their ability to keep large chunks of information in memory?


it is a question maybe of whether the programmer sees the forest or the 
trees.


these sorts of things may well have an impact on the types of code a 
person writes, and what sorts of things the programmer finds more readable.



like, for a person who can mentally more easily deal with step-by-step 
thinking, but can keep much of the code in their mind at-once, and 
quickly walk around and explore the various possibilities and scenarios, 
this kind of bulky low-abstraction code may be preferable, since when 
they walk the graph in their mind, they don't really have to stop and 
think too much about what sorts of items they encounter along the way.


in their minds-eye, it may well look like a debugger stepping at a rate 
of roughly 5-10 statements per second or so. maybe they may or may not 
be fully aware how their mind does it, but they can vaguely see the 
traces along the call-stack, ghosts of intermediate values, and the 
sudden jump of attention to somewhere where a crash has occurred or an 
exception has been thrown.


actually, I had before compared it to ants:
it is like ones' mind has ants in it, which walk along trails, either 
stepping code, or trying out various possibilities, ...
once something interesting comes up, it starts attracting more of 
these mental ants, until it has a whole swarm, and then a more clear 
image of the scenario or idea may emerge in ones' mind.


but, abstractions and difficult concepts are like oil to these ants, 
where if ants encounter something they don't like (like oil) they will 
back up and try to walk around it (and individual ants aren't 
particularly smart).



and, probably, other people use other methods of reasoning about code...


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-01 Thread Loup Vaillant-David
On Tue, Jan 01, 2013 at 03:02:09PM -0600, BGB wrote:
 On 1/1/2013 2:12 PM, Loup Vaillant-David wrote:
 On Mon, Dec 31, 2012 at 04:36:09PM -0700, Marcus G. Daniels wrote:
 On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
 2. The programmer has a belief or preference that the code is easier
 to work with if it isn't abstracted. […]
 I have evidence for this poisonous belief.  Here is some production
 C++ code I saw:
 
  [code snips]
 
 I think the root cause of those three examples can be called step by
 step thinking.  […]
 
 part of the issue may be a tradeoff:
 does the programmer think in terms of abstractions and using
 high-level overviews?
 or, does the programmer mostly think in terms of step-by-step
 operations and make use of their ability to keep large chunks of
 information in memory?
 
 it is a question maybe of whether the programmer sees the forest or
 the trees.
 
 these sorts of things may well have an impact on the types of code a
 person writes, and what sorts of things the programmer finds more
 readable.

Well, that could be tested.  Let's write some code in a procedural
way, and in a functional way.  Show it to a bunch of programmers, and
see if they understand it, spot the bugs, can extend it etc.  I'm not
sure what to expect from such tests.  One could think most people
would deal more easily with the procedural program, but on the other
hand, I expect the procedural version will be significantly more
complex, especially if it abides step by step aesthetics.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality

2013-01-01 Thread Ondřej Bílka
On Tue, Jan 01, 2013 at 09:12:07PM +0100, Loup Vaillant-David wrote:
 On Mon, Dec 31, 2012 at 04:36:09PM -0700, Marcus G. Daniels wrote:
  On 12/31/12 2:58 PM, Paul D. Fernhout wrote:
  2. The programmer has a belief or preference that the code is easier
  to work with if it isn't abstracted. […]
This depends lot on context. On one end you have pile copypasted of visual 
basic code that could be easily refactored into tenth of its size.
On opposite end of spectrum you have piece of haskell code where
everything is abstracted and each abstraction is wrong in some way or
another. 

Main reason of later is functional fixedness. A haskell programmer will see a
structure as a monad but then does not see more apropriate abstractions.

This is mainly problematic when there are portions of code that are very
similar but only by chance and each requires different treatment. You
merge them into one function and after some time this function ends with
ten parameters.

 
 I have evidence for this poisonous belief.  Here is some production
 C++ code I saw:
 
   if (condition1)
   {
 if (condition2)
 {
   // some code
 }
   }
 
 instead of
 
   if (condition1 
   condition2)
   {
 // some code
   }
 
 -
 
   void latin1_to_utf8(std::string  s);
 
Let me guess. They do it to save cycles caused by allocation of new
string.
 instead of
 
   std::string utf8_of_latin1(std::string s)
 or
   std::string utf8_of_latin1(const std::string  s)
 
 -
 
 (this one is more controversial)
 
   Foo foo;
   if (condition)
 foo = bar;
   else
 foo = baz;
 
 instead of
 
   Foo foo = condition
   ? bar
   : baz;
 
 I think the root cause of those three examples can be called step by
 step thinking.  Some people just can't deal with abstractions at all,
 not even functions.  They can only make procedures, which do their
 thing step by step, and rely on global state.  (Yes, global state,
 though they do have the courtesy to fool themselves by putting it in a
 long lived object instead of the toplevel.)  The result is effectively
 a monster of mostly linear code, which is cut at obvious boundaries
 whenever `main()` becomes too long (too long generally being a
 couple hundred lines.  Each line of such code _is_ highly legible,
 I'll give them that.  The whole however would frighten even Cthulhu.
 
 Loup.
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc

-- 

The electricity substation in the car park blew up.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


[fonc] SubScript website gone live: programming with Process Algebra

2013-01-01 Thread Andre van Delft
Please allow me to to blurb the following, which is related to several 
discussions at FONC:

Our web site http://subscript-lang.org went officially live last Saturday. 
SubScript is a way to extend common programming languages, aimed to ease event 
handling and concurrency. Typical application areas are GUI controllers, text 
processing applications and discrete event simulations. SubScript is based on a 
mathematical concurrency theory named Algebra of Communicating Processes (ACP).

ACP is a 30 year old branch of mathematics, as solid as numeric algebra and as 
boolean algebra. In fact, you can regard ACP as an extension to boolean algebra 
with 'things that can happen'. These items are glued together with operations 
such alternative, sequential and parallel compositions. This way ACP combines 
the essence of compiler-compilers and notions of parallelism.

Adding ACP to a common programming language yields a lightweight alternative 
for threading concurrency. It also brings the 50 year old but still magic 
expressiveness of languages for parser generators and compiler compilers, so 
that SubScript suits language processing. The nondeterministic style combined 
with concurrency support happens to be very useful for programming GUI 
controllers. Surprisingly, ACP with a few extras even enables data flow style 
programming, like you have with pipes in Unix shell language.

For instance, to program a GUI controller for a simple search application takes 
about 15 lines of code in Java or Scala, if you do threading well. In SubScript 
it is only 5 lines; see 
http://subscript-lang.org/examples/a-simple-gui-application/

At the moment SubScript is being implemented as an extension to the programming 
language Scala; other languages, such as C, C++, C#, Java and JavaScript, would 
be possible too. The current state of the implementation is mature enough for 
experimentation by language researchers, but not yet for real application 
development. If you have the Eclipse environment with the Scala plugin 
installed, it is easy to get SubScript running with the example applications 
from our Google Code project.

We hope this announcement will raise interest from programming language 
researchers, and that some developers will get aboard on the project.

In the second half of February 2013 we will very probably give a presentation 
and a hands on workshop at EPFL in Lausanne, the place where Scala is 
developed. We hope have a SubScript compiler ready then, branched from the 
Scala compiler scalac. A more detailed announcement will follow by the end of 
January on our site.


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


[fonc] Wrapping object references in NaN IEEE floats for performance (was Re: Linus...)

2013-01-01 Thread Paul D. Fernhout

On 1/1/13 3:43 AM, BGB wrote:

here is mostly that this still allows for type-tags in the
references, but would likely involve a partial switch to the use of
64-bit tagged references within some core parts of the VM (as a partial
switch away from magic pointers). I am currently leaning towards
putting the tag in the high-order bits (to help reduce 64-bit arithmetic
ops on x86).


One idea I heard somewhere (probably on some Squeak-related list several 
years ago) is to have all objects stored as floating point NaN instances 
(NaN == Not a Number). The biggest bottleneck in practice for many 
applications that need computer power these days (like graphical 
simulations) usually seems to be floating point math, especially with 
arrays of floating point numberls. Generally when you do most other 
things, you're already paying some other overhead somewhere already. But 
multiplying arrays of floats efficiently is what makes or breaks many 
interesting applications. So, by wrapping all other objects as instances 
of floating point numbers using the NaN approach, you are optimizing for 
the typically most CPU intensive case of many user applications. 
Granted, there is going to be tradeoffs like integer math and so looping 
might then probably be a bit slower? Perhaps there is some research 
paper already out there about the tradeoffs for this sort of approach?


For more background, see:
  http://en.wikipedia.org/wiki/NaN
For example, a bit-wise example of a IEEE floating-point standard 
single precision (32-bit) NaN would be: s111  1axx    
  where s is the sign (most often ignored in applications), a 
determines the type of NaN, and x is an extra payload (most often 
ignored in applications)


So, information about other types of objects would start in that extra 
payload part. There may be some inconsistency in how hardware 
interprets some of these bits, so you'd have to think about if that 
could be worked around if you want to be platform-independent.


See also:
  http://en.wikipedia.org/wiki/IEEE_floating_point

You might want to just go with 64 bit floats, which would support 
wrapping 32 bit integers (including as pointers to an object table if 
you wanted, even up to probably around 52 bit integer pointers); see:

  IEEE 754 double-precision binary floating-point format: binary64
  http://en.wikipedia.org/wiki/Binary64


does sometimes seem like I am going in circles at times though...


I know that feeling myself, as I've been working on semantic-related 
generally-triple-based stuff for going on 30 years, and I still feel 
like the basics could be improved. :-)


Meanwhile I'm going to think about Alan Kay's latest comments...

--Paul Fernhout
http://www.pdfernhout.net/

The biggest challenge of the 21st century is the irony of technologies 
of abundance in the hands of those thinking in terms of scarcity.

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Incentives and Metrics for Infrastructure vs. Functionality (eye tracking)

2013-01-01 Thread Paul D. Fernhout

On 1/1/13 4:29 PM, Loup Vaillant-David wrote:

On Tue, Jan 01, 2013 at 03:02:09PM -0600, BGB wrote:

it is a question maybe of whether the programmer sees the forest or
the trees.

these sorts of things may well have an impact on the types of code a
person writes, and what sorts of things the programmer finds more
readable.


Well, that could be tested.  Let's write some code in a procedural
way, and in a functional way.  Show it to a bunch of programmers, and
see if they understand it, spot the bugs, can extend it etc.  I'm not
sure what to expect from such tests.  One could think most people
would deal more easily with the procedural program, but on the other
hand, I expect the procedural version will be significantly more
complex, especially if it abides step by step aesthetics.


This sounds like a great idea, and there are probably some PhDs to be 
had doing that (if it has not been done a lot already?). At least such 
research is starting though. Here is a related article about research 
using eye tracking software to find differences between how experts and 
novices look at code, with links to videos of eye movements:

http://developers.slashdot.org/story/12/12/19/1711225/how-experienced-and-novice-programmers-see-code

Here is a direct link to Michael Hansen's blog, who is a PhD student 
doing related research:

  http://synesthesiam.com/?p=218
As my fellow Ph.D. student Eric Holk talked about recently in his blog, 
I’ve been running eye-tracking experiments with programmers of different 
experience levels. In the experiment, a programmer is tasked with 
predicting the output of 10 short Python programs. A Tobii TX300 eye 
tracker keeps track of their eyes at 300 Hz, allowing me to see where 
they’re spending their time.


I imagine the same approach could be useful to look into this issue, as 
a new way to quantify what different programmers are doing in different 
situations. Here is a link to question on eye tracking code, and form 
that I see that a search on eye tracking software opencv turns up a 
bunch of stuff, where the basic theory is that the pupils change shape 
depending on where the eyes are looking:

http://stackoverflow.com/questions/8959423/opencv-eye-tracking-on-android

In theory, the more real data we have on how people actually use 
software, the better we can make designs for new computing. Eye tracking 
is one way to collect a lot of that fairly quickly, and it can do that 
in a way that is much better than just recording what a user clicks on.


I'm getting a Samsung Galaxy Note 10.1 tablet as a next step towards a 
Dynabook. I chose that one mostly because it comes with a pressure 
sensitive pen as an input device. Apparently, that tablet still is not 
as good as a fifteen year old Apple Newton in some ways, and so is yet 
another example of technology regressing for reasons of strong copyright 
and generational turn-over. Related:

http://myapplenewton.blogspot.com/2012/10/apple-newton-still-beats-samsung-galaxy.html
http://myapplenewton.blogspot.com/2012/12/apple-newton-replacement-candidate.html

But I mention that tablet because one other feature is that the 
Samsung tablet supposedly uses the built-in front-facing camera to look 
to see whether the user's pupils are visible. If the tablet can't see 
the users pupils, then it goes into power saving mode. Of course, that 
is probably one of the first features I'll turn off in case it got 
hacked. :-) Still, there is a lot of possible promise there perhaps 
where the tablet could provide information or functionality more 
effectively somehow using that information about where I was looking. 
I've also heard you can make a fairly cheap eye tracker with a pair of 
glasses and some infrared LEDs and receivers, which sounds less 
intrusive than using cameras.


Anyway, I feel it is quite possible what would be found from that sort 
of eye tracking research on programmers is that, as in other domains of 
life, people have various characteristics, preferences, habits, skills, 
and so on at some particular time in their life, and those can be 
strengths or weaknesses depending on the context. A big part of whether 
a programmer is productive probably has to do with whether they are in 
flow, which in turn depends on how their current abilities relate to 
the current task (which is why many game developers make levels of their 
games progressively harder as players get better at them). So, even if 
we find that some programmers look at code differently than others based 
on experience or aptitude, that still does not mean that there is likely 
to be one type of programmer who is going to solve all the worlds 
programming problems, and nor would that mean that there is one kind of 
IDE that would satisfy all programmers at all stages of their careers. 
(Again, DrScheme/PLTScheme/Racket's language levels are a step towards 
this.)


Many of those programmers best at abstraction and with years of 
experience might probably just 

Re: [fonc] Current topics

2013-01-01 Thread Paul Homer
My thinking has been going the other way for some time now. I see the problem 
as the need to build bigger systems than any individual can currently imagine. 
The real value from computers isn#39;t just collecting the input from a single 
person, but rather #39;combining#39; the inputs from huge groups of people. 
It#39;s that ability to unify and harmonize our collective knowledge that 
gives us a leg up on being able to rationalize our rather over-complicated 
world. 

The problem I see with components, partically a small set of large ones, is 
that as the size of a formal system increases, the possible variations explode. 
That is, if we consider a nearly trival small set of primitives, there are 
several different possible decompositions. As the size of the system grows, the 
number of decompositions grows probably exponentially or better. Thus as we 
walk up the levels of abstraction to something higher, there becomes a much 
larger set of possibilities. If what we desire is beyond any individuals 
comprehension, and there is a huge variance in the pieces that will get 
created, then we#39;ll run into considerable problems when we try to bring all 
of these pieces together. That I think is esentially where we are currently.

My sense of the problem is to go the other way. To make the peices so trivial 
that they can be combined easily. It may sound labour intensive to bring it all 
together, but then we do have the ability of computers themselves to spend 
endless hours doing mundane chores for us. The trick then would be to engage as 
many people as possible in constructing these little pieces, then bring them 
all together. In a design sense, this is not substantally different than the 
Internet, or Wikipedia. These both grew organically out of relatively small 
pieces with minimal organization, yet somehow converged on an end-product that 
is considerably larger than any individual#39;s single effort.

Paul.___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Current topics

2013-01-01 Thread Casey Ransberger
Read this guy!

On Tue, Jan 1, 2013 at 7:53 AM, Alan Kay alan.n...@yahoo.com wrote:

 The most recent discussions get at a number of important issues whose
 pernicious snares need to be handled better.

 In an analogy to sending messages most of the time successfully through
 noisy channels -- where the noise also affects whatever we add to the
 messages to help (and we may have imperfect models of the noise) -- we have
 to ask: what kinds and rates of error would be acceptable?

 We humans are a noisy species. And on both ends of the transmissions. So a
 message that can be proved perfectly received as sent can still be
 interpreted poorly by a human directly, or by software written by humans.

 A wonderful specification language that produces runable code good
 enough to make a prototype, is still going to require debugging because it
 is hard to get the spec-specs right (even with a machine version of human
 level AI to help with larger goals comprehension).

 As humans, we are used to being sloppy about message creation and sending,
 and rely on negotiation and good will after the fact to deal with errors.

 We've not done a good job of dealing with these tendencies within
 programming -- we are still sloppy, and we tend not to create negotiation
 processes to deal with various kinds of errors.

 However, we do see something that is actual engineering -- with both
 care in message sending *and* negotiation -- where eventual failure is
 not tolerated: mostly in hardware, and in a few vital low-level systems
 which have to scale pretty much finally-essentially error-free such as
 the Ethernet and Internet.

 My prejudices have always liked dynamic approaches to problems with error
 detection and improvements (if possible). Dan Ingalls was (and is) a master
 at getting a whole system going in such a way that it has enough integrity
 to exhibit its failures and allow many of them to be addressed in the
 context of what is actually going on, even with very low level failures. It
 is interesting to note the contributions from what you can say statically
 (the higher the level the language the better) -- what can be done with
 meta (the more dynamic and deep the integrity, the more powerful and safe
 meta becomes) -- and the tradeoffs of modularization (hard to sum up, but
 as humans we don't give all modules the same care and love when designing
 and building them).

 Mix in real human beings and a world-wide system, and what should be done?
 (I don't know, this is a question to the group.)

 There are two systems I look at all the time. The first is lawyers
 contrasted with engineers. The second is human systems contrasted with
 biological systems.

 There are about 1.2 million lawyers in the US, and about 1.5 million
 engineers (some of them in computing). The current estimates of
 programmers in the US are about 1.3 million (US Dept of Labor counting
 programmers and developers). Also, the Internet and multinational
 corporations, etc., internationalizes the impact of programming, so we need
 an estimate of the programmers world-wide, probably another million or
 two? Add in the *ad hoc* programmers, etc? The populations are similar in
 size enough to make the contrasts in methods and results quite striking.

 Looking for analogies, to my eye what is happening with programming is
 more similar to what has happened with law than with classical engineering.
 Everyone will have an opinion on this, but I think it is partly because
 nature is a tougher critic on human built structures than humans are on
 each other's opinions, and part of the impact of this is amplified by the
 simpler shorter term liabilities of imperfect structures on human safety
 than on imperfect laws (one could argue that the latter are much more of a
 disaster in the long run).

 And, in trying to tease useful analogies from Biology, one I get is that
 the largest gap in complexity of atomic structures is the one from polymers
 to the simplest living cells. (One of my two favorite organisms is 
 *Pelagibacter
 unique*, which is the smallest non-parasitic standalone organism.
 Discovered just 10 years ago, it is the most numerous known bacterium in
 the world, and accounts for 25% of all of the plankton in the oceans. Still
 it has about 1300+ genes, etc.)

 What's interesting (to me) about cell biology is just how much stuff is
 organized to make integrity of life. Craig Ventor thinks that a minimal
 hand-crafted genome for a cell would still require about 300 genes (and a
 tiniest whole organism still winds up with a lot of components).

 Analogies should be suspect -- both the one to the law, and the one here
 should be scrutinized -- but this one harmonizes with one of Butler
 Lampson's conclusions/prejudices: that you are much better off making --
 with great care -- a few kinds of relatively big modules as basic building
 blocks than to have zillions of different modules being constructed by
 vanilla programmers. One of my 

Re: [fonc] Wrapping object references in NaN IEEE floats for performance (was Re: Linus...)

2013-01-01 Thread BGB

On 1/1/2013 6:36 PM, Paul D. Fernhout wrote:

On 1/1/13 3:43 AM, BGB wrote:

here is mostly that this still allows for type-tags in the
references, but would likely involve a partial switch to the use of
64-bit tagged references within some core parts of the VM (as a partial
switch away from magic pointers). I am currently leaning towards
putting the tag in the high-order bits (to help reduce 64-bit arithmetic
ops on x86).


One idea I heard somewhere (probably on some Squeak-related list 
several years ago) is to have all objects stored as floating point NaN 
instances (NaN == Not a Number). The biggest bottleneck in practice 
for many applications that need computer power these days (like 
graphical simulations) usually seems to be floating point math, 
especially with arrays of floating point numberls. Generally when you 
do most other things, you're already paying some other overhead 
somewhere already. But multiplying arrays of floats efficiently is 
what makes or breaks many interesting applications. So, by wrapping 
all other objects as instances of floating point numbers using the NaN 
approach, you are optimizing for the typically most CPU intensive case 
of many user applications. Granted, there is going to be tradeoffs 
like integer math and so looping might then probably be a bit slower? 
Perhaps there is some research paper already out there about the 
tradeoffs for this sort of approach?




I actually tried this already...

I had borrowed the idea originally off of Lua (a paper I was reading 
talking about it mentioned it as having been used in Lua).



the problems were, primarily on 64-bit targets:
my other code assumed value-ranges which didn't fit nicely in the 52-bit 
mantissa;

being a NaN obscured the pointers from the GC;
it added a fair bit of cost to pointer and integer operations;
...

granted, you only really need 48 bits for current pointers on x86-64, 
the problem was that other code had been already assuming using a 56-bit 
tagged space when using pointers (spaces), leaving a little bit of a 
problem of 5652.


so, everything was crammed into the mantissa somewhat inelegantly, and 
the costs regarding integer and pointer operations made it not really an 
attractive option.


all this was less of an issue with 32-bit x86, as I could essentially 
just shove the whole pointer into the mantissa (spaces and all), and 
the GC wouldn't be confused by the value.



basically, what spaces is, is that a part of the address space will 
basically be used and divided up into a number of regions for various 
dynamically typed values (the larger ones being for fixnum and flonum).


on 32-bit targets, spaces is 30 bits, and located between the 3GB and 
4GB address mark (which the OS generally reserves for itself). on 
x86-64, currently it is a 56-bit space located at 0x7F00_.




For more background, see:
  http://en.wikipedia.org/wiki/NaN
For example, a bit-wise example of a IEEE floating-point standard 
single precision (32-bit) NaN would be: s111  1axx    
  where s is the sign (most often ignored in applications), a 
determines the type of NaN, and x is an extra payload (most often 
ignored in applications)


So, information about other types of objects would start in that 
extra payload part. There may be some inconsistency in how hardware 
interprets some of these bits, so you'd have to think about if that 
could be worked around if you want to be platform-independent.


See also:
  http://en.wikipedia.org/wiki/IEEE_floating_point

You might want to just go with 64 bit floats, which would support 
wrapping 32 bit integers (including as pointers to an object table if 
you wanted, even up to probably around 52 bit integer pointers); see:

  IEEE 754 double-precision binary floating-point format: binary64
  http://en.wikipedia.org/wiki/Binary64



yep...

my current tagging scheme partly incorporates parts of double, mostly in 
the sense that some tags were chosen mostly such that a certain range of 
doubles could be passed through unmodified and with full precision.


the drawback is that 0 is special, and I haven't yet thought up a good 
way around this issue.


admittedly I am not entirely happy with the handling of fixnums either 
(more arithmetic and conditionals than I would like).



here is what I currently have:
http://cr88192.dyndns.org:8080/wiki/index.php/Tagged_references



does sometimes seem like I am going in circles at times though...


I know that feeling myself, as I've been working on semantic-related 
generally-triple-based stuff for going on 30 years, and I still feel 
like the basics could be improved. :-)




yes.

well, in this case, it is that I have bounced back and forth between 
tagged-references and magic pointers multiple times over the years.


granted, this would be the first time I am doing so using fixed 64-bit 
tagged references.



granted, on x86-64, I will probably end up later merging a lot of this 
back into the 

Re: [fonc] Current topics

2013-01-01 Thread Casey Ransberger
Inline.

On Tue, Jan 1, 2013 at 7:53 AM, Alan Kay alan.n...@yahoo.com wrote:

 The most recent discussions get at a number of important issues whose
 pernicious snares need to be handled better.

 In an analogy to sending messages most of the time successfully through
 noisy channels -- where the noise also affects whatever we add to the
 messages to help (and we may have imperfect models of the noise) -- we have
 to ask: what kinds and rates of error would be acceptable?


Depends on the context, I'm sure. When I'm solving a Project Euler problem
in Squeak, my heart isn't broken if I manage to bork the image, because I'm
writing the code to throw it away, and nothing depends on it but my own
flights of fancy. A missile guidance system, eh, well now that's something
I'd like a bit more well tested. Etc.


 We humans are a noisy species. And on both ends of the transmissions. So a
 message that can be proved perfectly received as sent can still be
 interpreted poorly by a human directly, or by software written by humans.

 A wonderful specification language that produces runable code good
 enough to make a prototype, is still going to require debugging because it
 is hard to get the spec-specs right (even with a machine version of human
 level AI to help with larger goals comprehension).


Makes me think of debugging grammars. The grammar is quite specified, but
the specification is deceptively complex, recursive.  Hard to hold in the
lobes all at once.


 As humans, we are used to being sloppy about message creation and sending,
 and rely on negotiation and good will after the fact to deal with errors.

 We've not done a good job of dealing with these tendencies within
 programming -- we are still sloppy, and we tend not to create negotiation
 processes to deal with various kinds of errors.


Contracts. I think I might grok how we arrived upon the law metaphor.


 However, we do see something that is actual engineering -- with both
 care in message sending *and* negotiation -- where eventual failure is
 not tolerated: mostly in hardware, and in a few vital low-level systems
 which have to scale pretty much finally-essentially error-free such as
 the Ethernet and Internet.


I had a manager once who said The reason what we do isn't engineering is
people aren't dying from it often enough. Bridge collapses with people on
it, career over. Kernel panic? Tell them to reboot, and ship a hot fix as
soon as possible.

My prejudices have always liked dynamic approaches to problems with error
 detection and improvements (if possible). Dan Ingalls was (and is) a master
 at getting a whole system going in such a way that it has enough integrity
 to exhibit its failures and allow many of them to be addressed in the
 context of what is actually going on, even with very low level failures. It
 is interesting to note the contributions from what you can say statically
 (the higher the level the language the better) -- what can be done with
 meta (the more dynamic and deep the integrity, the more powerful and safe
 meta becomes) -- and the tradeoffs of modularization (hard to sum up, but
 as humans we don't give all modules the same care and love when designing
 and building them).


Right. Again, the missile guidance system (ironically?) gets more love than
my solutions to Project Euler problems.


 Mix in real human beings and a world-wide system, and what should be done?
 (I don't know, this is a question to the group.)


Don't panic:)


 There are two systems I look at all the time. The first is lawyers
 contrasted with engineers. The second is human systems contrasted with
 biological systems.

 There are about 1.2 million lawyers in the US, and about 1.5 million
 engineers (some of them in computing). The current estimates of
 programmers in the US are about 1.3 million (US Dept of Labor counting
 programmers and developers). Also, the Internet and multinational
 corporations, etc., internationalizes the impact of programming, so we need
 an estimate of the programmers world-wide, probably another million or
 two? Add in the *ad hoc* programmers, etc? The populations are similar in
 size enough to make the contrasts in methods and results quite striking.

 Looking for analogies, to my eye what is happening with programming is
 more similar to what has happened with law than with classical engineering.
 Everyone will have an opinion on this, but I think it is partly because
 nature is a tougher critic on human built structures than humans are on
 each other's opinions, and part of the impact of this is amplified by the
 simpler shorter term liabilities of imperfect structures on human safety
 than on imperfect laws (one could argue that the latter are much more of a
 disaster in the long run).


Yeah, the short term liabilities, and the yelling executives interfering
with the process. Also, being able to retroactively fix DOA systems
remotely produces weird effects that are hard to think about naturally,
e.g., working