idea(s) on open vs closed src

Michael Orion Jackson Thu, 26 Aug 1999 23:37:34 -0700
        OK, I'm sorry if this whole post is kinda disjointed.  I just had
this idea on the bus ride back home yesterday from not-work[1].  Also,
this is fairly divergent from the normal list traffic, so if you are more
into concrete discussions and linux shop talk, skip this.  :^D

[1] I was there but I wasn't officially there.

        I can't even truly quantify this as one single idea per se, but
rather as connections I began to see among various areas I had not
previously seen as related.  See how many of the following things you can
tie together and to the closed vs open source thing: probability,
thermodynamics (specifically entropy), economics (specifically the
behavior of companies in non-perfect competitive situations, esp.
monopolistic behavior), sociology (patterns of collective behavior),
software design (like the Mythical-Man Month book I just started reading a
few days ago :D, esp. the communications overhead of communal processes
vs. the direct work requirements and the design of large systems),
evolutionary biology (esp. continuous vs. punctuated equilibrium), and
information theory.  Now, I freely admit that I'm not an expert in any of
those fields (of course, that does depend on your definition of expert,
which probably doesn't entail a finite expertise limit anyway ;^D), but
the bits and pieces I have encountered seem to indicate in a way more
sophisticated than my own (and no doubt shared on this list) feeling that
open source is just fundamentally better than proprietary software, on the
pratical as well as philosophical level[2].

[2]...in most cases (money is a good motivator for getting it done RIGHT
NOW and DEAD BANG PERFECT (like embedded software in medical devices, as
was on /. a while ago), and money typically implies closed source,
for now at least)

        So, let me lay out some of the things I see.  I'm sorry if all
this seems tangled or pretentious.  I just want to type it out to work it
out more in my head and maybe get feedback and maybe throw something
different into the list for a change of pace...  

        The entropy of closed systems increases[3], presuming that some
change is taking place.  If you define the system as being a particular
piece of software, then the entropy would be the increasing disorder of
the code as the code base grows larger (the change) or is modified in some
other way (disorder leading to flaws in the overall logical structure of
the program, aka that most hated insect, the bug :D).

[3]See pg 99-105 so in P.W. Atkins's _Physical Chemistry_ 6th ed.  His
wording is a little different, but can be summed up by saying that the
total change in entropy for the total (system + surroundings) must be > 0
for any spontaneous  process (spontaneous = will occur without outside
interference, like gas expanding or hot things cooling).  In closed
systems, because no outside interference is possible, only spontaneous
processes are 'allowed,' so deltaS must be > 0. (sub side note, if you
define the system to be the universe as a whole, by definition it is
closed (an infinte closed system...) (unless you're heavy into cosmology,
which I am not), so the net change in entropy for ANYTHING is > 0 (like,
you made an extension to Taylor Hall out of random bits of wood and pipe
and concrete, but in so doing lots of hydrocarbons were converted to
smaller organic compounds (by both man and machine), and lots of heat was
released into the surroundings (heat causes disorder, see Atkins)), which
leads to a pretty solid form of nihilism in that no matter how hard you
try to introduce order into your world all you do is increase the level of
disorder leading to the eventual heat death of the universe)

        The economics bit I'm less up on.  Hey, I'm a chem major[4], not
an eco major.  :^D  (The eco 304k book here is pretty good, principles of
microeconomics, and covers all of this in much more detail and far more
lucidly than I can.)  Essentially, in a perfectly competitive market,
every firm has essentially identical product, there are no artificial
barriers to entry or exit (to or from the market), every firm has perfect
information about the conditions of the market, and all consumers are
equally informed.  Firms become price-takers in that they must sell their
product at whatever the market dictates as being the fair price in order
to sell any (or less than that if they wish to make less per unit, but
then all their competitors find out and in short order do the same thing
in order to retain business).   However, the real world is obviously not
composed of perfect markets (although some might come close, c.f. the
efficient market hypothesis relating to securities trading).  I won't get
into how firms behave in monopolistic markets, as that would take pages
upon pages, but briefly: the monopolistic firm tends to underproduce
relative to what would be efficient _for society_ and overprice.
Perfectly competitive markets tend to produce the optimally efficient mix
of products _for society_.  ("Should we import steel or butter, or some
combination thereof?  Well, steel makes us strong, whereas butter only
makes us fat." paraphrase of Mao iirc)  Closed software is an inherent
attempt to monopolize ideas and the implementation of ideas (for profit).

[4]Well, actually I'm a BSCS major now but I'm going to finish my BSChem
too... :)  And I'm doing something called the business foundation program.
I think I'll have in the vicinity of 230 credit hours when I'm done.  How
masochistic is that?
        
        I've never had a formal class of the patterns of collective
behavior, but from my minimal exposure to the study of sociology, the
salient observation seems that most cultures value cooperative solutions.
The scale of the cooperative unit varies though.  "We'll build this well
together for our village but not to help those bastards over the hill."
Beyond that size, we'd rather band together as a group to profit from
everyone not in our group.  Will we ever expand the size of a cooperative
group to include everyone?  Maybe, but probably not anytime soon.  Of
course, the increased globalization of society (internet, fast travel,
etc.) seems to be expanding the size of our cooperative groups more
everyday...

        Probability (and by extension statistics), as a field of study in
its own right, has struck me as deathly boring (raise your hand if
you've ever been excited about a gaussian distribution for its own
sake)...but it's applications in other fields can be downright zippy
(apply that same gaussian distribution (or better yet a normal curve) to
things like evolutionary genetics, data analysis, societal progress in
terms of ideas per quanta of time (probability of a major advance, blah
blah), or a million other things).  Assume that a given as-yet unwritten
piece of software has a certain finite amount of programmer effort
required for it to be written in a debugged, useful form (similar argument
for a feature in a piece of larger software), and that any given
programmer's contriution to that effort is described by a given
probability distribution (one day they might be really inspired, the next
day they might be really sick or have been hit by a bus).  The more
programmers you throw at a given problem, within certain limits (see the
paragraph below about software engineering and the MMM), the greater the
amount of work is done on the project per unit time on average (and this
average becomes considerably more stable over time as more programmers are
added to the project[5]).

[5]see Moore's _Basic Practice of Statistics_ or any other good stat book
you choose, or a science text that has sections on the effect on
experimental precision of repeated experimental iterations (refinement of
the average for increaseing n observations, basically)

In open systems, there is _no effective barrier to adding programmers to
the project_.  Anyone with knowledge of the project and skills that they
feel sufficient can contribute.  Bugs are fixed faster, as more
programmers have a higher probability of finding them by simple chance.
New features are thought up faster, for a given probability of inspiration
for a given feature.  Etc.  Closed systems, however, suffer from the fact
that only those permitted to look at the source can contribute[6], and
typically only those employed by the source owner can see the code.  Even
for very large corporations, their pool of programmers is going to be
smaller than the number of programmers extant in totality.  (For the open
vs. closed barrier of entry see the economics paragraph...)

[6] (leaving aside things like development kits etc., but then how
effective is contribution if you have to do it through a black box into
another room that you know not the contents of?)

        If anyone reading this happens to be going to school in Kansas,
please skip this paragraph.  :^D  Evolutionary biology is something I had
some formal classwork in, albeit not at a high level of sophistication,
and because I love tormenting street preachers I've maintained an interest
in it.  If you regard a given piece of software as being analogous to a
biologic organism (and I would argue that it would be, it has a code that
determines it's form and function that might work perfectly or might not,
it functions well in its environment or it dies out, different programs
written to perform similar tasks in a similar environment end up sharing
very many common features and overall appearance, etc.), then over time
the changes that occur in the software can be considered part of an
evolutionary process.  There is considerable deate over the rate at which
evolution occurs (beyond zero to silly hairless apes in ~3 billion years),
last I heard.  One camp favors continuous change occuring at a relatively
constant rate, and the other favors something called punctuated
equilibrium, in which things exist in a fairly constant form until
something occurs (or a pattern of somethings occuring over time) that
causes a radical period of evolutionary change leading to a new
equilibrium.  Personally, I think the latter seems the more attractive
theory, but that's just me [7].  Assume that the average rate of change is
proportional to the average rate of work being put into the project (the
probability thing from a few paragraphs back), for the former model, or
that a certain (possibly variable) amount of work is neccessary to trigger
the next equilibrium shift, for the latter.  Tie this in with the
probability thing and you arrive at the hypothesis that open source
software evolves faster than closed source software, and is thus better
suited to its particular environment by virtue of faster adaption, all
other factors remaining equal.  As supporting evidence compare the rate at
which updates to Windows are released or new stuff appears on the Windows
Update thingy to the rate at which new kernels are released for linux or
things appear on freshmeat.net...

[7] again, I'm just a poor dumb ol' chem major so any biologists in the
audience be merciful... :^D

        There is, however, a potential fly in the ointment.
Communications overhead.  As the number of people involved in a project
increases, more and more effort goes into communication and coordination
amongst the members of the team.  Beyond a certain point, adding new
members actually slows the rate at which work occurs because of teh amount
of communication needed[8].  So the quandry of what is the optimal number
of hands/eyeballs/minds to throw at a project isn't neccessarily resolved
by "Add more people!".  

[8] Definitely see Brook's _The Mythical Man Month_.  Lynn at Desertbooks
can get it for you...certainly worth $20 for those of you interested in
software engineering or SE managment.

However, the amount of overhead depends on the communications model used.
(HTTP is quite chatty, for example)  In the commercial market, from what I
understand, programming teams have quite a lot of comm. overhead tyically:
meetings, you have to obtain permission to modify the source code, more
meetings, you have to coordinate your source changes with the other
components of the programming team, more meetings, etc.  The open source
model seems a bit different, though:  any interested party can get the
source code from some (typically at one point or another centralized)
source (web site, by email from the 'leader', by ftp, etc).  The
programmer then commences to work on the source at his or her own pace,
contributing things when they seem significant, possibly coordinating in
small teams with other open source minded programmers, perhaps not.
Someone or a team of someones on the other end decides whether or not to
include the changes.  Repeat as many times as desired...  The
communications overhead is still there, but it is much smaller compared to
what it would be for a traditional programming team of the same size (can
you imagine trying to corral all several million linux developers into
_one team_? ouch...).  Further, teams form around the support roles
(mailing list maintainers, /. moderators and founders :^), web site
maintainers, etc. etc.) that allow for centralized sources of information
to exist while not bogging down active programmers with their maintenance.

        Information theory relates to this inthat it helps formally
describe the communication process.  I'll admit to a near-total lack of
knowledge on this topic on a formal level.  I have read a _little bit_ of
a pretty neat Dover[9] book called _An Introduction to Information Theory_
by Fazlollah M. Reza.  

[9] Dover publishes nice paperback reprints of technical books in a huge
array of fields that are considered classical or defining works in those
fields.  Good info cheaply.  (Albeit, sometimes not as refined in
presentation or content as subsequent books, but that's life.)

        So, after several KB of jaw flapping from me, what is the bottom
line?  I think that, for most situations[10], the open source software
development model is fundamentally superior to closed source development.
Beyond that, my gut feeling that this is so, I think that by drawing
concepts from formal fields of study and by making appropriate analogies,
it can be show on a more quantitative level that this is the case.  (I'll
leave all the experimentation and research paper writing up to those of
you actually interested in doing so (CS grad students, most likely)...
:^D).

[10]  Typically, open source software depends on folks wanting to
contribute to it.  I don't think anyone really _wants_ to work on boring
stuff like ERP software, or those that find ERP software non-boring are
rare...  So you have to _make_ people want to do it.  How do you do that?
Money, lots of it, prefereably accompanied by lots of nice secondary
benefits.  If you are a company and you have paid through the nose to
develop something and your livelyhood depends on it, your tendency to
release the source code is likely quite low...

        So anyway, this is just some stuff that's been bouncing around in
my head over the past day or so.  I hope you all found it to be
interesting reading, or least a way to kill an half an hour or so.  Pardon
any typos or grammar mistakes please, it is getting quite late as I type
this...

*****************Michael Orion Jackson******************
***********TAMS Class of 96/UT Class of 200?************
*********************Random Quote:**********************
*"Why me?" "Because you are in Natural Sciences,silly."*
********************************************************

---------------------------------------------------------------------------
Send administrative requests to [EMAIL PROTECTED]
idea(s) on open vs closed src

Reply via email to