(This is 2100 words, or almost 10 minutes of reading, and could probably become 800 words productively. I might do that later.)
Support for Disagreement ------------------------ In http://shirky.com/writings/ontology_overrated.html "Ontology is Overrated," Clay Shirky describes one advantage of tagging systems like del.icio.us as follows: * Market Logic - ... we're moving towards market logic, where you deal with individual motivation, but group value. As Schachter says of del.icio.us, "Each individual categorization scheme is worth less than a professional categorization scheme. But there are many, many more of them." ... The other essential value of market logic is that individual differences don't have to be homogenized. ... with tagging, anyone is free to use the words he or she thinks are appropriate, without having to agree with anyone else about how something "should" be tagged. Market logic allows many distinct points of view to co-exist, because it allows individuals to preserve their point of view, even in the face of general disagreement. So support for individual points of view amidst general disagreement is one of the benefits of del.icio.us over dmoz or Yahoo, and it's built into the architecture of the system --- it's not just a social practice. Could Wikipedia's architecture change to support divergent points of view better? In some cases, I think that some technical advance would help with solving some social problem; for example, I think crazy old Tom Lord may have been correct that better source-code version-tracking systems could allow more people to collaborate productively on software by allowing any user to maintain their own stream of development into which they merge patches as easily as the official maintainer, a feature CVS does not support. There are some people who respond to any such suggestions with the aphorism that technical solutions cannot solve social problems. I think this aphorism contains seeds of truth and seeds of falsehood. It is partly true, in that existing institutions and patterns of interaction often contain internal problems that do not depend on any technical infrastructure. It is also partly false, however, for three reasons. When Technological Artifacts Can Solve Social Problems ------------------------------------------------------ First, technological artifacts --- computers, source-control systems, whatever --- are not purely value-neutral. They embed ways of thinking and expectations about social interactions that reflect the environments in which they evolved. For example, Unix provides no mandatory access controls, because it evolved in an environment whose users weren't trying to prevent other users from sharing information with each other, and similarly for the internet; and now the DSL and cable-modem parts of today's internet often provide no permanent IP address users can use to publish information, because they were developed for consumers, not participants. As a third example, Unix was developed in an environment of extreme literacy, and consequently many things about it are unduly difficult for people of limited literacy. When an institution adopts a technological artifact, it invariably changes somewhat to accommodate the cultural expectations embedded in the new artifact, if only by working around them. Often this causes some changes in the structure of the institution. However, these changes do not merely change the institution into a copy of the social environment that developed the artifact, and in some cases may make the institutions less alike. Problems inherent in the existing relationships of the institution usually survive the adoption of new artifacts. For example, many companies using Unix simply prohibit "ordinary users" from logging into a Unix server, in order to prevent them from sharing information with one another. The consequence is sometimes to increase effective controls on information-sharing. These changes are hard to predict. William Gibson summed this up in his line, "The Street finds its own uses for things --- uses the manufacturers never imagined." Second, some social problems are simply the result of technical problems. For example, nuclear power plants promote centralization of energy-generating capacity, and consequently concentrate the wealth generated by production of energy in the hands of the small number of people who own the power plants. This social problem of centralization, however, is in part the result of technical problems with reactor safety and nuclear proliferation. Third, new technical artifacts can support the existence of new institutions, and those institutions may have different structures from the existing institutions --- and they may crowd them out. For example, the telephone made possible single companies that operated many factories by providing a higher-bandwidth non-market means of coordinating, and email mailing lists support private topical discussion groups among geographically-distributed groups of people with a common niche insterest. So technical artifacts can create social problems, and by the same token, solve them. They can't solve all social problems, and the way that existing institutions make use of the artifacts depends closely on the details of the existing institution, so it is very difficult to predict what will happen. Source Control as an Example ---------------------------- The particular kind of interaction that decentralized version control tools, such as Tom Lord's "arch", Codeville, monotone, and darcs (as well as a proprietary system) aim to support is something like this. Many people have versions of a piece of software; the software is broadly similar from person to person. Many of the people are making changes to their local version, or "tree"; all of them select changes from other people's versions to add to their own. It's possible to move changes from one "tree" to another to the extent to which they share common structure, but the structure is of course a function of these changes over time. This differs from the CVS approach, in which there's a single "trunk" stream of development that the maintainer or maintainers adds new versions to, and other contributors (if they exist at all) contribute by emailing patches to a maintainer, who can then commit them to the trunk. The other contributor may have their own local CVS repository, but its ability to merge in changes from new "official" versions is limited. If an organization tries to adopt the "arch" model of the world while using CVS, they have several choices. They can treat each person's work area as a separate stream of development, and the CVS repository as yet another stream; this means that only one of the streams has version tracking. Worse, if people try to exchange patches directly, they waste a lot of time resolving spurious merge conflicts when they both try to commit the same set of changes to the CVS repository. They can create a separate CVS repository for each person; this preserves version-tracking for each person, but makes getting code changes back and forth considerably more difficult, and doesn't eliminate the spurious merge conflicts. They can use a separate CVS branch for each person; this makes CVS very slow, but retains version-tracking for each person, and makes getting code back and forth a little easier (though still dramatically more difficult than ordinary CVS use); it still doesn't eliminate the merge conflicts. The nearly universal choice, with CVS, is to adopt a more centralized model in which the CVS repository trunk is the Official Tree, even in situations where that is costly; for example, in situations where it isn't yet clear which design choice is better, or where you have to support customers with an old version or a very strange hardware setup. So here we see a technical problem --- CVS's limitations, stemming from the limited social dynamics it was built to support --- reflecting itself in social problems. "arch", darcs, monotone, Codeville, git, and other decentralized version-tracking systems aim to support a wider array of development models; in particular, they aim to allow each person's tree to stand alone as a first-class citizen, easily sharing its changes with other similar trees. Imagine Wikipedia Decentralized ------------------------------- Imagine that we applied one of these systems to Wikipedia. We would have several benefits: tolerance of controversy, disconnected operation, higher availability, and potentially organizational decentralization. We could tolerate controversy better because Holocaust deniers would have their own version of Wikipedia, which they could modify to their heart's content. This would reduce their desire to modify the Wikipedia that everyone else reads, but it would not eliminate it. More importantly, though, allowing everyone to modify their own copy of Wikipedia conveniently, but share the changes with anyone who wanted them, would reduce the need to support changes by anonymous people on the main Wikipedia site. Perhaps a historian, or several historians, would undertake the task of merging together history-related changes from many contributors, and the main Wikipedia site would accept their changes automatically on history-related articles --- but not history-related changes from other people. In Linux, there's one fellow who decides what goes into the official kernel, another fellow who tries stuff out for a while before the first guy accepts it, and a small number of "lieutenants" who act as gathering points for patches on particular topics, like memory management or IDE support, so that the generalist fellows don't have to spend as much time looking at those things --- they can just accept en masse. And the lieutenants have subsystem maintainers who perform the same function for them. Also, there are a dozen or so distributions who take the official kernel and apply their own sets of patches, gathered from different sources. All of this is mediated by public discussion on mailing lists where people publish their patches. Of the process of selecting these people, Linus Torvalds says (in http://www.linuxworld.com/story/46051.htm): "It's not me or any other leader who picks them. The programmers are very good at selecting leaders. There's no process for making somebody a lieutenant. But somebody who gets things done, shows good taste, and has good qualities -- people just start sending them suggestions and patches. I didn't design it this way. This happens because this is the way people work. It's very natural." This sort of structure can make it considerably easier to incorporate desirable changes into the Official Tree without including undesirable ones. In Linux, perhaps the majority of changes are undesirable, but that's probably not the case for Wikipedia; so Wikipedia probably would benefit from a much more streamlined user interface for submitting and accepting patches. Presumably Holocaust deniers would have a hard time getting their changes accepted by anyone, and so they would have a hard time getting other people to contribute on their server. Right now, Wikimedia's servers are a single point of failure for all the millions of people who benefit from the whole Wikipedia project, and the thousands who contribute. Perhaps I would pull new changes from the wikipedia.org server onto my laptop each night so I could consult Wikipedia when I was offline. I might make changes to my local copy of Wikipedia, and the version-tracking system (Codeville or whatever) would keep track of those changes. When I got back online, I could submit them to wikipedia.org, or the relevant subsystem maintainers, or whatever. If the particular server I wanted to submit my changes to was down for a few days, it wouldn't be a big deal; I'd still have my local copy and my local changes, and I could still share them with other people. Presumably other people would also run their own public mirrors of Wikipedia. In fact, there are already such mirrors, mostly set up by SEO spamming scum, but right now, they don't let you edit the articles --- what would they do with the edits? Overwrite them with the next version they copy from Wikipedia? Consequently, these mirrors are mostly not very good. Imagine that they were actually making positive contributions to the Wikipedia community, though: with their own communities of contributors vetting changes, many good changes would get made and reviewed without ever having to hit the main Wikipedia site. Wikipedia now has a budget, a team of system administrators, a bandwidth bill, fund-raising drives, and banned-IP lists --- the inevitable consequence, for now, of the operational centralization of a service useful to all the people of the world. This operational centralization might happen even if the underlying software supported a more decentralized structure, but it wouldn't need to. It should be obvious that I think this centralization is a necessary evil, and I hope this approach would make it an unnecessary evil. But kernel.org --- where people download the Linux kernel --- currently has two Proliant DL585 quad-processor Opteron boxes with 24 GB of RAM attached to two one-gigabit network links and a 10-terabyte disk array. That's maybe US$100 000 of machinery (each server is about $40k) to serve a relatively small part of the world's population. So clearly the fact that copies of the kernel are all over the net doesn't dissuade people from using the canonical site. Credits ------- I greatly appreciate the help of Brett C. Smith and Rohit Khare in discussing these ideas; I also drew on the writings and speeches of Clay Shirky, Greg Kroah-Hartman, Linus Torvalds, Joel Spolsky, Tom Lord, and William Gibson.

