Last night I spent a fair amount of time going over the technical hurdles of creating a distributed system. They were larger than I had previously expected, but by no means unworkable. Undoubtedly, creating the media system to be a distributed network would require more engineering effort in the beginning, but the undesirable implications of creating a centralized solution instead vastly outweigh the amount of effort it would take to design and build this thing right.
The issue: A fundamental design decision in how to build the media module / media network functionality. The choice is between a centralized submission / viewing media site - or a distributed media network that is aggregated up to a centrally searchable site. Part 1: The problems of a Distributed Media Network The biggest technical concerns Ping has about a decentralized solution are listed here: http://www.hack4dean.org/phpwiki/index.php?CentralizedMediaDatabase Here is my take on these issues: * Solving the taxonomy problem is fairly easy. Either have the media nodes feed down the taxonomy from a central source, or give stiff warnings to admins to might decide to come up with their own vocabs warning that if they did the aggregator may not understand or make available their local media. * Using Drupals taxonomy system to categorize media on the aggregator poses a problem if you want to have a GUID on media such as [EMAIL PROTECTED] - which would be required on a decentralized system, because the UID's are stored as ints in the table. The easiest solution to this problem is to map strings to numbers in the UID and hash the UID's as ints in the table. Not that hard to solve this, but takes some time. * Ping want's to implement a linking feature that allows media in the database to point to other media that it is a derivative work of, or are derivitave works of it. I.E. Jane posts a song - Bob downloads it and makes a flash movie and uses it - Bob links his movie to Jane's song when he puts it up on the DB - Then anyone who downloads either will see the two items are linked. This is a very cool idea, but would be somewhat tricky to implement with distributed network because of aggregation race conditions. This is a solvable problem, but not trivial. Before I go into detail on what I think are the fallibilities of the centralized solution - what are you takes on these technical hurdles? Do you think these issues are that hard to overcome? Do you have ideas / shortcuts for how to do it? Would you be willing to help solve them? (I am more than willing). Ok Part 2 - The problems with a Centralized Media Network: The problems with a Centralized solution are not the technical or implementation problems, they are design problems. I am convinced that in terms of manhours / # of headaches / potential for disaster creating a centralized system will be vastly more problematic than if we create a distributed media network. * The biggest problem I see with creating a centralized media site is hosting will get out of hand. For a while it would be possible to host the site with in-kind donations, but past the primaries it will become impossible. The only solution at that point (due to FEC laws) would be to hand the keys over to DFA. This would require vetting of all media, and is not much of an option. The central solution requires one site on one host being responsible for all submissions, all filtering, all searching, and being pinged relentlessly by nodes checking for updates. This would quickly get way out of hand. A decentralized system would greatly reduce the strain on one central server (by offloading the task to the nodes on the network). Also - if the costs of hosting the site still get out of hand even when the site is pared down to categorizing / aggregating / searching only - it would not be a disaster. DFA could pick up the tab on the aggregator. If the site is just aggregating - then perhaps they wouldn't even have to vet. And if they did this would not interfere with the media on the nodes, just the media accessible on the central aggregator. * Nodes will not have any say in how the media system works, and if the DMT site goes down the entire media system is completely sunk - it would be a disaster. There are many many many issues that will plague a centrally hosted site, messed up hosting, FEC laws - the fact that this entire effort is volunteered. There will absolutely be downtime for the site, and every time the server goes down nobody gets any new media. This is a huge problem. A decentralized network is vastly more robust. If the central site goes down, all local / state media functionality will still work. Also, because the central system is far simpler (the work is spread over the network) there is far less to go wrong on the one site. Having all the working parts on one server is asking for it. * Having one central site means that that the job of admin'ing / vetting is bottlenecked at that one site. This means that we would have to have someone watching the media items posted 24/7, or require all items to be vetted by admins before being posted - in terms of man hours this is a HUGE issue. With a decentralized solution this task is offloaded - or atleast shared by the node admins. * A centralized system flies in the face of the most fundamental and important network design principals. End to End principal: http://www.wikipedia.org/wiki/End-to-end_arguments_in_system_design A lot of very smart people (smarter than us) with a lot of experience in designing networks have tackled similar issues before, and unequivocally the answer is: distributed is better than centralized. What does everyone else think? -Zack
