Last night I spent a fair amount of time going over the technical
hurdles of creating a distributed system. They were larger than I had
previously expected, but by no means unworkable.  Undoubtedly, creating
the media system to be a distributed network would require more
engineering effort in the beginning, but the undesirable implications of
creating a centralized solution instead vastly outweigh the amount of
effort it would take to design and build this thing right.

The issue: A fundamental design decision in how to build the media
module / media network functionality.  The choice is between a
centralized submission / viewing media site - or a distributed media
network that is aggregated up to a centrally searchable site.  

Part 1: The problems of a Distributed Media Network

The biggest technical concerns Ping has about a decentralized solution
are listed here:
http://www.hack4dean.org/phpwiki/index.php?CentralizedMediaDatabase

Here is my take on these issues:

* Solving the taxonomy problem is fairly easy.  Either have the media
nodes feed down the taxonomy from a central source, or give stiff
warnings to admins to might decide to come up with their own vocabs
warning that if they did the aggregator may not understand or make
available their local media. 

 * Using Drupals taxonomy system to categorize media on the aggregator
poses  a problem if you want to have a GUID on media such as
[EMAIL PROTECTED] - which would be required on a decentralized system,
because the UID's are stored as ints in the table.  The easiest solution
to this problem is to map strings to numbers in the UID and hash the
UID's as ints in the table. Not that hard to solve this, but takes some
time.
 
 * Ping want's to implement a linking feature that allows media in the
database to point to other media that it is a derivative work of, or are
derivitave works of it.  I.E. Jane posts a song - Bob downloads it and
makes a flash movie and uses it - Bob links his movie to Jane's song
when he puts it up on the DB - Then anyone who downloads either will see
the two items are linked.  This is a very cool idea, but would be
somewhat tricky to implement with distributed network because of
aggregation race conditions.  This is a solvable problem, but not
trivial.

Before I go into detail on what I think are the fallibilities of the
centralized solution - what are you takes on these technical hurdles?
Do you think these issues are that hard to overcome? Do you have ideas /
shortcuts for how to do it? Would you be willing to help solve them? (I
am more than willing).

Ok Part 2 - The problems with a Centralized Media Network:

The problems with a Centralized solution are not the technical or
implementation problems, they are design problems. I am convinced that
in terms of manhours / # of headaches / potential for disaster creating
a centralized system will be vastly more problematic than if we create a
distributed media network.

 * The biggest problem I see with creating a centralized media site is
hosting will get out of hand.  For a while it would be possible to host
the site with in-kind donations, but past the primaries it will become
impossible.  The only solution at that point (due to FEC laws) would be
to hand the keys over to DFA. This would require vetting of all media,
and is not much of an option. The central solution requires one site on
one host being responsible for all submissions, all filtering, all
searching, and being pinged relentlessly by nodes checking for updates.
This would quickly get way out of hand.  A decentralized system would
greatly reduce the strain on one central server (by offloading the task
to the nodes on the network). Also - if the costs of hosting the site
still get out of hand even when the site is pared down to categorizing /
aggregating / searching only - it would not be a disaster.  DFA could
pick up the tab on the aggregator.  If the site is just aggregating -
then perhaps they wouldn't even have to vet.  And if they did this would
not interfere with the media on the nodes, just the media accessible on
the central aggregator. 

 * Nodes will not have any say in how the media system works, and if the
DMT site goes down the entire media system is completely sunk - it would
be a disaster.  There are many many many issues that will plague a
centrally hosted site, messed up hosting, FEC laws - the fact that this
entire effort is volunteered.  There will absolutely be downtime for the
site, and every time the server goes down nobody gets any new media.
This is a huge problem.  A decentralized network is vastly more robust.
If the central site goes down, all local / state media functionality
will still work.  Also, because the central system is far simpler (the
work is spread over the network) there is far less to go wrong on the
one site.  Having all the working parts on one server is asking for it.

* Having one central site means that that the job of admin'ing / vetting
is bottlenecked at that one site.  This means that we would have to have
someone watching the media items posted 24/7, or require all items to be
vetted by admins before being posted - in terms of man hours this is a
HUGE issue.  With a decentralized solution this task is offloaded - or
atleast shared by the node admins.  

 * A centralized system flies in the face of the most fundamental and
important network design principals.  

End to End principal:
http://www.wikipedia.org/wiki/End-to-end_arguments_in_system_design

A lot of very smart people (smarter than us) with a lot of experience in
designing networks have tackled similar issues before, and unequivocally
the answer is: distributed is better than centralized.

What does everyone else think?

-Zack

Reply via email to