Here is an update to my original post about DistribNet with a good deal
of information added.
I really like the general idea behind freenet, however I believe
Freenet is overly concerned about anonymity. Therefore, unless some
one talks me out of it, I am strongly considering starting my own
project called DistribNet which will be similar to Freenet but
different in a number of key areas. It will also try to avoid some of
the problems freenet has been having.
*** Comparison to Freenet
*) Focus more on speed and scalability than anonymity. The goal of
DistribNet is to be as fast or faster than the Web for any sort of
pages with reasonable popularity.
*) No fancy datastore. Use the file system for storing keys. No
attempt to disguise what is in one owns datastore. Nothing is
encrypted by default.
Since data is stored in a straight forward manner there is little
change the "datastore" will get corrupted and have to be reset.
Also, since data is no longer in a fixed size file, the size of the
data store can be controlled via both soft and hard quotas.
Finally support will be added for shrinking the datastore so that
there would be no reason anyone can not donate almost all of
there unused space to DistribNet.
*) The protocol will be well defined and kept as simple as possible.
Transferring of data from one node to another will likely use the
HTTP protocol for simplicity.
*) By default no attempt will be made to prevent other nodes from
knowing what is in another nodes datastore.
*) Data will not have to be routed though other nodes. Instead most
data will be send directly from one node to another.
Please note the "by default" part. The eventual goal is to support the
same level of anonymity that freenet offers, but that is not
DistribNet primary focus.
However, DistribNet will be the same as freenet in several key areas.
*) Will allow anyone to anonymously post content to the network
*) Completely decentralized
*) Content will be stored in a similar fashion that data is stored in
Freenet.
In addition DistribNet will differ by freenet as it is now with:
*) The ability for one to share content that is on one's hard drive
or be able to fetch content from the Web or other networks when it
is more effect to do so.
*) Searching and support for "updateable" keys will be build into the
protocol from the beginning. The searching faculty will be
designed in such a way to make message boards trivial to implement.
*) Will try very hard to keep all but the most unpopular content from
falling off the network. I have not worked all the details out yet
but basically before a node deletes content it will check to see
that other nodes have the content in there datastore. If there are
not enough nodes which have the content it won't delete it unless
it is truly unpopular content that has been around for while or can
find another node to accept the content it wants to get rid of.
By providing support for updateable keys from the start and using a
simple datastore which will be very hard to corrupt and should never
have to be reset I hope to eliminate most queries for non existent
data which I have a feeling accounts for a good deal of the network
traffic in freenet.
DistribNet may also be able to participate in other networks like
Freenet itself and Gnutella. However, due to the differences in the
way DistribNet and the other networks operate this may not be
possible.
*** Philosophy behind DistribNet
For most type of things the level of anonymity that freenet offers is
simply not needed. Even for copyrighted and censored material there
is, in general, little risk in actually viewing the information
because it is simply impractical to go after every single person who
access forbidden information. Most all of the time the lawsuits and
such are after the original distributors of the information and not the
viewers. There for DistribNet will offer the same level of anonymity
that freenet offers for distributing information, but not for actually
viewing it. However, since there *is* some information where even
viewing it is extremely risky, DistribNet will eventually be able to
provide the same level of anonymity that freenet offers, but it will
be completely optional.
I also believe that knowing what is in one owns datastore and being
able to block certain type of material from one owns node is not that
big of a deal. Unless almost everyone blocks a certain type of
information the availability of blocked information will not be
harmed. This is because even if 90% of the nodes block say, kiddie
porn, the information will still be available on the other 10% of the
nodes which, if the network is designed correctly, should be more than
enough for anyone to get at blocked information. Furthermore, since
the source code for DistribNet will be protected under the GPL or
similar license, it will be completely impractical for other to force
a significant number of nodes to block information.
*** DistribNet Architecture
I have not worked all the details of how DistribNet will work, but
here is what I have so far:
There will essentially be two types of keys. Map keys and data keys.
Map keys will be uniquely identified in a similar manner as freenet SSK
keys. Data keys will be identified in a similar manner as freenet's
CHK keys.
Map keys will contain the following information:
* Short Description
* Public Namespace Key
* Timestamped Index pointers
* Timestamped Data pointers
_At any given point in time_ each map key will only be associated with
one index pointer and one data pointer. Map keys can be updated by
appending a new index or data pointer to the existing list. By
default, when a map key is queried only the most recent pointer will
be returned. However, older pointers are still there and may be
retrieved by specifying a specific date. Thus, map keys may be
updated, but information is never lost or overwritten.
Data keys will be very much like freenet's CHK keys except that they will
not be encrypted. Since they are not encrypted delta compression may
be used to save space.
There will not be anything like freenet's KSK keys as those proved to
be completely insure. Instead Map keys may be requested with out a
signature. If there is more than one map key by that name than a list
of keys is presented sorted by popularity. To make such a list
meaning full every public key in freenet will have a descriptive
string associated with it.
Query for keys will be handled similar to freenet but instead of
returning the actual data a pointer to the node which can easily
provide the data is returned. The data can then be directly
transfered from one node to another. Once transfered the data will be
cached in the local node. If a particular node notices a large number
of query for a key that it does not have it may chose to store a copy
in its own cache therefore providing similar performance benefits that
freenet's routing provides.
*** Pseudo Code and Implementation Notes
Here is the beginning of how I think the network should function. I
only deal with data keys here and very little is done in terms of
routing or protecting the network against attacks. Also, even though
the code presented here is serial when actually implemented a good
deal of the network stuff will be done in paraller. Both by using
threads and non-blocking IO. Threads will be kept to a minimal.
I have not decided what language I will use but I most likely will be
doing this in C and C++ using POSIX system calls. It should also work
under Win32 using GCC and the CYGWIN library however I will rely on
someone else to test and debug the Win32 port as I personally hate
windows and only use it when I have to.
# Note: Functions in mixed case LikeThis will likely involve
# contacting another node over the network. Functions in lower case
# are local functions
# Global data structures
Node:
other nodes
key index
local keys
OtherNode:
id
query responce time
average download time
network distance
relability
KeyIndex:
key
last query:
try
date
query log
node list (where the data can be downloaded from)
LocalKey:
key
query log
data
QUERY_BRANCH_FACTOR = 3;
UPLOAD_BRANCH_FACTOR = 5;
# DataQuery returnes a list of nodes which can easlly make the key
# available for download or AlreadyQueried if this node has already
# been queried for a given handle
DataQuery (key, try, handle)
if already_responded(handle)
return AlreadyQueried
key_info = key_index[key];
decrement try
if try = 0 OR
(not expired key_info.last_query AND last_query.try <= try)
return key_info.node_list
canadite_nodes = select_candidate_nodes(key, FOR_QUERY)
nodes_queried = 0;
need_to_query = min(try, QUERY_BRANCH_FACTOR)
while (nodes_queried < need_to_query)
node = canadate_nodes.pop;
result = node.DataQuery(key,
random_round(try / QUERY_BRANCH_FACTOR),
handle)
if result != AlreadyQueried
push result onto key_info.node_list
increment nodes_queried
loop
key_info.last_query.try = try
.date = NOW
End
# Retrive data attemps to fetch a key from the network. It returns
# the actual data or an error
RetriveData (key)
if have_key return
try = 1
key_info = nil
until acceptable_node(key_info) or try > MAX_TRY
key_info = DataQuery(key, try, create_handle)
try = next_try_level(try);
if key_info = {}
return DataNotFound
node_to_use = best_node(key_info)
return node_to_use.Download(key)
End
# download the key from this node
Download (key)
unless have key
return DontHaveKey
return Data
End
discover
foreach other_node in other_nodes
ExchangeInfo(this_node, other_node)
loop
End
# Upload a key to the network. Returns a list of nodes the data was
# uploaded to.
Upload (key, data, try, handle)
if (will_accept_key(key))
store_key(key, data)
decrement try
canadite_nodes = select_candidate_nodes(key, FOR_UPLOAD)
nodes_queried = 0;
need_to_query = min(try, UPLOAD_BRANCH_FACTOR)
nodes_with_data = {}
while (nodes_queried < need_to_query)
node = canadate_nodes.pop;
result = node.DataUpload(key, data,
random_round(try / UPLOAD_BRANCH_FACTOR),
handle)
next if result = AlreadyUpload
# verify that it made it
node = select_random_node(result)
new_data = node.Download(key)
next if new_data != data # ie the download failed
push result onto nodes_with_data
increment nodes_queried
loop
# verify that the data is now properly indexed
result = DataQuery(key, MAX_TRY, create_handle);
result = nodes_with_data also_in result
if (|nodes_with_data| - |result| > ERROR_THRESHOLD)
... what to do?
return result
End
*** Conclusion
I really think I can make this work and I strongly belive such a
network has great potential. My eventual hope is that it will replase
networks like Gnutella and Morpheus and will also eliminate the need
for personal home pages on pop-up-city web sites.
--
http://kevin.atkinson.dhs.org
_______________________________________________
freenet-tech mailing list
[EMAIL PROTECTED]
http://lists.freenetproject.org/mailman/listinfo/tech