Hi Ximin,
Thanks for your answer. I will rephrase what you wrote (adding my own view) to
check if I understand.
I understand that each node constructs an SkeletonBTreeMap (huge index) that in
the long-term will contain a huge index with all the successful searches
initiated by that node or that passed though that node. Using the
SkeletonBTreeMap, each node has a partial view of the documents stored in
freenet system, but only about documents that passed thought that node.
The paragraph starting by For "Library's B-Tree, this is not feasible" is hard
to understand.
Since datastore uses a LRU-like cache replacement, when a node's datastore size
is exceeded old files are deleted. This should be reflected in the huge index
maintain by all remote's node that have links to that recently deleted file.
But it is not possible to reflect it, nodes don't know when the links/items in
their index are not valid anymore (i.e., when a remote node deleted the file as
part of LRU police replacement).
If I understood correctly, we can continue discussing what is written below. If
not, you can forget about below part and give more help to follow your previous
e-mail.
In that case maybe the solution is an announcement policy that broadcast to
neighbors that such file(key) is not valid anymore. Such messages will be
harmless and not too promiscuous, won't them?…Although, neighbors who didn't
knew such file was available through such node will learn it though that kind
of announcements. Is that a problem?
For instance, take the following scenario:
(assumptions: for simplicity 1 identity per node, index items are simplified to
{key,location})
Neighbors of Node1(n1): {n2, n3, n4, n5, n7}
Neighbors of Node4(n4): {n1, n30, n7}
Scenario:
1) Request of n5 to n1: "Give me file with key=200"
2) n1 index contains {key=201,n4}
3) n1 decides to forward request to n4: "Give me file with key=200"
4) n4 answers to n1 with file
5) n1 forwards file to n5
6) n1 stores a copy of the file in its cache datastore
7) n1 stores {key=200,n4} in his index
8) n1 stores {key=200,n5} in his index (n1 does not know n5 is final
destination).
9) n1 receives another request for key 200, he doesn't need to forward request
because the file is stored in its cache
10) (time passes) file is deleted from n1's LRU cache
11) (time passes) n1 receives another request for key 200 from n3.
12) n1 needs to decide whether to forward request to n4 or n5. Not sure what is
the criteria here, maybe he uses the node's reputation (WoT).
13) Some time later, file with key=200 is deleted from n4's datastore because
of LRU policy.
14) n4 broadcast to its neighbors n1, n30, n7 that key=200 is eliminated
However, the fact that key=200 can be found though n4 may imply that node n4
knows how to get key=200. There is another reason that makes me think this
assumption is valid. Nodes with similar keys are cluster together, aren't them?
Then, which routing decision is better for key=200? n5 or still using n4? *at
this point I am bewildered*
Not sure if n1 should delete {key=200,n4} from his index when he receives the
n4 broadcast.
Some comments: I am not sure if is possible to store a key with multiple
locations like steps 7 and 8, I guess is possible. I am still confused about
location swapping and what are the consequences of location swapping in the
node's index
Maybe a silly question, but...What do you mean by "top-level data structure".
What is the top-level data structure of Freenet?
Regarding your security note[1]. Not sure what do you mean. I suppose that you
refer to the fact that a node datastore cannot be accessed remotely. Users only
send requests to other nodes asking for a file and the remote node verifies its
datastore and answers the request. Thus, the storage is accessed only locally.
Regarding what I am trying to achieve. I am looking somehow accelerate the
speed of the search, share bookmarks (specialized in some terms) among a group
of people probably by using PSK maybe some friends of friends. Improve the
Library code.
Thanks a lot again and forgive my dummy assumptions,
leuchtkaefer
________________________________
From: Ximin Luo <[email protected]>
To: leuchtkaefer <[email protected]>; Discussion of development issues
<[email protected]>
Sent: Tuesday, May 21, 2013 12:06 PM
Subject: Re: [freenet-dev] questions about Library for my GSoC project
On 20/05/13 22:36, leuchtkaefer wrote:
>
>
> Hi infinity0,
>
> My proposal to GSoC13 is highly related to your code (Library).
>
> First, do you have any extra documentation on the code that you think it
> could be useful for me to understand the most important parts, such the
> SkeletonBTreeMap?
>
Hello,
I did Library for GSoC 2009 and back then I was inexperienced with building and
engineering large software codebases (such as freenet and its plugin
ecosystem). There are many aspects of Library that I would do differently today
if I was re-doing that project.
A large part of Library focuses on serialisation of massively-large(1) data
structures, implemented *on top of* freenet's decentralised(2) storage. (1) and
(2) together is what makes the problem hard.[1] For my GSoC 2009 project, I
tried to solve this problem by implementing a load-on-demand local data
structure (SkeletonBTreeMap) that represents the *overall* data structure (a
B-tree) as it exists on freenet storage.
By contrast, massively scalable distributed systems such as Bigtable, and even
the underlying freenet DHT storage system, never expose the *overall* data
structure to the clients of those systems - instead they allow piece-by-piece
access, e.g. by row, or by key, and the client never sees the top-level data
structure.
For Library's B-Tree, this is not feasible, because (due to the design of
freenet) we cannot offload computation (i.e. data structure book-keeping) onto
other nodes.[2] It was also not feasible to use freenet's decentralised storage
more directly, because it has certain properties (such as LRU cache) that are
not acceptable for a search index.
So. That was an overview of the abstract algorithmic issues surrounding the
design of Library. Please let me know if any part of what I just said is not
understandable. Every sentence makes an important theoretical point. If you do
not fully understand *any part*, ask me to clarify, otherwise I fear that you
may repeat the same mistakes that I did. This is not exactly a problem since
GSoC is partly about learning - but it would be sub-optimal for the project's
progress.
I'll hold off on answering the rest of your questions to give you a chance to
digest my previous answers. Understanding those will make it easier for you to
understand my answers to the next section - and you may even be able to figure
those answers out for yourself without me explaining it explicitly.
Also, if you give me some context on what you're trying to achieve, I can give
more specific advice.
Let me know how you get along, and good luck!
Ximin
[1] We are lucky that we don't have to further worry about security because the
underlying freenet storage allows us to restrict access to one single user.
[2] Perhaps one day, a system that supports fully homomorphic encryption will
allow this to happen.
> Second, I have some questions:
>
> 1. You disabled the "boolean internal_entries" inside the
> classSkeletonBTreeMap and use option 2. I don't understand what do you mean
> about a dummy serialiser that copies task.data to task.meta. What contains
> task.data?
>
> 2. What means deflate/inflate the node?
>
> 3. What is a GhostNode? I understood is a not desirable structure used to
> contain some metadata or sth related with the serializer and needs to be
> removed.
>
> If you can elaborate more about Library, besides the documentation already
> published in the wiki, it will be of great help.
>
> Thanks in advance,
>
> leuchtkaefer
>
>
>
> _______________________________________________
> Devl mailing list
> [email protected]
> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
--
GPG: 4096R/5FBBDBCE
https://github.com/infinity0
https://bitbucket.org/infinity0
https://launchpad.net/~infinity0
_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl