Hi Ximin,
Thanks for your answer. I will rephrase what you wrote (adding my own view) to 
check if I understand.


I understand that each node constructs an SkeletonBTreeMap (huge index) that in 
the long-term will contain a huge index with all the successful searches 
initiated by that node or that passed though that node. Using the 
SkeletonBTreeMap, each node has a partial view of the documents stored in 
freenet system, but only about documents that passed thought that node.

The paragraph starting by For "Library's B-Tree, this is not feasible" is hard 
to understand. 

Since datastore uses a LRU-like cache replacement, when a node's datastore size 
is exceeded old files are deleted. This should be reflected in the huge index 
maintain by all remote's node that have links to that recently deleted file. 
But it is not possible to reflect it, nodes don't know when the links/items in 
their index are not valid anymore (i.e., when a remote node deleted the file as 
part of LRU police replacement).

If I understood correctly, we can continue discussing what is written below. If 
not, you can forget about below part and give more help to follow your previous 
e-mail.

In that case maybe the solution is an announcement policy that broadcast to 
neighbors that such file(key) is not valid anymore. Such messages will be 
harmless and not too promiscuous, won't them?…Although, neighbors who didn't 
knew such file was available through such node will learn it though that kind 
of announcements. Is that a problem?

For instance, take the following scenario:
(assumptions: for simplicity 1 identity per node, index items are simplified to 
{key,location})
Neighbors of Node1(n1): {n2, n3, n4, n5, n7}

Neighbors of Node4(n4): {n1, n30, n7}
Scenario:
1) Request of n5 to n1: "Give me file with key=200"
2) n1 index contains {key=201,n4}
3) n1 decides to forward request to n4:  "Give me file with key=200"
4) n4 answers to n1 with file
5) n1 forwards file to n5
6) n1 stores a copy of the file in its cache datastore
7) n1 stores {key=200,n4} in his index
8) n1 stores {key=200,n5} in his index (n1 does not know n5 is final 
destination).
9) n1 receives another request for key 200, he doesn't need to forward request 
because the file is stored in its cache
10) (time passes) file is deleted from n1's LRU cache
11) (time passes) n1 receives another request for key 200 from n3.
12) n1 needs to decide whether to forward request to n4 or n5. Not sure what is 
the criteria here, maybe he uses the node's reputation (WoT).
13) Some time later, file with key=200 is deleted from n4's datastore because 
of LRU policy.
14) n4 broadcast to its neighbors n1, n30, n7 that key=200 is eliminated

However, the fact that key=200 can be found though n4 may imply that node n4 
knows how to get key=200. There is another reason that makes me think this 
assumption is valid. Nodes with similar keys are cluster together, aren't them?
Then, which routing decision is better for key=200? n5 or still using n4? *at 
this point I am bewildered*
Not sure if n1 should delete {key=200,n4} from his index when he receives the 
n4 broadcast.

Some comments: I am not sure if is possible to store a key with multiple 
locations like steps 7 and 8, I guess is possible. I am still confused about 
location swapping and what are the consequences of location swapping in the 
node's index

Maybe a silly question, but...What do you mean by "top-level data structure". 
What is the top-level data structure of Freenet?

Regarding your security note[1]. Not sure what do you mean. I suppose that you 
refer to the fact that a node datastore cannot be accessed remotely. Users only 
send requests to other nodes asking for a file and the remote node verifies its 
datastore and answers the request. Thus, the storage is accessed only locally.

Regarding what I am trying to achieve. I am looking somehow accelerate the 
speed of the search, share bookmarks (specialized in some terms) among a group 
of people probably by using PSK maybe some friends of friends. Improve the 
Library code.

Thanks a lot again and forgive my dummy assumptions,

leuchtkaefer



________________________________
 From: Ximin Luo <[email protected]>
To: leuchtkaefer <[email protected]>; Discussion of development issues 
<[email protected]> 
Sent: Tuesday, May 21, 2013 12:06 PM
Subject: Re: [freenet-dev] questions about Library for my GSoC project
 

On 20/05/13 22:36, leuchtkaefer wrote:
> 
> 
> Hi infinity0,
> 
> My proposal to GSoC13 is highly related to your code (Library). 
> 
> First, do you have any extra documentation on the code that you think it 
> could be useful for me to understand the most important parts, such the 
> SkeletonBTreeMap?
> 

Hello,

I did Library for GSoC 2009 and back then I was inexperienced with building and
engineering large software codebases (such as freenet and its plugin
ecosystem). There are many aspects of Library that I would do differently today
if I was re-doing that project.

A large part of Library focuses on serialisation of massively-large(1) data
structures, implemented *on top of* freenet's decentralised(2) storage. (1) and
(2) together is what makes the problem hard.[1] For my GSoC 2009 project, I
tried to solve this problem by implementing a load-on-demand local data
structure (SkeletonBTreeMap) that represents the *overall* data structure (a
B-tree) as it exists on freenet storage.

By contrast, massively scalable distributed systems such as Bigtable, and even
the underlying freenet DHT storage system, never expose the *overall* data
structure to the clients of those systems - instead they allow piece-by-piece
access, e.g. by row, or by key, and the client never sees the top-level data
structure.

For Library's B-Tree, this is not feasible, because (due to the design of
freenet) we cannot offload computation (i.e. data structure book-keeping) onto
other nodes.[2] It was also not feasible to use freenet's decentralised storage
more directly, because it has certain properties (such as LRU cache) that are
not acceptable for a search index.

So. That was an overview of the abstract algorithmic issues surrounding the
design of Library. Please let me know if any part of what I just said is not
understandable. Every sentence makes an important theoretical point. If you do
not fully understand *any part*, ask me to clarify, otherwise I fear that you
may repeat the same mistakes that I did. This is not exactly a problem since
GSoC is partly about learning - but it would be sub-optimal for the project's
progress.

I'll hold off on answering the rest of your questions to give you a chance to
digest my previous answers. Understanding those will make it easier for you to
understand my answers to the next section - and you may even be able to figure
those answers out for yourself without me explaining it explicitly.

Also, if you give me some context on what you're trying to achieve, I can give
more specific advice.

Let me know how you get along, and good luck!
Ximin

[1] We are lucky that we don't have to further worry about security because the
underlying freenet storage allows us to restrict access to one single user.
[2] Perhaps one day, a system that supports fully homomorphic encryption will
allow this to happen.

> Second, I have some questions:
> 
> 1. You disabled the "boolean internal_entries" inside the 
> classSkeletonBTreeMap and use option 2. I don't understand what do you mean 
> about a dummy serialiser that copies task.data to task.meta. What contains 
> task.data?  
> 
> 2. What means deflate/inflate the node?
> 
> 3. What is a GhostNode? I understood is a not desirable structure used to 
> contain some metadata or sth related with the serializer and needs to be 
> removed.
> 
> If you can elaborate more about Library, besides the documentation already 
> published in the wiki, it will be of great help.
> 
> Thanks in advance,
> 
> leuchtkaefer
> 
> 
> 
> _______________________________________________
> Devl mailing list
> [email protected]
> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


-- 
GPG: 4096R/5FBBDBCE
https://github.com/infinity0
https://bitbucket.org/infinity0
https://launchpad.net/~infinity0

_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to