Interesting.. I ponder the use of DHT perhaps, yet not sure about the likely size...
Webizen is a service[0]/repo[1] Assuming RWW Clustering Accounts (ie: provider / subdomains, et.al), perhaps the base-install uses a look-up service, which is pointed, like a time-server...? no-point decentralising on an account level. Equally, one might consider that the server would index it's own record, and perhaps a relationship graph out to an var. int. Melvin's been dealing with decentralised block-chain storage. I imagine this is a similar challenge. [0] http://webizen.org/ [1] https://github.com/linkeddata/webizen Tim.H. On 19 January 2015 at 04:18, Brent Shambaugh <[email protected]> wrote: > Andrei (and others in the reply all?), > > Last year you gave a talk about cimba.co at MIT. During the Q&A there was > some discussion about what sort of index or triple retrieval mechanism > there would be. Sandro Hawke put up the talk, which I linked to here [0]. I > was wondering if you came up with something. > > Thanks for your time. > > My thoughts: > > From what I have read, it is difficult to index everything. The best you > can do is index triples that are "important"that will eventually lead you > to less important triples that you might want. > > Perhaps this is accomplished by some form of semantic clustering? Perhaps > this clustering is accomplished by some sort of distributed RDF store, such > as Swarm Linda [1]. Or perhaps this clustering is accomplished by only > indexing the names of linked data containers with some sort of description > about what they are about. Or perhaps, collections, which seem to have less > structure defined about what they are about and can exist (iirc) at > multiple Network nodes with different ownership, are described in some way > and cleaned up to be more query able using swarm intelligence provided by > Swarm Linda, or something similar like building a Folksonomy with Twitter > tags [2]. I might need to compare these more, but it seems you are looking > at semantic and syntactic similarities where the semantic similarities need > some sort of global reference to make things more manageable/possible. > For the index you either need some sort of centralized index or > decentralized index. If being a purist in decentralization is desired even > YaCy won't do since there are 4 nodes that are not decentralized [3]. Not > knowing much, there may be times when you want a centralized index. Perhaps > P2P would introduce too much latency and use too much bandwidth in the > network. Perhaps sometimes you want P2P because you are constructing a Mesh > Network where you might even want local versions of some ontologies because > you are closed off for some reason. > [0] > http://adistributedeconomy.blogspot.com/2014/12/links-to-building-social-applications.html?m=1 > [1] > http://www.mi.fu-berlin.de/inf/publications/techreports/tr2009/B-09-04/TR-B-09-04.pdf?1346662692 > [2] > http://people.kmi.open.ac.uk/motta/papers/SpeciaMotta_ESWC-2007_Final.pdf > [3] https://fedcsis.org/proceedings/2011/pliks/237.pdf > > > >
