Re: [opencog-dev] Distributed Atomspace

Linas Vepstas Thu, 23 Jul 2020 15:54:12 -0700

On Thu, Jul 23, 2020 at 11:34 AM Ben Goertzel <[email protected]> wrote:


> > What I am fishing for, is either some example pseudocode, or the name of
> some algorithm or some wikipedia article that describes that algorithm,
> which can compute chunks.  Ideally, an algo that can run in fewer than a
> few thousand CPU cycles, because, after that, performance becomes a
> problem.  If none of this, then some brainstorming for how to find a
> reasonable algorithm.
> >
>
> Linas, just to be sure we're in sync -- how large of a chunk are you
> thinking this algorithm would typically find?
>

Arbitrary. If you look at what happened with opencog-ipfs or opencog-dht,
there are several key operations. One is, of course, "who's got this atom?"
but that's easy: each atom has a 64-bit hash (or 80-bit on opendht by
default, but that's settable). Next, "what's the incoming set of this
atom?" Whoops, can't compute the hash of that, because we don't know what
it is. So you can ask, and get back a list of N other atoms (or hashes)
that are in the incoming set. Where are they? Well, each different atom
gets a totally different hash, so they spread all over the planet (because
that's how Kademlia works), when in fact, what we really wanted to say was
"gee golly, the incoming set of an atom is 'close to' the atom itself, get
me the ball of close-by stuffs".  But I can't figure out how to "say
that".

Anyway, that is what I am trying to define as a chunk: an atom and
everything "nearby", with a variable conception of "nearby".

atomspace-ipfs had multiple major stumbling blocks. One is that the IPFS
documents are immutable, so for each new atomspace, you have to publish a
brand new document -- which has a completely different hash, so whoops, how
do findout out the hash of that?. Well, IPFS has a DNS-like naming system,
but it was horridly slow, totally unusuable (multi-secnod lookups with
60-second timeouts). The second problem is that its "centralized" -- you
have to jam the *entire* atomspace into the document. So its klunky. Won't
scale for large atomspaces. Some notion of chunks alleviates that. But
maybe something less klunky than IPFS would be better.

So that suggests a lower-level building block - e.g. opendht. and that is
how atomspace-dht was born. But that now seems to be maybe "too low". It
suffered from the chunking problem.

Here's one, somewhere in the middle: "earthstar" --
https://github.com/cinnamon-bun/earthstar is a decentralized document
store. Cross out "document" and write "atomspace" instead. Or rather cross
out "atomspace" and write "chunk" instead. Or something like that. Quite
unclear.

The reason atomspace-cog got created is it seems best to have "seeders",
same idea as in bittorent, so at least one server that is the source of
truth for a given atomspace, even if all the other servers are
down/offline.  The current ipfs and dht backends do not use seeders, but
I've got extremely vague plans to change that.

--linas


-- 
Verbogeny is one of the pleasurettes of a creatific thinkerizer.
        --Peter da Silva

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA34VPBdCw-hoXH-Y8RWa-eQRAjvUGQ%2BbRNSdU8wu95dbgA%40mail.gmail.com.

Re: [opencog-dev] Distributed Atomspace

Reply via email to