On Thu, Jul 23, 2020 at 11:34 AM Ben Goertzel <[email protected]> wrote:
> > What I am fishing for, is either some example pseudocode, or the name of > some algorithm or some wikipedia article that describes that algorithm, > which can compute chunks. Ideally, an algo that can run in fewer than a > few thousand CPU cycles, because, after that, performance becomes a > problem. If none of this, then some brainstorming for how to find a > reasonable algorithm. > > > > Linas, just to be sure we're in sync -- how large of a chunk are you > thinking this algorithm would typically find? > Arbitrary. If you look at what happened with opencog-ipfs or opencog-dht, there are several key operations. One is, of course, "who's got this atom?" but that's easy: each atom has a 64-bit hash (or 80-bit on opendht by default, but that's settable). Next, "what's the incoming set of this atom?" Whoops, can't compute the hash of that, because we don't know what it is. So you can ask, and get back a list of N other atoms (or hashes) that are in the incoming set. Where are they? Well, each different atom gets a totally different hash, so they spread all over the planet (because that's how Kademlia works), when in fact, what we really wanted to say was "gee golly, the incoming set of an atom is 'close to' the atom itself, get me the ball of close-by stuffs". But I can't figure out how to "say that". Anyway, that is what I am trying to define as a chunk: an atom and everything "nearby", with a variable conception of "nearby". atomspace-ipfs had multiple major stumbling blocks. One is that the IPFS documents are immutable, so for each new atomspace, you have to publish a brand new document -- which has a completely different hash, so whoops, how do findout out the hash of that?. Well, IPFS has a DNS-like naming system, but it was horridly slow, totally unusuable (multi-secnod lookups with 60-second timeouts). The second problem is that its "centralized" -- you have to jam the *entire* atomspace into the document. So its klunky. Won't scale for large atomspaces. Some notion of chunks alleviates that. But maybe something less klunky than IPFS would be better. So that suggests a lower-level building block - e.g. opendht. and that is how atomspace-dht was born. But that now seems to be maybe "too low". It suffered from the chunking problem. Here's one, somewhere in the middle: "earthstar" -- https://github.com/cinnamon-bun/earthstar is a decentralized document store. Cross out "document" and write "atomspace" instead. Or rather cross out "atomspace" and write "chunk" instead. Or something like that. Quite unclear. The reason atomspace-cog got created is it seems best to have "seeders", same idea as in bittorent, so at least one server that is the source of truth for a given atomspace, even if all the other servers are down/offline. The current ipfs and dht backends do not use seeders, but I've got extremely vague plans to change that. --linas -- Verbogeny is one of the pleasurettes of a creatific thinkerizer. --Peter da Silva -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA34VPBdCw-hoXH-Y8RWa-eQRAjvUGQ%2BbRNSdU8wu95dbgA%40mail.gmail.com.
