> On Jul 29, 2020, at 10:39 AM, Ben Goertzel <[email protected]> wrote: > > On Wed, Jul 29, 2020 at 6:35 AM Abdulrahman Semrie <[email protected]> wrote: >> >>> I think it's a mistake to try to think of a distributed atomspace as one >>> super-giant, universe-filling uniform, undifferentiated blob of storage. >> >> It is not clear to me why this is a mistake. > > It's a mistake because making a call from machine A to machine B is > just sooooooo much slower than making a call from machine A to machine > A ... > > So if you try to ignore the underlying distributed nature of a > knowledge store, and treat it as if it was a single knowledge blob > living in one location, you will wind up making a system that is very, > very, very slow... > > My Webmind colleagues and I were naive enough to try this in the late > 1990s using Java 1.1 ;-)
Ah yes I recall those days and the (in)famous Java 1 with the original broken Java Memory Model, not fixed until 2004 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.7914&rep=rep1&type=pdf <http://www.ibm.com/developerworks/library/j-jtp02244.html>). > > One challenge though is: From a language and algorithm design > perspective, it is of course necessary to abstract away many of the > details of distributed infrastructure, while still respecting the > difference btw a localized and distributed knowledge store. > > E.g. an AI algorithm may need to be aware that pieces of knowledge can > have three different statuses: Local, Remote (in RAM on some other > machine in Distributed Atomspace) or BackedUp (disk). So then when > it issues a query it may need specify whether its search for an answer > should be Local only, should include Remote machines, or should also > include BackedUp data... Because having an AI algorithm issue all > its queries across a distributed Atomspace + disk backup will just be > too slow. So in this case the existence of a distributed/persistent > infrastructure requires the AI algorithm to prioritize its queries w/ > at least 3 levels of priority. > >> I suggest you to look into the design docs of Nebula graph DB, which is a >> strongly typed distributed graph db. I believe they address the above issues >> you mentioned and it is possible to implement something similar for the >> first version of the distributed Atomspace. Here are the links >> >> [Overview] - >> https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/1.design-and-architecture/ >> >> [Storage Design] - >> https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/2.storage-design/ >> - part of this currently implemented through the Postgres backend as >> demonstrated in this example >> >> [Query Engine] - >> https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/3.query-engine/ >> - esp. interesting how they implement access control through sessions, >> which partly relates to #1855 >> >> They implement sharding somewhat similar to what you described above using >> Edge-Cut - storing a destination vertex and all its incoming edges in the >> same partition, a source vertex and its outgoing edges in the same >> partition. They use Multi-raft groups (" Multi-Raft only means we manage >> multiple Raft consensus groups on one node") to achieve consistency across >> partitions for multiple databases. This is contrary to what you suggested in >> that each node doesn't broadcast its changes, only the elected leader will >> broadcast changes (i.e send log requests) and the rest of the nodes will >> update their partitions accordingly. Of course, a new leader can be elected >> if the current leader fails or it term ends. The above design also solves >> what you noted as the "unsolved part" in #2138 > > > > There are some interesting things in Nebula and maybe some stuff for > us to learn there. > > However, their assumption of complete consistency across the > distributed KB does not match our requirements for OpenCog. We need > complete consistency only regarding certain sorts of knowledge items > -- for other cases it's OK for us if different versions of an Atom in > different parts of a distributed system drift apart a little and are > then reconciled a little later. > > The assumption of complete consistency is built into the RocksDB > infrastructure that they use, btw > > ben > > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/CACYTDBc5wPvNj-k-%3Dwhx-yntHkZrh%3DEfWi%2BcqjJUMksJ-5LKhA%40mail.gmail.com. -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/B9DCC63A-3866-41D4-B0D9-735318FF721E%40gmail.com.
