On Wed, Jul 29, 2020 at 6:35 AM Abdulrahman Semrie <[email protected]> wrote: > > > I think it's a mistake to try to think of a distributed atomspace as one > super-giant, universe-filling uniform, undifferentiated blob of storage. > > It is not clear to me why this is a mistake.
It's a mistake because making a call from machine A to machine B is just sooooooo much slower than making a call from machine A to machine A ... So if you try to ignore the underlying distributed nature of a knowledge store, and treat it as if it was a single knowledge blob living in one location, you will wind up making a system that is very, very, very slow... My Webmind colleagues and I were naive enough to try this in the late 1990s using Java 1.1 ;-) One challenge though is: From a language and algorithm design perspective, it is of course necessary to abstract away many of the details of distributed infrastructure, while still respecting the difference btw a localized and distributed knowledge store. E.g. an AI algorithm may need to be aware that pieces of knowledge can have three different statuses: Local, Remote (in RAM on some other machine in Distributed Atomspace) or BackedUp (disk). So then when it issues a query it may need specify whether its search for an answer should be Local only, should include Remote machines, or should also include BackedUp data... Because having an AI algorithm issue all its queries across a distributed Atomspace + disk backup will just be too slow. So in this case the existence of a distributed/persistent infrastructure requires the AI algorithm to prioritize its queries w/ at least 3 levels of priority. > I suggest you to look into the design docs of Nebula graph DB, which is a > strongly typed distributed graph db. I believe they address the above issues > you mentioned and it is possible to implement something similar for the > first version of the distributed Atomspace. Here are the links > > [Overview] - > https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/1.design-and-architecture/ > > [Storage Design] - > https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/2.storage-design/ > - part of this currently implemented through the Postgres backend as > demonstrated in this example > > [Query Engine] - > https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/3.query-engine/ > - esp. interesting how they implement access control through sessions, > which partly relates to #1855 > > They implement sharding somewhat similar to what you described above using > Edge-Cut - storing a destination vertex and all its incoming edges in the > same partition, a source vertex and its outgoing edges in the same partition. > They use Multi-raft groups (" Multi-Raft only means we manage multiple Raft > consensus groups on one node") to achieve consistency across partitions for > multiple databases. This is contrary to what you suggested in that each node > doesn't broadcast its changes, only the elected leader will broadcast changes > (i.e send log requests) and the rest of the nodes will update their > partitions accordingly. Of course, a new leader can be elected if the current > leader fails or it term ends. The above design also solves what you noted as > the "unsolved part" in #2138 There are some interesting things in Nebula and maybe some stuff for us to learn there. However, their assumption of complete consistency across the distributed KB does not match our requirements for OpenCog. We need complete consistency only regarding certain sorts of knowledge items -- for other cases it's OK for us if different versions of an Atom in different parts of a distributed system drift apart a little and are then reconciled a little later. The assumption of complete consistency is built into the RocksDB infrastructure that they use, btw ben -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBc5wPvNj-k-%3Dwhx-yntHkZrh%3DEfWi%2BcqjJUMksJ-5LKhA%40mail.gmail.com.
