Re: [opencog-dev] Distributed Atomspace

Ben Goertzel Wed, 29 Jul 2020 09:40:33 -0700

On Wed, Jul 29, 2020 at 6:35 AM Abdulrahman Semrie <[email protected]> wrote:
>
>  > I think it's a mistake to try to think of a distributed atomspace as one 
> super-giant, universe-filling uniform, undifferentiated blob of storage.
>
> It is not clear to me why this is a mistake.


It's a mistake because making a call from machine A to machine B is
just sooooooo much slower than making a call from machine A to machine
A ...

So if you try to ignore the underlying distributed nature of a
knowledge store, and treat it as if it was a single knowledge blob
living in one location, you will wind up making a system that is very,
very, very slow...

My Webmind colleagues and I were naive enough to try this in the late
1990s using Java 1.1   ;-)

One challenge though is: From a language and algorithm design
perspective, it is of course necessary to abstract away many of the
details of distributed infrastructure, while still respecting the
difference btw a localized and distributed knowledge store.

E.g. an AI algorithm may need to be aware that pieces of knowledge can
have three different statuses: Local, Remote (in RAM on some other
machine in Distributed Atomspace) or BackedUp (disk).   So then when
it issues a query it may need specify whether its search for an answer
should be Local only, should include Remote machines, or should also
include BackedUp data...   Because having an AI algorithm issue all
its queries across a distributed Atomspace + disk backup will just be
too slow.   So in this case the existence of a distributed/persistent
infrastructure requires the AI algorithm to prioritize its queries w/
at least 3 levels of priority.

> I suggest you to look into the design docs of Nebula graph DB, which is a 
> strongly typed distributed graph db. I believe they address the above issues 
> you mentioned and  it is possible to implement something similar for the 
> first version of the distributed Atomspace.   Here are the links
>
> [Overview] - 
> https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/1.design-and-architecture/
>
> [Storage Design] - 
> https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/2.storage-design/
>  - part of this currently implemented through the Postgres backend as 
> demonstrated in this example
>
> [Query Engine] - 
> https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/3.query-engine/
>  - esp.  interesting how they implement access control through sessions, 
> which partly relates to #1855
>
> They implement sharding somewhat similar to what you described above using 
> Edge-Cut - storing a destination vertex and all its incoming edges in the 
> same partition, a source vertex and its outgoing edges in the same partition. 
> They use Multi-raft groups (" Multi-Raft only means we manage multiple Raft 
> consensus groups on one node") to achieve consistency across partitions for 
> multiple databases. This is contrary to what you suggested in that each node 
> doesn't broadcast its changes, only the elected leader will broadcast changes 
> (i.e send log requests) and the rest of the nodes will update their 
> partitions accordingly. Of course, a new leader can be elected if the current 
> leader fails or it term ends. The above design also solves what you noted as 
> the "unsolved part" in #2138



There are some interesting things in Nebula and maybe some stuff for
us to learn there.

However, their assumption of complete consistency across the
distributed KB does not match our requirements for OpenCog.  We need
complete consistency only regarding certain sorts of knowledge items
-- for other cases it's OK for us if different versions of an Atom in
different parts of a distributed system drift apart a little and are
then reconciled a little later.

The assumption of complete consistency is built into the RocksDB
infrastructure that they use, btw

ben

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBc5wPvNj-k-%3Dwhx-yntHkZrh%3DEfWi%2BcqjJUMksJ-5LKhA%40mail.gmail.com.

Re: [opencog-dev] Distributed Atomspace

Reply via email to