Re: Decentralized building blocks [was Re: [opencog-dev] Distributed Atomspace

Predrag Radovic Tue, 04 Aug 2020 13:56:11 -0700

Hello everyone,

I just recently started following the development in more detail so excuse me 
if some of what I write below might be somewhat off for the system like Opencog 
or if something is already being done along the same lines and I didn't have a 
chance to read it so far. David Hart had similar ideas in the past. Constraints 
might be different this time.

I like minimalistic approach of Linas. But I'm thinking what if we go one layer
further down - not use key/value database library, not do
serialization/deserialization for core system operation. I'll admit some
additional speculative aspects are interleaved in this proposal.

We could use memory-mapped files on SSD. Lets say memory-mapped size is 8x
amount of physical RAM of that compute instance (should be tuned, for example,
node is 16 CPUs / 64 GB RAM / 512 GB SSD mmap). Memory-mapped region is always
at the same virtual memory address. This will require some cross-instance
testing to find which virtual address is the best choice on 64-bit system. We
can also patch Linux if necessary. Number of CPUs in the cluster is
proportional to RAM size. Not sure what is the best ratio, will need
experimentation. *Let the Linux do all the caching/loading/saving data*. We
focus on Opencog subsystem algorithm development with some careful selection of
data structures to be safe from corruption as this memory-mapped region will be
shared by multiple processes (in the spirit of existing blackboard
architecture). Concurrency could be handled optimistically even if some process
degrades some part of the state effect should not be catastrophic, it will be
corrected in the next incremental stochastic state update iteration. I'm
implying here that Hyperon will feature stochastic runtime which might not be
the case.

Each subsystem is function F from [past] GenericAtomSpace -> [future]
GenericAtomSpace. Internally there will be functions which are ~ inverse of F.
Subsystem categories are mapped to GenericAtomSpace in their own way, in
separate processes. Opencog will ocassinally do fsync if we want to make a
filesystem snapshot or to periodically clone the whole system state (for
example, every day) to external storage for backup. Opencog compute instances
only run Opencog processes forming a cluster of exactly the same instance
configurations. If Opencog has to use some software packages this should be
executed on additional instances, orchestrated by Opencog, to maintain
stability of the cluster. There should be cgroup resource limit for memory
usage by Opencog processes set to slightly below RAM size. This prevents system
maintenance processes from being unavailable like sshd in case something freaky
happens with Opencog. But I really don't expect this to happen because our
fixed memory usage.

Distribution is tricky and from discussion so far I understand there are issues
with network speed and latency which define activated atom
chunking/batching/merging strategy. Maybe the solution is to log each network
transfer with useful metadata so that knowledge/procedures for efficient
distribution could be machine learned/inferred by PLN & MOSES Hyperon
analogues. What also simplifies distribution is the fact that in the cluster
all memory-mapped regions are mapped at exactly the same virtual memory
address. Therefore we could just copy memory from one instance to another
without any serialization/deserialization overhead as activation workload
migrates. This could also be valuable characteristic for efficient Infiniband
transfers. Okay, there might be some merging operation after the transfer to
support possible superposition semantics of the new Atomese. Collapsing to more
persitant closer-to-root atoms as form of
compression/abstraction/embedding/entanglement. If we find correct translation
from Atomese to fixed-sized data structures (i.e. N-dimensional arrays and
corresponding highly parallel simple operations) that might simplify memory
resource management and runtime design further. Adding nodes to the cluster
while keeping the same working set size increases system reasoning fidelity
(processing more possible worlds).

And at last, if Linux can't manage atomspace caching efficiently then we should
forgo SSD backing of working set and maybe add more RAM as Linas suggested and
similary handle really huge diverse datastores which are also much bigger than
memory-mapped region anyway, specialized data access strategies for these
should be automatically inferred by Opencog. My hope is that Opencog system
should be more than capable to read and understand API specifications and
software manuals to do intelligent data integration procedures. If this is not
true, we are doing something wrong.

Final note regarding GPUs, I think that newer NVIDIA CUDA GPUs support memory
sharing with the host system so we could see if smaller parts of our
memory-mapped region can be shared with the GPU in this way too. Smaller shared
parts will go through expansion/decompression/instancing into GPU memory by GPU
kernels (generated from inferred Atomese which match CUDA kernel type system
etc). Bulk of GPU compute should use on-board RAM. Reverse process will then
happen for collapsing operation mentioned above or similar reduction operation.
Maybe the use of GPUs/TPUs is distraction right now as Opencog system should
eventually learn how to use those types of specialized hardware accelerators
efficiently on its own. I'm still trying to understand how much should be
built-in and how much learned in the system like Opencog. Even if the system is
learning these things it's always easier to do that with the help from good
teachers from the community.

best regards,
Pedja

--
You received this message because you are subscribed to the Google Groups
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/opencog/0C602DE8-0CFA-435F-8798-0D17376A55DD%40crackleware.nu.

Re: Decentralized building blocks [was Re: [opencog-dev] Distributed Atomspace

Reply via email to