Hello everyone,

I just recently started following the development in more detail so excuse me 
if some of what I write below might be somewhat off for the system like Opencog 
or if something is already being done along the same lines and I didn't have a 
chance to read it so far. David Hart had similar ideas in the past. Constraints 
might be different this time.

I like minimalistic approach of Linas. But I'm thinking what if we go one layer 
further down - not use key/value database library, not do 
serialization/deserialization for core system operation. I'll admit some 
additional speculative aspects are interleaved in this proposal.

We could use memory-mapped files on SSD. Lets say memory-mapped size is 8x 
amount of physical RAM of that compute instance (should be tuned, for example, 
node is 16 CPUs / 64 GB RAM / 512 GB SSD mmap). Memory-mapped region is always 
at the same virtual memory address. This will require some cross-instance 
testing to find which virtual address is the best choice on 64-bit system. We 
can also patch Linux if necessary. Number of CPUs in the cluster is 
proportional to RAM size. Not sure what is the best ratio, will need 
experimentation. *Let the Linux do all the caching/loading/saving data*. We 
focus on Opencog subsystem algorithm development with some careful selection of 
data structures to be safe from corruption as this memory-mapped region will be 
shared by multiple processes (in the spirit of existing blackboard 
architecture). Concurrency could be handled optimistically even if some process 
degrades some part of the state effect should not be catastrophic, it will be 
corrected in the next incremental stochastic state update iteration. I'm 
implying here that Hyperon will feature stochastic runtime which might not be 
the case.

Each subsystem is function F from [past] GenericAtomSpace -> [future] 
GenericAtomSpace. Internally there will be functions which are ~ inverse of F. 
Subsystem categories are mapped to GenericAtomSpace in their own way, in 
separate processes. Opencog will ocassinally do fsync if we want to make a 
filesystem snapshot or to periodically clone the whole system state (for 
example, every day) to external storage for backup. Opencog compute instances 
only run Opencog processes forming a cluster of exactly the same instance 
configurations. If Opencog has to use some software packages this should be 
executed on additional instances, orchestrated by Opencog, to maintain 
stability of the cluster. There should be cgroup resource limit for memory 
usage by Opencog processes set to slightly below RAM size. This prevents system 
maintenance processes from being unavailable like sshd in case something freaky 
happens with Opencog. But I really don't expect this to happen because our 
fixed memory usage.

Distribution is tricky and from discussion so far I understand there are issues 
with network speed and latency which define activated atom 
chunking/batching/merging strategy. Maybe the solution is to log each network 
transfer with useful metadata so that knowledge/procedures for efficient 
distribution could be machine learned/inferred by PLN & MOSES Hyperon 
analogues. What also simplifies distribution is the fact that in the cluster 
all memory-mapped regions are mapped at exactly the same virtual memory 
address. Therefore we could just copy memory from one instance to another 
without any serialization/deserialization overhead as activation workload 
migrates. This could also be valuable characteristic for efficient Infiniband 
transfers. Okay, there might be some merging operation after the transfer to 
support possible superposition semantics of the new Atomese. Collapsing to more 
persitant closer-to-root atoms as form of 
compression/abstraction/embedding/entanglement. If we find correct translation 
from Atomese to fixed-sized data structures (i.e. N-dimensional arrays and 
corresponding highly parallel simple operations) that might simplify memory 
resource management and runtime design further. Adding nodes to the cluster 
while keeping the same working set size increases system reasoning fidelity 
(processing more possible worlds).

And at last, if Linux can't manage atomspace caching efficiently then we should 
forgo SSD backing of working set and maybe add more RAM as Linas suggested and 
similary handle really huge diverse datastores which are also much bigger than 
memory-mapped region anyway, specialized data access strategies for these 
should be automatically inferred by Opencog. My hope is that Opencog system 
should be more than capable to read and understand API specifications and 
software manuals to do intelligent data integration procedures. If this is not 
true, we are doing something wrong.

Final note regarding GPUs, I think that newer NVIDIA CUDA GPUs support memory 
sharing with the host system so we could see if smaller parts of our 
memory-mapped region can be shared with the GPU in this way too. Smaller shared 
parts will go through expansion/decompression/instancing into GPU memory by GPU 
kernels (generated from inferred Atomese which match CUDA kernel type system 
etc). Bulk of GPU compute should use on-board RAM. Reverse process will then 
happen for collapsing operation mentioned above or similar reduction operation. 
Maybe the use of GPUs/TPUs is distraction right now as Opencog system should 
eventually learn how to use those types of specialized hardware accelerators 
efficiently on its own. I'm still trying to understand how much should be 
built-in and how much learned in the system like Opencog. Even if the system is 
learning these things it's always easier to do that with the help from good 
teachers from the community.

best regards,
Pedja

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/0C602DE8-0CFA-435F-8798-0D17376A55DD%40crackleware.nu.

Reply via email to