Hello everyone, I just recently started following the development in more detail so excuse me if some of what I write below might be somewhat off for the system like Opencog or if something is already being done along the same lines and I didn't have a chance to read it so far. David Hart had similar ideas in the past. Constraints might be different this time.
I like minimalistic approach of Linas. But I'm thinking what if we go one layer further down - not use key/value database library, not do serialization/deserialization for core system operation. I'll admit some additional speculative aspects are interleaved in this proposal. We could use memory-mapped files on SSD. Lets say memory-mapped size is 8x amount of physical RAM of that compute instance (should be tuned, for example, node is 16 CPUs / 64 GB RAM / 512 GB SSD mmap). Memory-mapped region is always at the same virtual memory address. This will require some cross-instance testing to find which virtual address is the best choice on 64-bit system. We can also patch Linux if necessary. Number of CPUs in the cluster is proportional to RAM size. Not sure what is the best ratio, will need experimentation. *Let the Linux do all the caching/loading/saving data*. We focus on Opencog subsystem algorithm development with some careful selection of data structures to be safe from corruption as this memory-mapped region will be shared by multiple processes (in the spirit of existing blackboard architecture). Concurrency could be handled optimistically even if some process degrades some part of the state effect should not be catastrophic, it will be corrected in the next incremental stochastic state update iteration. I'm implying here that Hyperon will feature stochastic runtime which might not be the case. Each subsystem is function F from [past] GenericAtomSpace -> [future] GenericAtomSpace. Internally there will be functions which are ~ inverse of F. Subsystem categories are mapped to GenericAtomSpace in their own way, in separate processes. Opencog will ocassinally do fsync if we want to make a filesystem snapshot or to periodically clone the whole system state (for example, every day) to external storage for backup. Opencog compute instances only run Opencog processes forming a cluster of exactly the same instance configurations. If Opencog has to use some software packages this should be executed on additional instances, orchestrated by Opencog, to maintain stability of the cluster. There should be cgroup resource limit for memory usage by Opencog processes set to slightly below RAM size. This prevents system maintenance processes from being unavailable like sshd in case something freaky happens with Opencog. But I really don't expect this to happen because our fixed memory usage. Distribution is tricky and from discussion so far I understand there are issues with network speed and latency which define activated atom chunking/batching/merging strategy. Maybe the solution is to log each network transfer with useful metadata so that knowledge/procedures for efficient distribution could be machine learned/inferred by PLN & MOSES Hyperon analogues. What also simplifies distribution is the fact that in the cluster all memory-mapped regions are mapped at exactly the same virtual memory address. Therefore we could just copy memory from one instance to another without any serialization/deserialization overhead as activation workload migrates. This could also be valuable characteristic for efficient Infiniband transfers. Okay, there might be some merging operation after the transfer to support possible superposition semantics of the new Atomese. Collapsing to more persitant closer-to-root atoms as form of compression/abstraction/embedding/entanglement. If we find correct translation from Atomese to fixed-sized data structures (i.e. N-dimensional arrays and corresponding highly parallel simple operations) that might simplify memory resource management and runtime design further. Adding nodes to the cluster while keeping the same working set size increases system reasoning fidelity (processing more possible worlds). And at last, if Linux can't manage atomspace caching efficiently then we should forgo SSD backing of working set and maybe add more RAM as Linas suggested and similary handle really huge diverse datastores which are also much bigger than memory-mapped region anyway, specialized data access strategies for these should be automatically inferred by Opencog. My hope is that Opencog system should be more than capable to read and understand API specifications and software manuals to do intelligent data integration procedures. If this is not true, we are doing something wrong. Final note regarding GPUs, I think that newer NVIDIA CUDA GPUs support memory sharing with the host system so we could see if smaller parts of our memory-mapped region can be shared with the GPU in this way too. Smaller shared parts will go through expansion/decompression/instancing into GPU memory by GPU kernels (generated from inferred Atomese which match CUDA kernel type system etc). Bulk of GPU compute should use on-board RAM. Reverse process will then happen for collapsing operation mentioned above or similar reduction operation. Maybe the use of GPUs/TPUs is distraction right now as Opencog system should eventually learn how to use those types of specialized hardware accelerators efficiently on its own. I'm still trying to understand how much should be built-in and how much learned in the system like Opencog. Even if the system is learning these things it's always easier to do that with the help from good teachers from the community. best regards, Pedja -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/0C602DE8-0CFA-435F-8798-0D17376A55DD%40crackleware.nu.
