On Thu, Jul 30, 2020 at 11:20 AM Ben Goertzel <[email protected]> wrote:
>
> In this case the "chunk" of information that we want to grab from the
> backing-store is "the sub-metagraph giving the most intensely relevant
> information about gene H" ....
>
Agreed with all you say, but let me flip this over on its head, and look at
it from a comp-sci fundamentals point of view. There are two ways to "grab
something". One way is to provide a pointer to it -- e.g. in C/C++,
literally a pointer. Some number, some unique ID, a name for the thing
that you want, a handle.
The other way to grab something is to describe it: "I want all X's that
have property Y". These are the only two options available to the system
programmer. Everything else has to be built out of these two operations.
And this is generically true for all programming languages, all operating
systems ... I suppose maybe it is even some mathematical theorem, but I
don't know it's name.
For the current C++ Atomspace, the first corresponds to having Handle to
some Atom, and the second corresponds to running a query. (Which might
include looking at the attention-bank, or using the space/time-server, or
other indexing subsystems that are currently not integrated into the
AtomSpace). So, to get "the sub-metagraph giving the most intensely
relevant information about gene H", you have to specify that in some
declarative or algorithmic or recursive fashion that either results in a
pointer, or a query to an existing subsystem.
Since we are talking about "persistance", I have a very easy answer: don't
turn off the electricity! Ta dah! Persistent! Doesn't fit in RAM? Buy more
RAM! Oh, still have problems? Well, let me think about that. Flash-file
storage works with 64K blocks that can be flashed in 5 usecs, which have a
pending write buffer of 128 MBytes on the other side of a PCIe 1x link
which can run at about 8 Gbps. That means that if I create index blocks
that are 64K in size and only flash them when ... hmm. Interesting. But
what if I... wait, hang on ... one could always ... let see, uhh .... let
me get back to you on that.
These are your choices. Yes, in theory it is possible to create a query
system that runs fast from rotating storage or SSD. Storage vendors like
Oracle have pumped many billions of dollars of R&D into exactly this -- for
the last half-century.
The meta question facing the current Atomspace is: what can be done using
limited manpower, limited time, limited money? And I'm saying: buy more
RAM, and use a dirt-simple, super-tiny, ultra-fast "database" to access
disk. I wrote "database" in quotes because it barely needs most of the
bells-n-whistles that databases have. It only needs some kind of fast
disk-access algorithm, to figure out which block to fetch next (taking into
account PCIe and FFS and other bandwidth limitations.) Once you've gotten
those blocks into RAM, just run the existing query engine to get the rest
of the job done.
Now, if you have the time, the money, and the interest in low-level stuff
like speed-of-PCIe, and command-queueing in SCSI protocols (which is what
those SSD drives actually use) then, sure, do some research in how to
rapidly perform pattern-matching queries subject to these hardware
constraints. It's fun, people do this all the time.
In the world of open-source, you might be able to find some existing
library - in C++, java, whatever, that already does a superb job of
disk-access, and has at least some query support in it. And then you can
assign 5 systems programmers to add more pattern-query type things into it
... how much time/money do you want to spend on this low-level stuff, vs
how much on high-level stuff?
-- Linas
--
Verbogeny is one of the pleasurettes of a creatific thinkerizer.
--Peter da Silva
--
You received this message because you are subscribed to the Google Groups
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/opencog/CAHrUA36jNscfwj33JU6-E2EMMGOerF4%2B02J8UXmd7zJjxaYe5A%40mail.gmail.com.