> While I was writing the last message to Matt, I realized that having any DB 
> at all is very nearly pointless. That having a DB-backed storage is very 
> nearly an anti-pattern.
>
> The only useful function that a DB seems to provide is to be able to say "get 
> me this particular atom X from the disk".  But how often do you need that?  
> Far more typical  is that you want to load zillions of atoms, and do 
> something with them.  If zillions of atoms are too big to fit in RAM, then we 
> are back to the chunking problem that started this conversation, and the 
> chunking problem has nothing to do with databases.
>
> The only other advantage of databases is incremental backup -- for multi-day, 
> multi-week-long calculations, you want to save partial results, one atom at a 
> time.
>

This is almost right Linas, but just a little too extreme...

Let's think e.g. about the genomics use-case.

Consider the following situation.   An OpenCog system is thinking hard
about say 100 human genes at a time, building new links connecting
them to various concepts and predicates etc.   Then it saves its
conclusions to a backing-store DB -- and moves on to the next batch of
human genes.

But while thinking about gene G, OpenCog may relate it to gene H, and
may then want to grab information from the backing-store about gene H
...

In this case the "chunk" of information that we want to grab from the
backing-store is "the sub-metagraph giving the most intensely relevant
information about gene H" ....

Note that the chunk related to gene H, desired on a certain occasion,
may overlap with the chunk related to gene H1 ... or the chunk related
to GO category GO7 ...  desired on other occasions...

So I think it's a correct point that

-- the quantity of Atom-stuff to be sucked out of the BackingStore
into the Atomspace will almost always be a "chunk" of  Atoms rather
than an individual Atom

However, I think these chunks are not always going to be extremely
huge (they could be 100s of Atoms sometimes, or 1000s sometimes, not
always hundreds of thousands or millions...)... and also the chunks
needed are going to overlap w/ each other in ways that can't be
foreseen in advance

Thus I believe that we need some fairly powerful static pattern
matching operating against the BackingStore, and that a primary
operation to focus on it:

-- send a Pattern Matcher query to BackingStore
-- sent the Atom-chunk resulting from the query to Atomspace

This is pretty clearly what is needed in the genomics use-case.  But I
could come up with similar stories for other use-cases, e.g. if an
OpenCog-controlled robot meets a person "Piotr Walarz" for the first
time, it may wish to fish into the BackingStore to pull in a whole
bunch of nodes and links comprising previously ingested or inferred
knowledge about "Piotr Walarz" ....   This will be a sizeable chunk
but maybe

-- If the AI's knowledge about "Piotr Walarz" comes from online
profiles etc., this could be a 100s to 1000s to 10000s of Atoms chunk
...

-- if the robot or other robots sharing the same KB has had a lot of
direct interaction with "Piotr Walarz", then it could be a much larger
chunk ... which  may need to get fished into RAM only partially and in
multiple stages....


-- Ben


ben

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBcKw2L2gxj2LTC8p4SsBv3yfUwpNmY7V3S0bUVAPZs%3DUg%40mail.gmail.com.

Reply via email to