Re: [opencog-dev] OpenCog Rework [Re: s-expression database]

Linas Vepstas Tue, 09 Nov 2021 09:07:40 -0800

On Tue, Nov 9, 2021 at 3:44 AM Amirouche Boubekki
<[email protected]> wrote:
>
> > > > > Rocksdb is a serverless, single-system key-value database optimized 
> > > > > for SSD disks. It has C bindings, and it's fast. For me, simple and 
> > > > > fast are highly desirable properties.
> > > > > > > So is wiredtiger https://source.wiredtiger.com/ but GPLv3
> > > > > Is wiredtiger actually serverless?
> > > It has no network component. It has a notion of table, but that is an 
> > > optimization. You can use it like rocksdb with a single table, with 
> > > single key-column and a single value-column.
> > > The only drawback is that it is difficult to work with if the dataset 
> > > does NOT fit in RAM.
>
> Linas replied:
>
> > The AtomSpace wants RAM. Because the AtomSpace is an in-RAM database. So 
> > using something else that *also* wants RAM is stupid: it just means that I 
> > have to buy a computer with twice as much RAM. I've walked down all of 
> > these roads before.
>
> I am not sure I understand: will the new symbol expression database
> work along the current AtomSpace?


What "new symbol expression database"?  I don't know what you are
referring to here.

> In any case, whether it is RocksDB, wiredtiger, SQLite, Foundationdb,
> or whatever.

I was trying to say something very simple; I'm sorry if I did not say
it clearly. Let me try again:

The AtomSpace is an in-RAM database. When I read about WiredTiger, it
seems to say that it also is an in-RAM database. That means that if I
store something in the AtomSpace, and then use WiredTiger as the
backing store, then, for everything I push out to the backing store,
even more RAM is used up.

Which means I have less RAM available for the AtomSpace. Which is
counter-productive.

I do NOT want to have TWO RAM-hungry systems running, both of them
competing for the same resource.

By contrast, if the backing store to the AtomSpace is a disk drive,
then when I go to save something, I DON'T use more RAM. I just use
some disk. Disk is cheap. That means that most of the RAM in the
system is available to the AtomSpace. Yayy!

> The decision must not be settled lightly,

Why make a decision at all? The AtomSpace currently has five different
storage nodes (file, postgres, rocks, and two cogserver variants)
(actually, seven: two prototypes that work, but were abandoned). You
can use all five at the same time. You can copy atoms from one to
another to your heart's content. You can even do queries across four
of them (the file backend does not support queries)

Add one more storage node for WiredTiger, if that makes you happy.
It's not hard. It's under 2KLOC of code. Closer to 1KLOC, if you don't
count GPL copyright boilerplate and verbose documentation. You could
code this up and test and debug it in about 1-3 days (well, I could.
YMMV)  This is simply just not that hard, not worth hand-wringing or
shedding tears over.

> I was under the
> impression they were clear goals and use-cases for opencog,

Sadly, there are not.

> my
> recommendation is to extract from those hardware and software
> specifications, and quality metrics, then benchmark https://okvs.dev
> and dbms.

The link https://okvs.dev just takes me to a wikipedia article. I'm
sure Rocks is an okvs db. It kind of says so.

The benchmarks are at https://github.com/opencog/benchmark These
include a copy of the gene-ontology data from agi-bio, and a copy of
some natural language data.

> > Maybe something like (query ?x (foo bar ?x baz)) to find all lists with 
> > four items, three of which as shown and the fourth unknown... would this be 
> > an OK API? Have you seen anything like this? any suggestions?
>
> It looks like the symbolic expression database you built

Heh. I didn't build it. Andre Senna and a bunch of Brazillians did,
twenty years ago, under the supervision of mad scientist Ben Goertzel.
All that I did was to work it, make it, do it, harder, faster, better,
stronger, more than before.

Sometimes, it's worth celebrating anniversaries. The AtomSpace is
twenty years old, I believe.

> looks much
> like what I call a 'General Tuple Store' where every object of the
> database is a tuple of arbitrary length made of "atoms".

General s-expression store. Because each element can be another tuple.

General JSON store, because each element can have a name.

General abstract syntax tree store. Because each element stores trees.

General python store. Because python is just syntax trees.

General programming-language-X store, because all programming
languages have syntax trees.

It's general. That's the point.

-- linas

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA37rEsJOQvTCCDk-wftVDQ4R5g8M2KYxYWUZCW_36JgepA%40mail.gmail.com.

Re: [opencog-dev] OpenCog Rework [Re: s-expression database]

Reply via email to