Yesterday, I put the finishing touches on the FileStorageNode.  This uses
the StorageNode API to read/write Atomese s-expressions to a flat file.
It's fast, its compact.  It's 10x faster than using plain scheme (guile) to
dump Atoms: this is thanks to code originally written by Alexey Potapov and
Anatoly Belikov -- I wrote a wrapper around it to use the StorageNode API.

Some stats: I tested two datasets: a MOZI biology dataset, and a natural
language dataset, of 7 million and 20 million Atoms, respectively. When
these are loaded into the AtomSpace (in RAM), they take up 632 and 775
bytes/Atom of RSS (operating system resident set size). This is very
typical for Atoms in the AtomSpace. (I put these two datasets up at
https://linas.org/datasets/ for Amirouche.)

Dumped to a file, this becomes 55 and 154 bytes/Atom, for plain,
uncompressed Atomese s-expressions. When compressed with bzip2, it shrinks
to 4 and 6 bytes/Atom!  Tiny!  Clearly, storing searchable indexes into the
AtomSpace costs a huge amount of RAM.  The actual data content in typical
Atoms is .. tiny.

See  https://wiki.opencog.org/w/FileStorageNode and the demo in
https://github.com/opencog/atomspace/blob/master/examples/atomspace/persist-store.scm

-- Linas

-- 
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA34MAwQsHvrhzjtY4rHqUd6RBDAXt21kVzhp_U%3D_tw5JuQ%40mail.gmail.com.

Reply via email to