Sorry Matt, I was very snippy. Making things work takes a lot of time and effort. It's frustrating when that effort isn't recognized.
Ideas are cheap -- there's an unbounded supply of ideas, and one can pump them out rapidly -- new idea every 5 minutes. Converting an idea into reality takes 1000x or 10000x longer. This is a stereotype in software: the programmer who says "that's easy, I can do that in no time" and weeks later they're still working on it. But it's also true of reality in general: the ideas behind #BLM are not particularly sophisticated or complex, but it will take 100 million man-years to turn them into reality. (and that's a lower-bound) --linas On Thu, Aug 6, 2020 at 11:30 AM Matt Chapman <[email protected]> wrote: > You misunderstood my first comment; I was agreeing with you that > Cassandra-backed storage & distribution won't be faster than what I thought > you were suggesting: a client-server model where one Rocks-backed atom > server is used by many clients who retrieve, manipulate, and return atoms > to the central server. > > Maybe you're suggesting something very different, or I'm just very > confused, because I've been hearing people talk about the need for > distributed atomspace on and off for 8+ years, and I've never seen an > answer along the lines of "you can already have a cluster, here's the > documentation on how to set it up." If that was the answer, and people > rejected it because of the lack of disk-back persistence, the I'm agreeing > with you that RocksDB may solve much of the problem. > > Maybe the unsolved part of the problem is consistency/consensus? I tend to > agree with your sentiment that consistency is overrated for Atomspace use > cases, often not needed or not desirable, but it seems like maybe Ben and > others are seeking something like Tunable Consistency. Maybe this is the > big chocolate sprinkle? > > >Who is "we"? > > Practitioners of the computing arts & sciences, in general. > > > We've had the ability to run distributed AtomSpaces that far exceed > installed RAM, running on a cluster, for more than a decade. > > Does it meet the 7 business requirements in Ben's document: > https://docs.google.com/document/d/1n0xM5d3C_Va4ti9A6sgqK_RV6zXi_xFqZ2ppQ5koqco/edit > ? > > Points 2 & 3 are about performance improvements; Do you believe such > improvements are impossible, or would require more effort than the likely > benefits would justify? > > Of the other 5, which already exist and which are what you call "chocolate > sprinkles"? > > All the Best, > > Matt > > -- > Please interpret brevity as me valuing your time, and not as any negative > intention. > > > On Wed, Aug 5, 2020 at 11:32 PM Linas Vepstas <[email protected]> > wrote: > >> >> >> On Wed, Aug 5, 2020 at 11:59 PM Matt Chapman <[email protected]> >> wrote: >> >>> > I'll bet you a bottle of good wine or other neuroactive substance that >>> the existing atomspace client-server infrastructure is faster than >>> Cassandra. >>> >>> No, it won't be faster, >>> >> >> wrong. >> >> but you'll never be able to store an atomspace bigger than what you can >>> fit in memory >>> >> >> That is also wrong. >> >> on that single atomserver, and you'll never be able to perform more >>> operations (on the canonical atomspace) in parallel than what that one atom >>> server can support. >>> >> >> That has been wrong for 12+ years. >> >> Obviously distributed systems have a performance penalty. >>> >> >> We've had a distributed atomspace for 12+ years. >> >> We don't build them because we need to go faster (at the level of a >>> single process), we build them because we need to go bigger (in terms of >>> storage space or parallel processes). >>> >> >> Who is "we"? >> >> We've had the ability to run distributed AtomSpaces that far exceed >> installed RAM, running on a cluster, for more than a decade. People talk >> about this as if it doesn't exist or it doesn't work or there's something >> wrong with it, or they want something with more chocolate sprinkles on it. >> >> I'm annoyed. Seriously, is no one actually paying attention to anything? >> WTF. >> >> --linas >> >> >>> All the Best, >>> >>> Matt >>> >>> -- >>> Please interpret brevity as me valuing your time, and not as any >>> negative intention. >>> >>> >>> On Wed, Aug 5, 2020 at 12:16 AM Linas Vepstas <[email protected]> >>> wrote: >>> >>>> LevelDB/RocksDB and Cassandra are apples and kumquats. >>>> >>>> LevelDB/RocksDB are C++ libraries, single-user, non-networked, >>>> non-distributed, link directly into the app, store data directly in files, >>>> on the local system. They are "embedded databases". So conceptually, they >>>> are like the 50-year old unix dbm, except that they have 50 years of >>>> computer science behind them, such as bloom filters and log-structured >>>> merge trees and what-not (e.g. Rocks is explicitly optimized for SSD >>>> disks.). LevelDB was created by google in 2011. Then facebook took levelDB >>>> in 2013 and forked it to create rocksdb, and added a bunch of stuff, made >>>> some parts run faster. >>>> >>>> Just like dbm, people use leveldb/rocksdb to build *other* databases on >>>> top of it. (that's the beauty of "embedded") For example, there's some >>>> version of MariaDB that uses RocksDB for the actual storage. >>>> >>>> Cassandra is written in java, its a network database, so basically, its >>>> like postgres, except its not postgres, because it uses CQL instead of SQL >>>> so its not actually SQL compatible. Otherwise, it has exactly all of the >>>> exact same issues that any other networked client-server database has, >>>> including the need for an experienced DB Admin to set it up, run it, >>>> administer it. (This is an easily forgotten but important detail --vs >>>> rocksdb just ... writes to a file. No DBAdmin required.) >>>> >>>> For the app developer (i.e. me) one must to write in a custom query >>>> language -- CQL, convert my data into CQL format, send that data via tcpip >>>> to the server, which unpacks it, then runs it's interpreter to figure out >>>> what I said/wanted, unpacks my data packets, converts them into it's own >>>> internal format, (so, that's a second format conversion) and actually >>>> performs whatever operations I had specified. This is conceptually >>>> identical to *any* client-server database. >>>> >>>> For CQL I copy from wikipedia: >>>> >>>> CREATE KEYSPACE MyKeySpace >>>> WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : >>>> 3 }; >>>> USE MyKeySpace; >>>> CREATE COLUMNFAMILY MyColumns (id text, Last text, First text, PRIMARY >>>> KEY(id)); >>>> INSERT INTO MyColumns (id, Last, First) VALUES ('1', 'Doe', 'John'); >>>> SELECT * FROM MyColumns; >>>> >>>> Looks identical to SQL except its not actually compatible. Yuck. This >>>> offers exactly zero advantages of SQL that I can see; the fact that its >>>> key-value somewhere in there offers no perceivable advantage that I can >>>> make out. >>>> >>>> I'll bet you a bottle of good wine or other neuroactive substance that >>>> the existing atomspace client-server infrastructure is faster than >>>> Cassandra. That is -- start a cogserver, as is, today, open the rocksdb >>>> backend under it (so everything going to the cogserver gets stored), and >>>> then let other atomspaces connect to the cogserver (using the existing >>>> client-server code) that you will have a distributed atomspace that runs >>>> faster than cassandra. >>>> >>>> OK, it doesn't have any of those other bells-n-whistles in cassandra, >>>> but no one really knows how to do anything useful with those other >>>> bells-n-whistles, other than to suggest that they might be somehow useful >>>> in some way, maybe, for something. That surpasses my attention span. >>>> >>>> --linas >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Aug 4, 2020 at 11:39 PM Ben Goertzel <[email protected]> wrote: >>>> >>>>> I wonder how different would be the API for RocksDB vs., say, >>>>> Cassandra which Matt Chapman has recommended (which may have some >>>>> advantages in terms of allowing more configurable/flexible notions of >>>>> consistency?) >>>>> >>>>> >>>>> >>>>> On Tue, Aug 4, 2020 at 4:44 PM Linas Vepstas <[email protected]> >>>>> wrote: >>>>> > >>>>> > >>>>> > >>>>> > On Tue, Aug 4, 2020 at 11:51 AM Ben Goertzel <[email protected]> >>>>> wrote: >>>>> >> >>>>> >> Wow! >>>>> > >>>>> > >>>>> > You're welcome. Querying from the database is now supported. The >>>>> demo is in >>>>> > >>>>> https://github.com/opencog/atomspace-rocks/blob/master/examples/query-storage.scm >>>>> > >>>>> > At the moment it works, but I'm rethinking the API. Do check it >>>>> out. Feedback, opinions, suggestions, etc. invited. >>>>> > >>>>> > --linas >>>>> > >>>>> >> >>>>> >> On Tue, Aug 4, 2020, 8:45 AM Linas Vepstas <[email protected]> >>>>> wrote: >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> On Thu, Jul 30, 2020 at 11:20 AM Ben Goertzel <[email protected]> >>>>> wrote: >>>>> >>>> >>>>> >>>> >>>>> >>>> -- send a Pattern Matcher query to BackingStore >>>>> >>>> -- sent the Atom-chunk resulting from the query to Atomspace >>>>> >>>> >>>>> >>> >>>>> >>> So, >>>>> >>> >>>>> >>> Someone needed to prove me wrong, and who better to do that but >>>>> me. I took the weekend to implement a file-based backing store, using >>>>> RocksDB (which itself is a variant on LevelDB). It's here: >>>>> https://github.com/opencog/atomspace-rocks >>>>> >>> >>>>> >>> -- It works, all of the old persistent store unit tests pass >>>>> (there are 8 of them) >>>>> >>> -- its faster than the SQL by factors of 2x to 5x depending on >>>>> dataset. With tuning, maybe one could do better. (I have no plans to tune, >>>>> right now) >>>>> >>> >>>>> >>> I'm certain I know of a simple/easy way to "send a Pattern Matcher >>>>> query to BackingStore and send the Atom-chunk resulting from the query to >>>>> Atomspace" and will implement this afternoon (famous last words...) BTW, >>>>> you can *already* do this with the cogserver-based network client (i.e. >>>>> without sql, just the network only) here: >>>>> https://github.com/opencog/atomspace-cog/blob/master/examples/remote-query.scm >>>>> >>> >>>>> >>> By combining these two backends, I think you can get file-backed >>>>> storage that is also network-enabled. Or rather, you have two key >>>>> building >>>>> blocks for exploring both distributed and also decentralized designs. >>>>> >>> >>>>> >>> Some background info, from the README: >>>>> >>> >>>>> >>> AtomSpace RocksDB Backend >>>>> >>> ========================= >>>>> >>> >>>>> >>> Save and restore AtomSpace contents to a RocksDB database. The >>>>> RocksDB >>>>> >>> database is a single-user, local-host-only file-backed database. >>>>> That >>>>> >>> means that only one AtomSpace can connect to it at any given >>>>> moment. >>>>> >>> >>>>> >>> In ASCII-art: >>>>> >>> >>>>> >>> ``` >>>>> >>> +-------------+ >>>>> >>> | AtomSpace | >>>>> >>> | | >>>>> >>> +---- API-----+ >>>>> >>> | | >>>>> >>> | RocksDB | >>>>> >>> | files | >>>>> >>> +-------------+ >>>>> >>> ``` >>>>> >>> RocksDB (see https://rocksdb.org/) is an "embeddable persistent >>>>> key-value >>>>> >>> store for fast storage." The goal of layering the AtomSpace on top >>>>> of it >>>>> >>> is to provide fast persistent storage for the AtomSpace. There are >>>>> >>> several advantages to doing this: >>>>> >>> >>>>> >>> * RocksDB is file-based, and so it is straight-forward to make >>>>> backup >>>>> >>> copies of datasets, as well as to share these copies with others. >>>>> >>> * RocksDB runs locally, and so the overhead of pushing bytes >>>>> through >>>>> >>> the network is eliminated. The remaining >>>>> inefficiencies/bottlenecks >>>>> >>> have to do with converting between the AtomSpace's natural in-RAM >>>>> >>> format, and the position-independent format that all databases >>>>> need. >>>>> >>> (Here, we say "position-independent" in that the DB format does >>>>> not >>>>> >>> contain any C/C++ pointers; all references are managed with local >>>>> >>> unique ID's.) >>>>> >>> * RocksDB is a "real" database, and so enables the storage of >>>>> datasets >>>>> >>> that might not otherwise fit into RAM. This back-end does not try >>>>> >>> to guess what your working set is; it is up to you to load, work >>>>> with >>>>> >>> and save those Atoms that are important for you. The >>>>> [examples](examples) >>>>> >>> demonstrate exactly how that can be done. >>>>> >>> >>>>> >>> This backend, together with the CogServer-based >>>>> >>> [network AtomSpace](https://github.com/opencog/atomspace-cog) >>>>> >>> backend provides a building-block out of which more complex >>>>> >>> distributed and/or decentralized AtomSpaces can be built. >>>>> >>> >>>>> >>> Status >>>>> >>> ------ >>>>> >>> This is **Version 0.8.0**. All unit tests pass. All known issues >>>>> >>> have been fixed. This could effectively be version 1.0; waiting on >>>>> >>> user feedback. >>>>> >>> >>>>> >>> -- Linas >>>>> >>> >>>>> >>> -- >>>>> >>> Verbogeny is one of the pleasurettes of a creatific thinkerizer. >>>>> >>> --Peter da Silva >>>>> >>> >>>>> >>> -- >>>>> >>> You received this message because you are subscribed to the Google >>>>> Groups "opencog" group. >>>>> >>> To unsubscribe from this group and stop receiving emails from it, >>>>> send an email to [email protected]. >>>>> >>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/opencog/CAHrUA37Agw0cg5gJX1fDffvSAjcW1kq4LdMOSuyknaEC_41F1g%40mail.gmail.com >>>>> . >>>>> >> >>>>> >> -- >>>>> >> You received this message because you are subscribed to the Google >>>>> Groups "opencog" group. >>>>> >> To unsubscribe from this group and stop receiving emails from it, >>>>> send an email to [email protected]. >>>>> >> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/opencog/CACYTDBcnROxkUgppev8cW2LuAbzqvjWXxrrWZvCgvQv3g9Q3eg%40mail.gmail.com >>>>> . >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > Verbogeny is one of the pleasurettes of a creatific thinkerizer. >>>>> > --Peter da Silva >>>>> > >>>>> > -- >>>>> > You received this message because you are subscribed to the Google >>>>> Groups "opencog" group. >>>>> > To unsubscribe from this group and stop receiving emails from it, >>>>> send an email to [email protected]. >>>>> > To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/opencog/CAHrUA36cTFc-S7C%3D0SqQgfAGZ1bpVupihVfOs0g6hpD14UtSxw%40mail.gmail.com >>>>> . >>>>> >>>>> >>>>> >>>>> -- >>>>> Ben Goertzel, PhD >>>>> http://goertzel.org >>>>> >>>>> “The only people for me are the mad ones, the ones who are mad to >>>>> live, mad to talk, mad to be saved, desirous of everything at the same >>>>> time, the ones who never yawn or say a commonplace thing, but burn, >>>>> burn, burn like fabulous yellow roman candles exploding like spiders >>>>> across the stars.” -- Jack Kerouac >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "opencog" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/opencog/CACYTDBfmiyKhHusd2ThoD6dAYBDdyL73CB%3DJe6w0-aX7WbX_Uw%40mail.gmail.com >>>>> . >>>>> >>>> >>>> >>>> -- >>>> Verbogeny is one of the pleasurettes of a creatific thinkerizer. >>>> --Peter da Silva >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "opencog" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/opencog/CAHrUA3554pK1ktwPmU2rzNAvNUC7U%3DAYV6StqEEkjMPofERkiw%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/opencog/CAHrUA3554pK1ktwPmU2rzNAvNUC7U%3DAYV6StqEEkjMPofERkiw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "opencog" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/opencog/CAPE4pjAhha5RGHTqKxzvpwf8_%3D7TMue2FcEASP0tECMbCjkohQ%40mail.gmail.com >>> <https://groups.google.com/d/msgid/opencog/CAPE4pjAhha5RGHTqKxzvpwf8_%3D7TMue2FcEASP0tECMbCjkohQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> >> >> -- >> Verbogeny is one of the pleasurettes of a creatific thinkerizer. >> --Peter da Silva >> >> -- >> You received this message because you are subscribed to the Google Groups >> "opencog" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/opencog/CAHrUA34H0Xmq08N6nYe2xwcz8QkPigCLm1SnBNy1%3D80eed-fhQ%40mail.gmail.com >> <https://groups.google.com/d/msgid/opencog/CAHrUA34H0Xmq08N6nYe2xwcz8QkPigCLm1SnBNy1%3D80eed-fhQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/CAPE4pjB6q3YF7%2Bc-X6S9reZVsuur4%3D%2BM64t1qe5CPz3HSQ7pqg%40mail.gmail.com > <https://groups.google.com/d/msgid/opencog/CAPE4pjB6q3YF7%2Bc-X6S9reZVsuur4%3D%2BM64t1qe5CPz3HSQ7pqg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- Verbogeny is one of the pleasurettes of a creatific thinkerizer. --Peter da Silva -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA375nBmNJfZ9Gb5aPRtEm5E-2ZPC%2BNp3LjhDOHAeC2L8mA%40mail.gmail.com.
