I read it. I feel like you're talking about complete different things, and misrepresenting what Cassandra is. All those node.microservice frequently require a common data store (in additional to a local data store), and that store is more often than not, Dynamo, Firebase, or Cassandra.
But I feel like someone said, "Which kind of plane should we take to China?" and you answered, "Fishing is best done from a boat." True, and maybe we do want to do some fishing when we get to China, but that's not at all what I'm talking about here, and I make my living building the kinds of micro-service based applications that youre describing. So, I conclude we're talking at levels that are too abstract to resolve our communication failures, and the only path forward is to build a PoC. It's unlikely I'll have the time given my current employment, but now I know it won't happen until I do. If I can leave oy one impression: Tunable Consistency is valuable. Not Eventual Consistency. Not Cassandra or Scylla or Seastar, or Token Rings. An effective multi-mind-agent distributed atomspace probably requires Tunable Consistency, however it's implemented. Arguably, current DB backing offers a limited form of it, but more powerful forms exist. Matt On Tue, Aug 11, 2020, 5:04 PM Linas Vepstas <[email protected]> wrote: > Sigh. > > I dislike writing long emails because I fear no one reads them, or that > they are viewed as overly aggressive and pugnacious. But until such time as > we have mind-reading or neural laces, its .. email. > > I want to talk about "service meshes". The problem with shopping for > cassandra, or any of the other suggested databases, is that they are all > "monolithic black boxes". You pick one, and you get what you get: whatever > is provided, that's what it is. Sure, some configuration files somewhere > allow you to tune this and that, but that's all. > > The service mesh idea (and the npm/js idea before that) is to assemble > your system out of small, self-contained pieces. Sure, the object-oriented > folks have been talking about this for 3 or 4 decades, and it's cited as > the raison-d'etre for things like C++. But C++ never lived up to this > ideal. There are no generic C++ frameworks. None. At All. (OK, so SGI had > one or two in the early 1990's ...) Something is ... missing... in C++. > Compare this to node.js and npm which are wildly successful over-achievers > in this category. People regularly build large applications by assembling > a cacophony of tiny little javascript parts. Clearly, javascript has > something that C++ does not. Something that makes the OO dream achievable > not just in theory, but regularly validated in practice. > > Now, there are some down-sides to npm apps: they contain hundreds or > thousands of parts, and not all of them are well-maintained, and many have > published security vulnerabilities that remain unpatched. Worse, patching > some of them require incompatible API changes that would break users. So it > has its own prickly and thorny issues that are unique and different from > those that other languages (python, scheme, c++) suffer from. > > In the cloud world, there has long been, and continues to be a movement to > meshes of containerized applications. Here, docker is the prototypical > container -- lxc/lxd/lxe more generally. Managing these containers > requires kubernetes, and more: the "service meshes" (istio, microsoft open > service mesh) provide a layer (a "control plane") that further manages > deployments, error fallbacks, a/b testing, circuit-breakers, > load-balancing, etc. The mental model is that containerized apps are just > like npm nodes, except they are million times bigger and beefier > (literally) and they all have network interfaces instead of javascript > methods/objects. And since they are so much bigger, they need more active > management. > > Now compare the service-mesh idea to the olde-fashioned ideas of "web > shopping carts" or "content management systems" or "customer relationship > management systems". Those things were single, monolithic black boxes that > you bought from a vendor (or installed via open-source) that automagically > did everything for you, once you configured a few templates. They worked > great, as long as what you wanted was (a) a web shopping cart, and (b) was > customizable via some template or config file. If not .. you were SOL. > > These monolithic architectures were their downfall, were the driver to > containers, kubernetes and service meshes. The founders of cloud startup > XYZ can't spray-paint some config files onto a monolith and then raise $20M > in venture funding. But, give them a bunch of pieces-parts containers, > that they can hook up in some new, novel and exciting way, plus a little > secret sauce, and buzzword-bingo, a unicorn is born. > > And this is why Cassandra makes me yawn with disinterest, if not a bit of > hostility. It's a big monolithic block. Sure, I can take the AtomSpace, and > plaster it onto Cassandra, like wrapping some wet paper around a rock. The > ultimate shape is still that of the rock, no matter how brightly-colored or > thoughtful that paper wrapped around it is. > > So, I'm trying to grab hold of this idea of pieces-parts. OpenCog needs > pieces-parts that can be arranged and re-assembled into that mesh that > provides the distributed-atomspace attributes and requirements du-jour. > > Yes, of course, singularity.net is also pursuing a vision of pieces-parts > that can be assembled. Which is why I am a bit dumb-founded that we are > entertaining ideas like Cassandra -- it is the very antithesis of modular > architecture. It's the opposite of a dapp -- It's a big giant lump, the one > ring to rule them all. It's kind of exactly the poster-child for what not > to do ... > > For a distributed atomspace, what we really need to focus on is > inter-operability, so that, like javascript (and unlike c++) it is easy to > assemble modules out of other modules. Like containers, there should be > some fairly regularized API for communications (I nominate > atomese-as-ascii-strings i.e. s-expressions and maybe plan-B > atomese-as-json). With this under control, we can move on to creating > unique, custom services aka agents aka dapps or whatever these other things > might be. > > Again, I nominate the building-blocks idea: I took the earlier email, and > pasted it into the README, here: > https://github.com/opencog/atomspace-agents > > -- Linas > > > On Tue, Aug 11, 2020 at 5:04 PM Linas Vepstas <[email protected]> > wrote: > >> This appears to evade/avoid acknowledging issue #1, which is the (CPU) >> overhead of translating between multiple formats, the competition for RAM >> that those formats entail, and the need to ship the resulting bytes between >> API's, or, worse, over (network or local) sockets. >> >> Sure, maybe cassandra has nice solutions for issues #2 #3 and #4, such as >> consistency, replication, etc. but until you address issue #1 frontally and >> completely, the remaining issues are utterly unimportant and even >> delusional. >> >> --linas >> >> On Tue, Aug 11, 2020 at 10:35 AM Ben Goertzel <[email protected]> wrote: >> >>> Matt, >>> >>> So regarding Cassandra, it's clear there are many cool things there... >>> From what I understand, the key differentiating functionality it seems >>> potentially able to offer would be: The ability to replicate atoms >>> locally accompanied by eventual consistency ... >>> >>> As a first step, I wonder if it would make sense to try some simple >>> experiments w/ Cassandra to see if it really does this effectively for >>> an OpenCog context? If you or anyone else w/ Cassandra experience >>> has time to experiment w/ this, it might be quite interesting... >>> >>> Is Cassandra's notion of eventual consistency significantly different >>> from that in Amazon's DynamoDB ? >>> >>> It seems that in some cases in OpenCog we might want to let two >>> versions of an Atom drift even further/longer than is commonly allowed >>> to happen in most Dynamo-based systems... but this really comes down >>> to, how flexible is the eventual consistency management / >>> configuration in these things? >>> >>> ben >>> >>> On Wed, Jul 29, 2020 at 12:19 PM Matt Chapman <[email protected]> >>> wrote: >>> > >>> > > Which peers? >>> > As determined by a token ring: >>> > >>> > >>> https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/architecture/archDataDistributeDistribute.html >>> > >>> > I think you could almost replace "vnode" with "chunk" if you wanted to >>> adopt the Cassandra architecture, although I wouldn't be surprised to see >>> performance problems with a huge number of vnodes, so it might actually >>> need to be a "chunk-hash modulo reasonable number of vnodes". >>> > >>> > > How do you find them? >>> > >>> > By calculating the partition token via consistent hash, as Cassandra >>> does with Murmur3. This tells you the authoritative source for the chunk >>> you want. You might also have a local cache of other peers that have had >>> replicas of that chunk, in case any of them are more responsive to you. >>> Cassandra calls this process of finding potential replicas "Snitching". >>> > >>> > >>> > > You are thinking Kademlia (as do I, when I think of publishing) or >>> OpenDHT or IPFS. >>> > >>> > Nope. I've only played with IPFS a bit, but I don't expect it to be >>> performance for the atomsoace use case. I'm only vaguely familiar with >>> openDHT; it seems worth exploring, but I'm sure you understand it far >>> better than I do. >>> > >>> > I'm not very familiar with p2p systems like kademlia, but I suspect >>> that's optimized for consistency & availability over performance, so not >>> the right choice for datomspace. >>> > >>> > By this point, it should be clear that I look to Cassandra for how >>> semi-conistent distributed data storage systems should be designed. (Fwiw, >>> my inspiration for distributed messaging systems comes mostly from Apache >>> Kafka.) >>> > >>> > >>> > > Which is great, if all you're doing is publishing small amounts of >>> static, infrequently-changing information. Not so much, if interacting or >>> blasting out millions of updates. Neither system can handle that -- >>> literally -- tried that, been there, done that. They are simply not >>> designed for that. >>> > >>> > Cassandra is. To be fair, Cassandra is optimized for massive scale, >>> with may involve some trade-offs that are not desirable for present-day >>> atomspace use cases. >>> > >>> > See also, ScyllaaDB for a C++ reimplementation of Cassandra. >>> > >>> > > Now, perhaps using only a hash-driven system, it is possible to >>> overcome these issues. I do not know how to do this. Perhaps someone does >>> -- perhaps there are even published papers ... I admit I did not do a >>> careful literature search. >>> > >>> > http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf >>> > >>> http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf >>> > >>> > Matt >>> > >>> > >>> > >>> > On Wed, Jul 29, 2020, 9:37 AM Linas Vepstas <[email protected]> >>> wrote: >>> >> >>> >> >>> >> >>> >> On Wed, Jul 29, 2020 at 1:09 AM Matt Chapman <[email protected]> >>> wrote: >>> >>> >>> >>> >I think it's a mistake to try to think of a distributed atomspace >>> as one super-giant, universe-filling uniform, undifferentiated blob of >>> storage. >>> >>> >>> >>> > You don't want broadcast messages going out to the whole universe. >>> >>> >>> >>> Not sure if you intended to imply it, but the reality of the first >>> statmentt need not require the 2nd statement. Hashes of atoms/chunks can be >>> mapped via modulo onto hashes of peer IDs so that messages need only go to >>> one or few peers. >>> >> >>> >> >>> >> Which peers? How do you find them? You are thinking Kademlia (as do >>> I, when I think of publishing) or OpenDHT or IPFS. Which is great, if all >>> you're doing is publishing small amounts of static, infrequently-changing >>> information. Not so much, if interacting or blasting out millions of >>> updates. Neither system can handle that -- literally -- tried that, been >>> there, done that. They are simply not designed for that. >>> >> >>> >> Now, perhaps using only a hash-driven system, it is possible to >>> overcome these issues. I do not know how to do this. Perhaps someone does >>> -- perhaps there are even published papers ... I admit I did not do a >>> careful literature search. >>> >> >>> >> But, basically, before we are even out of the gate, we already have a >>> snowball of problems with no obvious solution. Haven't even written any >>> code, and are beset by technical problems. That's not an auspicious >>> beginning. >>> >> >>> >> If you have something more specific, let me know. Right now, I simply >>> don't know how to do this. >>> >> >>> >> --linas >>> >>> >>> >>> >>> >>> Specialization has a cost, in that you need to maintain some central >>> directory or gossip protocol so that peers can learn which other peers are >>> specialized to which purpose. >>> >>> >>> >>> An ideal general intelligence network may very well include both a >>> large number of generalist, undifferentiated peers and clusters of highly >>> interconnected specialized peers. If peers are neurons, I think this >>> describes the human nervous system also, no? >>> >>> >>> >>> To borrow terms from my previous messsge, generalist peers own many >>> atoms, and replicate few, while specialist peers own few or none, but >>> replicate many. >>> >>> >>> >>> Matt >>> >>> >>> >>> >>> >>> >>> >>> On Tue, Jul 28, 2020, 10:36 PM Linas Vepstas <[email protected]> >>> wrote: >>> >>>> >>> >>>> >>> >>>> >>> >>>> On Tue, Jul 28, 2020 at 11:41 PM Ben Goertzel <[email protected]> >>> wrote: >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> Hmm... you are right that OpenCog hypergraphs have natural chunks >>> >>>>> defined by recursive incoming sets. However, I think these chunks >>> >>>>> are going to be too small, in most real-life Atomspaces, to serve >>> the >>> >>>>> purpose of chunking for a distributed Atomspace >>> >>>>> >>> >>>>> I.e. it is true that in most cases the recursive incoming set of an >>> >>>>> Atom should all be in the same chunk. But I think we will probably >>> >>>>> need to deal with chunks that are larger than the recursive >>> incoming >>> >>>>> set of a single Atom, in very many cases. >>> >>>> >>> >>>> >>> >>>> I like the abstract to the Ja-be-ja paper, will read and ponder. It >>> sounds exciting. >>> >>>> >>> >>>> But ... the properties of a chunk depends on what you want to do >>> with it. >>> >>>> >>> >>>> For example: if some peer wants to declare a list of everything it >>> holds, then clearly, creating a list of all of its atoms is self-defeating. >>> But if some user wants some specific chunk, well, how does the user ask for >>> that? How does the user know what to ask for? How does the user say "hey >>> I want that chunk which has these contents"? Should the user say "deliver >>> to me all chunks that contain Atom X"? If the user says this, then how does >>> the peer/server know if it has any checks with Atom X in it? Does the >>> peer/server keep a giant index of all atoms it has, and what chunks they >>> are in? Is every peer/server obliged to waste some CPU cycles to figure out >>> if it's holding Atom X? This gets yucky, fast. >>> >>>> >>> >>>> This is where QueryLinks are marvelous: the Query clearly states >>> "this is what I want" and the query is just a single Atom, and it can be >>> given an unambiguous, locally-computable (easily-computable; we already do >>> this) 80-bit or a 128-bit (or bigger) hash and that hash can be blasted >>> out to the network (I'm thinking Kademlia, again) in a compact way - its >>> not a lot of bytes. The request for the "query chunk" is completely >>> unambiguous, and the user does not have to make any guesses whatsoever >>> about what may be contained in that chunk. Whatever is in there, is in >>> there. This solves the naming problem above. >>> >>>> >>> >>>>> >>> >>>>> What happens when the results for that (new) BindLink query are >>> spread >>> >>>>> among multiple peers on the network in some complex way? >>> >>>> >>> >>>> >>> >>>> I'm going to avoid this question for now, because "it depends" and >>> "not sure" and "I have some ideas". >>> >>>> >>> >>>> My gut impulse is that the problem splits into two parts: first, >>> find the peers that you want to work with, second, figure out how to work >>> with those peers. >>> >>>> >>> >>>> The first part needs to be fairly static, where a peer can >>> advertise "hey this is the kind of data I hold, this is the kind of work >>> I'm willing to perform." Once a group of peers is located, many of the >>> scaling issues go away: groups of peers tend to be small. If they are not, >>> you organize them hierarchically, they way you might organize people, with >>> specialists for certain tasks. >>> >>>> >>> >>>> I think it's a mistake to try to think of a distributed atomspace >>> as one super-giant, universe-filling uniform, undifferentiated blob of >>> storage. I think we'll run into all sorts of conceptual difficulties and >>> design problems if you try to do that. If nothing else, it starts smelling >>> like quorum-sensing in bacteria. Which is not an efficient way to >>> communicate. You don't want broadcast messages going out to the whole >>> universe. Think instead of atomspaces connecting to one-another like >>> dendrites and axons: a limited number, a small number of connections >>> between atomspaces, but point-to-point, sharing only the data that is >>> relevant for that particular peer-group. >>> >>>> >>> >>>> -- Linas >>> >>>> >>> >>>> -- >>> >>>> Verbogeny is one of the pleasurettes of a creatific thinkerizer. >>> >>>> --Peter da Silva >>> >>>> >>> >>>> -- >>> >>>> You received this message because you are subscribed to the Google >>> Groups "opencog" group. >>> >>>> To unsubscribe from this group and stop receiving emails from it, >>> send an email to [email protected]. >>> >>>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/opencog/CAHrUA35zN4aaSrZ2Dpu4qLUL1bYfjAF_rGiS_xxg2-E-SBqY3Q%40mail.gmail.com >>> . >>> >>> >>> >>> -- >>> >>> You received this message because you are subscribed to the Google >>> Groups "opencog" group. >>> >>> To unsubscribe from this group and stop receiving emails from it, >>> send an email to [email protected]. >>> >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/opencog/CAPE4pjCyzOcoRAOPj7aGsj_73dAUnWovbjeaM4qjeM43hzXA6A%40mail.gmail.com >>> . >>> >> >>> >> >>> >> >>> >> -- >>> >> Verbogeny is one of the pleasurettes of a creatific thinkerizer. >>> >> --Peter da Silva >>> >> >>> >> -- >>> >> You received this message because you are subscribed to the Google >>> Groups "opencog" group. >>> >> To unsubscribe from this group and stop receiving emails from it, >>> send an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/opencog/CAHrUA36esvtcgGrZ%3D4rCVMDde74TYKF1%3DS-AwLG95UYrT5Mdrg%40mail.gmail.com >>> . >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> Groups "opencog" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> > To view this discussion on the web visit >>> https://groups.google.com/d/msgid/opencog/CAPE4pjALKeWmpzxwoYR7gCmS5ZcDqrrKPaB0V-UZe814G6cwTA%40mail.gmail.com >>> . >>> >>> >>> >>> -- >>> Ben Goertzel, PhD >>> http://goertzel.org >>> >>> “The only people for me are the mad ones, the ones who are mad to >>> live, mad to talk, mad to be saved, desirous of everything at the same >>> time, the ones who never yawn or say a commonplace thing, but burn, >>> burn, burn like fabulous yellow roman candles exploding like spiders >>> across the stars.” -- Jack Kerouac >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "opencog" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/opencog/CACYTDBdYpCUcqMcEAUDtn_P4UbrCq1PrC7keJJoArFU5B%3Dq1Cw%40mail.gmail.com >>> . >>> >> >> >> -- >> Verbogeny is one of the pleasurettes of a creatific thinkerizer. >> --Peter da Silva >> >> > > -- > Verbogeny is one of the pleasurettes of a creatific thinkerizer. > --Peter da Silva > > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/CAHrUA362wWYPdp1L5g%3DORV77XYc5aDLh4SEGtn-zPJ-JWWge4g%40mail.gmail.com > <https://groups.google.com/d/msgid/opencog/CAHrUA362wWYPdp1L5g%3DORV77XYc5aDLh4SEGtn-zPJ-JWWge4g%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAPE4pjC7D%2BQLwZbAKbZe2Mi3dftw9_o%2BMATbgnvoqN7hZ_wp8A%40mail.gmail.com.
