Hi Anton On Wed, Jan 16, 2019 at 11:10 AM Anton Kolonin @ Gmail <[email protected]> wrote:
> > From strategic product/service delivery perspective having AtomSpace as > hyper-graph database (storage) layer isolated from application (business > logic) layer would make a lot of sense, if that is possible. > I think this was always the case. The choice of words "business logic" is kind-of funny. It's quite accurate, but I've found that there is a class of catch-phrases that Ben thinks are boring (he will literally walk out of the room), and I think this is one of them. We've never tried to make the AtomSpace a business product. We've never found a way to talk about the atomspace in a business-obvious, developer-obvious kind of way (compare, again to grakn.ai). This means that almost all developers struggle mightily to figure out how to use it, and almost always fail. Compare, again to grakn.ai: it's not just tutorials and demos and examples and documentation; its the need create an example that *everyone* can relate to, and make that the primary example. > As I see it, one of the problems preventing this nice architectural > isolation between the layers is atom type hierarchy which is bound to > low-level implementation of the AtomSpace (storage) concepts on one end > and to high-level (business logic) aspects like NLP on the other end. > What's the problem? Per a different thread, there is a need for a better FFI (or rather a "foreign type interface" rather than a "foreign function interface") but I think this a well-defined, easily solvable problem that no one has been interested in solving, until now. > > Ideally, I would imagine having AtomSpace as a C/C++ graph database > loadable with any atom type hierarchy, isolated from any specific atom > type hierarchies like one used in OpenCog. > But that is the case already, is it not? For example, agi-bio has it's own hierarchy; the atomspace does not know about it, but it can store it and load it and pattern-match it and backward-chain with it just fine. My question might be rhetorical; I know of many things that are wrong, incomplete, poorly implemented ... but it is hard to figure out which of these are important, and which are not. So you'd have to make clear which ones are the important ones. > On top of this, there could be separate projects and any applications > and scripts in any languages such as Scheme or C/C++ or whaterve, > loading any atom type hierarchies into with any AGI/NLP/etc. applications. > I think that has been possible since about "forever", so you would have to give a more detailed example. Now, there are many things that one could do to make the atomspace better/easier for "ordinary" users. I've thought a lot about these. But doing so takes focus and effort. The historical focus has been on PLN and various conceptions of AGI, and essentially zero focus on "normal" applications. > > But there is more important thing that is concerning me regarding the > architecture. To my understanding, unlikely any conventional database > used in industry, the OpenCog is not supposed to work multi-user > environments. For instance if you have SQL table about animals, you may > have multiple users querying different segments of the table related to > different animals. > Do you mean "access permissions"? Read-write? There's some minimal support for that; you can have a read-only atomspace (e.g. some huge genome dataset) and then a read-write layer on top of it (so that some scientist can modify portions of the dataset, without screwing up the total, and without having to make a private, personal copy of the huge dataset.) This works now, but it's minimal; no fancy features. If you mean "atomic update", then no; the atomspace is more BASE-like than ACID-like. This could be interesting to talk about. If you mean "table schemas", we've got a prototype of that, called "deep types". In SQL you must always have a table schema. In prolog/datalog, you never need a schema (and I'm not sure it is even possible to specify a schema in those languages). I promise not to mention XML schema. Ooops. Same idea - you can write XML without a schema, but there are people who insist that their app has to have one, and so -- XML schemas. We have a sketch for that in the atomspace -- see the wiki page on type constructors. The basics work. No one actually uses it for anything. If you mean inner and outer joins, the pattern matcher already does that, automatically. > > Seemingly, it does not work the same way in OpenCog - if two independent > users start MST-Parsing on two different corpora, they will have data > messed up together, Why? Open a bug; this should work perfectly. Once, long long ago, I've run parsing in parallel on 3 different machines; the data was not "messed", it summed up very nicely. I have not actually tried this (or even thought about it) with the current pipeline in opencog/nlp, so yes, there may be bugs. They should be fixed. There's a potential performance penalty from syncing too often. There might be issues with atomic updates; I don't think we have atomic counters fully implemented (work for that was started, but not finished) but you can certainly do language learning without atomic counters. > if they start inference or pattern matching activity > on different topics, the topics will be messed up together. Huh? This should work perfectly. Open a bug. > The way it > is supposed to get solved is having different AtomSpaces for each corpus > ?? > or for each inference process but the AtomSpaces are really heavyweight > and you can not create AtomSpaces dynamically for the user sessions. > ?? Why can't you create atomspaces dynamically? cog-new-atomspace cog-push-atomspace cog-pop-atomspace cog-atomspace-readonly? cog-set-atomspace! and 8 more of these kinds of functions. > Well, we may have pool of N AtomSpaces serving queue of M users, so if N > < M then M-N users are staying in queue. But I anticipate that context > AtomSpace initialization for every user coming from the queue could be > as expensive as creation of the new AtomSpace for every user... > I don't understand. You can do this just fine. There's already a pool of temporary atomspaces. You'd have to define what "expensive" means. I think you can create an atomspace in milliseconds or maybe tens of milliseconds at most. Destroying a database with millions of atoms in it slow ... but that is a different issue. > > Something to get addressed before considering exposure of OpenCog-based > services to SingularityNET. > There are 1001 things that generic databases do that the atomspace does not do, or, at least, not efficiently, quickly, easily. It would be nice to have those features. Up until now, there has been a very low demand for these. Because the user base is tiny. There is one very very important issue that is being ignored: we need to be able to load the ghost rules for Sophia much more quickly than we do. Last time someone measured, it was unacceptably slow. I'm not sure of what the status on that is. -- Linas -- cassette tapes - analog TV - film cameras - you -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA35M6CT%3D-D1cG40ucgtjUdbBzdZ7A03bMyQpGB64mDca%3Dw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
