This statement I find confusing: "I can’t write a pattern matching query to retrieve an atom using its id/name" There is one and only one such atom, ever, by definition... There is nothing to query; if you know the name, you know the atom.
There was talk previously about "substring matching", for example, you have atoms named "Uniprot: 1234" and "Uniprot: 5678" and you want to find all atoms that start with the eight characters "Uniprot:". There are (at least) three solutions for this. One is to create a RegexNode, but this is ugly from a theoretical standpoint. A second is to create a UniProtNode and use that; queries are then simple because you just ask for all UniprotNodes. A third (recommended) way is to write (MemberLink (Node "Uniprot: 1234") (Concept "the-set-of-all-uniprots")) This third way is recommended because, in a sense, the atomspace is nothing but one giant network of interconnected partial indexes. There is an index from (Node "Uniprot: 1234") to everything that makes use of it -- its called "the incoming set" and it is a real index - a c++ std::set if I recall. Same for (Concept "the-set-of-all-uniprots") and what the pattern matcher "actually does" is to stitch together these partial indexes into a whole, and then prune away the irrelevant parts. -- Linas ... unless you mean "can I ask if (Node "uniprot: 1234") exists, without accidentally creating it if it does not?" ... you can do this from the C++, scheme and python API's, but you cannot do this in Atomese. On Thu, Aug 27, 2020 at 4:07 AM Abdulrahman Semrie <[email protected]> wrote: > > > TL;DR: you can already do that. It's already supported. > > It’s partially supported. As you’ve described, we can cache the result of > a pattern matching query and it is already supported. However, since I > can’t write a pattern matching query to retrieve an atom using its id/name > from the atomspace, there is no way to cache/index. If there was some > ExistsLink that inherits from QueryLink where you can use to retrieve an > atom by its name if it exists or return a false truth value, then what > you’ve described can be done. > > — > > Regards, > > Abdulrahman Semrie > <https://canarymail.io> > > On Thursday, Aug 27, 2020 at 2:46 AM, Linas Vepstas < > [email protected]> wrote: > TL;DR: you can already do that. It's already supported. > > Please follow me on this train of thought. > > 1) What is an "index"? Well, its a pre-defined cache of all atoms of some > shape or pattern. > > 2) How can one specify an index? Well, if its a pattern, then a pattern > query can be used. > > 3) Where should the index be stored, or kept? Well, it can be stored or > kept with the pattern that defines the shape of the index. > > Before I move on to the next thought, let me point out that 1-2-3 can be > directly solved today. Define a pattern, e.g. a query link. Run it. Store > the results on the query, as a value. You can "do this yourself", today, > its easy, but it becomes even easier if you are willing to read the docs > for `cog-execute-cache!` (appended below) > > 4) How should the index be updated? Ah, well, that is actually the tricky > question, the hard question, the place where all of the interesting > technology debates and thinking are centered. One strategy is to update > the index every single time an Atom is added to/removed from the atomspace. > But recomputing the index every time is wildly inefficient, burning through > vast quantities of CPU time. What else can one do? Well, maybe recompute on > demand. Or recompute every few minutes. Or maybe once a night. (aka > "eventually consistent") Maybe store a time-stamp on the index, to tell > you how old it is. Or maybe have an append-only log of atomspace changes... > I can propose many different kinds of solutions. They all have space and > time-overhead, and/or assorted usability issues. Which of these best suits > your needs, I have trouble guessing, so you would have to explain what the > problem is (if any). > > --linas > > Here's the docs: > cog-execute-cache! EXEC KEY [METADATA [FRESH]] > > Execute or return cached execution results. This is a caching version > of the `cog-execute!` call. > > If the optional FRESH boolean flag is #f, then if there is a Value > stored at KEY on EXEC, return that Value. The default value of FRESH > is #f, so the default behavior is always to return the cached value. > If the optional FRESH boolean flag is #t, or if there is no Value > stored at KEY, then the `cog-execute!` function is called on EXEC, > and the result is stored at KEY. > > The METADATA Atom is optional. If it is specified, then metadata > about the execution is placed on EXEC at the key METADATA. > Currently, this is just a timestamp of when this execution was > performed. The format of the meta-data is subject to change; this > is currently an experimental feature, driven by user requirements. > > At this time, execution is synchronous. It may be worthwhile to have > an asynchronous version of this call, where the execution is performed > at some other time. This has not been done yet. > > On Wed, Aug 26, 2020 at 7:41 AM Abdulrahman Semrie <[email protected]> > wrote: > >> >> In the current atomspace, atoms are indexed by their type, i.e given a >> type we can retrieve all the atoms that have that type. But there is no >> other away of adding custom indices in the atomspace. For example, if we >> want to index nodes by their name, there is no way of doing this. >> >> As discussed in this issue >> <https://github.com/MOZI-AI/annotation-scheme/issues/192>, we plan to >> expand the annotation-service, which uses the AtomSpace to store genomics >> data, to support the annotation of more types in addition to genes. >> Currently, when I user submits a list of ids to the service, it is assumed >> that these ids/symbols represent `GeneNode`s. But in the case where the >> input can be a protein, a drug molecule, pathway or a gene, there is no >> direct way of retrieving what type of the atom with the given name is >> unless we iterate through all atoms searching for that particular id. This >> isn't be a good approach from performance standpoint. But if we had a >> custom index - e.g `name_index`, on the ids/names of the atoms, it will be >> easier to search the atoms by name and identify the type that the atom >> belongs to. >> >> Hence, if there is a way to add custom indices to the atomspace, it will >> greatly simplify some searches. Or maybe there is a way to do what I >> described above without the need for an index. If so, please share it. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "opencog" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/opencog/27892502-0dfb-4042-a805-30a1520f6250n%40googlegroups.com >> <https://groups.google.com/d/msgid/opencog/27892502-0dfb-4042-a805-30a1520f6250n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > Verbogeny is one of the pleasurettes of a creatific thinkerizer. > --Peter da Silva > > -- > You received this message because you are subscribed to a topic in the > Google Groups "opencog" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/opencog/5uE2lw6b-5E/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/CAHrUA34qoTA90pcSC3GwXsGy8xpK5yn-1U7k%2Ba10nuDTWcrBLQ%40mail.gmail.com > <https://groups.google.com/d/msgid/opencog/CAHrUA34qoTA90pcSC3GwXsGy8xpK5yn-1U7k%2Ba10nuDTWcrBLQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/2a5214b7-c083-40c0-801d-0a3595783046%40Canary > <https://groups.google.com/d/msgid/opencog/2a5214b7-c083-40c0-801d-0a3595783046%40Canary?utm_medium=email&utm_source=footer> > . > -- Verbogeny is one of the pleasurettes of a creatific thinkerizer. --Peter da Silva -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA37N%3Dbjr7QDQzS-uUpcwaSP%3D44QEYfkmUXQC9mrVEZATEQ%40mail.gmail.com.
