> A second is to create a UniProtNode and use that; queries are then simple because you just ask for all UniprotNodes.
We are already using this approach. We have added new, data-source specific types to the atomspace and we use those types in pattern matching query. > A third (recommended) way is to write (MemberLink (Node "Uniprot: 1234") (Concept "the-set-of-all-uniprots")) can you please explain why this approach is recommended compared to the second one? Doesn't using this approach add many links that can be avoided by having a specific type? > . unless you mean "can I ask if (Node "uniprot: 1234") exists, without accidentally creating it if it does not?" More like "can I ask if any node with name "uniprot:1234" exists? If so, can you return that node." > you can do this from the C++, scheme and python API's, but you cannot do this in Atomese. If I know the type and the name, yes I can do this from the C++, scheme and python - I'm actually doing this in the C++ code for the rpc server. But in the case I'm describing, I only know the name and not the type. And to create a Handle to retrieve the atom, I need both the type and the name. On Thursday, August 27, 2020 at 8:33:29 PM UTC+3 linas wrote: > I just provided three different solutions to that task... -- linas > > On Thu, Aug 27, 2020 at 11:14 AM Ben Goertzel <[email protected]> wrote: > >> >> I think perhaps what Xabush wants is to be able to query >> >> " Find me all Atoms whose name string contains the substring "ABDPDQ". " >> >> even if he doesn't know what types these Atoms may be ? >> >> ben >> >> On Thu, Aug 27, 2020 at 9:09 AM Linas Vepstas <[email protected]> >> wrote: >> >>> This statement I find confusing: "I can’t write a pattern matching query >>> to retrieve an atom using its id/name" There is one and only one such atom, >>> ever, by definition... There is nothing to query; if you know the name, you >>> know the atom. >>> >>> There was talk previously about "substring matching", for example, you >>> have atoms named "Uniprot: 1234" and "Uniprot: 5678" and you want to find >>> all atoms that start with the eight characters "Uniprot:". There are (at >>> least) three solutions for this. One is to create a RegexNode, but this is >>> ugly from a theoretical standpoint. A second is to create a UniProtNode and >>> use that; queries are then simple because you just ask for all >>> UniprotNodes. A third (recommended) way is to write (MemberLink (Node >>> "Uniprot: 1234") (Concept "the-set-of-all-uniprots")) >>> >>> This third way is recommended because, in a sense, the atomspace is >>> nothing but one giant network of interconnected partial indexes. There is >>> an index from (Node "Uniprot: 1234") to everything that makes use of it -- >>> its called "the incoming set" and it is a real index - a c++ std::set if I >>> recall. Same for (Concept "the-set-of-all-uniprots") and what the pattern >>> matcher "actually does" is to stitch together these partial indexes into a >>> whole, and then prune away the irrelevant parts. >>> >>> -- Linas >>> >>> ... unless you mean "can I ask if (Node "uniprot: 1234") exists, without >>> accidentally creating it if it does not?" ... you can do this from the C++, >>> scheme and python API's, but you cannot do this in Atomese. >>> >>> >>> >>> >>> On Thu, Aug 27, 2020 at 4:07 AM Abdulrahman Semrie <[email protected]> >>> wrote: >>> >>>> >>>> >>>> TL;DR: you can already do that. It's already supported. >>>> >>>> It’s partially supported. As you’ve described, we can cache the result >>>> of a pattern matching query and it is already supported. However, since I >>>> can’t write a pattern matching query to retrieve an atom using its id/name >>>> from the atomspace, there is no way to cache/index. If there was some >>>> ExistsLink that inherits from QueryLink where you can use to retrieve >>>> an atom by its name if it exists or return a false truth value, then what >>>> you’ve described can be done. >>>> >>>> — >>>> >>>> Regards, >>>> >>>> Abdulrahman Semrie >>>> <https://canarymail.io> >>>> >>>> On Thursday, Aug 27, 2020 at 2:46 AM, Linas Vepstas < >>>> [email protected]> wrote: >>>> TL;DR: you can already do that. It's already supported. >>>> >>>> Please follow me on this train of thought. >>>> >>>> 1) What is an "index"? Well, its a pre-defined cache of all atoms of >>>> some shape or pattern. >>>> >>>> 2) How can one specify an index? Well, if its a pattern, then a >>>> pattern query can be used. >>>> >>>> 3) Where should the index be stored, or kept? Well, it can be stored or >>>> kept with the pattern that defines the shape of the index. >>>> >>>> Before I move on to the next thought, let me point out that 1-2-3 can >>>> be directly solved today. Define a pattern, e.g. a query link. Run it. >>>> Store the results on the query, as a value. You can "do this yourself", >>>> today, its easy, but it becomes even easier if you are willing to read the >>>> docs for `cog-execute-cache!` (appended below) >>>> >>>> 4) How should the index be updated? Ah, well, that is actually the >>>> tricky question, the hard question, the place where all of the interesting >>>> technology debates and thinking are centered. One strategy is to update >>>> the index every single time an Atom is added to/removed from the >>>> atomspace. >>>> But recomputing the index every time is wildly inefficient, burning >>>> through >>>> vast quantities of CPU time. What else can one do? Well, maybe recompute >>>> on >>>> demand. Or recompute every few minutes. Or maybe once a night. (aka >>>> "eventually consistent") Maybe store a time-stamp on the index, to tell >>>> you how old it is. Or maybe have an append-only log of atomspace >>>> changes... >>>> I can propose many different kinds of solutions. They all have space and >>>> time-overhead, and/or assorted usability issues. Which of these best suits >>>> your needs, I have trouble guessing, so you would have to explain what the >>>> problem is (if any). >>>> >>>> --linas >>>> >>>> Here's the docs: >>>> cog-execute-cache! EXEC KEY [METADATA [FRESH]] >>>> >>>> Execute or return cached execution results. This is a caching version >>>> of the `cog-execute!` call. >>>> >>>> If the optional FRESH boolean flag is #f, then if there is a Value >>>> stored at KEY on EXEC, return that Value. The default value of FRESH >>>> is #f, so the default behavior is always to return the cached value. >>>> If the optional FRESH boolean flag is #t, or if there is no Value >>>> stored at KEY, then the `cog-execute!` function is called on EXEC, >>>> and the result is stored at KEY. >>>> >>>> The METADATA Atom is optional. If it is specified, then metadata >>>> about the execution is placed on EXEC at the key METADATA. >>>> Currently, this is just a timestamp of when this execution was >>>> performed. The format of the meta-data is subject to change; this >>>> is currently an experimental feature, driven by user requirements. >>>> >>>> At this time, execution is synchronous. It may be worthwhile to have >>>> an asynchronous version of this call, where the execution is >>>> performed >>>> at some other time. This has not been done yet. >>>> >>>> On Wed, Aug 26, 2020 at 7:41 AM Abdulrahman Semrie <[email protected]> >>>> wrote: >>>> >>>>> >>>>> In the current atomspace, atoms are indexed by their type, i.e given a >>>>> type we can retrieve all the atoms that have that type. But there is no >>>>> other away of adding custom indices in the atomspace. For example, if we >>>>> want to index nodes by their name, there is no way of doing this. >>>>> >>>>> As discussed in this issue >>>>> <https://github.com/MOZI-AI/annotation-scheme/issues/192>, we plan to >>>>> expand the annotation-service, which uses the AtomSpace to store genomics >>>>> data, to support the annotation of more types in addition to genes. >>>>> Currently, when I user submits a list of ids to the service, it is >>>>> assumed >>>>> that these ids/symbols represent `GeneNode`s. But in the case where the >>>>> input can be a protein, a drug molecule, pathway or a gene, there is no >>>>> direct way of retrieving what type of the atom with the given name is >>>>> unless we iterate through all atoms searching for that particular id. >>>>> This >>>>> isn't be a good approach from performance standpoint. But if we had a >>>>> custom index - e.g `name_index`, on the ids/names of the atoms, it will >>>>> be >>>>> easier to search the atoms by name and identify the type that the atom >>>>> belongs to. >>>>> >>>>> Hence, if there is a way to add custom indices to the atomspace, it >>>>> will greatly simplify some searches. Or maybe there is a way to do what I >>>>> described above without the need for an index. If so, please share it. >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "opencog" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/opencog/27892502-0dfb-4042-a805-30a1520f6250n%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/opencog/27892502-0dfb-4042-a805-30a1520f6250n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> >>>> >>>> -- >>>> Verbogeny is one of the pleasurettes of a creatific thinkerizer. >>>> --Peter da Silva >>>> >>>> -- >>>> You received this message because you are subscribed to a topic in the >>>> Google Groups "opencog" group. >>>> To unsubscribe from this topic, visit >>>> https://groups.google.com/d/topic/opencog/5uE2lw6b-5E/unsubscribe. >>>> To unsubscribe from this group and all its topics, send an email to >>>> [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/opencog/CAHrUA34qoTA90pcSC3GwXsGy8xpK5yn-1U7k%2Ba10nuDTWcrBLQ%40mail.gmail.com >>>> >>>> <https://groups.google.com/d/msgid/opencog/CAHrUA34qoTA90pcSC3GwXsGy8xpK5yn-1U7k%2Ba10nuDTWcrBLQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "opencog" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/opencog/2a5214b7-c083-40c0-801d-0a3595783046%40Canary >>>> >>>> <https://groups.google.com/d/msgid/opencog/2a5214b7-c083-40c0-801d-0a3595783046%40Canary?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >>> >>> -- >>> Verbogeny is one of the pleasurettes of a creatific thinkerizer. >>> --Peter da Silva >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "opencog" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/opencog/CAHrUA37N%3Dbjr7QDQzS-uUpcwaSP%3D44QEYfkmUXQC9mrVEZATEQ%40mail.gmail.com >>> >>> <https://groups.google.com/d/msgid/opencog/CAHrUA37N%3Dbjr7QDQzS-uUpcwaSP%3D44QEYfkmUXQC9mrVEZATEQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> >> >> -- >> Ben Goertzel, PhD >> http://goertzel.org >> >> “The only people for me are the mad ones, the ones who are mad to live, >> mad to talk, mad to be saved, desirous of everything at the same time, the >> ones who never yawn or say a commonplace thing, but burn, burn, burn like >> fabulous yellow roman candles exploding like spiders across the stars.” -- >> Jack Kerouac >> >> -- >> You received this message because you are subscribed to the Google Groups >> "opencog" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/opencog/CACYTDBeqdq0vixYq1M0kceBqyywkAvQMPsMOd51X-0V5Oagr2Q%40mail.gmail.com >> >> <https://groups.google.com/d/msgid/opencog/CACYTDBeqdq0vixYq1M0kceBqyywkAvQMPsMOd51X-0V5Oagr2Q%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > Verbogeny is one of the pleasurettes of a creatific thinkerizer. > --Peter da Silva > > -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/8e6d763a-9b4d-4a68-810e-d6f16e80e118n%40googlegroups.com.
