This statement I find confusing: "I can’t write a pattern matching query to
retrieve an atom using its id/name" There is one and only one such atom,
ever, by definition... There is nothing to query; if you know the name, you
know the atom.

There was talk previously about "substring matching", for example, you have
atoms  named "Uniprot: 1234" and "Uniprot: 5678" and you want to find all
atoms that start with the eight characters "Uniprot:". There are (at least)
three solutions for this. One is to create a RegexNode, but this is ugly
from a theoretical standpoint. A second is to create a UniProtNode and use
that; queries are then simple because you just ask for all UniprotNodes.  A
third (recommended) way is to write  (MemberLink (Node "Uniprot: 1234")
(Concept "the-set-of-all-uniprots"))

This third way is recommended because, in a sense, the atomspace is nothing
but one giant network of interconnected partial indexes. There is an index
from (Node "Uniprot: 1234") to everything that makes use of it -- its
called "the incoming set" and it is a real index - a c++ std::set  if I
recall. Same for (Concept "the-set-of-all-uniprots") and what the pattern
matcher "actually does" is to stitch together these partial indexes into a
whole, and then prune away the irrelevant parts.

-- Linas

... unless you mean "can I ask if (Node "uniprot: 1234") exists, without
accidentally creating it if it does not?" ... you can do this from the C++,
scheme and python API's, but you cannot do this in Atomese.




On Thu, Aug 27, 2020 at 4:07 AM Abdulrahman Semrie <[email protected]>
wrote:

>
>
> TL;DR: you can already do that.  It's already supported.
>
> It’s partially supported. As you’ve described, we can cache the result of
> a pattern matching query and it is already supported. However, since I
> can’t write a pattern matching query to retrieve an atom using its id/name
> from the atomspace, there is no way to cache/index. If there was some
> ExistsLink that inherits from QueryLink where you can use to retrieve an
> atom by its name if it exists or return a false truth value, then what
> you’ve described can be done.
>
> —
>
> Regards,
>
> Abdulrahman Semrie
> <https://canarymail.io>
>
> On Thursday, Aug 27, 2020 at 2:46 AM, Linas Vepstas <
> [email protected]> wrote:
> TL;DR: you can already do that.  It's already supported.
>
> Please follow me on this train of thought.
>
> 1) What is an "index"? Well, its a pre-defined cache of all atoms of some
> shape or pattern.
>
> 2) How can one specify an index?  Well, if its a pattern, then a pattern
> query can be used.
>
> 3) Where should the index be stored, or kept? Well, it can be stored or
> kept with the pattern that defines the shape of the index.
>
> Before I move on to the next thought, let me point out that 1-2-3 can be
> directly solved today. Define a pattern, e.g. a query link. Run it. Store
> the results on the query, as a value. You can "do this yourself", today,
> its easy, but it becomes even easier if you are willing to read the docs
> for `cog-execute-cache!` (appended below)
>
> 4) How should the index be updated? Ah, well, that is actually the tricky
> question, the hard question, the place where all of the interesting
> technology debates and thinking are centered.  One strategy is to update
> the index every single time an Atom is added to/removed from the atomspace.
> But recomputing the index every time is wildly inefficient, burning through
> vast quantities of CPU time. What else can one do? Well, maybe recompute on
> demand. Or recompute every few minutes. Or maybe once a night. (aka
> "eventually consistent")  Maybe store a time-stamp on the index, to tell
> you how old it is. Or maybe have an append-only log of atomspace changes...
> I can propose many different kinds of solutions. They all have space and
> time-overhead, and/or assorted usability issues. Which of these best suits
> your needs, I have trouble guessing, so you would have to explain what the
> problem is (if any).
>
> --linas
>
> Here's the docs:
>  cog-execute-cache! EXEC KEY [METADATA [FRESH]]
>
>    Execute or return cached execution results. This is a caching version
>    of the `cog-execute!` call.
>
>    If the optional FRESH boolean flag is #f, then if there is a Value
>    stored at KEY on EXEC, return that Value. The default value of FRESH
>    is #f, so the default behavior is always to return the cached value.
>    If the optional FRESH boolean flag is #t, or if there is no Value
>    stored at KEY, then the `cog-execute!` function is called on EXEC,
>    and the result is stored at KEY.
>
>    The METADATA Atom is optional.  If it is specified, then metadata
>    about the execution is placed on EXEC at the key METADATA.
>    Currently, this is just a timestamp of when this execution was
>    performed. The format of the meta-data is subject to change; this
>    is currently an experimental feature, driven by user requirements.
>
>    At this time, execution is synchronous. It may be worthwhile to have
>    an asynchronous version of this call, where the execution is performed
>    at some other time. This has not been done yet.
>
> On Wed, Aug 26, 2020 at 7:41 AM Abdulrahman Semrie <[email protected]>
> wrote:
>
>>
>> In the current atomspace, atoms are indexed by their type, i.e given a
>> type we can retrieve all the atoms that have that type. But there is no
>> other away of adding custom indices in the atomspace. For example, if we
>> want to index nodes by their name, there is no way of doing this.
>>
>> As discussed in this issue
>> <https://github.com/MOZI-AI/annotation-scheme/issues/192>, we plan to
>> expand the annotation-service, which uses the AtomSpace to store genomics
>> data, to support the annotation of more types in addition to genes.
>> Currently, when I user submits a list of ids to the service, it is assumed
>> that these ids/symbols represent `GeneNode`s. But in the case where the
>> input can be a protein, a drug molecule, pathway or a gene, there is no
>> direct way of retrieving what type of the atom with the given name is
>> unless we iterate through all atoms searching for that particular id. This
>> isn't be a good approach from performance standpoint. But if we had a
>> custom index - e.g `name_index`, on the ids/names of the atoms, it will be
>> easier to search the atoms by name and identify the type that the atom
>> belongs to.
>>
>> Hence, if there is a way to add custom indices to the atomspace, it will
>> greatly simplify some searches. Or maybe there is a way to do what I
>> described above without the need for an index. If so, please share it.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "opencog" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/opencog/27892502-0dfb-4042-a805-30a1520f6250n%40googlegroups.com
>> <https://groups.google.com/d/msgid/opencog/27892502-0dfb-4042-a805-30a1520f6250n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> --
> Verbogeny is one of the pleasurettes of a creatific thinkerizer.
>         --Peter da Silva
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "opencog" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/opencog/5uE2lw6b-5E/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/CAHrUA34qoTA90pcSC3GwXsGy8xpK5yn-1U7k%2Ba10nuDTWcrBLQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/opencog/CAHrUA34qoTA90pcSC3GwXsGy8xpK5yn-1U7k%2Ba10nuDTWcrBLQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/2a5214b7-c083-40c0-801d-0a3595783046%40Canary
> <https://groups.google.com/d/msgid/opencog/2a5214b7-c083-40c0-801d-0a3595783046%40Canary?utm_medium=email&utm_source=footer>
> .
>


-- 
Verbogeny is one of the pleasurettes of a creatific thinkerizer.
        --Peter da Silva

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA37N%3Dbjr7QDQzS-uUpcwaSP%3D44QEYfkmUXQC9mrVEZATEQ%40mail.gmail.com.

Reply via email to