Re: [opencog-dev] PartitionLink, biological pathways, human bodies, etc.

Linas Vepstas Wed, 09 Aug 2017 13:54:30 -0700

Quick comment, I did not review the Doc.

On Tue, Aug 8, 2017 at 3:11 PM, Michael Duncan <[email protected]> wrote:


> the pathway links are predicates defined here
> <https://docs.google.com/document/d/1R_AiCCRuWKI92JUCYXJRnKeYw-MiwKLR3kr9fJyYZfs/edit>.
> the pathways are
>
> DefineLink
>

I strongly urge that EquivalenceLink be used here, and not DefineLink.
DefineLink is meant for something else.


>      DefinedPredicateNode "GO pathway term name"
>
Isn't an ordinary Predicatenode enough??


>      AndLink
>

Again: the anlink is completely unordered, so in this case, would be
exactly the same thing as a SetLink: its just a collection of "stuff" (a
collection of protein relationships, it seems)  Because its not ordered,
its not a "path" per se, its just a set.

You keep saying that you use AndLink because its "true" when everything is
in it, but that is also the case for SetLink.  When I say "x and y are in
set A", that always true that x is in A and Y is in A, and you don't need
an AndLink to say this.  The SetLink is enough. The SetLink is effectively
an AndLink, from the truthiness of it.


>           Predicate "protein relationship 1"
>                 ProteinNode "x"
>                 ProteinNode "y"
>           Predicate "protein relationship 2"
>                 ProteinNode "y"
>                 ProteinNode "z"
>           ....
>


I'm proposing this:

MemberLink
      DefinedPredicateNode "GO pathway term name"
          Predicate "protein relationship 1"

>                 ProteinNode "x"
>                 ProteinNode "y"
>



MemberLink
      DefinedPredicateNode "GO pathway term name"

>           Predicate "protein relationship 2"
>                 ProteinNode "y"
>                 ProteinNode "z"
>           ....
>


--linas

>
>
>
> On Tuesday, August 8, 2017 at 11:00:54 AM UTC-7, Michael Duncan wrote:
>>
>> the AndLink semantics are for the simplified pathway representation for
>> the current demo/toy bio-atomspace which only has binary links between
>> proteins and abstracts out small molecules. so the pathway for the krebs
>> cycle for instance is just directed links between the enzymes:  ... -> 
>> isocitrate
>> dehydrogenase -> alpha-ketoglutarate dehydrogenase -> Succinyl-CoA
>> synthetase -> ...
>>
>> linus semantics look good for when the complete biopax pathway
>> representation is translated into atomese.
>>
>> even then my intuition is that the AndLink semantics should be useful in
>> pln inference about say the likelihood of a pathway being involved in
>> distinguishing a case-control phenotype based on moses models of relative
>> gene expression levels.
>>
>> On Monday, August 7, 2017 at 11:31:59 AM UTC-7, Ben Goertzel wrote:
>>>
>>> a pathway in biology is actually a network with directed arrows and
>>> generally lots of loops.... there are even some hyperlinks e.g. for
>>> catalysis... a pathway is a subhypergraph...
>>>
>>>
>>>
>>> On Aug 7, 2017 11:25, "Linas Vepstas" <[email protected]> wrote:
>>>
>>>> no clue why its appropriate for biological pathways. Mike is designing
>>>> that, not me.
>>>>
>>>> Anyway, a "pathway" is an ordered sequence where the ordering matters.
>>>> Neither SetLink, nor AndLink are ordered. So if you actually want to have a
>>>> path, i.e. a sequence of directed arrows, well .. you  need to find a
>>>> representation of  biological pathways as directed arrows. But this is
>>>> familiar ground, for opencog...
>>>>
>>>> --linas
>>>>
>>>> On Mon, Aug 7, 2017 at 1:21 PM, Ben Goertzel <[email protected]> wrote:
>>>>
>>>>> OK I get that... but I don't see why it is appropriate for biological
>>>>> pathways...
>>>>>
>>>>> On Tue, Aug 8, 2017 at 2:19 AM, Linas Vepstas <[email protected]>
>>>>> wrote:
>>>>> > First, lets review SetLink:
>>>>> >
>>>>> >  SetLink
>>>>> >     ConceptNode "x"
>>>>> >     ConceptNode "y"
>>>>> >     ConceptNode "z"
>>>>> >
>>>>> >
>>>>> >  EquivalenceLink
>>>>> >     ConceptNode "last three letters of the alphabet"
>>>>> >     SetLink
>>>>> >        ConceptNode "x"
>>>>> >        ConceptNode "y"
>>>>> >        ConceptNode "z"
>>>>> >
>>>>> >
>>>>> >  MemberLink
>>>>> >      ConceptNode "x"
>>>>> >      ConceptNode "last three letters of the alphabet"
>>>>> >   MemberLink
>>>>> >      ConceptNode "y"
>>>>> >      ConceptNode "last three letters of the alphabet"
>>>>> >   MemberLink
>>>>> >      ConceptNode "z"
>>>>> >      ConceptNode "last three letters of the alphabet"
>>>>> >
>>>>> > Again, with TV's:
>>>>> >
>>>>> >   MemberLink  <1.0>
>>>>> >      ConceptNode "z"
>>>>> >      ConceptNode "last letters of the alphabet"
>>>>> >   MemberLink  <0.9>
>>>>> >      ConceptNode "w"
>>>>> >      ConceptNode "last letters of the alphabet"
>>>>> >   MemberLink  <0.8>
>>>>> >      ConceptNode "s"
>>>>> >      ConceptNode "last letters of the alphabet"
>>>>> >   MemberLink  <0.2>
>>>>> >      ConceptNode "m"
>>>>> >      ConceptNode "last letters of the alphabet"
>>>>> >
>>>>> >
>>>>> >
>>>>> > Sooo .. AndMemberLink would be just like the above, except that
>>>>> whereever
>>>>> > you see SetLink above, you would have AndLink, and wherever you see
>>>>> > MmeberLink above, you would have AndMemeberLink.
>>>>> >
>>>>> > --linas
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Aug 7, 2017 at 1:11 PM, Ben Goertzel <[email protected]>
>>>>> wrote:
>>>>> >>
>>>>> >> I don't understand the proposed semantics of AndMemberLink, could
>>>>> you
>>>>> >> explain?
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Sat, Aug 5, 2017 at 1:07 AM, Michael Duncan <[email protected]>
>>>>> >> wrote:
>>>>> >> > i actually think an AndLink-like semantics better fits biochemical
>>>>> >> > pathways
>>>>> >> > at a computationally tractable level than partitions in that
>>>>> below the
>>>>> >> > level
>>>>> >> > of a whole organism, where one pathway ends and another begins is
>>>>> >> > largely
>>>>> >> > arbitrary.  also,  if one link is missing then the whole thing
>>>>> doesn't
>>>>> >> > work
>>>>> >> > but the last bit of a dead end might be the start of another path
>>>>> that
>>>>> >> > goes
>>>>> >> > to the same place, more like words and phrases that can be
>>>>> rearranged
>>>>> >> > and
>>>>> >> > swapped in different ways to say the same thing.  linus idea of
>>>>> >> > AndMemberLinks and OrMemeberLinks would get around the size
>>>>> limitation
>>>>> >> > and
>>>>> >> > also seem like they would be useful for reasoning on moses models.
>>>>> >> >
>>>>> >> > On Monday, July 31, 2017 at 5:55:16 PM UTC-4, linas wrote:
>>>>> >> >>
>>>>> >> >> Hi Ben, Mike,
>>>>> >> >>
>>>>> >> >>
>>>>> >> >> On Fri, Jul 21, 2017 at 9:41 PM, Ben Goertzel <[email protected]
>>>>> >
>>>>> >> >> wrote:
>>>>> >> >>>
>>>>> >> >>> Some interesting representational issues have come up in the
>>>>> context
>>>>> >> >>> of Atomspace representation of pathways, which appear to have
>>>>> more
>>>>> >> >>> general implications…
>>>>> >> >>>
>>>>> >> >>> It seems the semantics we want for a biological pathway is sort
>>>>> of
>>>>> >> >>> like “the pathway P is a set of relationships R1, R2, …, R20”
>>>>> in kinda
>>>>> >> >>> the same sense that “the human body is a set of organs: brain,
>>>>> heart,
>>>>> >> >>> lungs, legs, etc.”
>>>>> >> >>>
>>>>> >> >>> First of all it seems what we have here is a part of
>>>>> relationship…
>>>>> >> >>> maybe
>>>>> >> >>> we want
>>>>> >> >>>
>>>>> >> >>> PartLink
>>>>> >> >>>     ConceptNode “heart”
>>>>> >> >>>     ConceptNode “human-body”
>>>>> >> >>>
>>>>> >> >>> and
>>>>> >> >>>
>>>>> >> >>> PartLink
>>>>> >> >>>     >relationship<
>>>>> >> >>>     >pathway<
>>>>> >> >>>
>>>>> >> >>> PartLink and PartOfLink have come and gone in
>>>>> >> >>> OpenCog/Novamente/Webmind history...
>>>>> >> >>>
>>>>> >> >>> An argument that PartLink should have fundamental status and a
>>>>> >> >>> well-defined fuzzy truth value is given in this paper:
>>>>> >> >>>
>>>>> >> >>> https://www.academia.edu/1016959/Fuzzy_mereology
>>>>> >> >>>
>>>>> >> >>> However what we need for biological pathways and human bodies
>>>>> seems
>>>>> >> >>> like a bit more.   We want to say that a human body consists of
>>>>> a
>>>>> >> >>> certain set of parts... not just that each of them is a
>>>>> part...  We're
>>>>> >> >>> doing a decomposition.
>>>>> >> >>>
>>>>> >> >>> One way to do this would be
>>>>> >> >>>
>>>>> >> >>> PartitionLink
>>>>> >> >>>    ConceptNode “human-body”
>>>>> >> >>>    ListLink
>>>>> >> >>>       ConceptNode “legs”
>>>>> >> >>>       ConceptNode “arms”
>>>>> >> >>>       ConceptNode “brain”
>>>>> >> >>>       etc.
>>>>> >> >>>
>>>>> >> >>> Relatedly, we could also have
>>>>> >> >>
>>>>> >> >>
>>>>> >> >> As mentioned earlier, there are several problems with this
>>>>> format.  One
>>>>> >> >> is
>>>>> >> >> the "oops I forgot to mention xyz in the list" or "gosh I should
>>>>> have
>>>>> >> >> left
>>>>> >> >> out pqr" and this becomes a big problem:  you have to delete the
>>>>> >> >> PartitionLink, delete the ListLink, create a new list and
>>>>> partition.
>>>>> >> >> In the
>>>>> >> >> meanwhile, some other subsystem might be holding a handle to the
>>>>> old,
>>>>> >> >> now-wrong PartitionLink, and there is no effective way of
>>>>> announcing
>>>>> >> >> "hey
>>>>> >> >> stop using that old thing, get my new thing now".
>>>>> >> >>
>>>>> >> >> A second problem is that the above doesn't have anywhere to hang
>>>>> >> >> addtional
>>>>> >> >> data: e.g. "legs are a big part of the human body, having a mas
>>>>> of
>>>>> >> >> nearly
>>>>> >> >> half of the body." You can't just slap that on as a
>>>>> (truth)value, cause
>>>>> >> >> there's no where  to put that value.
>>>>> >> >>
>>>>> >> >> Third problem is that large list-links are hard to handle in the
>>>>> >> >> pattern
>>>>> >> >> matcher. Its much much harder to write a query of the form
>>>>> "find me
>>>>> >> >> all
>>>>> >> >> values of $X where
>>>>> >> >>
>>>>> >> >> PartitionLink
>>>>> >> >>    ConceptNode “human-body”
>>>>> >> >>    ListLink
>>>>> >> >>       ConceptNode “legs”
>>>>> >> >>       VariableNode  “$X”
>>>>> >> >>       ConceptNode “brain”
>>>>> >> >>
>>>>> >> >> because, ... well the ListLink is an ordrerd link, not an
>>>>> unordered
>>>>> >> >> link.
>>>>> >> >> If you forget to include the pqr (added above) then the search
>>>>> will
>>>>> >> >> fail.
>>>>> >> >> You could try to use unordered links and globnodes, but these
>>>>> lead to
>>>>> >> >> other
>>>>> >> >> difficulties, including the n! possible permutations of an
>>>>> unordered
>>>>> >> >> link
>>>>> >> >> become large n-factorial large when the unordered link has n
>>>>> items in
>>>>> >> >> it.
>>>>> >> >> Recall that old factorial-70 trick used to make calculators
>>>>> overflow.
>>>>> >> >>
>>>>> >> >> In general, any link with more than 3 or 4 or 5 items in it is
>>>>> bad
>>>>> >> >> news.
>>>>> >> >> This is a generic statement about knowledge representation in
>>>>> opencog.
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>> OverlappingPartitionLink
>>>>> >> >>>     C
>>>>> >> >>>     L
>>>>> >> >>>
>>>>> >> >>> if we want to encompass cases where the partition elements in L
>>>>> can
>>>>> >> >>> overlap; or
>>>>> >> >>>
>>>>> >> >>> CoveringLink
>>>>> >> >>>     C
>>>>> >> >>>     L
>>>>> >> >>>
>>>>> >> >>> if we want to encompass cases where the partition elements in L
>>>>> can
>>>>> >> >>> overlap, AND the elements in L may encompass some stuff that’s
>>>>> not in
>>>>> >> >>> C
>>>>> >> >>>
>>>>> >> >>> For the pathway case, we could then say
>>>>> >> >>>
>>>>> >> >>> PartitionLink
>>>>> >> >>>     ConceptNode “Krebs cycle”
>>>>> >> >>>     ListLink
>>>>> >> >>>         >relationship 1<
>>>>> >> >>>         >relationship 2<
>>>>> >> >>>         etc.
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>> Now this solves the semantics problem but doesn’t solve the
>>>>> problem of
>>>>> >> >>> having a long ListLink….  A biological pathway might have 100s
>>>>> or
>>>>> >> >>> 1000s of relationships in it, and we don't usually want to make
>>>>> lists
>>>>> >> >>> that big in the Atomspace...
>>>>> >> >>>
>>>>> >> >>> To solve this we could do something like (for the human body
>>>>> case)
>>>>> >> >>>
>>>>> >> >>> PartitionLink
>>>>> >> >>>    ConceptNode “human-body”
>>>>> >> >>>    PartitionNode “body-partition-1”
>>>>> >> >>>
>>>>> >> >>> PartitionElementLink
>>>>> >> >>>    PartitionNode “body-partition-1"
>>>>> >> >>>    ConceptNode “legs”
>>>>> >> >>>
>>>>> >> >>> PartitionElementLink
>>>>> >> >>>    PartitionNode “body-partition-1"
>>>>> >> >>>    ConceptNode “arms”
>>>>> >> >>>
>>>>> >> >>> etc.
>>>>> >> >>>
>>>>> >> >>> and similarly (for the biological pathway case)
>>>>> >> >>>
>>>>> >> >>> PartitionLink
>>>>> >> >>>     ConceptNode “Krebs cycle”
>>>>> >> >>>     PartitionNode “krebs-partition-1”
>>>>> >> >>>
>>>>> >> >>> PartitionElementLink
>>>>> >> >>>     PartitionNode “krebs-partition-1"
>>>>> >> >>>     >relationship 1<
>>>>> >> >>>
>>>>> >> >>> PartitionElementLink
>>>>> >> >>>     PartitionNode “krebs-partition-1”
>>>>> >> >>>     >relationship 2<
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >> Yeah, sure. Not sure why the existing MemberLink is not
>>>>> sufficient for
>>>>> >> >> your purposes. The MemberLink has reasonably-well-defined
>>>>> semantics,
>>>>> >> >> there
>>>>> >> >> are already rules for handling it in PLN (or there will be rules
>>>>> -- I
>>>>> >> >> think
>>>>> >> >> its something Nil has thought about)   I'm not clear on why
>>>>> you'd want
>>>>> >> >> to
>>>>> >> >> invent something that is just like MemberLink but is different.
>>>>> >> >>
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>> ...
>>>>> >> >>>
>>>>> >> >>> There could be some nice truth value math regarding these, e.g.
>>>>> we
>>>>> >> >>> could introduce Ellerman's "logical entropy" which is really a
>>>>> >> >>> partition entropy.   There are also connections with some recent
>>>>> >> >>> theoretical work I've been doing on "graphtropy" (using
>>>>> "distinction
>>>>> >> >>> graphs" that generalize partitions), which I'll post a paper on
>>>>> >> >>> sometime in the next week or two....   But that will be another
>>>>> email
>>>>> >> >>> for another day...
>>>>> >> >>
>>>>> >> >>
>>>>> >> >> Yeah graphical-entropy is something that I keep trying to work
>>>>> on,
>>>>> >> >> except
>>>>> >> >> that every new urgent disaster of the day distracts me from it.
>>>>> >> >>
>>>>> >> >> --linas
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>> -- Ben
>>>>> >> >>>
>>>>> >> > --
>>>>> >> > You received this message because you are subscribed to the Google
>>>>> >> > Groups
>>>>> >> > "opencog" group.
>>>>> >> > To unsubscribe from this group and stop receiving emails from it,
>>>>> send
>>>>> >> > an
>>>>> >> > email to [email protected].
>>>>> >> > To post to this group, send email to [email protected].
>>>>> >> > Visit this group at https://groups.google.com/group/opencog.
>>>>> >> > To view this discussion on the web visit
>>>>> >> >
>>>>> >> > https://groups.google.com/d/msgid/opencog/e1df7273-da14-45f5
>>>>> -8d0d-5ebad0d31217%40googlegroups.com.
>>>>> >> >
>>>>> >> > For more options, visit https://groups.google.com/d/optout.
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> Ben Goertzel, PhD
>>>>> >> http://goertzel.org
>>>>> >>
>>>>> >> "I am God! I am nothing, I'm play, I am freedom, I am life. I am the
>>>>> >> boundary, I am the peak." -- Alexander Scriabin
>>>>> >
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ben Goertzel, PhD
>>>>> http://goertzel.org
>>>>>
>>>>> "I am God! I am nothing, I'm play, I am freedom, I am life. I am the
>>>>> boundary, I am the peak." -- Alexander Scriabin
>>>>>
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA367873%3DgrMeoXWp1x0bwczKYn488v0mudTxxRe_dZF3zg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [opencog-dev] PartitionLink, biological pathways, human bodies, etc.

Reply via email to