Re: [opencog-dev] Re: Pattern mining from PLN inference histories

Shujing Ke Tue, 20 Jun 2017 17:32:02 -0700

correting a typeo:

in the point 4 in previous email:
pattern B should be:
ImplicationLink
    AndLink
        InheritanceLink  x  rich
        InheritanceLink  y  cute
    EvaluationLink married x y


On Wed, Jun 21, 2017 at 2:29 AM, Shujing Ke <[email protected]> wrote:

> Hi, Ben and Nil,
>
> Thanks for all your responses. I may be a bit slow this week - it is too
> warm here and my baby is sick, he barely eat and drink anything since
> yesterday morning.
>
> *1. About the output format and TV of patterns*
> The pattern miner will output the raw patterns found from the input data
> (without more process). Because different modules in Opencog and
> applications may require different output formats. It shouldn't be only one
> output format. Currently we can put our discussion based on raw pattern
> format. After we make sure the concents of patterns are right, we can
> discuss about the output formats for differnt modules. If I have time then,
> I can implement it, if I don't then I think each module's developer should
> also be easy to turn the raw patterns into the format they want. It is
> better to be on another layer out of the pattern miner, which is more
> convient for each module to modify the pattern format they need in future.
> Otherewise, any module wants to change some format, they have to modify the
> pattern miner core.
>
>
> *2. About the pattern gram*
> Actually the gram doesn't really exactly indicate the size of a pattern,
> it just mean the numbers of root links in a pattern.
>
> A ==> B, B==>C  |- A==>C
> A==>C, C ==> D |- A ==>D
> HebbianLink (D,B)
> useful(A==>D)
>
> Yes, it could be a 4 gram , but it can also be 1 gram, depends on the
> input data
> *.*
> If you have a big Link likes:
>
> ImplicationLink
>      AndLink
>           ImplicationLink A B
>           ImplicationLink B C
>           ImplicationLink C D
>     ImplicationLink A D
>
> Then this pattern will be 1-gram.
>
>  Take the cockroach pattern for more example:
> Suppose you have handle 666 and 777:
>
> ImplicationLink [handle=666]
>      EvaluationLink
>          PredicateNode "eat"
>          ListLink
>                ConceptNode "Ben"
>                ConceptNode "cockroach"
>      InheritanceLink
>           ConceptNode "Ben"
>           ConceptNode "weird"
>
>
> ImplicationLink [handle=777]
>      EvaluationLink
>          PredicateNode "eat"
>          ListLink
>                ConceptNode "NIl"
>                ConceptNode "cockroach"
>      InheritanceLink
>           ConceptNode "Nil"
>           ConceptNode "weird"
>
> If only alow ImplicationLinks to be rootlinks, then the pattern 1 below is
> a 1-gram pattern:
> *Pattern 1:*
> ImplicationLink
>      EvaluationLink
>          PredicateNode "eat"
>          ListLink
>                ConceptNode "var1"
>                ConceptNode "cockroach"
>      InheritanceLink
>           ConceptNode "var1"
>           ConceptNode "weird"
>
> If EvaluationLinks and InheritanceLinks are also allow to be rootlinks,
> then pattern 2,3,4 are all 2-gram patterns, because they contains two
> rootlinks. Of course, in this case, pattern 3 and 4 do not make much sense,
> but in the DBpedia data, these types of patterns are what we want. So we
> need to specify which link types should be rootlinks for different
> applications, to avoid a lot of useless patterns being mined. It can be set
> in config file or scm interface throuth the white and black link type list.
>
> *Pattern 2:*
> EvaluationLink
>      PredicateNode "eat"
>      ListLink
>           ConceptNode "var1"
>           ConceptNode "cockroach"
>
> InheritanceLink
>      ConceptNode "var1"
>      ConceptNode "weird"
>
> *Pattern 3:*
>  EvaluationLink
>          PredicateNode "eat"
>          ListLink
>                ConceptNode "var1"
>                ConceptNode "cockroach"
>
> ImplicationLink
>      EvaluationLink
>          PredicateNode "eat"
>          ListLink
>                ConceptNode "var1"
>                ConceptNode "cockroach"
>      InheritanceLink
>           ConceptNode "var1"
>           ConceptNode "weird"
>
> *Pattern 4:*
> ImplicationLink
>      EvaluationLink
>          PredicateNode "eat"
>          ListLink
>                ConceptNode "Ben"
>                ConceptNode "var1"
>      InheritanceLink
>           ConceptNode "Ben"
>           ConceptNode "var2"
>
>  InheritanceLink
>      ConceptNode "Nil"
>      ConceptNode "var2"
>
> *3. About unify link orders in unorderlinks in input data*
> It probably won't cost too much time to code, because it should be quite
> similar to the logic of pattern isomorphism identifying algorithm which I
> already have in pattern miner, becasue it is quite an important part of
> pattern miner. I should be able to reuse the logic.
>
> *4. About the interestingness evalution*
>
> I didn't quite get the meaning of the rich(x) and z(y) and married(x,y)
> example.
> I think it is also related to the pattern gram. For below 2 patterns:
> x,y,z are variables
> pattern A:  rich(x) and z(y) and married(x,y)
> pattern B:  rich(x) and cute(y) and married(x,y)
>
> If they are represented as 3 gram patterns, then it may be able to just
> evaluate their interesingness by surpringness
> pattern A:
> InheritanceLink  x  rich
> InheritanceLink  y  z
> EvaluationLink married x y
>
> pattern B:
> InheritanceLink  x  rich
> InheritanceLink  y  cute
> EvaluationLink married x y
>
> If they are represented as 1 gram patterns, then I can implement an
> interestingness evalution based on the variables inside one root link.
> pattern A:
> ImplicationLink
>     AndLink
>         InheritanceLink  x  rich
>         InheritanceLink  y  z
>     EvaluationLink married x y
>
> pattern B:
> ImplicationLink
>     AndLink
>         InheritanceLink  x  rich
>         InheritanceLink  y  z
>     EvaluationLink married x y
>
> *5. A suggestion to make up a very simple tiny test data file *
> I suggest Nil to make up a simple test data file just to test if the
> output patterns are what you want and if the frequency count is correct.
> For example, I made up a simple data before - the ugly-man-drink-soda file,
> which contains 10 men, 10 women, among then 5 women and 5 men are ugly, and
> also 5 women and 5 men drink soda - it is expected to find the pattern that
> "ugly man drink soda". Because for such a tiny file, we can actually check
> every output pattern and its count to see if there is any bug. If it pass,
> then we can apply it on a big corpus. Otherwise, there are too many outputs
> for a big corpus, it is hard to examine the result.
>
> Thanks,
> Shujing
>
> On Tue, Jun 20, 2017 at 4:48 AM, Ben Goertzel <[email protected]> wrote:
>
>> On Tue, Jun 20, 2017 at 2:29 AM, Nil Geisweiller
>> <[email protected]> wrote:
>> > What do you mean exactly by "useful(A==>D)"?
>>
>>
>> What I was thinking was:  If the implication [666], e.g.
>>
>> ImplicationLink [handle=666]
>>      EvaluationLink
>>          PredicateNode "eat"
>>          ListLink
>>                ConceptNode "Ben"
>>                ConceptNode "cockroach"
>>
>>      InheritanceLink
>>           ConceptNode "Ben"
>>           ConceptNode "weird"
>>
>>
>> was used or created by the BC, and was found to be useful for whatever
>> inference the BC was doing when it used or created [666], then the
>> utility of this link should be annotated via
>>
>> EvaluationLink
>>      PredicateNode "useful"
>>      ListLink
>>              [666]
>>              [111]
>>
>>
>> where [111] is the handle of the target of the BC inference the BC was
>> doing when it created [666].
>>
>> So maybe my example should look more like
>>
>> A ==> B, B==>C  |- A==>C
>> A==>C, C ==> D |- A ==>D
>> HebbianLink (D,B)
>> useful(A==>D, T)
>>
>>
>> where T is a variable that matches the target of prior BC inferences...
>>
>> ben
>>
>> --
>> Ben Goertzel, PhD
>> http://goertzel.org
>>
>> "I am God! I am nothing, I'm play, I am freedom, I am life. I am the
>> boundary, I am the peak." -- Alexander Scriabin
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CALpD4-%2BJZ7dxzQrcCVNK8%3DwjJJagFm9V2R9bKnOVu3kL06ZVwg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [opencog-dev] Re: Pattern mining from PLN inference histories

Reply via email to