Re: [opencog-dev] Re: Pattern mining from PLN inference histories

Shujing Ke Tue, 20 Jun 2017 17:40:48 -0700

Actually in the point 2 in previous emaile, a more clear example is the
pattern 5 given below for 2 gram patterns when only ImplicationLinks are
allow to be root links:


*Pattern 5:*
ImplicationLink
     EvaluationLink
         PredicateNode "var1"
         ListLink
               ConceptNode "Ben"
               ConceptNode "var2"
     InheritanceLink
          ConceptNode "Ben"
          ConceptNode "var3"

ImplicationLink
     EvaluationLink
         PredicateNode "var1"
         ListLink
               ConceptNode "Nil"
               ConceptNode "var2"
     InheritanceLink
          ConceptNode "Nil"
          ConceptNode "var3"

It is of course not interesting, but these two ImplicationLins do connected
via "eat" "werid" and "cockroach", so that is why I think in pln data, only
1-gram patterns are worthy to mine.


On Wed, Jun 21, 2017 at 2:31 AM, Shujing Ke <[email protected]> wrote:

> correting a typeo:
>
> in the point 4 in previous email:
> pattern B should be:
> ImplicationLink
>     AndLink
>         InheritanceLink  x  rich
>         InheritanceLink  y  cute
>     EvaluationLink married x y
>
> On Wed, Jun 21, 2017 at 2:29 AM, Shujing Ke <[email protected]> wrote:
>
>> Hi, Ben and Nil,
>>
>> Thanks for all your responses. I may be a bit slow this week - it is too
>> warm here and my baby is sick, he barely eat and drink anything since
>> yesterday morning.
>>
>> *1. About the output format and TV of patterns*
>> The pattern miner will output the raw patterns found from the input data
>> (without more process). Because different modules in Opencog and
>> applications may require different output formats. It shouldn't be only one
>> output format. Currently we can put our discussion based on raw pattern
>> format. After we make sure the concents of patterns are right, we can
>> discuss about the output formats for differnt modules. If I have time then,
>> I can implement it, if I don't then I think each module's developer should
>> also be easy to turn the raw patterns into the format they want. It is
>> better to be on another layer out of the pattern miner, which is more
>> convient for each module to modify the pattern format they need in future.
>> Otherewise, any module wants to change some format, they have to modify the
>> pattern miner core.
>>
>>
>> *2. About the pattern gram*
>> Actually the gram doesn't really exactly indicate the size of a pattern,
>> it just mean the numbers of root links in a pattern.
>>
>> A ==> B, B==>C  |- A==>C
>> A==>C, C ==> D |- A ==>D
>> HebbianLink (D,B)
>> useful(A==>D)
>>
>> Yes, it could be a 4 gram , but it can also be 1 gram, depends on the
>> input data
>> *.*
>> If you have a big Link likes:
>>
>> ImplicationLink
>>      AndLink
>>           ImplicationLink A B
>>           ImplicationLink B C
>>           ImplicationLink C D
>>     ImplicationLink A D
>>
>> Then this pattern will be 1-gram.
>>
>>  Take the cockroach pattern for more example:
>> Suppose you have handle 666 and 777:
>>
>> ImplicationLink [handle=666]
>>      EvaluationLink
>>          PredicateNode "eat"
>>          ListLink
>>                ConceptNode "Ben"
>>                ConceptNode "cockroach"
>>      InheritanceLink
>>           ConceptNode "Ben"
>>           ConceptNode "weird"
>>
>>
>> ImplicationLink [handle=777]
>>      EvaluationLink
>>          PredicateNode "eat"
>>          ListLink
>>                ConceptNode "NIl"
>>                ConceptNode "cockroach"
>>      InheritanceLink
>>           ConceptNode "Nil"
>>           ConceptNode "weird"
>>
>> If only alow ImplicationLinks to be rootlinks, then the pattern 1 below
>> is a 1-gram pattern:
>> *Pattern 1:*
>> ImplicationLink
>>      EvaluationLink
>>          PredicateNode "eat"
>>          ListLink
>>                ConceptNode "var1"
>>                ConceptNode "cockroach"
>>      InheritanceLink
>>           ConceptNode "var1"
>>           ConceptNode "weird"
>>
>> If EvaluationLinks and InheritanceLinks are also allow to be rootlinks,
>> then pattern 2,3,4 are all 2-gram patterns, because they contains two
>> rootlinks. Of course, in this case, pattern 3 and 4 do not make much sense,
>> but in the DBpedia data, these types of patterns are what we want. So we
>> need to specify which link types should be rootlinks for different
>> applications, to avoid a lot of useless patterns being mined. It can be set
>> in config file or scm interface throuth the white and black link type list.
>>
>> *Pattern 2:*
>> EvaluationLink
>>      PredicateNode "eat"
>>      ListLink
>>           ConceptNode "var1"
>>           ConceptNode "cockroach"
>>
>> InheritanceLink
>>      ConceptNode "var1"
>>      ConceptNode "weird"
>>
>> *Pattern 3:*
>>  EvaluationLink
>>          PredicateNode "eat"
>>          ListLink
>>                ConceptNode "var1"
>>                ConceptNode "cockroach"
>>
>> ImplicationLink
>>      EvaluationLink
>>          PredicateNode "eat"
>>          ListLink
>>                ConceptNode "var1"
>>                ConceptNode "cockroach"
>>      InheritanceLink
>>           ConceptNode "var1"
>>           ConceptNode "weird"
>>
>> *Pattern 4:*
>> ImplicationLink
>>      EvaluationLink
>>          PredicateNode "eat"
>>          ListLink
>>                ConceptNode "Ben"
>>                ConceptNode "var1"
>>      InheritanceLink
>>           ConceptNode "Ben"
>>           ConceptNode "var2"
>>
>>  InheritanceLink
>>      ConceptNode "Nil"
>>      ConceptNode "var2"
>>
>> *3. About unify link orders in unorderlinks in input data*
>> It probably won't cost too much time to code, because it should be quite
>> similar to the logic of pattern isomorphism identifying algorithm which I
>> already have in pattern miner, becasue it is quite an important part of
>> pattern miner. I should be able to reuse the logic.
>>
>> *4. About the interestingness evalution*
>>
>> I didn't quite get the meaning of the rich(x) and z(y) and married(x,y)
>> example.
>> I think it is also related to the pattern gram. For below 2 patterns:
>> x,y,z are variables
>> pattern A:  rich(x) and z(y) and married(x,y)
>> pattern B:  rich(x) and cute(y) and married(x,y)
>>
>> If they are represented as 3 gram patterns, then it may be able to just
>> evaluate their interesingness by surpringness
>> pattern A:
>> InheritanceLink  x  rich
>> InheritanceLink  y  z
>> EvaluationLink married x y
>>
>> pattern B:
>> InheritanceLink  x  rich
>> InheritanceLink  y  cute
>> EvaluationLink married x y
>>
>> If they are represented as 1 gram patterns, then I can implement an
>> interestingness evalution based on the variables inside one root link.
>> pattern A:
>> ImplicationLink
>>     AndLink
>>         InheritanceLink  x  rich
>>         InheritanceLink  y  z
>>     EvaluationLink married x y
>>
>> pattern B:
>> ImplicationLink
>>     AndLink
>>         InheritanceLink  x  rich
>>         InheritanceLink  y  z
>>     EvaluationLink married x y
>>
>> *5. A suggestion to make up a very simple tiny test data file *
>> I suggest Nil to make up a simple test data file just to test if the
>> output patterns are what you want and if the frequency count is correct.
>> For example, I made up a simple data before - the ugly-man-drink-soda file,
>> which contains 10 men, 10 women, among then 5 women and 5 men are ugly, and
>> also 5 women and 5 men drink soda - it is expected to find the pattern that
>> "ugly man drink soda". Because for such a tiny file, we can actually check
>> every output pattern and its count to see if there is any bug. If it pass,
>> then we can apply it on a big corpus. Otherwise, there are too many outputs
>> for a big corpus, it is hard to examine the result.
>>
>> Thanks,
>> Shujing
>>
>> On Tue, Jun 20, 2017 at 4:48 AM, Ben Goertzel <[email protected]> wrote:
>>
>>> On Tue, Jun 20, 2017 at 2:29 AM, Nil Geisweiller
>>> <[email protected]> wrote:
>>> > What do you mean exactly by "useful(A==>D)"?
>>>
>>>
>>> What I was thinking was:  If the implication [666], e.g.
>>>
>>> ImplicationLink [handle=666]
>>>      EvaluationLink
>>>          PredicateNode "eat"
>>>          ListLink
>>>                ConceptNode "Ben"
>>>                ConceptNode "cockroach"
>>>
>>>      InheritanceLink
>>>           ConceptNode "Ben"
>>>           ConceptNode "weird"
>>>
>>>
>>> was used or created by the BC, and was found to be useful for whatever
>>> inference the BC was doing when it used or created [666], then the
>>> utility of this link should be annotated via
>>>
>>> EvaluationLink
>>>      PredicateNode "useful"
>>>      ListLink
>>>              [666]
>>>              [111]
>>>
>>>
>>> where [111] is the handle of the target of the BC inference the BC was
>>> doing when it created [666].
>>>
>>> So maybe my example should look more like
>>>
>>> A ==> B, B==>C  |- A==>C
>>> A==>C, C ==> D |- A ==>D
>>> HebbianLink (D,B)
>>> useful(A==>D, T)
>>>
>>>
>>> where T is a variable that matches the target of prior BC inferences...
>>>
>>> ben
>>>
>>> --
>>> Ben Goertzel, PhD
>>> http://goertzel.org
>>>
>>> "I am God! I am nothing, I'm play, I am freedom, I am life. I am the
>>> boundary, I am the peak." -- Alexander Scriabin
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CALpD4-JiMX98wmMk_0-rM3Y0yNVh21qwGE6dz6R2pJChOyAiXQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [opencog-dev] Re: Pattern mining from PLN inference histories

Reply via email to