Re: [opencog-dev] Re: Pattern mining from PLN inference histories

Ben Goertzel Mon, 19 Jun 2017 09:50:45 -0700

(Nil, please look at the end of this email, I have a suggestion for
you there...)



On Wed, Jun 14, 2017 at 9:32 PM, Shujing Ke <shujin...@gmail.com> wrote:
> 3. The interestingness evaluation is different from previous applications
> Our interestingness evalution is based on surpringness measure,  which
> includes Surpingness_I and Surpringness_II:
> Surpingness_I : how difficult the actual frequency of a n-gram pattern can
> be infered from all its (n-1)-gram to 1-gram subpatterns' frequency.
> Surpingness_II : how difficult the actual frequency of a n-gram pattern can
> be infered from all its (n+1)-gram super patterns' frequency.
> But in the pln corpus, we only mine 1 gram, and I guess the interesting
> patterns here you want to identify is the patterns of "the max degree of
> abstraction" , for example:
> pattern1: (x and y are friends) (x is musician) (y is musician) (z is
> musician) (z and y are friends)->(x and z are friends)
> pattern2: (x and y are friends) (x is var_job) (y is var_job) (z is var_job)
> (z and y are friends)->(x and z are friends)
> If pattern 1 occurs 10 times; pattern 2 also occurs 10 times, it means that
> pattern 2 only be right when var_job = musician, which means the abstraction
> to be pattern 2 is no sense. So patten 1 is already the max degree of
> abstraction in this case. If my unerstand is right, then I will need to
> write a new interestingness evalution for this, because it is different from
> the surpringness measure.

Hmm... well I am not sure if I am interpreting your example right...

Is the idea of the example that x, y and z are pattern-miner
variables, whereas var_job is an Atomspace VariableNode (not, in the
current implementation, what would be a a PatternVariableNode?)?

In that case ...let's consider a simpler example

pattern 1 = rich(x) and cute(y) and married(x,y)

pattern 2 = ThereExists z: rich(x) and z(y) and married(x,y)

pattern 3 = ForAll z: rich(x) and z(y) and married(x,y)

So in each of these cases, as I intend them: Everything except the x
and y is assumed to be there in the Atomspace.   Only the x and y are
the PatternVariables...

If we have 100 occurrences of the pattern

ThereExists z: rich(x) and z(y) and married(x,y)

i.e. 100 cases such as

ThereExists z: rich(Bill) and z(Jane) and married(Bill, Jane)

ThereExists z: rich(Mary) and z(Kate) and married(Mary, Kate)

... etc.

-- then this is a valid pattern, right?

A more realistic example would be in calculus where you have many patterns like

ForAll epsilon: epsilon>0, ( ThereExists delta: ( delta>0 and
abs(x-y)<delta ==> abs( f(x) - f(y)) < epsilon))

ForAll epsilon: epsilon>0, ( ThereExists delta: ( delta>0 and
abs(x-y)<delta ==> abs( g(x) - g(y)) < epsilon))

...

where "epsilon" and "delta" and "x" and "y" are (in an OpenCog
representation) VariableNodes ...

So in this case, f and g are the constants being matched by
PatternVariables, so we could have a PatternVariable $PV and the
pattern miner could recognize the pattern

ForAll epsilon: epsilon>0, ( ThereExists delta: ( delta>0 and
abs(x-y)<delta ==> abs( $PV(x) - $PV(y)) < epsilon))

This is a nice pattern to find, as it's a pattern that exists in lots
of calculus proofs (it just says $PV is continuous...) ... it's no
problem that the pattern has lots of quantified variables and
quantifiers in it...

...

In the PLN case, if we take an example possible pattern like "two
deductions in a row, involving associated entities, are often useful"
that would look like

A ==> B, B==>C  |- A==>C
A==>C, C ==> D |- A ==>D
HebbianLink (D,B)
useful(A==>D)

So the first two of these 4 lines are going to be embedded in a single
ExecutionOutputLink, I guess....  Then the other two will be their own
separate links in the Atomspace...

Suppose this pattern occurs 10 times in the Atomspace.   Each of these
times, we will have different Atoms in the slots for A, B, C, D.  Some
of these may be complex, e.g. we might have in one case

A  equals

MemberLink
    VariableNode $X
     SatisfyingSet
           EvaluationLink
                PredicateNode "piece of poop"
                ListLink
                       $X
                        ConceptNode "cheese doodle"


or whatever...  In this case the fact that there's a VariableNode $X
in the interior of A doesn't matter.

Nil, it will take some work, but  maybe it's worthwhile for you to
create a test Atomspace in which my above example pattern

A ==> B, B==>C  |- A==>C
A==>C, C ==> D |- A ==>D
HebbianLink (D,B)
useful(A==>D)

is a surprising pattern, and in which some of the examples of A, B, C
or D have some complexity to them (some internal quantified
variables).

Having a more "real" example like this might help avoid any confusion
and aid Shujing in getting the pattern  miner to work on PLN inference
histories in a useful way

ben




-- 
Ben Goertzel, PhD
http://goertzel.org

"I am God! I am nothing, I'm play, I am freedom, I am life. I am the
boundary, I am the peak." -- Alexander Scriabin

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to opencog+unsubscr...@googlegroups.com.
To post to this group, send email to opencog@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBc%2B9mkizpWkpmzm1Rs77pLM2Bnywm9CHRnTvD5jJXdomw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [opencog-dev] Re: Pattern mining from PLN inference histories

Reply via email to