[opencog-dev] Re: Pattern Matching performance

Linas Vepstas Mon, 12 Mar 2018 22:05:40 -0700

Resending. My email agent is formatting and indenting bizarrely.
I don't understand why its 2018 and something as simple as email
has ugly and unreliable formating.  Is this what the heat death of
the universe looks like?


--linas

On Mon, Mar 12, 2018 at 11:55 PM, Linas Vepstas <[email protected]> wrote:
> Hi,
>
> To repeat my earlier remarks, and add a few more:
>
> On Mon, Mar 12, 2018 at 3:42 PM, Alexey Potapov <[email protected]> wrote:
>>
>> Yes. But Pattern Matcher compares all pairs of bounding boxes making it
>> quadratic.

> Yes, there is supposed to be a space-server that is supposed to be optimized
> for these kinds of searches. There is one, but its super-minimal, and not at 
> all
> integrated with the pattern matcher, or anything else for that matter.  Smart
> searches on bounding boxes is completely green-field development, for the
> atomspace. We've kind of got very nearly nothing for this.


>>> Yes, but keep in mind that the PM is also Turing complete because you can
>>> call any function within a query (including the PM itself).
>>
>>
>> This is quite problematic. Basic processes should not execute anything
>> dangerous that can take too much time or loop forever and cannot be
>> interrupted. Thus, we should either not treat PM as a basic process, or
>> should restrict its capabilities (and shift responsibility for evaluating
>> arbitrary code to other processes).


> This is misleading or a mis-understanding. Its not the correct way to think
> about it.  The current pattern matcher is 2 or three or 4 things matched into
> one:
>
> A) A generic subgraph isomorphism solver. Since this is an NP complete
> problem, it's certainly possible to create pathologically slow queries.
>
> B) A way of combining subgraphs using a crisp-logic boolean algebra
> (actually a Heyting algebra) which we have very vague intentions of
> promoting into something probabilistic.  Which, of course, if this was done,
> would layer on an additional combinatoric explosion.  It would be fruitful to
> discuss the wisdom or stupidity of this particular task. Or alternative 
> designs
> for it.
>
> C) The ability to perform subgraph isomorphism with so-called "axiom 
> schemata".
> An "axiom schemata" is roughly an infinite collection of relations, for 
> example
> "less than" over the integers or rationals or reals.   This means that the 
> pattern
> matcher is kind-of-like-ish a "satisfiability modulo theories" (SMT) solver.  
> The
> API for specifying a theory at this point is rather very simplistic. The 
> "virtual link"
> is  that API. It says, basically "implement your model theory here, as C++ 
> code,
> and we will automatically do the satisfiability modulo your theory for you"
>
> Currently supported theories are the equational theory (EqualLink) and numeric
> inequaltiy (GreaterThanLink)  An example of a nice-to-have theory would be
> linear algebra - done right, this could  solve your space-time bounding box 
> problem
> for starters, and linear programming type problems if anyone cared about that.
> Maybe matroids. Whatever. Dunno.  Another nice-to-have would be naive set 
> theory
> which could help lay a cornerstone of probability done right (TM) but this 
> would be
> a long and difficult but veryinteresting discussion.
>
> D) Once a matching subgraph is found, you can launch arbitrary 
> C++/scheme/python
> code to do something with that subgraph. So that's unbounded.


>>> So, the first thing you should do is build a good benchmark tool, then we,
>>> you and the rest of opencog community, can supply it with a collection of
>>> critical tests.
>>
>> How do you see such tool?

> Unclear. I am interested in a tool that tells me if performance got worse 
> after
> a particular code change or bug fix.  Some of our fixes accidentally slow 
> things
> down (by a lot) and no one noticies for months or half a year.

>> We have some datasets of varying sizes and
>> queries, and simply run PM on these queries and measure the time. What can
>> be unified/automated?

> Yes!?

>> Running all tests and writing log files with
>> computation times? Anything else?

> I don't care about log files.

>> I agree that we need several (many?) tests to be sure that some changes
>> didn't affect any types of queries, but isn't it just a script? Or do you
>> have something much more complicated in mind?

> Take a look at 3D graphics performance: there are two types of benchmark:
> triangles per second, lines per second, texture maps per second.

> The other type is "for game XYZ, frames per second".

> We need both types of benchmarks.  Probably the first more than the second,
> because it helps developers more.  The second kind just tells you how screwed
> up the system is today, without telling you why, or where to look.


>> In our case, the query processing time becomes unrealistically large for a
>> one-minute video.  If we consider the problem of search in the entire
>> episodic memory, it should be even not linear, but logarithmic. Low degree
>> polynomial complexity is ok for the task dimensions ~1000, not millions or
>> billions...

> No clue what you are searching here.  Logarithmic searches mean binary tree,
> quad-tree, octree.  The space-time sever is an octree, but its not integrated 
> with
> the pattern matcher, and has a super-naive API.


> We have binary-trees and hash tables for atoms-by-name, by-type, but zero
> sophistication for numeric values. See above comments about "satisfiability
> modulo theories".


> Linas
>
>
> --
> cassette tapes - analog TV - film cameras - you



-- 
cassette tapes - analog TV - film cameras - you

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA347GFq0%3DeBZuXYPegD1%2B4%3D2weY6X68fnFF9WOTwUWWh8g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[opencog-dev] Re: Pattern Matching performance

Reply via email to