On Tue, Apr 16, 2019 at 12:02 AM Nil Geisweiller <[email protected]> wrote:
> On 4/16/19 6:52 AM, Linas Vepstas wrote: > > What about PLN? Well, today's PLN, built on the pattern matcher, will > > run thousands of CPU cycles and then do a small handful of float-point > > ops. I'm making several "meta" claims, which perhaps I should be more specific about. First claim: * The reason that deep-learning has been so effective and successful is that they found a way of avoiding 'useless calculations'. This is by a combination of two tricks: backpropagation and dimensional reduction. * Backpropagation means that an order of magnitude (factors of N or NlogN or N^2) of useless float-point computations are eliminated and only the useful, non-redundant calculations are are kept. * Dimensional reduction is the weak spot, the Achilles heel of NN's. It reduces the problem space to a size where results can be obtained relatively quickly; however, the reduced problem space is too small for human-level reasoning and language. * Using highly-sparse matrix math instead of dimensional reduction preserves everything important, while eliminating the weaknesses of dimensional reduction. The calculation-space remains small, but the representation space remains large/huge. NN's have small computation-spaces (that's good, it makes them fast) but also small representation spaces (really bad, it destroys structure). * I'm concerned that the current PLN architecture, of performing complex graph searches (using integer pattern matching code) followed by infrequent numeric (float-point) work is very inefficient. That is, when comparing to NN, the search-and-traversal is a waste of time and effort, and only the float-point computations matter (are meaningful). So I'm wondering if the PLN algo can be reformulated to be more backpropagation-like, which means the math becomes a kind of "inner loop", while the graph-traversal parts of it become the "outer loop". This last sentence is extremely imprecise: it is meant to be inspirational, not practical. To rephrase: PLN (and symbolic-AI approaches in general) preserve the large/huge representation space. That's good. Algorithmically, my gut intuition is that traditional symbolic-AI algos (such as PLN) are extremely CPU-inefficient. (spend too much time graph-searching, graph-traversing, and not enough time actually computing things (i.e. multiplying and adding floats)) The attempts with the ultra-super-sparse matrix code is to retain the giant representational spaces of symbolic AI, while minimizing the graph-search efforts. This is done by working with a single, small, simple fixed graph. i.e. by making multiply-add the inner loop, where the graph is held fixed, and make the outer loop be the exploration of different graph shapes, of how bigger graphs are assembled from smaller parts. I think I have a fairly clear conception of how this works for language learning; I do not yet have an equivalent conception for reasoning. However, its important to obtain this, and for more reasons than one: I think its a better theoretical foundation, but also, it's exactly what IPU-style machines are very efficient at computing. The "sheaves" paper is trying to explain exactly how to exchange the inner and outer loops. viz graphs are the slowly-changing things, the combinations of which sit in the outer loop (classical symbolic AI), and the weights/probabilities are rapidly updated in the inner loop (which works because everything looks regular, uniform at the local level, viz. just a vertex, and its nearest neighbors, all look "the same", simply because they are small, simple, tiny. Work with these small, tiny components, *before* these are all joined up to form some mega-graph. This is the same trick that NN deep learning is using, except that existing NN deep learning algos also collapse/blur/average away the large scale structure (which is fundamentally wrong). NN's do this because they do not know how to identify and manage that large-scale structure. They're blind to it. The weight-vector projections in mash together, average together any/all relationships that are more than two nearest-neighbor distances apart. -- Linas -- cassette tapes - analog TV - film cameras - you -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA35Wbq7CAR5U5ct8RoFnZKx-Xr6tOueK3v6CYCUOqBbNmA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
