Ben, On Wed, Feb 20, 2019 at 2:39 AM Ben Goertzel <b...@goertzel.org> wrote:
> ... > The unfortunate fact is we can't currently feed as much data into our > OpenCog self-adapting graph as we can into a BERT type model, given > available resources... thus using the latter to help tweak weights in > the former may have significant tactical advantage... > You can't feed as much data into your graph as you can into a BERT type model?? How are you feeding data into your graph? Shouldn't this just be observation? Isn't -> it -> as -> simple -> as -> this -> ? This stuff is very close. First there is Linas's observation that vector representations too have linearities and thus are inadequate. This would map to the insight I noted before re. my vector model, that starting with vectors I had already thrown a lot of contextual variation away, that I needed a graph representation. And I like what seems to be a link to the need for any formalization to be category theoretic, or at least some kind of gauge/invariant theory, even QM maths at one point, a la Coecke et al. But I still fear there is some idea of learning which is trapping you. Looking at what Linas has written in the other thread: LV: "Use symbolic systems, but use fractional values, not 0/1 relations. Find a good way of updating the weights. So, deep-learning is a very effective weight-update algorithm. But there are other ways of updating weights too (that are probably just as good or better. Next, clarify the vector-space-vs-graph-algebra issue, and then you can clearly articulate how to update weights on symbolic systems, as well." For sure there are other ways of updating the weights which are just as good or better! How much better for the weights to be virtual, corresponding to clusters of observed links. The "update" mechanism can just be a clustering. Deep learning is not a great update mechanism. Firstly because it does not have the formal power of the graph you are trying to inform. Right? We just agreed vectors have linearities didn't we?? (LV: "Although vector spaces are linear, semantics isn't; not really.") So their power will never be enough. Using DL you are crippling the power of the full graph you have just decided you need. And there are other things too. Deep nets need linearities in the way their weights can be updated so information can propagate down through the layers. And they impose a structure on your graph quite apart from linearities in the connectivity and layers. All those carefully crafted "attention" layers etc. are a hack on a full connectivity. To use them is to throw away so much of the power of a full graph. And for what? So you have a weight update mechanism? A weight update mechanism which makes assumptions you want to throw away. And they don't even "update weights" to find new structure all the time, in real time (which is really what the distinction between symbolism and distributed representation should be, I worry that may be becoming lost -- even my vector model had that, by substituting vectors into other vectors. That was the point of it, to portray the cognition problem as one of creating/generating patterns, not learning them. That gets lost if you tie everything to a deep learning weight estimation.) Rather the weight update mechanism should just be a clustering. Probably just oscillations on the network. As regards formalization. We can sweat blood to formalize groupings. Visualize patterns of connectivity as symbols. As Coecke etc theorized, that formalism will probably need to be category theoretic, or using quantum mechanical maths in another thread of literature. But the vector-space-vs-graph-algebra issue and the complex maths goes away if you are not worried about formalization, but only want a functioning system. It's backwards to insist on formalization, so you can formulate the problem in terms of updating weights on symbols, when the network is already a perfectly good representation in itself. The groupings are easier to generate than to describe. (You can formalize them, but you will just get a formalism which is indeterminate, like QM. You will need the network to resolve the indeterminacy anyway -- embodiment.) Perhaps we will need to crystallize out a formalization to move to reasoning systems. But for raw perception, the network will be enough. And raw perception is the big failure at the moment, self-driving cars etc. LV: "the path forward is at the intersection of the two: a net of symbols, a net with weights, a net with gradient-descent properties, a net with probabilities and probability update formulas." It can be seen this way. But both the symbols and the weights should be virtual, corresponding to clusters, with the clusters projected out at real time. Deep nets are a really bad way to do this. I must be missing something in the data format of your network. I don't see an argument why it can't be as simple as: 1) Establish a network of sequential observations. (Super easy for text. Just -> like -> this -> .) 2) Set the network oscillating. (To project out the symbolism, weights, probabilities, etc, as groups of observations with synchronized oscillations in this network, resolved by varying inhibition.) -Rob ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T581199cf280badd7-M22a2f111cd6027d40196733a Delivery options: https://agi.topicbox.com/groups/agi/subscription