Linas, OK, let's continue the discussion about probabilities. 2018-05-22 21:18 GMT+03:00 Linas Vepstas <[email protected]>:
> Do not assume that a probability is what you actually want. Let me give > three examples. > > In real life, when you see a crow, and it is dark, and you want to talk > about it, you just say "black crow" as an identifier of the object in the > scene. You don't pull out your photometer and measure it's darkness at > 87.68% and a blueish hue of 77%. Why? Because you don't need to do that to > have a conversation about it's presence, location, movement, etc. You only > need to evaluate crow-ness and blackness sufficiently to distinguish it > from all other elements of the scene, and then you can assign P==100% for > most practical purposes. > I cannot agree. This is controversial at least. My vision system does perform photometry. Why? Because it is designed for reconstructing invariant physical properties (like albedo) of the surrounding world. Humans may not have a direct consciouss access to the specific values, but many visual 'illusions' show this. E.g. the 'chessboard illusion' shows exactly that we are not interested in raw brightnesses, but try to reconstruct reflective properties of observable surfaces. Then, what is a 'crow-ness'? If somebody tells me: "Hey, there is a white crow", then my vision system will try to detect and recognize it given a priori information about its presence. Then, it should conclude if it is really a crow or not. Your "crow-ness" is a rough estimation of P(crow|...). There are very many practical situations, when I'm not sure I see a crow, because it is too distant or is flying too fast or it is a badly drawn picture, so I conclude that this might be a crow with higher or lower confidence. Even if I don't need a detailed information on a level of consciousness to talk about this particular crow, this doesn't mean that my vision system doesn't try to assign probabilities of this object being a crow. Even if my vision system works in a discriminative way and tries to distinguish crows from other objects, this can be treated as assigning probabilities; these are just conditional, but not mutual probabilities, so they cannot be used in any other direction of inference, but just to estimate whether this is a crow or not; but they are probabilities. When I'm saying 'probability', of course, I don't mean its precise objective value. At least, I mean a variational approximation of subjective posteriors, or mean-field approximation, or naive Bayes approximation, or fuzzy-logic approximation, or whatever else approximation. Of course, we don't try to calculate these probabilities precisely, because it is computationally too expensive. But approximate probabilities are still probabilities. People who say that these are not probabilities, actually insist on a particular way of approximating probabilities. And they are completely wrong, because there is no universal way for approximating probabilities. Sometimes, you can calculate them precisely. Sometimes, you can use a variational approximation with DNNs. Sometimes, you can use fuzzy logic. Sometimes, you can even not calculate these probabilities or their approximations explicitly, like in model-free reinforcement learning, in which probabilities are 'summed out' inside Q-functions, but they are still there in the Bellman equation. So, no, probabilities are what I want. I don't want precise values of these probabilities in most cases though (I would like to, but I know this is infeasible). So, if you just meant that precise values of probabilities are not needed (in sense that their calculation would be a waste of limited computational resources), than yes, I know. > In neural nets, the sigma-function is a non-linear component, used to > boost results towards extremes. whatever sum of weights or evidence or > whatever it is that you have, as inputs feeding the neural net, you apply > the non-linear sigma, to try to sharpen everything closer to either 0% or > 100% -- to discriminate. To increase contrast. This is kind-of the > "secret" as to why neural nets work, and probabilities don't. > Sigma-function (which is actually not too popular now for hidden layers) itself doesn't boost results towards extremes. What does boost is the crossentropy loss. And sigmoid actually makes the outputs of neurons probabilities (and the crossentropy loss is the result of this probabilistic, or information-theoretic if you wish, interpretation). So, these are exactly probabilities, and they help to achieve this result, which you are talking about. > > In "integrated information" theory, you work with a large complex network > of things that are all inter-related, all interconnected. The goal of > applying the theory is to find those extensions of the net that are highly > interlinked, interconnected, and then to draw an accurate boundary around > them. If and when you can perceive that boundary, you can give everything > inside one name, and everything outside a different name. The names > assigned are unambiguous, unique, even if the actual boundary is perhaps > uncertain, even if there is a gradation, a smooth-ish transition from the > highly-interconnected thing, to the mostly disconnected parts. The act of > name-tagging is what gives a handle on being able to think about the object > in symbolic terms. > Sounds like yet another approximation to probabilities ;) -- Alexey -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CABpRrhyemWHO5FZQuGSfP6D3GrO4RwOrWt3Wc9UaP72yJCjMmg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
