Linas,
OK, let's continue the discussion about probabilities.

2018-05-22 21:18 GMT+03:00 Linas Vepstas <[email protected]>:

> Do not assume that a probability is what you actually want.  Let me give
> three examples.
>
> In real life, when you see a crow, and it is dark, and you want to talk
> about it, you just say "black crow" as an identifier of the object in the
> scene.  You don't pull out your photometer and measure it's darkness at
> 87.68% and a blueish hue of 77%. Why? Because you don't need to do that to
> have a conversation about it's presence, location, movement, etc. You only
> need to evaluate crow-ness and blackness sufficiently to distinguish it
> from all other elements of the scene, and then you can assign P==100% for
> most practical purposes.
>

I cannot agree. This is controversial at least. My vision system does
perform photometry. Why? Because it is designed for reconstructing
invariant physical properties (like albedo) of the surrounding world.
Humans may not have a direct consciouss access to the specific values, but
many visual 'illusions' show this. E.g. the 'chessboard illusion' shows
exactly that we are not interested in raw brightnesses, but try to
reconstruct reflective properties of observable surfaces. Then, what is a
'crow-ness'? If somebody tells me: "Hey, there is a white crow", then my
vision system will try to detect and recognize it given a priori
information about its presence. Then, it should conclude if it is really a
crow or not. Your "crow-ness" is a rough estimation of P(crow|...). There
are very many practical situations, when I'm not sure I see a crow, because
it is too distant or is flying too fast or it is a badly drawn picture, so
I conclude that this might be a crow with higher or lower confidence. Even
if I don't need a detailed information on a level of consciousness to talk
about this particular crow, this doesn't mean that my vision system doesn't
try to assign probabilities of this object being a crow. Even if my vision
system works in a discriminative way and tries to distinguish crows from
other objects, this can be treated as assigning probabilities; these are
just conditional, but not mutual probabilities, so they cannot be used in
any other direction of inference, but just to estimate whether this is a
crow or not; but they are probabilities.

When I'm saying 'probability', of course, I don't mean its precise
objective value. At least, I mean a variational approximation of subjective
posteriors, or mean-field approximation, or naive Bayes approximation, or
fuzzy-logic approximation, or whatever else approximation. Of course, we
don't try to calculate these probabilities precisely, because it is
computationally too expensive. But approximate probabilities are still
probabilities. People who say that these are not probabilities, actually
insist on a particular way of approximating probabilities. And they are
completely wrong, because there is no universal way for approximating
probabilities. Sometimes, you can calculate them precisely. Sometimes, you
can use a variational approximation with DNNs. Sometimes, you can use fuzzy
logic. Sometimes, you can even not calculate these probabilities or their
approximations explicitly, like in model-free reinforcement learning, in
which probabilities are 'summed out' inside Q-functions, but they are still
there in the Bellman equation.

So, no, probabilities are what I want. I don't want precise values of these
probabilities in most cases though (I would like to, but I know this is
infeasible). So, if you just meant that precise values of probabilities are
not needed (in sense that their calculation would be a waste of limited
computational resources), than yes, I know.



> In neural nets, the sigma-function is a non-linear component, used to
> boost results towards extremes. whatever sum of weights or evidence or
> whatever it is that you have, as inputs feeding the neural net, you apply
> the non-linear sigma, to try to sharpen everything closer to either 0% or
> 100% -- to discriminate. To increase contrast.  This is kind-of the
> "secret" as to why neural nets work, and probabilities don't.
>

Sigma-function (which is actually not too popular now for hidden layers)
itself doesn't boost results towards extremes. What does boost is the
crossentropy loss. And sigmoid actually makes the outputs of neurons
probabilities (and the crossentropy loss is the result of this
probabilistic, or information-theoretic if you wish, interpretation). So,
these are exactly probabilities, and they help to achieve this result,
which you are talking about.


>
> In "integrated information" theory, you work with a large complex network
> of things that are all inter-related, all interconnected.  The goal of
> applying the theory is to find those extensions of the net that are highly
> interlinked, interconnected, and then to draw an accurate boundary around
> them.   If and when you can perceive that boundary, you can give everything
> inside one name, and everything outside a different name.  The names
> assigned are unambiguous, unique, even if the actual boundary is perhaps
> uncertain, even if there is a gradation, a smooth-ish transition from the
> highly-interconnected thing, to the mostly disconnected parts.   The act of
> name-tagging is what gives a handle on being able to think about the object
> in symbolic terms.
>

Sounds like yet another approximation to probabilities ;)

-- Alexey

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CABpRrhyemWHO5FZQuGSfP6D3GrO4RwOrWt3Wc9UaP72yJCjMmg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to