Hi Alexey,
these are valid points. Currently, as you probably already understand,
the (only?) way to match values is to resort to grounded schemata, see
for instance
https://github.com/opencog/opencog/blob/ea987668ed713c55c2df087b81f55736d7469772/opencog/learning/miner/rules/shallow-abstraction.scm#L72
where absolutely-true-eval is define here
https://github.com/opencog/atomspace/blob/master/opencog/scm/opencog/rule-engine/rule-engine-utils.scm#L434
For similar reasons PLN formulas are programmed with grounded schemata.
A way to address that would be to complement Atomese with links encoding
operators to access and modify values, GetValueLink, etc. This wouldn't
make the pattern matcher more efficient (initially), but at least it
would allow OpenCog to reason about values.
Nil
On 05/20/2018 06:53 PM, Alexey Potapov wrote:
Ben, Nil, Linas, Cassio, and whoever might be interested,
2018-05-20 12:54 GMT+03:00 Ben Goertzel <[email protected]
<mailto:[email protected]>>:
> But how will you calculate P(image|crow,black)?
Well as you know, if you really want to, something like "the RGB value
of the pixel at coordinate (444,555) is within a distance .01 of
(.3,.7,.8)" can be represented as a logical atom ... so there is no
problem using logic to reason about perceptual data in a very raw
way if you want to
OTOH I don't really want to do it that way... instead, as you know, I
want to model visual data using deep NNs of the right sort, and then
feed info about the structured latent variables of these NNs and their
interrelationships into the logical reasoning engine.... This is
because it seems like NNs, rather than explicit logic or probabilistic
programming, are more efficient at processing large-scale raw video
data...
Yeah... and here is the dilemma.
We consider two different yet connected tasks:
–Connecting OpenCog with deep neural networks (more specifically, with
Tensorflow library);
–Implementing efficient probabilistic programming with the use of OpenCog.
Both tasks can be considered as a part of the Semantic Vision problem,
but their solution can be useful in a more general context.
*
*
*OpenCog + Tensorflow*
Depth of OpenCog+Tensoflow integration can be quite different. Shallow
integration implies that Tensorflow is used as an external module, and
communication between Tensorflow and OpenCog is limited to passing
activities of neurons, which are represented both by Tensorflow and
Atomspace nodes.
The most restricted way is just to run (pre-trained) TF models on input
data and to set values of Atomspace nodes in correspondence with the
activities of output neurons. What will be missing in this case:
feedback connections from the cognitive level to the perception system;
online (and joint) training of neural networks and OpenCog.
Let us consider the Visual Question Answering (VQA) task as a motivating
example. How will OpenCog be able to answer such questions as “What is
the color of the dress of the girl standing to the left of the man in a
blue coat?” If our network is pre-trained to detect and recognize all
objects in the image and supplement them with detailed descriptions of
colors, shapes, poses, textures, etc., then Pattern Matcher will be able
to answer such questions (converted to corresponding queries). However,
this approach is not computationally feasible: there are too many
objects in images, and too many grounded predicates which can be applied
to them. Thus, the question should influence the process of how the
image is interpreted.
For example, even if we detected bounding boxes (BBs) for all objects
and inserted them into AtomSpace, predicate “left to” is not immediately
evaluated to all pairs of BBs. Instead, it will be evaluated during
query execution by Pattern Matcher (hopefully) only for relevant BBs
labeled as “girl” and “man”. Similarly, grounded predicate “is blue”
implemented by a neural subnetwork can be computed only in the course of
query execution meaning that the work of Pattern Matcher should be
extended to neural network levels. Indeed, purely DNN solutions for VQA
usually implement some top-down processes at least in the form of
attention mechanisms.
Apparently, a cognitive feedback to perception is necessary for AGI in
general.
It is not a problem to feed Tensorflow models with data generated by
OpenCog via placeholders, but OpenCog will also need some interface for
executing computational graphs in Tensorflow. This can be done by
binding corresponding Session.run calls with Grounded Predicate/Schema
nodes.
The question is how to combine OpenCog and neural networks on the
algorithmic level. Let us return to the considered request for VQA. We
can imagine a grounded schema node, which detects all bounded boxes with
a given class label, and inserts them into Atomspace, so Pattern Matcher
or Backward Chainer can further evaluate some grounded predicates over
them finally finding an answer to the question. However, the question
can be “What is the rightmost object in the scene?” In this case, we
don’t expect our system to find all objects, but rather to examine the
image starting from its right border. We can imagine queries supposing
other strategies of image processing/examination. In general, we would
like not to hardcode all possible cases, but to have a general
mechanism, which can be trained to execute different queries.
To make neural networks transparent for Pattern Matcher, we need to make
nodes of Tensorflow also habitants of Atomspace. The same is needed for
a general case of unsupervised learning. In particular, architecture
search is needed in order to achieve better generalization with neural
networks or simply to choose an appropriate structure of the latent
code. Thus, OpenCog should be able to add or deleted nodes in Tensorflow
graphs.
These nodes correspond not just to neural layers, but also to operations
over them. One can imagine TensorNode nodes connected by PlusLink,
TimesLink, etc.. There can be tricky technical issues with Tensorflow
(e.g. compilation of dynamical graphs), but they should be solvable.
A conceptual problem consists in that fact that Pattern Matcher work
with Atoms, but not with Values. Apparently, activities of neurons
should be Values. However, evaluation of, e.g. GreaterThanLink requires
NumberNode nodes. Operations over (truth) values are usually implemented
in Scheme within rules fed to URE. This might be enough for dealing with
individual neuron activities as truth values and with neural networks as
grounded predicates, but patterns in values cannot be matched or mined
directly (while the idea of SynerGANs implied the necessity to mine
patterns in activities of neurons of the latent code).
I was going to illustrate by concreate the same kind of problems with
implementing probabilistic programming with OpenCog, but I guess it's
already TL;DR.
So, briefly speaking, we need Pattern Matcher and Pattern Miner to work
over Values/Valuations, that is not the case now (OpenCog uses only
truth and attention values, and Atomese/Pattern Matcher doesn't have a
built-in semantic even for them). I cite Linas here:
"Atoms are:
* slow to create, hard to destroy
* are indexed and globally unique
* are searchable
* are immutable
Values are:
* fast and easy to create, destroy, change
* values are highly mutable.
* values are not indexed, are not searchable, are not globally unique."
But we need "fast and easy to create, destroy, change, highly mutable,
but searchable" entities. So, this is not only technical, but
alsoconceptual problem...
I would really like to hear your opinion on this. What should we do?
Resort to the most shallow integration between OpenCog and DNNs? In this
case, SynerGANs will not work since we will not be able to mine patterns
in values, and we will not be able to use Pattern Matcher to solve VQA.
Express output of DNNs as Atoms? Linas objected even the idea to express
coordinates and lables of bounding boxes as Atoms. To do this with
activities of neurons will be even worse. Put everything into Space-Time
server? But the idea to use the power of Pattern Matcher, URE, etc. will
not be achievable. Extend Pattern Matcher to work with Values? Maybe...
/*I like the idea of embedding TF computational graph into Atomspace,
but tf.mul works over Values (tensors) - not NumberNodes. Thus, in this
case, it will be required to make all links (like TimesLink) to work not
only with NumberNodes, but also with Values... but I foresee objections
from Linas here... Also, I believe it should be useful in general since
Values are not first-class objects in Atomese - you should use
Scheme/Python/C to describe how to recalculate truth values; you cannot
reason about them directly...
Or should we try to use a sort of PPL as a bridge between Values and
Atoms? Maybe... Or we should do something unifying all these.*/
The question is not just about binding vision and PLN. It is more
general. Say, if you driving a car, you estimate distances and
velocities of other cars and take actions on this basis. These are also
Values, and you 'reason' over them using both 'number crunching' and
'logic' simultaneously (I don't mean procedural knowledge here in sense
of GroundedSchemaNode). So, I don't think that we should limit outselves
to a shallow integration and use DNNs/PPL/etc. peripherically only...
Ben Goertzel<[email protected] <mailto:[email protected]>>:
if one stays in the world of finite discrete
distributions, one can construct probabilistic logics with
sampling-based semantics... https://arxiv.org/pdf/1602.06420.pdf
<https://arxiv.org/pdf/1602.06420.pdf>
Sounds quite interesting. I'll study it in detail...
-- Alexey
--
You received this message because you are subscribed to the Google Groups
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit
https://groups.google.com/d/msgid/opencog/e827023c-7a65-86ef-40fb-1200d581abaf%40gmail.com.
For more options, visit https://groups.google.com/d/optout.