[opencog-dev] sections of a graph, langauge

Linas Vepstas Sun, 24 Sep 2017 00:06:09 -0700

I've been talking about these ideas for a while, and thought I'd write them
up in a single place, in some easy-to-understand way. So here they are,
again:

https://github.com/opencog/atomspace/blob/master/opencog/sheaf/README.md

So, in case you heard me mumble about this, here's it is again.

--linas

Sheafs - Quick Start
====================
Sheafs provide a simple and convenient mechanism for working with
graphs "locally", by making the nearest-neighbors of a vertex apparent.

The traditional textbook-canonical way of specifying a graph is to
state that it is a set of vertexes, and a set of edges that connect
pairs of vertexes. The problem with this description is that given
any vertex, one has no idea of what edges are connected to it, without
scanning the entire set of edges. Another problem is that vertexes and
edges are not composable; that is, when they are composed together,
they are no longer vertexes or edges, but a more general type: a
"subgraph". By contrast, sheaves carry local information, and are
composable.

Given a vertex V, a "section" is defined as a set of pairs (V,E) of
that vertex V and all edges E that are attached to it. That's it!
Very simple! A section can be envisioned as a "spider", with the
vertex V as the body of the spider, and the edges as the legs of the
spider.

Sections are composable, in that several can be connected together
by joining ("connecting") edges. The result is still a section, in
that it has a central blob as the spider-body, and a whole bunch of
legs sticking out. Composing sections in such a way that the edges
connect only in legal ways is called "parsing".

Another way of visualizing sections is to envision a jigsaw-puzzle
piece instead of a spider. The vertex V is a label on the puzzle-piece,
and each leg is a tab or slot on the puzzle-piece. The tabs or slots
are now obviously connectors: this emphasizes that jigsaw-puzzle pieces
can be conneted together legally only when the connectors fit togehter.
Again: the act of fiting together puzzle-pieces in a legal fashion is
termed "parsing".

In standard mathematical terminology, the spider-body or jigsaw-label
is called the "germ". It is meant to evoke the idea of a germinating
seed, as will become clear below.

Sections in Atomese
-------------------
Sections are represented in Atomese as follows:
```
Section
Atom "foo" ; usually a Node of some kind.
ConnectorSeq
Connector
Atom "bar" ; for example, a WordNode.
LabelAtom "foo-to-bar label" ; can be a PredicateNode.
Connector
....
```
Here, `(Atom "foo")` is the spider-body or germ. Each leg of the spider
is represented by a `Connector` link. In the above example, the vertex
"foo" has an edge "foo-bar" attached to it, with `(Atom "bar")` being
the far-endpoint of that edge. The edge carries an optional label,
shown as `LabelAtom`, above.

The "foo-bar" edge can also be represented as an `EvaluationLink`,
which is the structure used in many other parts of OpenCog.
```
EvaluationLink
LabelAtom "foo-to-bar label" ; can be a PredicateNode.
ListLink
Atom "foo"
Atom "bar"
```
This `EvaluationLink`, and the `Section...Connector` structure are meant
to be sort-of, more-or-less equivalent and interchangeable. (In many
cases, thy can be taken to be equivalent; however, the `Section...
Connector` structure is more general and can describe more kinds of
structures more simply than an EvaluationLink can. This will be made
clear below).

Connectors
----------
Note that `Connector`s are like "half-edges": in isolation, they carry
the edge-label, and the far endpoint, but not the near endpoint. The
term "connector" is used to emphasize that these are meant to behave
like the tabs on a jigsaw-puzzle piece: that they serve to connect to
other connectors on other sections.

There are no explicit or implicit rules for how to connect up
connectors; this is application-dependent. However, the above example
implicitly implies that a connection is legal only when the germ-atom
of one section is the same as a connector-atom in another section.
Likewise, the edge-labels should match up or be coherent in some way.

The `ConnectorSeq` link is an ordered link, rather than an unordered
set. This is because in many applications, the sequential order of the
connectors matter.

Germs and Etales
----------------
In the above, the germ `(Atom "foo")` can be taken to be a label "foo"
on the vertex that is the germ of the section. There is no requirement
that this label be unique, or that it uniquely identify a vertex. Thus,
in general, there will be many different sections having the same germ.

In this case, it is common to visualize sections as pieces of paper,
stacked one on top another, aligned so that the germs are always above
one-another. This stacking is refered to as a "stalk". Alternately, the
stalk can be visualized as a fir-tree or pine-tree, where annual growths
define the trunk, and then branches shoot off to the side, on level
planes or "etales". A single section is then just that etale, that plane
of branches from that single year of growth. Unlike a pine-tree, however,
sections, like jigsaw-puzzle pieces, can be assembled together to make
a bigger section. Such sub-assemblies are still "etales", in that they
are all at the same level.

Sheaf Theory
------------
The collection of all possible stalks, and the sections on them,
together with the rules that dictate how the connectors can connect
is terms a "sheaf".

The axioms of sheaf theory can be understood as saying that the
puzzle-pieces can only be assmebled in certain legal ways, and that
sub-assemblies can be intersected with one-another, as long as they
are correctly lined up. The Wikipedia page for Sheaf Theory bears
very little resemblance to the description above, even though they are
both taking about the same concepts. That is because the typical
application of Sheaf Theory in mathematics deals with sections that
have an infinite number (countable and uncountable) of connectors, and
the connector labels form a ring or a field. That is not the case here,
and so the situation is conceptually simpler, here.

In relation to Graph Theory
---------------------------
Note that the above started by talking about graphs, but ends at a
somewhat different place. It is worth reviewing the differences.
In the canonical description of a graph, it is implicitly assumed,
without statement, that ecah vertex is unique and unambigous and
obviously a different vertex than eny of the others. Likewise for
edges: they join pairs of vertexes, and there is no confusion about
what is being connected.

This is not the case for sheaves: although a germ can be visualized as
a single vertex, the fact that there is a stalk above the germ implies
that many vertexes in the graph are taken to be "the same vertex". Yet,
despite being the "same vertex", each section connects to the graph in
different ways. Thus, a sheaf can be visualized as a graph with a
certain quotient operation applied, a lumping together of certain
vertexes into a common set, the "germ".

In graph theory, an edge unambiguously connects two vertexes. By
contrast, the connectors on a section are a bit more ambiguous: they can
connect to anything else that is legally connectable: the connectors
must match, must be contractable. The connectability of connectors
are given by rules, but those rules are "user-defined" (although they
usually match connectors to germs and force edge-label agreement).

In relation to Tensor Algebra
-----------------------------
A tensor can also be visualized as a kind-of spider or puzzle-piece: the
legs or tabs are just the indexes on the tensor, and the number of legs
is the degree of the tensor. The act of contracting tensor indexes is
the same as connecting together connectors.

More properly, a tensor is a linear combination of such puzzle-pieces;
it is a linear combination of the etales of the stalk. Thus, for
example, an N-dimensional vector in an N-dimensional space can be
visualized as a linear sum of N different connectors (that is, as a sum
of one-legged spiders). Each basis element of the vector space
corresponds to a single connector. When one takes a dot-product of
two vectors, one joins connector-to-connector, aligning the basis
elements of the vector, and sums up the values.

Thus, if one has a list of connectors, and a floating-point value
attached to each, this can be visualized as a vector. Examples arise
in linguistics, where one has lists of words, and the associated counts:
these form a vector.

In the more general case, a tensor of degree K is a linear sum of spiders
or puzzle-pieces with K legs/connectors on them. A tensor is a "stalk"
with a number attached to each "etale" or section. Tensors can be
contracted together to form other tensors simply by connecting some of
the legs/connectors. Multiplication and co-multiplication in a tensor
algebra are just specific rules for connecting connectors so that
linearity is preserved.

The N-gram in linguistics an example of an degree-N tensor. The count of
the number of occurances that an N-gram is seen is the weight.
Sometimes the collection of N-grams is called a "vector", but this is
not quite right, because when multiple N-grams are assebled to form a
sentence, the words must match-up correctly: N-grams are tensor
elements.

--
*"The problem is not that artificial intelligence will get too smart and
take over the world," computer scientist Pedro Domingos writes, "the
problem is that it's too stupid and already has." *

--
You received this message because you are subscribed to the Google Groups
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit
https://groups.google.com/d/msgid/opencog/CAHrUA34mKNA_iuZfjXNjPzh498NZyjgjWd%2B2wyy12Yep0NdcuA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[opencog-dev] sections of a graph, langauge

Reply via email to