[opencog-dev] A srfi for tensors

Linas Vepstas Thu, 20 Aug 2020 16:28:25 -0700

You may feel that I've already written too much about tensors, but I'd like
to make one last remark. This is partly for my own benefit, as writing
helps clarify my own thoughts. It's partly for you, as I hope to do
something constructive, here, for scheme. It may seem off-topic for
srfi-194 but does have a certain relevance, in general, for scheme, as I
hope to show, with luck, by the end of this email.


(I am cc'ing two other mailing lists, so this is not just about scheme
srfi's, but also about tuples and natural language)

Let's start with "the tensor product", and let me start with classical,
i.e. non-lisp notation.  We'll get back to lisp notation, which is the
point of the email(!?)

Let F(x,y,z) be a "function", returning a real number, for integer-variable
arguments x,y,z. Likewise G(u,v).  Pick a specific set of integers
a,b,c,d,e. That is, let a,b,c,d,e be constants.

The tensor product of F and G is simply the numerical value (the
real-number) that is the ordinary, numeric product of the numerical values
of F and G. Trivial enough, eh? The result is a tensor T(x,y,z,u,v) whose
numerical value at T(a,b,c,d,e) is just F(a,b,c)G(d,e)  A key property of
the tensor product (the whole thing that makes all this non-trivial) is
that it "erases" where some "information" "came from from".  That is, there
might have been some other functions H(a,b) and K(c,d,e) such that
T(a,b,c,d,e) = H(a,b)K(c,d,e)  ... we no longer know "where the indexes
come from" (did they come from H and K, or from F and G?) and we don't know
where the values came from, either... if T(a,b,c,d,e)=42, we don't know if
F(a,b,c)=6 and G(d,e)=7 or if F(a,b,c)=2 and G(d,e)=21 .. you just can't
tell any more.

Let's try that again, in lisp notation. Earlier, I mentioned that
simply-typed lambda calculus is the internal language of Cartesian
products. Perhaps you already know this and it's obvious (it's
textbook-101) , but let me review anyway.   The Cartesian product of spaces
X,Y,Z is X x Y x Z and if I sample a,b,c from them I have the ordered tuple
(a,b,c) or, in lisp notation, the list ('a 'b 'c)  .. let me drop the
quotes, and just write (a b c)

The magic of cartesian products is you can nest them: Thus, the Cartesian
category consists of all possible strings of letters and spaces and
balanced parens: e.g. (a (b c) a (d e) (p (q (r (s))))) where all of these
single letters were constants. Now, mathematicians love variables, so we
allow variables x,y,z... Mathematicians also love to plug values into
variables and by convention, the act of plugging in is called "beta
reduction" and the notational device of what-to-plug-where is called
"lambda"   and so one may write λxyz. (a (b x) a (p q y)) (d e f) by which
we mean "plug in def for xyz" (or "connect" per my earlier email).  All
that we've done here is to work with the cartesian product, constants,
variables, a notational trick for connecting-up, called lambda, and
beta-reduction and bingo .. simply typed lambda calculus.  Textbook
chapter-1 stuff. You've already known this since forever. ("The internal
language of cartesian-closed categories is simply typed lambda calculus" to
put it formally.)

So what's the deal with the tensor product? Associated with the list (a b
c) .. that is, the cartesian product of a and b and c, is a number, say "6"
and associated with the list (d e) is another number, say "7".  The tensor
product is a list concatenation (a b c d e) and the associated value 42.
The key point is the "erasure" of parenthesis: We are emphatically NOT
writing ((a b c) (d e))  -- list concatenation "erases" the knowledge of
the two original lists.   In this sense, the tensor product is "the same
thing as" list concatenation. Well, in this example, we also multiplied 6
and 7 to get 42, which is how numerically-valued tensors work (how they
must work).  For the generic case, for foo-valued tensors, one only needs
to provide a product to multiply two foo's together. (and this product
needs to "forget" the initial factors).

Besides products, it is convenient to add (sum together) tensors. For
foo-valued tensors, one might not have any valid kind of addition of foo's.
One can still go meta and say "this foo or that foo" where by "or" I mean
"disjunctive or" -- pick one; menu-choice.  Some people like to use the v
symbol for disjunctive-or, and the & symbol for list concatenation, and
thus one naturally is led to ask what happens when considering the algebra
of these two symbols. They are obviously distributive in one sense, but not
the other, so it is NOT a boolean algebra  (confusion about this -
confusion with boolean-and and boolean-or has led many a beginner astray).
Throw in variables and lambda and beta reduction and bingo, you've got ...
umm, I'm unclear on the formal name. Let's call it "tensor logic". (the
internal logic of the closed monoidal category)

Inner product was mentioned in earlier emails. For two numerically-valued
vectors A=[a_1, a_2, ... a_n] and B=[b_1, b_2, ..., b_n] it was defined as
inner(A,B) = A.B = a_1 . b_1 +  a_2 . b_2 + ... + a_n . b_n and so for
foo-valued vectors, it is likewise: a_1 & b_1 v  a_2 &  b_2 v ... v a_n &
b_n  where a_1 & b_1 is the product of foo a_1 and foo b_1, etc.  The
disjoin is the meta-disjoined, disjunctive-or.

I previously talked about beta-reduction implying that it's the same thing
as "connecting" -- that plugging in 42 for x in f(x) gives f(42) by
"connecting" 42 to x. But I also said that "contracting together tensor
indexes" is "connecting" -- that is, the inner product is "connecting".
So, which one is it? Both.

Type theory to the rescue. When I say "plugging in 42 for x in f(x)" I
really mean that "x must be of integer type" and "42 is an exemplar of the
integer type" and that "plugging in" is allowed only when the types match.
Now, an integer can be 1 v 2 v 3 v ... (disjunctive-or) and thus, the
result of "plugging in" is f(1) v f(2) v f(3) v ... well, that's just the
inner product. To belabor the point, and get awkward with the notation, I
should have written f_1&1  v  f_2&2  v  f_3&3  v .... where of course f_1 &
1 = f(1) is how one takes the "product" of foo f_1 with the foo 1 to get
foobar f(1).

How far away is this from a "tensor srfi"? Some ingredient seems missing
...

The difference between simply-typed lambda-calculus and lambda calculus
with types is .. well, I guess it's dramatic. The tensor algebra, as
canonically defined in mathematics, is simply-typed. That is, all of the
tensor indexes are of the same type: for tensor T(x,y,z) the x,y,z are all
the same type. This is usually not acceptable for programming, where you
might want to say "x is an int and y is a string and z is a float"
(somehow, lisp/scheme programmers are happy without explicit types...).
Even in physics, tensors get typed: the quark wave-function (a
complex-number-valued function) has three tensor indexes: a dirac-spinor
index, a color index, a flavor index; thou shalt not mix these up; typing
is respected.

So this is where I have trouble with srfi-179: the indexes were always
integers (ordinals). So, sure, for a quark wave-function, the dirac-spinor
index ranges from 1,..4 and the color index ranges 1..3 and the flavor
index ranges from 1..6 but these are really different types. It is kind of
"an accident" that ordinal numbers  can be used. Physicists don't, they say
"u,d,s,c,b,t" for flavor, and "r,g,b" for color. Historically, the
dirac-spinor-index is 1..4 but sometimes also  (1..2) \oplus (1..2)

So, I'm thinking more like a mash-up of srfi-111 (boxes) with Amirouche's
srfi-168 (tuple-store database) ... so, to represent a quark, I'm thinking
I want to define a box to hold a complex number, and I want to index the
box with a triple of (spin, color, flavor)  .. but I still need
"connectors" -- the QCD axial-vector current is psi-bar gamma_5 gamma_\mu
psi and thus I have to contract/connect the color and flavor and spin
indexes correctly.  This is the inner product -- I need the inner product.
The srfi-168 doesn't tell me how to write inner products of tuples. The
srfi-179 almost seems to but not quite, from what I can tell.

OK, in earlier emails, I alluded to natural language, suggesting that
lexical grammars are really just tensors. This is worth articulating, since
I feel a proper tensor srfi needs to be able to handle this case.

I am going to use link-grammar
https://www.abisource.com/projects/link-grammar/ as my prototypical
example. It defines a lexical grammar for English, Russian and several
other languages (Arabic, Persian..) The lexical entries are all tensors.
Let me give an example. You might open the `4.0.dict` file and find these
entries (highly simplified):

John Mary: S+ or O-;
told: S- & O+;

The S+ and O- are "connectors" and the grammar rules is that types must
match, polarities must be opposite (contra/covariant, upper/lower, if you
wish to go that way... I don't, but it's possible).  The "or" is the
disjunctive-or, from above.  The & is the tensor-product, from above.  The
only valid sentences of the language are "John told Mary" or "Mary told
John" or "John told John" or "Mary told Mary" .  S and O are linguistic
short-hand for "subject" and "object".

Mary and John are both vectors, or 1-tensors. S corresponds to the vector
[1,0] and O corresponds to the vector [0,1]  ... to be more precise, John
is "either the vector [0,1] or the vector [1,0] but not both" -- it is
disjoined; the menu-choices are called "disjuncts".

"told" is a verb, represented as matrix, or 2-tensor. It has the shape
[[0,1],[1,0]]  or, in 2-D

0 1
1 0

The sentence fragment "John told" is the vector [0,1] which we recognize as
being of type O  - the object - this sentence is missing the object. If we
contract with another object [0,1] we get a scalar 1 and thus a valid
sentence.  If we contract with another subject S [1,0] we get a scalar zero
- an invalid sentence.  This example is perhaps confusing, as John and Mary
can be either subjects or objects, I guess I should have used "rock" and
"tree" instead, and "saw", and a grammar that allows "John saw the tree"
but not other combinations.  This email is long, and writing out the
combinations is tedious, so I leave this as an "exercise for the reader"?

Now, these vectors and tensors can be very sparse -- besides S and O, there
is A for adjective and D for determiner and P for preposition, and many
more. Writing these as long vectors of 0's and 1's is tedious, thus the
letters.  Also, these are usually very sparse - mostly all zeros.  There is
a bit of trickery where I can replace the disjunctive-or by plain-old
vector addition, and the number 1 by the probability so that these look
like probability vectors and markov matrices, but I wander off-topic, so
let's not do that (yet).

Type-theoretically, S and O are "primitive types", and what is meant by
"being able to connect" is that the types must match, which means the same
thing as saying that the inner product must be non-zero. Connecting
together two link-grammar connectors is the same thing as taking the inner
product.

The tensor product S-&O+ is a compound type constructed with the
compound-type constructor; formally, this is called "the product type
constructor".  It is analogous to a java/c++ "class" -- thus, formally,
"classes" in java/c++ are really just tensors in disguise; they are the
tensor product of their members.  A class with 2 members is a 2-tensor; a
class with 3 members is a 3-tensor, etc. Recall the tensor product was just
just list-concatenation. Scheme has run-time list concatenation, java and
C++ do not.  That is, I cannot declare a new c++/java class at run-time.
(However, javascript... hmmm... well I wander astray again... there is a
reason why javascript is popular, and this is part of it.)

So, in my imagined srfi, I take Bradley's srfi-179 "intervals", and replace
a single interval by a set of possible values (a disjoined set -pick any
value but only pick one) but since I now work with sets, most of the
mechanics of intervals is useless; I don't need ordinals, I need the things
that the ordinals are labelling.  Thus I really need tuples (and it should
now be clear that tuples are tensors).

I need the tensor product, but that is just tuple concatenation. (which I
don't see in srfi-168 but maybe it is hiding in there).

I need the inner product, in several forms: as a type-matching constrainer,
which is much faster than computing the inner product of [1,0,0,0...] and
[0,0,1,0,...] only to discover that its zero. But also a traditional inner
product, so that my types can be weighted by probabilities .. because that
is actually very useful...cough neural net cough...

I need the tuple database, so that I can store a lexis -- a large
collection of tensors (i.e. elements of tensors).  The tuple database is
highly advantageous for sparse tensors, where almost all entries are zero.

I'm hoping this is not too much to ask for: tuple concatenation, a tuple
store, an inner product. That's it.  How hard can that be?

This email is too long already, but it does need a foot-note: my personal,
primary usage for this beast is not to store and compute quark axial-vector
currents, (or random matrixes) but to generate "sentences" i.e. all the
non-zero scalars contracted from tensors aka the "grammatically valid
sentences", or, alternately, given a tuple-store (a lexis), data-mine it to
discover how the indexes connected up (aka "parse a sentence").  So the
programming API for this imagined srfi would need to be amenable to such
usage.

Three things: tuple concatenation, a tuple store, an inner product. How
hard can that be?

-- linas

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA36U083x58cWSkD6S_Rzid%3D-aWtbJ9Gwz13W1C%3D0po1Ndg%40mail.gmail.com.

[opencog-dev] A srfi for tensors

Reply via email to