"mathematical purity" ... how you can use vector/tensor algebra with texts
I'd suggest using the search word "embeddings" instead of "tensor".
The concept is being used in other fields, even physics, but (sticking
with linguistics) if you've not looked into Word2Vec yet that is a good
place to appreciate how human language and linear algebra come together.
It is normally introduced as a ready-made model of dim 300, trained on
millions of words. Like you I wanted to understand what it was actually
doing, so a few years ago I did a presentation using just two dimensions
and a handful of words and sentences, then plotting the embeddings found
for each word. You can add or remove a sentence at a time to see what it
is learning from each.
You can see how each dimension is being given some meaning, even if they
are not the way a human linguist would have structured it.
It is also a good test bed for finding the limits, such as playing
around with ambiguous words and proper nouns, increasing the amount of
training data without increasing dimension, etc.
Darren
P.S. The embedding layer is the first layer in transformers, the layer
where tokens ("words") are turned into numbers, typically of dim 512 or
higher. But note that they are randomly generated, not initialized from
word2vec or similar. And any modification to their initial randomness is
to please the layers above, not humans trying to peer inside the box.
P.P.S. I think you might also enjoy
https://transformer-circuits.pub/2021/framework/index.html which is
exploring how transformers work at a very low-level.
The gap between their minimalist models and something like ChatGPT is
huge, though, and reading their work isn't going to help you appreciate
why ChatGPT says stupid things to you.
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]