I'm interested in going through each step and describing the data flow
in human words.  Or at least starting this.  Could make substeps if
the steps are complex.

Step 1: Input preprocessing.
PerceiverTextPreprocessor collects embeddings and position embeddings
into one summed tensor.  They are both simple trainable embeddings.
An embedding converts tokens into vectors that relate to their use.

Tokens or bytes -> PerceiverTextPreprocessor -> Embedding vectors

Reply via email to