----
I think I can roughly summarise that if it is, it's not going to be a
big return to learn it, since other model parts would train a
different way.

I'm now interested in brainstorming another approach. Something to
replace the tokenizer. We could use the model's existing output to
train the other approach, and then finetune it around the desired
result.

Time to move to the other thread. Overstayed this one a little.
--- editing the above for new thread:
I think I can roughly summarise that if the output tokenizer is
trainable, it's not going to be a
big return to learn it, since other model parts would train a
different way.

I'm now interested in brainstorming another approach. Something to
replace the tokenizer. We could use the model's existing output to
train the other approach, and then finetune it around the desired
result.
---

Reply via email to