in one of the chats i learned that in machine learning “grok” is the
phenomenon whereby, if a model is overtrained for a very long time
after the loss bottoms out, it can further generalize far beyond the
given dataset, reversing the effects of overfitting. doesn’t sound
like it is meant in that sense here.

Reply via email to