this says it drops parameters by 25x using retrieval:
https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens

if we're stuck copying work we bump into, then it makes sense to figure out
what's cited that paper (or bump into something else)

Reply via email to