i’m looking at lm-infinite a little bit, maybe i can make steps on
making it work

paper: https://arxiv.org/pdf/2308.16137.pdf
partial implementation:
https://github.com/kyegomez/LM-Infinite/blob/main/infinite/main.py

it seems like the theory is that if out-of-context tokens are moved to
the very start of the context window in some empirically-determined
way, results on long context outputs radically increase in quality

the partial implementation doesn’t include the new calculation of
position encodings, which is different depending on the model the
length extension is applied to.
  • [ot][spam][l... Undescribed Horrific Abuse, One Victim & Survivor of Many
    • Re: [ot... Undescribed Horrific Abuse, One Victim & Survivor of Many
      • Re:... Undescribed Horrific Abuse, One Victim & Survivor of Many
        • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
          • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
            • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
              • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
                • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
                • ... Undescribed Horrific Abuse, One Victim & Survivor of Many

Reply via email to