https://huggingface.co/cerebras/btlm-3b-8k-base/discussions/25

Context length schedule and performance
#25

by baffo32 - opened less than a minute ago
Discussion

> Hey,
>
> I’m looking at your chart showing incredible performance improvement greatly 
> extending the context length with a smaller portion of training at the end.
>
> It’s quite notable most of the gains are in the untrained context lengths.
>
> It looks to me like steadily increasing the context length throughout 
> training could possibly flatline the chart, these relative gains are so big.
>
> Has anyone tried training on steadily increasing context lengths?
  • [ot] 3B / 3G... Undescribed Horrific Abuse, One Victim & Survivor of Many
    • Re: [ot... Undescribed Horrific Abuse, One Victim & Survivor of Many
      • Re:... Undescribed Horrific Abuse, One Victim & Survivor of Many

Reply via email to