this commit finally produces the correct perceiver output without bailing.
i might like to do the gpt2 model next. since i ran into a lot of
unexpected difficulties here.

commit ab72f4a6a2a9095587b02c262ae1b20801172315 (HEAD ->
memory-efficient-attention, xloem/memory-efficient-attention)
Author: xloem <[email protected]>
Date:   Thu Jan 27 13:14:23 2022 +0000

    handle missing attention mask, add code for head_mask, comment out
debugging break

Reply via email to