https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85253

--- Comment #3 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
Yep, looking at the code, it seems that in this special
case, we need one more row in the temporary buffer.

This seems to cure it.

Index: m4/matmul_internal.m4
===================================================================
--- m4/matmul_internal.m4       (Revision 259152)
+++ m4/matmul_internal.m4       (Arbeitskopie)
@@ -234,7 +234,7 @@ sinclude(`matmul_asm_'rtype_code`.m4')dnl

       /* Adjust size of t1 to what is needed.  */
       index_type t1_dim;
-      t1_dim = (a_dim1-1) * 256 + b_dim1;
+      t1_dim = (a_dim1- (ycount > 1)) * 256 + b_dim1;
       if (t1_dim > 65536)
        t1_dim = 65536;

Reply via email to