On 3/4/24 23:19, Philippe Mathieu-Daudé wrote:
+ uint32_t *za_row = &za[H4(tile_vslice_index(row))]; + uint32_t n = zn[H4(row)]; + + for (col = 0; col < oprsz; ++col) { + uint8_t pb = pm[H1(col >> 1)] >> ((col & 1) * 4); + uint32_t *a = &za_row[col];Shouldn't this be: uint32_t *a = &za_row[H4(col)]; to work on big endian hosts?
Oof. Yes, the H4 adjustment belongs here, not on za_row above. r~