Hi all,

As mentioned in previous ACEv1 thread, we implemented it with a
different way. We used internal pattern instead of inline assmebly
by treating tmm as a fake register. Users still need to manage the
register allocation for now by passing register number, but this
will help compiler know the dependency between insts and maybe
more convenient for potential future tmm register allocation since
all the patterns are there for reference. Since in ACE, tmms are
accumulator unit rather than calculation unit, this is acceptable.

Also since ACE and legacy AMX should not be used together, we use
different intrin name for those shared insts. See patch detail
description for how we consider those names.

Bootstrapped and regtested on x86_64-pc-linux-gnu. Discussions are
welcomed on this patch series.

Thx,
Haochen


Reply via email to