mengceng15 opened a new pull request, #11699:
URL: https://github.com/apache/tvm/pull/11699
This PR is a minor fix on the AutoTVM int8 vnni dense task extraction.
VNNI dense strategy is used only when the input-weight-out datatypes are
u8s8s32 and the weight layout is NC16n4c.
Without altering the weight layout, for a simple dense workload:
def @main(%data: Tensor[(1, 16), uint8] /* ty=Tensor[(1, 16), uint8] /,
%weight: Tensor[(32, 16), int8] / ty=Tensor[(32, 16), int8] /) -> Tensor[(1,
32), int32] {
nn.dense(%data, %weight, units=None, out_dtype="int32") / ty=Tensor[(1, 32),
int32] */
}
There are two non-VNNI tasks extracted:
[Task(func_name=dense_nopack.x86, args=(('TENSOR', (1, 16), 'uint8'),
('TENSOR', (32, 16), 'int8'), None, 'int32'), kwargs={},
workload=('dense_nopack.x86', ('TENSOR', (1, 16), 'uint8'), ('TENSOR', (32,
16), 'int8'), None, 'int32')),
Task(func_name=dense_pack.x86, args=(('TENSOR', (1, 16), 'uint8'),
('TENSOR', (32, 16), 'int8'), None, 'int32'), kwargs={},
workload=('dense_pack.x86', ('TENSOR', (1, 16), 'uint8'), ('TENSOR', (32, 16),
'int8'), None, 'int32'))]
WIth this fix, one VNNI task is extracted:
[Task(func_name=dense_vnni.x86, args=(('TENSOR', (1, 16), 'uint8'),
('TENSOR', (2, 4, 16, 4), 'int8'), None, 'int32'), kwargs={},
workload=('dense_vnni.x86', ('TENSOR', (1, 16), 'uint8'), ('TENSOR', (2, 4, 16,
4), 'int8'), None, 'int32'))]
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]