Lunderberg commented on code in PR #16098:
URL: https://github.com/apache/tvm/pull/16098#discussion_r1391636011
##########
python/tvm/relax/op/distributed/distributed.py:
##########
@@ -59,3 +59,28 @@ def redistribute(input: Expr, device_mesh: DeviceMesh,
placement: Placement) ->
The tensor after redistribution.
"""
return _ffi_api.redistribute(input, device_mesh, placement) # type: ignore
+
+
+def redistribute_replica_to_shard(input: Expr, num_workers: int, axis: int) ->
Expr:
Review Comment:
That is correct, that we would require runtime support, but only if the
`num_workers` is still dynamic after being lowered to either the disco runtime
or the ccl op legalization. It is easier to write a single dynamic
implementation then specialize to a variety of static cases, than it is to
write several distinct static implementations. However, the initial writing of
the dynamic implementation requires that it be expressible, even though it will
be specialized out later in lowering.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]