lecoan opened a new issue, #15019:
URL: https://github.com/apache/tvm/issues/15019

   While trying to utilize Apache TVM for heterogeneous execution, I've 
encountered a problem with the PlanDevices pass. Specifically, when two 
operators share the same input but are assigned to different target devices, 
PlanDevices pass fails.
   
   As an example, consider the operation `(a+b)-(b+c)`. If I intend to assign 
the first `add` operator to the CPU and the second one to the GPU, PlanDevices 
pass fails as it appears to have difficulty determining the appropriate device 
for `b`.
   
   Given that it's common for multiple layers in a neural network to process 
the same input, this seems to be a bug in TVM that warrants attention.
   
   ### Expected behavior
   
   There are two behaviors I would expect from the PlanDevices pass in this 
scenario:
   
   1. Automatic addition of a `device_copy`: If CPU is the default device, the 
PlanDevices pass should append a `device_copy` between `b` and the last `add` 
operator.
   2. Input replication: The PlanDevices pass could replicate `b` on both CPU 
and GPU.
   
   ### Actual behavior
   
   Assuming the PlanDevices pass initially visits `(a+b)`, it marks `b` for the 
CPU. However, when it visits `(b+c)`, it throws an error as it attempts to 
place `b` on GPU.
   
   Here is the error message:
   
   ```log
   TVMError: Function parameters and result VirtualDevices do not match those 
of call. Call:
   free_var %b: Tensor[(5, 7), float32] ;
   free_var %c: Tensor[(5, 7), float32] ;
   %0 = add(%b, %c) ;
   on_device(%0, virtual_device=VirtualDevice(device_type=2, 
virtual_device_id=0, target=Target(id=12cebdda0, kind='cuda', keys={'cuda', 
'gpu'}, attrs={'max_num_threads': 1024, 'thread_warp_size': 32, 'arch': 
"sm_50"}, host=Target(id=12ceba810, kind='llvm', keys={'cpu'})))) 
   with function virtual devices:
   fn(?4828570296?VirtualDevice(device_type=2, virtual_device_id=0, 
target=Target(id=11fcaac00, kind='cuda', keys={'cuda', 'gpu'}, 
attrs={'max_num_threads': 1024, 'thread_warp_size': 32, 'arch': "sm_50"}, 
host=Target(id=11fcc96b0, kind='llvm', keys={'cpu'})))):?4828570408?
   and implied call virtual devices:
   fn(?4828403160?VirtualDevice(device_type=1, virtual_device_id=0, 
target=Target(id=11fcb2f80, kind='llvm', keys={'cpu'}, 
host=Target(id=11fcc96b0, kind='llvm', keys={'cpu'})))):?4828554904?
   ```
   
   ### Environment
   
   - OS: Linux
   - TVM: Latest commit (4267fbf6a173cd742acb293fab4f77693dc4b887)
   
   ### Steps to reproduce
   
   Below is a minimal reproduction code which attempts to set devices for 
`(a+b) - (b+c)`, where the first `add` operator is set to CPU and the other one 
is set to GPU:
   
   ```python
   import tvm
   from tvm import relay
   
   HOST_DEVICE = tvm.device("cpu")
   HOST_TARGET = tvm.target.Target("llvm")
   
   CPU_DEVICE = tvm.device("cpu")
   CPU_TARGET = tvm.target.Target("llvm").with_host(HOST_TARGET)
   
   GPU_DEVICE = tvm.device("cuda")
   GPU_TARGET = tvm.target.Target("cuda").with_host(HOST_TARGET)
   CPU = tvm.target.VirtualDevice(CPU_DEVICE, CPU_TARGET)  # device_type=1
   GPU = tvm.target.VirtualDevice(GPU_DEVICE, GPU_TARGET)  # device_type=2
   
   metatable = {"VirtualDevice": [CPU, GPU]}
   
   mod = tvm.relay.parse(
           """
           #[version = "0.0.5"]
           def @main(%a: Tensor[(5, 7), float32], %b: Tensor[(5, 7), float32],
                       %c: Tensor[(5, 7), float32]) {
               %0 = add(%a, %b);
               %1 = on_device(%0, virtual_device=meta[VirtualDevice][0]);
               %2 = add(%b, %c);
               %3 = on_device(%2, virtual_device=meta[VirtualDevice][1]);
               subtract(%1, %3)
           }
           """,
           "from_string",
           None,
           metatable,
       )
   
   DEFAULT = GPU
   CTXT = tvm.transform.PassContext(config={"relay.fallback_device_type": 
DEFAULT.device_type_int})
   TARGETS = [CPU_TARGET, GPU_TARGET]
   
   config = tvm.target.make_compilation_config(CTXT, TARGETS)
   mod = relay.transform.InferType()(mod)
   mod = relay.transform.PlanDevices(config)(mod)
   mod = relay.transform.InferType()(mod)
   ```
   
   ### Triage
   
   * needs-triage


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to