tqchen commented on pull request #5914:
URL: https://github.com/apache/incubator-tvm/pull/5914#issuecomment-648903111


   I like the overall util for cache flushing, however, it would be great to 
discuss the interface of cache eviction. In terms the API choices:
   
   - While it is understandable that we could like to include first 
argument(activation) and flush the rest of the argument, it is still a very 
specific setup(ideally it might be better to make it configurable).
   - Right now things are configured through env variable, is it the best way 
to configure API?
   - The current logic does not check for other context(besides CPU), and will 
results un determined behavior when we use OpenCL or CUDA(because the opaque 
data ptr does not corresponds to a CPU addresss), it might also cause problem 
when the function is not an DLTensor
   
   Here are a few alternative API choices for configuring the cache flushing 
behavior.
   
   ### A0: Fold cache flushing factor into time_evaluator
   
   ```python
   mod = load_module()
   # flush cpu cache of args 1
   f = mod.time_evaluator("myfunc", repeat=10, cache_flush_cpu_args_begin=1)
   ```
   
   
   ### A1: Decoupled Composite style
   ```python
   mod = load_module()
   # cache flush packed is a packed func that performs the cpu cache flush
   cache_flush_packed = remote.get_function("cpu_cache_flush")(args=begin=1)
   # fprepare is a callback that will be called before the evaluation, it takes 
in args as arguments. 
   f = mod.time_evaluator("myfunc", repeat=10, fprepare=cache_flush_packed)
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to