janetsc commented on code in PR #13844:
URL: https://github.com/apache/tvm/pull/13844#discussion_r1091235281


##########
src/runtime/hexagon/hexagon_buffer.cc:
##########
@@ -235,19 +245,20 @@ void hexagon_buffer_copy_across_regions(const BufferSet& 
dest, const BufferSet&
 
   // Finally, do the memory copies.
   for (const auto& copy : macro_copies) {
-    // clean Hexagon cache before / after memcpy to ensure clean cache state 
to enable usage of DMA
-    // bypass mode for increased DMA bandwidth
     // TODO(HWE): Switch to ION Buffer to avoid need for memcpy and 
potentially lighten or alleviate
     // the burden of cache invalidation in this code
-    qurt_mem_cache_clean(reinterpret_cast<qurt_addr_t>(copy.dest), 
copy.num_bytes,
-                         QURT_MEM_CACHE_INVALIDATE, QURT_MEM_DCACHE);
-    qurt_mem_cache_clean(reinterpret_cast<qurt_addr_t>(copy.src), 
copy.num_bytes,
-                         QURT_MEM_CACHE_INVALIDATE, QURT_MEM_DCACHE);
     memcpy(copy.dest, copy.src, copy.num_bytes);
+
+    // We must flush and invalidate both the destination and source of the 
memcpy from the cache to
+    // ensure that DMA with cache bypass enabled will function properly. DMA 
with cache bypass
+    // enabled assumes that HexagonBuffer objects are not cached unless 
explicitly modified by the
+    // primfunc. We must flush and invalidate after the memcpy to uphold this 
assumption. Note that
+    // we need not flush and invalidate prior to the memcpy because it is 
coherent - issued through
+    // the cache.
     qurt_mem_cache_clean(reinterpret_cast<qurt_addr_t>(copy.dest), 
copy.num_bytes,
-                         QURT_MEM_CACHE_INVALIDATE, QURT_MEM_DCACHE);
+                         QURT_MEM_CACHE_FLUSH_INVALIDATE, QURT_MEM_DCACHE);
     qurt_mem_cache_clean(reinterpret_cast<qurt_addr_t>(copy.src), 
copy.num_bytes,
-                         QURT_MEM_CACHE_INVALIDATE, QURT_MEM_DCACHE);
+                         QURT_MEM_CACHE_FLUSH_INVALIDATE, QURT_MEM_DCACHE);

Review Comment:
   Can you explain why src would need a cache operation after the copy?  I 
think only dst needs to be flushed, my comment was to remove the operation on 
src, as the data src points to does not change after the memcpy.  src should be 
invalidated before a memcpy where the hexagon buffer is being copied to Hexagon.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to