janetsc commented on code in PR #13844:
URL: https://github.com/apache/tvm/pull/13844#discussion_r1091253341
##########
src/runtime/hexagon/hexagon_buffer.cc:
##########
@@ -235,19 +245,20 @@ void hexagon_buffer_copy_across_regions(const BufferSet&
dest, const BufferSet&
// Finally, do the memory copies.
for (const auto& copy : macro_copies) {
- // clean Hexagon cache before / after memcpy to ensure clean cache state
to enable usage of DMA
- // bypass mode for increased DMA bandwidth
// TODO(HWE): Switch to ION Buffer to avoid need for memcpy and
potentially lighten or alleviate
// the burden of cache invalidation in this code
- qurt_mem_cache_clean(reinterpret_cast<qurt_addr_t>(copy.dest),
copy.num_bytes,
- QURT_MEM_CACHE_INVALIDATE, QURT_MEM_DCACHE);
- qurt_mem_cache_clean(reinterpret_cast<qurt_addr_t>(copy.src),
copy.num_bytes,
- QURT_MEM_CACHE_INVALIDATE, QURT_MEM_DCACHE);
Review Comment:
Some clarifications (from our offline discussion, saving here for posterity):
I was wondering what gets passed in the fast RPC call. If it is a pointer
to the allocation and not a pointer to the hexagon buffer object, then we
shouldn't need any of these cache operations surrounding the memcpy. (Because
of Karl's comment that those have code that will make sure buffers passed as
arguments are coherent.)
It sounds like DDR is going to be just one allocation in that object. And
we do have the map of allocations to buffer objects, so that makes me think we
can get rid of everything except a flush on dst when copying to the device,
after the memcpy operation.
The reason that is needed is just in case there is no primfunc modifying
that data before a DMA to VTCM. In that case, we want to make sure it starts
out flushed to DDR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]