Nakul Jindal created SYSTEMML-1396: -------------------------------------- Summary: Enable lazily freeing cuda allocated memory chunks Key: SYSTEMML-1396 URL: https://issues.apache.org/jira/browse/SYSTEMML-1396 Project: SystemML Issue Type: Improvement Components: Runtime Reporter: Nakul Jindal Assignee: Nakul Jindal Fix For: SystemML 1.0
The current version of deallocating cuda memory chunks is done lazily. That came about as a result of the {{cudaFree}} operations being expensive. After adding extra instrumentation, it was determined that {{cudaAlloc}} operations were fairly expensive as well. Most GPU operations are done in loops with constantly allocating and deallocating the same size of memory chunks per loop. What would be more efficient is to "clear out" or set the memory to 0 instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)