Nakul Jindal created SYSTEMML-1396:
--------------------------------------

             Summary: Enable lazily freeing cuda allocated memory chunks
                 Key: SYSTEMML-1396
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1396
             Project: SystemML
          Issue Type: Improvement
          Components: Runtime
            Reporter: Nakul Jindal
            Assignee: Nakul Jindal
             Fix For: SystemML 1.0


The current version of deallocating cuda memory chunks is done lazily. That 
came about as a result of the {{cudaFree}} operations being expensive. After 
adding extra instrumentation, it was determined that {{cudaAlloc}} operations 
were fairly expensive as well. 
Most GPU operations are done in loops with constantly allocating and 
deallocating the same size of memory chunks per loop. What would be more 
efficient is to "clear out" or set the memory to 0 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to