szha commented on issue #14973: [MXNET-1404] Added the GPU memory profiler URL: https://github.com/apache/incubator-mxnet/pull/14973#issuecomment-495969814 We had a discussion yesterday about the design (w/ @ArmageddonKnight @anirudh2290 @eric-haibin-lin). We all agreed that this is an awesome feature to be added to mxnet. There were some concerns around the current design and one particular point is about requiring names passed from the frontend for each NDArray, along with the addition of the named imperative invoke interface. The motivation was to be able to identify and track the source of allocation to specific code. As agreed, here I offer an alternative design (and mock-up for the user experience) for automatically naming NDArrays that are easy to identify, without requiring manual naming of all NDArrays. It can be done by providing the interfaces to make it easy for users to mark a region of code. That, in combination with the information from the operators, should allow users to easily identify the part of code that's responsible for certain memory allocation. The design enables: - user specified scope in code. - identifying arrays within that scope. - without changes to existing execution interface like imperative invoke. Suppose user wants to profile a function that looks like: ``` def function1(nd1, nd2): r1 = mx.nd.op1(nd1) r2 = mx.nd.op1(nd2) r3 = mx.nd.op2(r1, r2) return r3 ``` One easy way is to have an interface to the users for marking the scope. ``` def function1(nd1, nd2): with mx.profiler.scope('function1'): r1 = mx.nd.op1(nd1) r2 = mx.nd.op1(nd2) r3 = mx.nd.op2(r1, r2) return r3 ``` When entering the profiler scope, it can invoke a new C API that sets the thread local context of the profiler scope name by prepending the outer scope name (if any) to the user specified scope name, and saves the old scope. When exiting, it restores the old scope. When allocation happens, the entries should be named according the current scope name, and have a counter-based naming for op to differentiate multiple invocation of the same op (i.e. the allocation identifier should look like `{scope}.{op}.{counter}`). From the code example, this should have the effect of producing these allocation records: ``` name,bytes function1.op1.1,x function1.op1.2,y function1.op2.1,z ``` If user wants to have more granularity in the scope of the code, this is possible: ``` def function1(nd1, nd2): with mx.profiler.scope('function1'): r1 = mx.nd.op1(nd1) r2 = mx.nd.op1(nd2) with mx.profiler.scope('sec1'): r3 = mx.nd.op2(r1, r2) return r3 ``` which results in the following records: ``` name,bytes function1.op1.1,x function1.op1.2,y function1.sec1.op2.1,z ``` For easier usage, we can also utilize the decorator syntax: ``` @mx.profiler.scope('function1') def function1(nd1, nd2): r1 = mx.nd.op1(nd1) r2 = mx.nd.op1(nd2) r3 = mx.nd.op2(r1, r2) return r3 ``` The decorator should wrap the decorated code within the aforementioned scope. For Gluon, we can update the `__call__` function in `Block` and `HybridBlock` and automatically use the class names as scopes. For optimizers, we can update the `update` function to achieve similar effect. Also, notice that the allocation identifier already has a hierarchy that can be used for aggregation. This can be utilized to aggregate results without requiring the mapping from the users. This means there would no longer be the need to ask for `SETME.py` changes. This design can be applied to both Symbol and NDArray. The allocation identifier should be treated separately from the variable names in Symbol, so that it provides a consistent experience for Gluon HybridBlocks.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
