KaiserSozo opened a new issue #11101: Gluon Performance and memory conumption
URL: https://github.com/apache/incubator-mxnet/issues/11101
 
 
   Working under gpu, I have next code:
   
   for i, (data) in enumerate(trainingInputs):
                   calcT = time.time()
                   data = data.as_in_context(ctx)
                   output, win_index, delta, mask = netSom(data)
                   calc += time.time() - calcT
                   copyT = time.time()            
                   weightsData = weights.data()
                   ratesData = rates.data()
                   ratesData[win_index] += 1
                   weightsData[win_index] += delta
                   ratesData.wait_to_read()
                   weightsData.wait_to_read()
                   train_accuracy += output.asscalar()
                   copy += time.time() - copyT
   
   Calculation time that is in calc variable is 5 times less than copy time 
that is in copy variable. Why o an how it can be reduced?
   Also I noted that if I remove calling of wait_to_read() function then copy 
time is 0, but memory consumption always increasing and leads to memory 
allocation failure. And almost the same behaviour I see in next code using 
gluon:
   
   for data, label in itertools.izip(trainingInputs, trainingOutputs):
               calcT = time.time()
               data = data.as_in_context(ctx)
               label = label.as_in_context(ctx)
               output, win_index, delta, mask = netSom(data)
               data = data.reshape((-1,inputsCount))
               with autograd.record():
                   args = (data, mask)
                   output = net(*args)
                   l2loss = loss(output, label)
               l2loss.backward()
               calc += time.time() - calcT
               copyT = time.time()    
               trainer.step(data.shape[0])
               copy += time.time() - copyT
               i+=1
   
           testT = time.time()
           test_accuracy = evaluate_accuracyMLP(testInputs, testOutputs, net, 
netSom, inputsCount, activeNeuronsCount)
           test += time.time() - testT
   
   Here calculation and copying (gradient adjusting) times are almost equal 
each other. And I also searching the way of decreasing copy time, it should be 
serious less than calculation time. And here also persists strange behaviour, 
if I remove 'evaluate_accuracyMLP' call, memory consumption become increase 
till error.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to