There is one easy think to try, call a_theano_function.free(): http://deeplearning.net/software/theano/library/compile/function.html?highlight=function#theano.compile.function_module.Function.free
We don't free all the memory in that case. This should do that. But as Pascal said, we didn't test or have people continue there process when this happen. So there could be some refcount problem. I'm not convinced it would be a too big problems. I think we don't know if it is small or very big. We just don't know. Maybe there is just a few ops that have problems. You probably don't need to fix all of them, just the one where your memory error happen. But we don't have time for this. If you are a good C/C++ programmer and are ready to learn Python/NumPy C interface and Theano C interface a little, you could try it. I would suggest that you start from: http://deeplearning.net/software/theano/extending/extending_theano_c.html If you want to understand more the Theano graph and not just the C code, the previous page can be useful too (but not strictly mandatory to try to fix this problem in 1 op): http://deeplearning.net/software/theano/extending/extending_theano.html Fred On Thu, Aug 25, 2016 at 10:01 AM, Pascal Lamblin <[email protected]> wrote: > Hi, > > Unfortunately, this is not a case we handle well, and since we do not > really test recovery from errors, I would bet many of the Ops do not > behave correctly wrt refcounts when encountering a MemoryError, which > could explain the segfault. > > I think the real solution would be to clean those up (and test...), but > I'm afraid it would be too big of a task. > > On Tue, Aug 23, 2016, David Knowles wrote: > > I have a training function where the input varies in size. Some of my > > training examples are considerably larger than others, so these can > result > > in MemoryErrors. I would like to be able to just ignore these samples, so > > I'm catching MemoryErrors and then attempting to continue training. > > > > My initial issue was that the training function didn't seem to free > memory > > used by intermediate calculations before throwing the MemoryError, so > > subsequent training examples would have very little memory left to use. > > Deleting the training function does free up this memory, but then I would > > have to set up my model/training again using checkpointed parameters > (this > > would be doable but tedious). So instead I tried co-opting > > theano.function.free() removing the check for allow_gc. This frees the > > memory... and then seg faults when I try to continue training. > > > > So... is there a valid solution to clean up a theano.function after it > > threw a MemoryError? If not I'll find some workaround where I try to work > > out the biggest example I can handle for a given network ahead of time. > > > > Thanks, > > > > David. > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "theano-users" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > For more options, visit https://groups.google.com/d/optout. > > > -- > Pascal > > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
