On 04/07/13 04:17, Andre' Walker-Loud wrote:
Hi All,
I wrote some code that is running out of memory.
How do you know? What are the symptoms? Do you get an exception? Computer
crashes? Something else?
It involves a set of three nested loops, manipulating a data file (array) of
dimension ~ 300 x 256 x 1 x 2.
Is it a data file, or an array? They're different things.
It uses some third party software, but my guess is I am just not aware of how
to use proper memory management and it is not the 3rd party software that is
the culprit.
As a general rule, you shouldn't need to worry about such things, at least 99%
of the time.
Memory management is new to me, and so I am looking for some general guidance.
I had assumed that reusing a variable name in a loop would automatically flush
the memory by just overwriting it. But this is probably wrong. Below is a
very generic version of what I am doing. I hope there is something obvious I
am doing wrong or not doing which I can to dump the memory in each cycle of the
innermost loop. Hopefully, what I have below is meaningful enough, but again,
I am new to this, so we shall see.
Completely non-meaningful.
################################################
# generic code skeleton
# import a class I wrote to utilize the 3rd party software
import my_class
Looking at the context here, "my_class" is a misleading name, since it's
actually a module, not a class.
# instantiate the function do_stuff
my_func = my_class.do_stuff()
This is getting confusing. Either you've oversimplified your pseudo-code, or
you're using words in ways that do not agree with standard terminology. Or
both. You don't instantiate functions, you instantiate a class, which gives you
an instance (an object), not a function.
So I'm lost here -- I have no idea what my_class is (possibly a module?), or
do_stuff (possibly a class?) or my_func (possibly an instance?).
# I am manipulating a data array of size ~ 300 x 256 x 1 x 2
data = my_data # my_data is imported just once and has the size above
Where, and how, is my_data imported from? What is it? You say it is "a data
array" (what sort of data array?) of size 300x256x1x2 -- that's a four-dimensional
array, with 153600 entries. What sort of entries? Is that 153600 bytes (about 150K) or
153600 x 64-bit floats (about 1.3 MB)? Or 153600 data structures, each one holding 1MB of
data (about 153 GB)?
# instantiate a 3d array of size 20 x 10 x 10 and fill it with all zeros
my_array = numpy.zeros([20,10,10])
At last, we finally see something concrete! A numpy array. Is this the same
sort of array used above?
# loop over parameters and fill array with desired output
for i in range(loop_1):
for j in range(loop_2):
for k in range(loop_3):
How big are loop_1, loop_2, loop_3?
You should consider using xrange() rather than range(). If the number is very
large, xrange will be more memory efficient.
# create tmp_data that has a shape which is the same as data
except the first dimension can range from 1 - 1024 instead of being fixed at 300
''' Is the next line where I am causing memory problems? '''
tmp_data = my_class.chop_data(data,i,j,k)
How can we possibly tell if chop_data is causing memory problems when you don't
show us what chop_data does?
my_func(tmp_data)
my_func.third_party_function()
Again, no idea what they do.
my_array([i,j,k]) = my_func.results() # this is just a floating
point number
''' should I do something to flush tmp_data? '''
No. Python will automatically garbage collect is as needed.
Well, that's not quite true. It depends on what my_tmp actually is. So,
*probably* no. But without seeing the code for my_tmp, I cannot be sure.
--
Steven
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor