--- begin forwarded text
Date: Wed, 7 Aug 2002 07:32:59 -0400
To: [email protected]
From: Chris Pelkie <[EMAIL PROTECTED]>
Subject: Re: [opendx-users] Estimating memory use
Cc:
Bcc:
X-Attachments:
From: Chris Pelkie <[EMAIL PROTECTED]>
Subject: Re: [opendx-users] Estimating memory use
Cc:
Bcc:
X-Attachments:
Hello DX users,
I am working with a large dataset generated by high-resolution CT scans. It
consists of 918 slices @ 1024x1024, 16-bit (~1.7Gb). Obviously this causes
huge problems with memory, so I was hoping to make subsets of the data to
regions of interest using the include module. We are currently tweaking our
SGI Origin 3400 to increase the amount of RAM available to dx.
What I was wondering is there any way, other than trial and error, to
estimate the maximum amount of memory required to run a visual program, or
by an individual module. I thought that looking at memory usage in the
message window might do it but that only shows what is cached at the end of
the program. If I only cache one step I can see the memory used by the
results of that step, but I would think that a module like Include may use
more memory than the results it caches.
The modules I have in my programs (that I assume use significant memory) are
-
Import
Extract
Compute
Statistics
Mark
Include
Unmark
Export
User's Guide, Chap 4 discusses this to some degree.
Basic rules:
1. DX caches everything new
2. DX shares anything old that doesn't change within the
function
3. DX never does garbage collection, so sooner or later you hit
the wall and restart the server
4. DX is not a database, so using one externally offers great
advantages*
So,
Import: expect a huge drawdown on memory as your files are
loaded
Extract: probably just makes a pointer to the array (I'm not sure
on this one)
Compute: definitely creates a new result array (I think even if
the _expression_ is "a"; at least that's how you expand
regular positions to explicit tuples); array length equals "a"
input's length; type depends on what you do inside Compute
Statistics: puny little object containing a few numbers
Mark: shouldn't use memory itself: should create a new named
pointer to the "marked" array and rename the "data"
pointer to "saved data"; however, it may be the case that
the allocation and duplication of "marked" happens here for
efficiency; the old "data" is not duplicated, just
renamed
Include: depends on Cull status; if you don't Cull, you create a
new "invalids" array which is either one byte per position
or connection, or one int per invalid (sparse matrix); otherwise, if
you Cull, you output a new field with new positions, connections,
data, which could be really big (if you Culled only a couple values)
or very small (if you Culled 98% of the data)
Unmark: should just swap pointer names around
Export: shouldn't take any memory other than a file pointer/name
ref (few bytes)
*We are developing some interesting and powerful connections
between openDX and MS SQL Server via Perl and Python scripts. Only
about 2 weeks old, but two hatchling projects are coming along nicely,
one in fracture mechanics and the other in bioinformatics.
How about rethinking your preprocessing and using Slab (twice) to
whack off a smaller area on each slice (one slice per iteration),
maybe use Reduce also, write those smaller slices out to a series,
then Import, Stack (series positions as Z), and visualize. At some
level of Slab dimension and Reduce within that volume, you should be
able to load a usable object.
--
Chris Pelkie
Scientific Visualization Producer
618 Rhodes Hall Cornell Theory Center, Cornell University
Ithaca, NY 14853
Scientific Visualization Producer
618 Rhodes Hall Cornell Theory Center, Cornell University
Ithaca, NY 14853
--- end forwarded text
--
Chris Pelkie
Managing Partner
Practical Video LLC
30 West Meadow Drive, Ithaca, NY 14850
Managing Partner
Practical Video LLC
30 West Meadow Drive, Ithaca, NY 14850
