Title: Fwd: Re: [opendx-users] Estimating memory use

--- begin forwarded text


Date: Wed, 7 Aug 2002 07:32:59 -0400
To: [email protected]
From: Chris Pelkie <[EMAIL PROTECTED]>
Subject: Re: [opendx-users] Estimating memory use
Cc:
Bcc:
X-Attachments:
Hello DX users,

I am working with a large dataset generated by high-resolution CT scans. It
consists of 918 slices @ 1024x1024, 16-bit (~1.7Gb). Obviously this causes
huge problems with memory, so I was hoping to make subsets of the data to
regions of interest using the include module. We are currently tweaking our
SGI Origin 3400 to increase the amount of RAM available to dx.

What I was wondering is there any way, other than trial and error, to
estimate the maximum amount of memory required to run a visual program, or
by an individual module. I thought that looking at memory usage in the
message window might do it but that only shows what is cached at the end of
the program. If I only cache one step I can see the memory used by the
results of that step, but I would think that a module like Include may use
more memory than the results it caches.

The modules I have in my programs (that I assume use significant memory) are
-
Import
Extract
Compute
Statistics
Mark
Include
Unmark
Export

User's Guide, Chap 4 discusses this to some degree.

Basic rules:
1. DX caches everything new
2. DX shares anything old that doesn't change within the function
3. DX never does garbage collection, so sooner or later you hit the wall and restart the server
4. DX is not a database, so using one externally offers great advantages*

So,
Import: expect a huge drawdown on memory as your files are loaded

Extract: probably just makes a pointer to the array (I'm not sure on this one)

Compute: definitely creates a new result array (I think even if the _expression_ is "a"; at least that's how you expand regular positions to explicit tuples); array length equals "a" input's length; type depends on what you do inside Compute

Statistics: puny little object containing a few numbers

Mark: shouldn't use memory itself: should create a new named pointer to the "marked" array and rename the "data" pointer to "saved data"; however, it may be the case that the allocation and duplication of "marked" happens here for efficiency; the old "data" is not duplicated, just renamed

Include: depends on Cull status; if you don't Cull, you create a new "invalids" array which is either one byte per position or connection, or one int per invalid (sparse matrix); otherwise, if you Cull, you output a new field with new positions, connections, data, which could be really big (if you Culled only a couple values) or very small (if you Culled 98% of the data)

Unmark: should just swap pointer names around

Export: shouldn't take any memory other than a file pointer/name ref (few bytes)

*We are developing some interesting and powerful connections between openDX and MS SQL Server via Perl and Python scripts. Only about 2 weeks old, but two hatchling projects are coming along nicely, one in fracture mechanics and the other in bioinformatics.

How about rethinking your preprocessing and using Slab (twice) to whack off a smaller area on each slice (one slice per iteration), maybe use Reduce also, write those smaller slices out to a series, then Import, Stack (series positions as Z), and visualize. At some level of Slab dimension and Reduce within that volume, you should be able to load a usable object.

--
Chris Pelkie
Scientific Visualization Producer
618 Rhodes Hall Cornell Theory Center, Cornell University
Ithaca, NY 14853

--- end forwarded text


-- 
Chris Pelkie
Managing Partner
Practical Video LLC
30 West Meadow Drive,  Ithaca,  NY  14850



Reply via email to