Re: Large table in memory

Blaicher, Christopher Y. Sun, 23 Dec 2012 10:22:21 -0800

My concern about memory was not for above the bar or even above the line, but 
about real storage.  I have been involved in situations where a sort that 
worked fine last week, didn't work so well this week.  Nothing changed in the 
operating system or the sort.


What did change was a new DB2 system was brought up and it used a fair chunk of 
real.  Sorts will use additional virtual memory if the real memory that will 
back it is not currently being used.  Adding an almost 1.2G table that is 
highly referenced, why else would you make it an in memory table, is going to 
keep most of that in real memory.

I have seen LPARs with 2G and LPARS with 100G of real.  Using a new 1.2G 
probably won't have too much effect on a large system, but drop that into a 
system on the smaller end of the scale, and you will see effects.  Maybe in 
paging, maybe not.  Syncsort, at least, doesn't like to cause a rise in the 
paging rate, so you can sometimes see a situation where large sorts will have a 
performance drop off before you see a paging increase.

I guess my main point is when making significant changes you need to have 
situational awareness.  A good choice for one situation may be a terrible 
choice for another.

On the question of design, I would vote for using hashing, if you have a 
hashable key.  It eliminates the maintenance of the binary tree and it is 
faster.

Chris Blaicher
Senior Software Engineer, Software Services
Syncsort Incorporated
50 Tice Boulevard, Woodcliff Lake, NJ 07677
P: 201-930-8260  |  M: 512-627-3803
E: [email protected]

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On Behalf 
Of John Gilmore
Sent: Sunday, December 23, 2012 8:39 AM
To: [email protected]
Subject: Re: Large table in memory

Joel Ewing's post makes several important points.  Hashing would certainly 
reduce the size of the key space to be searched, making your search scheme a 
little less important.

Another possibility that you should consider is the use of a binary-search tree 
(BST).  Even without AVL balancing or the like it should be possible to search 
such a BST using about 10 comparison operations per search.  If I were doing it 
I would use a 40-byte node block (NB) comprised of four pointers (one to the 
left subtree, one to the right subtree, one to a key block, and one to the 
information you
need) and eight bytes for modal switches.  This scheme makes NBs reusable, and 
it separates the management of the BST from that of the heap you will need for 
records.  (I assume that since records vary significantly in size they can also 
grow.)

Another design issue you need to consider very carefully is that of backup.  
The BST could be traversed readily (in, say, ascending key sequence), making 
both incremental  and full backup operations easy to perform.

Finally, the anxieties about the use of [virtual] storage above the bar that 
have been expressed here are, in my view, overstated.  If you manage that 
storage at all well its use should be unproblematic.

People whose primary job is storage management will of course be concerned, 
properly, about your the possibility that your system could malfunction in ways 
that made  enormous, unanticipated demands on limited real storage; and for 
this reason you should  involve them actively and from the beginning in what 
you are doing.

John Gilmore, Ashland, MA 01721 - USA

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
[email protected] with the message: INFO IBM-MAIN



ATTENTION: -----

The information contained in this message (including any files transmitted with 
this message) may contain proprietary, trade secret or other  confidential 
and/or legally privileged information. Any pricing information contained in 
this message or in any files transmitted with this message is always 
confidential and cannot be shared with any third parties without prior written 
approval from Syncsort. This message is intended to be read only by the 
individual or entity to whom it is addressed or by their designee. If the 
reader of this message is not the intended recipient, you are on notice that 
any use, disclosure, copying or distribution of this message, in any form, is 
strictly prohibited. If you have received this message in error, please 
immediately notify the sender and/or Syncsort and destroy all copies of this 
message in your possession, custody or control.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: Large table in memory

Reply via email to