One of my application programmers asked for information on design decision.

Given two files that are not in order by the match key.
Assuming the smaller file has at most one record per possible key value. 
Larger file may have multiple records for some keys and may also have values 
not found in the smaller file.
Assume optimal blocking, buffering, both files, both solutions.   
Which is more efficient?

Sort the smaller by that key, load to VSAM, then pass the unordered larger file 
and
do random retrievals from the VSAM file.

Sort both PS files by key in question and pass sorted files for matching by key.

We both (sorta) think the answer is ... "It depends" but on what criteria?

Number of records in each file? Which is more important? And by how much?
Record lengths? Same as above?
Length of key field in question? (Within SORT and VSAM length restrictions of 
course).
Key bias in larger file?
Ratio of hits/non hits?

Anyone have a nice formula?  :-)

David Speake

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to