Re: DFSORT statement sequence

David Betten Mon, 09 Feb 2009 06:22:12 -0800

> Thanks for your kind answer. Does that means that AFTER reading all the
> matching input records, that subset is THEN sorted?


Correct.  As each record is processed, the INCLUDE criteria is evaluated.
If a the record does not meet the INCLUDE criteria it is discarded and not
included in the sort.


> The reason is I have to select a wide variety of different inputs
toproduce a
> subset and THEN sort that lot for reporting. My goal: to have as fewbytes
to
> sort after selecting what I need with one pass of the input dataset.
>
> I wonder, if I do all my COPY and using INCLUDE statements and then do
this
> two stage method:
>
> COPY    FROM(INPUT) TO(TEMP1) USING(<dd with INCLUDE>)
> COPY    FROM(TEMP1) TO(TEMP2) USING(<dd with SORT>)
>
> Will this work better for large input datasets?


This is one of those "it depends" types of situations.  If a large
percentage of the records are likely to be included, I think you are fine
with just doing 1 sort with the include statements.  If a small percentage
are likely to be included, you may be better off doing the copy with
INCLUDE followed by the sort.

The reason I say this is that when sorting with INCLUDE, DFSORT has no way
of knowing how many records will match the criteria.  Therefore, we need to
allocate resources (main storage, work space, etc.) under the assumption
that all records will be included.  So you may have some wasted resources
if only a small percentage of the records are actually included.
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: DFSORT statement sequence

Reply via email to