> Thanks for your kind answer. Does that means that AFTER reading all the > matching input records, that subset is THEN sorted?
Correct. As each record is processed, the INCLUDE criteria is evaluated. If a the record does not meet the INCLUDE criteria it is discarded and not included in the sort. > The reason is I have to select a wide variety of different inputs toproduce a > subset and THEN sort that lot for reporting. My goal: to have as fewbytes to > sort after selecting what I need with one pass of the input dataset. > > I wonder, if I do all my COPY and using INCLUDE statements and then do this > two stage method: > > COPY FROM(INPUT) TO(TEMP1) USING(<dd with INCLUDE>) > COPY FROM(TEMP1) TO(TEMP2) USING(<dd with SORT>) > > Will this work better for large input datasets? This is one of those "it depends" types of situations. If a large percentage of the records are likely to be included, I think you are fine with just doing 1 sort with the include statements. If a small percentage are likely to be included, you may be better off doing the copy with INCLUDE followed by the sort. The reason I say this is that when sorting with INCLUDE, DFSORT has no way of knowing how many records will match the criteria. Therefore, we need to allocate resources (main storage, work space, etc.) under the assumption that all records will be included. So you may have some wasted resources if only a small percentage of the records are actually included. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

