> I am wondering/believing that JOINKEYS could do all of this in one
single swoop..> Now, waiting for Sri ... ;-)
Elardus Engelbrecht.
Like minds think alike. : ) Massimo has already posted a solution using
Joinkeys. However he came up with an idea where the input file is SORTED
twice ( Subtask 1 and subtask2 ). For smaller files this is acceptable, but
if the input file have millions of records then it is advisable NOT to SORT
the file twice. The idea of matching the file to itself is to use a COPY
operation, so that you don't require any resources. If do have to SORT the
data, then I would prefer 1 pass of sorting the data like shown below,
Just evaluate description and if it is 'CORPORATIVO' then have a temporary
value of zero at position 81 and it is anything other than the required
string , then have the temp value as 1.
Sort on the Item_nbr and Sum the temporary value at 81. By doing so, any
item_nbr that has a different value other than 'CORPORATIVO' will have a
total greater than 1 and those are the records that you need to eliminate.
//STEP0100 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD *
4046340¦CORPORATIVO UNIQUE - PICK
4046564¦CORPORATIVO HAS ANOTHER DESC - DROP
4046564¦ESTADO HAS ANOTHER DESC - DROP
4047131¦CORPORATIVO UNIQUE - PICK
4047460¦CORPORATIVO UNIQUE - PICK
4047479¦CORPORATIVO UNIQUE - PICK
4047480¦CORPORATIVO HAS ANOTHER DESC - DROP
4047480¦RON THOMAS HAS ANOTHER DESC - DROP
4047481¦CORPORATIVO DUPLICATE DESC - PICK
4047481¦CORPORATIVO DUPLICATE DESC - PICK
//SORTOUT DD SYSOUT=*
//SYSIN DD *
INREC OVERLAY=(81:12,20,CHANGE=(3,C'CORPORATIVO',C'000'),
NOMATCH=(C'001'))
SORT FIELDS=(01,10,CH,A)
SUM FIELDS=(81,3,ZD)
OUTFIL OMIT=(81,3,ZD,GT,0),
BUILD=(1,80)
//*
The output from this job is
4046340¦CORPORATIVO UNIQUE - PICK
4047131¦CORPORATIVO UNIQUE - PICK
4047460¦CORPORATIVO UNIQUE - PICK
4047479¦CORPORATIVO UNIQUE - PICK
4047481¦CORPORATIVO DUPLICATE DESC - PICK
Notice how the Item_Nbr "4047481" has 2 records and both of them have
CORPORATIVO, but we write out only 1 record
Since OP has shown the data to be already sorted, we can use JOINKEYS to
drop the records that have a different description other than
'CORPORATIVO' by matching the file to itself. Subtask1 will include all
the records with 'CORPORATIVO' and Subtask2 will include records other
than 'CORPORATIVO' . The options SORTED,NOSEQCK will make Joinkeys use a
COPY operation.
//STEP0100 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//INA DD *
4046340¦CORPORATIVO UNIQUE - PICK
4046564¦CORPORATIVO HAS ANOTHER DESC - DROP
4046564¦ESTADO HAS ANOTHER DESC - DROP
4047131¦CORPORATIVO UNIQUE - PICK
4047460¦CORPORATIVO UNIQUE - PICK
4047479¦CORPORATIVO UNIQUE - PICK
4047480¦CORPORATIVO HAS ANOTHER DESC - DROP
4047480¦RON THOMAS HAS ANOTHER DESC - DROP
4047481¦CORPORATIVO DUPLICATE DESC - PICK
4047481¦CORPORATIVO DUPLICATE DESC - PICK
//INB DD *
4046340¦CORPORATIVO UNIQUE - PICK
4046564¦CORPORATIVO HAS ANOTHER DESC - DROP
4046564¦ESTADO HAS ANOTHER DESC - DROP
4047131¦CORPORATIVO UNIQUE - PICK
4047460¦CORPORATIVO UNIQUE - PICK
4047479¦CORPORATIVO UNIQUE - PICK
4047480¦CORPORATIVO HAS ANOTHER DESC - DROP
4047480¦RON THOMAS HAS ANOTHER DESC - DROP
4047481¦CORPORATIVO DUPLICATE DESC - PICK
4047481¦CORPORATIVO DUPLICATE DESC - PICK
//SORTOUT DD SYSOUT=*
//SYSIN DD *
OPTION COPY
JOINKEYS F1=INA,FIELDS=(1,10,A),SORTED,NOSEQCK
JOINKEYS F2=INB,FIELDS=(1,10,A),SORTED,NOSEQCK
JOIN UNPAIRED,F1,ONLY
//*
//JNF1CNTL DD *
INCLUDE COND=(12,20,CH,EQ,C'CORPORATIVO')
/*
//JNF2CNTL DD *
INCLUDE COND=(12,20,CH,NE,C'CORPORATIVO')
/*
The output from this job is
4046340¦CORPORATIVO UNIQUE - PICK
4047131¦CORPORATIVO UNIQUE - PICK
4047460¦CORPORATIVO UNIQUE - PICK
4047479¦CORPORATIVO UNIQUE - PICK
4047481¦CORPORATIVO DUPLICATE DESC - PICK
4047481¦CORPORATIVO DUPLICATE DESC - PICK
Notice how we picked the 2 duplicate records for the item_Nbr "4047481"
If the data is NOT sorted and you still want to use JOINKEYS method then,
then you can optimize the job a bit
On the Subtask2, all you care is just the Item_Nbr, so that you can match
it. So build the Subtask2 with just the Item_Nbr using INREC statement.
This will reduce the resources required to sort ( 80 byte sorting vs 10
byte sorting). You also do NOT require SUM FIELDS=NONE as the intention is
to DROP the matched records. And since we are using JOIN UNPAIRED,F1,ONLY
the matched records will be dropped even if the Join ended up with a
Cartesian join.
The updated control cards for Joinkeys would be as follows.
//SYSIN DD *
OPTION COPY
JOINKEYS F1=INA,FIELDS=(1,10,A)
JOINKEYS F2=INB,FIELDS=(1,10,A)
JOIN UNPAIRED,F1,ONLY
//*
//JNF1CNTL DD *
INCLUDE COND=(12,20,CH,EQ,C'CORPORATIVO')
/*
//JNF2CNTL DD *
INCLUDE COND=(12,20,CH,NE,C'CORPORATIVO')
INREC BUILD=(1,10)
/*
Further if you have any questions please let me know
Thanks,
Kolusu
DFSORT Development
IBM Corporation
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN