I have a fixed-record-length data file which has "subsections" starting with a 
"data characteristics" record (call it a "subsection header").  One of the 
fields in the "subsection header" is a 2-byte zoned decimal "data length" value 
which identifies the number of SIGNIFICANT columns in the "data" portion of the 
records following the "subsection header".

What I would like to do is to blank out the NON-significant data columns in the 
data records that follow each "subsection header".  The data records have 
suffered from some "data pollution" where non-significant data has been 
accidentally stored beyond the significant data columns.

Each "subsection header" may have a different "significant data" length value 
for the following data records.

Example INPUT data (column 1 = record type [H = header, D = data], data starts 
in column 3 in each record):

H 05  COMMENT: "05" IS SIGNIFICANT DATA COLUMNS
D 12345 XYZ ABC DEF
D 45678 GHI JKL MNO
H 10
D 1234567890 ABCDEFGHIJKL
D 9876543210 MNOPQWRSTUVWXYZ

Example OUTPUT data (column 1 = record type [H = header, D = data], data starts 
in column 3):

H 05  COMMENT: "05" IS SIGNIFICANT DATA COLUMNS
D 12345
D 45678
H 10
D 1234567890
D 9876543210

Obviously I can write a pretty simple script or program to accomplish this 
"data cleaning" operation, but I wondered if it would be possible using just 
SORT.

The data volume is in the range of about 100K-200K records per file if that 
matters.

TIA for any ideas you can offer.

Peter
--



----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to