This is one of my ruminations, and if you find them boring you may safely skip
the rest of this post.
The subject of this thread in all of its particularity is not of much interest,
but it is of great interest if it is viewed more abstractly. This time,
moreover, it is easy to generalize.
Most of the posts dealt with blanks, x'40' in EBCDIC and x'20' in ASCII, but
repetitions of nuls, x'00' in both, are more important in some few contexts.
The limitation to 16 characters, one instance and 15 repetitions of it, is
artificial.
Finally, the limitation to a single character is stultifying: many apparently
different but in fact conceptually identical problems require that repetitions
of any of a subset of characters be identified.
Consider the two PL/I statements, useful because they are cointext-sensitive:
declare transac3 file record sequential buffered ;
open file(transac3) input ;
In both we want to break out the tokens, 'declare', 'transac3', 'file',
'record', 'sequential', 'buffered', ';' in the first and 'open', 'file', '(',
'transac3', ')', 'input', ';' in the second. Here, as Robin has already
pointed out, a TRT is better than a CLC[L].
Indeed, well-written texical breakout routines consist of little more than a
small finite-state-machine and a set of TRT tables. In particular, they do not
process inputs one character at a time (as the computer science 101
illustrations always do).
There is a sense in which this is well known. The current PROP in its
discussion of the TRT instruction says
TRANSLATE AND TEST may be used to scan the first operand for characters with
special meaning. The second operand, or list, is set up with all-zero function
bytes for those characters to be skipped over and with nonzero function bytes
for the characters to be detected.
For my example it is thus possible to both stop on [any of] a blank, a left
parenthesis, a right parenthesis, or a semicolon and to distinguish them
(without further testing) after stopping.
Why use a TRT, which requires a table of, usually, 256 bytes, when something
more compact can be put together for the special case of 16 blanks (or nuls)?
The answer can be boiled down to a single word, reusability.
Efficiency is a vexed question, and I cannot consider it here in any really
satisfactory way. No one wants to write inefficient or inelegant code, but
preoccupation with the relative efficiencies of alternative single
instructions, both of which consume only nanoseconds, is a mug's game. Worse,
the avoidance of single millicode-based instructions, their replacement by a
notionally and temporarily more efficient sequence of hardware-based ones, is,
I think, perverse.
Enough! I have already offended just about everyone.
John Gilmore Ashland, MA 01721-1817 USA