On 14 March 2018 at 11:51, Charles Mills <[email protected]> wrote:
> 1.       Is there a machine instruction that will find one string within
> another? That given "Now is the time" and "is" would find the "is" and
> return a pointer to it? A machine instruction analog of Rexx POS?

I am almost certain that there is not.

> 2.       Searching the PoOp for such an instruction led me to CUSE. It does
> not seem that CUSE could be used for this - is that correct? If I am reading
> CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would
> return the position of "is". Is my reading correct? What would that be good
> for? What would be a reasonable real-world use?

The usefulness or otherwise of CUSE has been discussed here a couple
of times over the years. IIRC DB2 (sorry, Db2) rows were mentioned.
Certainly CUSE by itself is not a Rexx-style POS. I found the
description in the POP unclear, and so wrote and tested some little
programs to see what it does. It finds a substring match *at the same
offset* in both strings, so it cannot do what is needed for POS.

To implement POS I ended up using SRST to find the first character,
followed by CLCL for the rest, and it turns out that the register
setup is not too bad for this use if you choose carefully. I couldn't
see a useful way to exploit CUSE after SRST for this, but it's quite
possible I missed the trick. And it's certainly possible that an
old-school approach using CLC in a loop might be faster than my two
step process.

To do POS my way you have to remember a couple of  (obvious but easy
to forget) things:

- if your SRST finds the first character but the CLCL doesn't find the
whole string, then you have to loop back and try for the first
character again. For example if you searched a string "Now is the time
it seems" for "it".

- you have to keep track of the length remaining in the string you are
searching in when you do the CLCL. Unlike SRST, which uses an
end-pointer that should stay valid, CLCL needs a current length for
each operand. Somehow I wanted CLCL's padding to be useful here, but
it isn't.

And a more subtle one:
- SRST does not change the pointer to the string (the second register)
unless you get CC3. For this instruction CC3 can occur only if the
string is > 256 bytes. So if you know in advance that this will be the
case, you may be able to avoid having to fix up that register if you
are going to need it to calculate a length for the CLCL. OTOH you may
be setting up a code maintainer for a nightmare if you assume this and
conditions change later. It's also almost impossible to keep track in
one's head of which of the "CPU determined number of bytes"
instructions guaranty at least how many bytes. Some are 256, some are
8, and many are 1.

I'm not a Compiler Guy, but I am fascinated that this kind of subtlety
can be encoded in instruction description tables that compiler code
generation can use.

Tony H.

Reply via email to