On 14 March 2018 at 11:51, Charles Mills <[email protected]> wrote: > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS?
I am almost certain that there is not. > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am reading > CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would > return the position of "is". Is my reading correct? What would that be good > for? What would be a reasonable real-world use? The usefulness or otherwise of CUSE has been discussed here a couple of times over the years. IIRC DB2 (sorry, Db2) rows were mentioned. Certainly CUSE by itself is not a Rexx-style POS. I found the description in the POP unclear, and so wrote and tested some little programs to see what it does. It finds a substring match *at the same offset* in both strings, so it cannot do what is needed for POS. To implement POS I ended up using SRST to find the first character, followed by CLCL for the rest, and it turns out that the register setup is not too bad for this use if you choose carefully. I couldn't see a useful way to exploit CUSE after SRST for this, but it's quite possible I missed the trick. And it's certainly possible that an old-school approach using CLC in a loop might be faster than my two step process. To do POS my way you have to remember a couple of (obvious but easy to forget) things: - if your SRST finds the first character but the CLCL doesn't find the whole string, then you have to loop back and try for the first character again. For example if you searched a string "Now is the time it seems" for "it". - you have to keep track of the length remaining in the string you are searching in when you do the CLCL. Unlike SRST, which uses an end-pointer that should stay valid, CLCL needs a current length for each operand. Somehow I wanted CLCL's padding to be useful here, but it isn't. And a more subtle one: - SRST does not change the pointer to the string (the second register) unless you get CC3. For this instruction CC3 can occur only if the string is > 256 bytes. So if you know in advance that this will be the case, you may be able to avoid having to fix up that register if you are going to need it to calculate a length for the CLCL. OTOH you may be setting up a code maintainer for a nightmare if you assume this and conditions change later. It's also almost impossible to keep track in one's head of which of the "CPU determined number of bytes" instructions guaranty at least how many bytes. Some are 256, some are 8, and many are 1. I'm not a Compiler Guy, but I am fascinated that this kind of subtlety can be encoded in instruction description tables that compiler code generation can use. Tony H.
