I want to publicly thank Retired Mainframer for leading me to the correct 
direction.  The problem was that the EBCDIC oriented regmatch_t structure is 
defined like this:
    typedef struct {         /* substring locations - from regexec() */         
          __off_t   rm_so;   /* offset of substring                  */         
          mbstate_t rm_ss;   /* Shift state at start of substring    */         
          __off_t   rm_eo;   /* offset of first char after substring */         
          mbstate_t rm_es;   /* Shift state at end of substring      */         
    } regmatch_t;

__off_t is clearly defined earlier as a 32 bit entity but mbstate_t is defined 
as short which I assumed is a 16 bit entity.  Either short is NOT 16 bits but 
32 bit entity as well, or the C compiler leaves 2 bytes of zeroes in order to 
keep the correct integral boundary.  However, in REAL LIFE there are 32 bits 
between rm_so and rm_eo.  I did not bother to investigate too much but 
translated the structure as:
       10  :PREFIX:-regmatch-t.
           15 :PREFIX:-rm-so      PIC       S9(9) COMP-5.
      * offset of substring                  */
           15 :PREFIX:-rm-ss      PIC       s9(4) COMP-5.
      * Shift state at start of substring    */
           15 FILLER              PIC XX.
      * The filler was added since despite the fact that the C
      * calls for short rm-ss and rm-es (i.e. S9(4) COMP-5) allocates 
      * 4 bytes, either because short is not short after all or 
      * because of integral boundary.
           15 :PREFIX:-rm-eo      PIC       S9(9) COMP-5.
      * offset of first char after substring */
           15 :PREFIX:-rm-es      PIC       S9(4) COMP-5.
      * Shift state at end of substring      */
           15 FILLER              PIC XX.

compensating for the additional two bytes after each pair.  After all, we 
always use EBCDIC!

I've published the results of my work as FILE928 in the CBTTAPE with a demo 
program and explanation how to extract a captured substring (in addition to the 
regular match/no-match capability.
I will also publish it in my own website later.

The reason for this work is that the only serious argument I've ever heard for 
not using my port of the PCRE library was that "management would never allow 
use of Open Source".  One can freely take my copybooks that only describe 
structures and use them with the bona-fide IBM supplied functions.  Please feel 
free to copy the definition only and avoid the "open Source" mambo Jumbo if you 
do not intend to distribute your poor COBOL program!

ZA

ZA

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to