I want to publicly thank Retired Mainframer for leading me to the correct
direction. The problem was that the EBCDIC oriented regmatch_t structure is
defined like this:
typedef struct { /* substring locations - from regexec() */
__off_t rm_so; /* offset of substring */
mbstate_t rm_ss; /* Shift state at start of substring */
__off_t rm_eo; /* offset of first char after substring */
mbstate_t rm_es; /* Shift state at end of substring */
} regmatch_t;
__off_t is clearly defined earlier as a 32 bit entity but mbstate_t is defined
as short which I assumed is a 16 bit entity. Either short is NOT 16 bits but
32 bit entity as well, or the C compiler leaves 2 bytes of zeroes in order to
keep the correct integral boundary. However, in REAL LIFE there are 32 bits
between rm_so and rm_eo. I did not bother to investigate too much but
translated the structure as:
10 :PREFIX:-regmatch-t.
15 :PREFIX:-rm-so PIC S9(9) COMP-5.
* offset of substring */
15 :PREFIX:-rm-ss PIC s9(4) COMP-5.
* Shift state at start of substring */
15 FILLER PIC XX.
* The filler was added since despite the fact that the C
* calls for short rm-ss and rm-es (i.e. S9(4) COMP-5) allocates
* 4 bytes, either because short is not short after all or
* because of integral boundary.
15 :PREFIX:-rm-eo PIC S9(9) COMP-5.
* offset of first char after substring */
15 :PREFIX:-rm-es PIC S9(4) COMP-5.
* Shift state at end of substring */
15 FILLER PIC XX.
compensating for the additional two bytes after each pair. After all, we
always use EBCDIC!
I've published the results of my work as FILE928 in the CBTTAPE with a demo
program and explanation how to extract a captured substring (in addition to the
regular match/no-match capability.
I will also publish it in my own website later.
The reason for this work is that the only serious argument I've ever heard for
not using my port of the PCRE library was that "management would never allow
use of Open Source". One can freely take my copybooks that only describe
structures and use them with the bona-fide IBM supplied functions. Please feel
free to copy the definition only and avoid the "open Source" mambo Jumbo if you
do not intend to distribute your poor COBOL program!
ZA
ZA
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN