Hi all,

For situations of testing a bit in an IO/peripheral register or global variable, I noticed that code generated rarely makes use of STM8's BTJT/BTJF instructions, instead doing LD+BCP+JRxx.

I have drafted some peephole rules which I think will optimise these operations. They seem to work in the limited testing I have done with simple cases. Can anyone please review and let me know if I have missed anything, or if anything is incorrect?

Or, maybe it is more appropriate to make these kind of changes to code generation in the first place?

// -----------------------------------------------------------------------------
// REPLACE LD + SRL + JRNC WITH BTJF
// 3/4 cycles, 6 bytes versus 2/3 cycles, 5 bytes
// -----------------------------------------------------------------------------

replace restart {
    ld    a, %1
    srl    a
    jrnc    %2
} by {
    ; peephole replaced load + shift-right + jump-if-no-carry with btjf
    btjf    %1, #0, %2
} if notUsed('a'), notUsed('n'), notUsed('z'), notUsed('c'), operandsLiteral(%1), immdInRange(0 65535 '+' 0 %1 %3)

// -----------------------------------------------------------------------------
// REPLACE LD + BCP + JREQ WITH BTJF
// 3/4 cycles, 7 bytes versus 2/3 cycles, 5 bytes
// -----------------------------------------------------------------------------

replace restart {
    ld    a, %1
    bcp    a, #0x01
    jreq    %2
} by {
    ; peephole replaced load + bit-compare + jump-if-zero with btjf
    btjf    %1, #0, %2
} if notUsed('a'), notUsed('n'), notUsed('z'), notUsed('c'), operandsLiteral(%1), immdInRange(0 65535 '+' 0 %1 %3)

// [...repeat for bits 1 through 6...]

replace restart {
    ld    a, %1
    bcp    a, #0x80
    jreq    %2
} by {
    ; peephole replaced load + bit-compare + jump-if-zero with btjf
    btjf    %1, #7, %2
} if notUsed('a'), notUsed('n'), notUsed('z'), notUsed('c'), operandsLiteral(%1), immdInRange(0 65535 '+' 0 %1 %3)

// -----------------------------------------------------------------------------
// REPLACE LD + SRL + JRC WITH BTJT
// 3/4 cycles, 6 bytes versus 2/3 cycles, 5 bytes
// -----------------------------------------------------------------------------

replace restart {
    ld    a, %1
    srl    a
    jrc    %2
} by {
    ; peephole replaced load + shift-right + jump-if-carry with btjt
    btjt    %1, #0, %2
} if notUsed('a'), notUsed('n'), notUsed('z'), notUsed('c'), operandsLiteral(%1), immdInRange(0 65535 '+' 0 %1 %3)

// -----------------------------------------------------------------------------
// REPLACE LD + BCP + JRNE WITH BTJT
// 3/4 cycles, 7 bytes versus 2/3 cycles, 5 bytes
// -----------------------------------------------------------------------------

replace restart {
    ld    a, %1
    bcp    a, #0x01
    jrne    %2
} by {
    ; peephole replaced load + bit-compare + jump-if-non-zero with btjt
    btjt    %1, #0, %2
} if notUsed('a'), notUsed('n'), notUsed('z'), notUsed('c'), operandsLiteral(%1), immdInRange(0 65535 '+' 0 %1 %3)

// [...repeat for bits 1 through 6...]

replace restart {
    ld    a, %1
    bcp    a, #0x80
    jrne    %2
} by {
    ; peephole replaced load + bit-compare + jump-if-non-zero with btjt
    btjt    %1, #7, %2
} if notUsed('a'), notUsed('n'), notUsed('z'), notUsed('c'), operandsLiteral(%1), immdInRange(0 65535 '+' 0 %1 %3)

// -----------------------------------------------------------------------------
// REPLACE BTJT + JRA WITH BTJF
// 3/4 cycles, 7 bytes versus 2/3 cycles, 5 bytes
// -----------------------------------------------------------------------------

replace restart {
    btjt    %1, %2, %3
    jra    %4
%3:
} by {
    ; peephole removed jra by using inverse bit-test-jump logic
    btjf    %1, %2, %4
} if labelRefCountChange(%3 -1)

I could not figure out a way to transform a bit-mask value back to a bit number, so have had to repeat some rules. Although, it seems the existing bset/bres/bcpl rules do this too, so maybe it's not actually possible to make the transformation.

One other thing to note is that I figured it's not worth bothering to optimise ld+jrmi/jrpl to btjt/btjf (used when testing bit 7) because they are same number of cycles and bytes.

Regards,
Basil Hussain


_______________________________________________
Sdcc-user mailing list
Sdcc-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sdcc-user

Reply via email to