Re: looking for faster Ideas...

Craig Bennett Tue, 27 Jan 2004 16:22:52 -0800

George,

I don't know if this will help you, but part of the problem with a CASE
statement is that every statement is tested until you have a match and EVERY
statement is tested if there is no match. If you don't have a large number
to remove, this can get very wasteful.


When I need to parse some data and I need to do it fast (and I don't care
that I may write a very long tedious program (sometime I even write a
program to build the final program)) I find that a state machine model with
computed gosubs based on ASCII character numbers can be quicker.

I started writing the code below, but then remembered I had to work :(

The basic Idea is to only test the characters you need and to test them one
by one where each letter in a match is another internal subroutine eg:

LOOP WHILE POS LE (DATALEN - MATCHLEN) DO
    * Just match A-Z and we are only looking for names starting with A and T
    CHARCODE = SEQ(MYDATA[POS, 1])        ;* Under UV BYTEVAL(MYDATA, POS)
is MUCH quicker.
    ON CHARCODE + 64 GOSUB NOMATCH,
                                                        FIRSTCHARA,
                                                        NOMATCH,    ;* B
                                                        NOMATCH,    ;* C
                                                        NOMATCH,    ;* D
                                                        NOMATCH,    ;* E
                                                        NOMATCH,    ;* F
                                                        NOMATCH,    ;* G
                                                        NOMATCH,    ;* H
                                                        NOMATCH,    ;* I
                                                        NOMATCH,    ;* J
                                                        NOMATCH,    ;* K
                                                        NOMATCH,    ;* L
                                                        NOMATCH,    ;* M
                                                        NOMATCH,    ;* N
                                                        NOMATCH,    ;* O
                                                        NOMATCH,    ;* P
                                                        NOMATCH,    ;* Q
                                                        NOMATCH,    ;* R
                                                        NOMATCH,    ;* S
                                                        FIRSTCHART,
                                                        NOMATCH,    ;* U
                                                        NOMATCH,    ;* V
                                                        NOMATCH,    ;* W
                                                        NOMATCH,    ;* X
                                                        NOMATCH,    ;* Y
                                                        NOMATCH,    ;* Z
                                                        NOMATCH

REPEAT

NOMATCH:
    * Set a flag to false
    MATCH.NAME = 0
RETURN

FIRSTCHARA:
    POS += 1
    CHARCODE = SEQ(MYDATA[POS, 1])
    ON CHARCODE + 64 GOSUB NOMATCH,
                                                        NOMATCH,    ;* A
                                                        SECONDCHARB,
                                                        NOMATCH,    ;* C
                                                        NOMATCH,    ;* D
                                                        NOMATCH,    ;* E
                                                        NOMATCH,    ;* F
                                                        NOMATCH,    ;* G
                                                        NOMATCH,    ;* H
                                                        NOMATCH,    ;* I
                                                        NOMATCH,    ;* J
                                                        NOMATCH,    ;* K
                                                        NOMATCH,    ;* L
                                                        NOMATCH,    ;* M
                                                        NOMATCH,    ;* N
                                                        NOMATCH,    ;* O
                                                        NOMATCH,    ;* P
                                                        NOMATCH,    ;* Q
                                                        NOMATCH,    ;* R
                                                        NOMATCH,    ;* S
                                                        SECONDCHART,
                                                        NOMATCH,    ;* U
                                                        NOMATCH,    ;* V
                                                        NOMATCH,    ;* W
                                                        NOMATCH,    ;* X
                                                        NOMATCH,    ;* Y
                                                        NOMATCH,    ;* Z
                                                        NOMATCH
RETURN

etc
etc
etc

Tedious to write, but writing a program generator to build it is easy (in an
include file?) and then your testing is faster.

Use three sets of (many) computed gosubs to match for name, then zip, then
address.


No benchmarks, no promises.


Craig

_______________________________________________
u2-users mailing list
[EMAIL PROTECTED]
http://www.oliver.com/mailman/listinfo/u2-users

Re: looking for faster Ideas...

Reply via email to