On Sunday, 30 July 2017 at 01:15:50 UTC, Stefan Koch wrote:
[ ... ]

Shortly after posting this I came up with a more effective variant of the parseBf which reduces the generate bytecode from around 650 instructions to around 430 instructions.

const(RepeatedToken[]) parseBf(const string input) pure
{
    uint pos;
    RepeatedToken[] result;
    result.length = input.length + 2;
// the maximal number of diffrent tokens is equal to the chars in the input
    // plus the begin and end token

    result[0] = Token(BFTokenEnum.ProgrammBegin);
    uint resultLen = 0;

    while (pos < input.length)
    {
        uint lastToken = (result[resultLen] >> 24);
        uint thisToken = BFTokenEnum.ProgrammEnd;
        final switch (input[pos++]) with (BFTokenEnum)
        {
        case '>':
            thisToken = IncPtr;
            break;
        case '<':
            thisToken = DecPtr;
            break;
        case '+':
            thisToken = IncVal;
            break;
        case '-':
            thisToken = DecVal;
            break;
        case '.':
            thisToken = OutputVal;
            break;
        case ',':
            thisToken = InputVal;
            break;
        case '[':
            thisToken = LoopBegin;
            break;
        case ']':
            thisToken = LoopEnd;
            break;
        case '\r':
            pos++;
            goto case '\n';
        case '\n':
            //TODO handle lines and proper position information;
            break;
        }


        if (lastToken == thisToken)
        {
            result[resultLen]++;
        }
        else if (thisToken != BFTokenEnum.ProgrammEnd)
        {
result[++resultLen] = Token(cast(BFTokenEnum)thisToken);
        }

    }

    result[++resultLen] = Token(BFTokenEnum.ProgrammEnd);
    return result[0 .. resultLen + 1];
}

In theory it would be possible to get rid of the switch altogether ;)
And compress the function ever more.
But readability would suffer and we'd rely on the enum-members to correspond to the tokens, which maybe undesirable.

Reply via email to