Further to my earlier ramblings and worries about binary operators
and overloading etc....

Here is a proposal for the numerical part of the SV API that
provides a framework for arbitrary precision arithmetic, while still
allowing standard ints and floats to be handled efficiently.
Some of the ideas here could be borrowed to handle the string part of
the API, especially regarding unicode and its various representations.
It is also based on some of my early muddled thinking on how operator
overloading should in general be handled at this level and may provide
futher discussion on that topic.

If nothing else, since this document shows examples both for the body
of a binary op, and for the code that calls it, it may add some focus
to discussions of the meaning of the functions in the API, even if it
is only to show how it shouldnt be done!

The summary of this mini API looks like:

p = sv->precision()

i = sv->get_int(sv)
    sv->get_intn(sv,buf,n)
r = sv->get_real(sv)
    sv->get_realn(sv,buf,plen,elen)

sv->set_int(sv,i)
sv->set_intn(sv,buf,n)
sv->set_real(sv,r)
sv->set_realn(sv,buf,plen,elen)
--or possibly--
sv = vtable->newSViv(7)
sv = vtable->newSVnv(7.0)

[ There is also a need for
sv1->overwrite(sv1,sv2) or similar, which is needed for assignment
]

Some guiding principles:

1) binary operators should in general always return
an SV rather rather than just an int or char* say.
Otherwise ops won't have any way of propagating the 'specialness' of their
operands (eg the sum of two complex numbers should also be complex).

2) in general we want the result of a binop to be of the
same type as the 'biggest' of its operands, eg

$real + $int                    is a real
$int + $real                    is a real
$bigint + $int                  is a bigint
$real + $complex                is a complex
$complex + $bigcomplex          is a bigcomplex

[the last example above wont calculate a sensible result since the API
in this proposal only provides limited support for multi-dimensional
scalars such as complex numbers. In fact $complex + $bigcomplex is
probably equivalent to $bigcomplex + modulus($complex) ]

3) an operation on 2 numerical SVs of the same type should be efficient

4) an operation on 2 numerical SVs of different types should in general work,
but need not in general be efficient.


How this all works

1) Precision

The 'bigness' of an operand is determined by a constant integer property
per SV class, which for lack of a better name, I call precision.
sv->precision() returns the precision associated with a particular class
(and all variables of the same class have the same precision value).

We need some formula to calculate precision per class.
One approach would be the following:
Let d be the dimension of the variable (eg complex numbers would be 2)
Let t be:
        0 for integer
        1 for floating point
let b be the bits of storage used (eg 32 for an INT32, 64 for a double)
then define the precision by (d<<24 + t<<23 + b) or similar.
Classes which allow arbirtary dimension or precision should set d or t
to its maximum value.
This means that complex numbers always win out over plain numbers,
floats are better than ints, and if they are otherwise the same,
the bits of storage used wins out.

With this, the operator to subtract 2 elements on the stack might look
something like this:


pp_subtract {
        SV *sv1 = POP;
        SV *sv2 = POP;
        SV *result;
        if (
                // if they're the same type (common case),
                // avoid the overhead of calling precision() twice
                   sv1->vtable == sv2->vtable
                || sv1->precision() >= sv2->precison()
        )
                result = sv1->subtract(sv1,sv2,0);
        else
                result = sv2->subtract(sv2,sv1,1); // operands swapped
        PUSH(result);
}


Note that the action of this code is to call the subtact function associated
with the 'biggest' arg.

The implementation of numeric ops for a particular class is based on the
idea that if both ops are of the same class, the op is carried out efficiently
by directly accessing the internal representations of both ops; otherwise
the 2nd op is asked to extract its value in a standard but not necessarily
efficient form that is portable between all classes. With this in mind,
a scalar class must provide the following methods:

i = sv->get_int(sv)

The scalar returns its value as a 'standard integer' (eg whatever integer
type is used when 'use integer' is in effect). If its internal value is
too big, an exception is thrown.
This function is typically used in places like array subscripts, the mode
value in chmod $mode, etc etc.

sv->get_intn(sv,buf,n)

here, the sv is passed a buf of size N bytes, and is asked to fill it with
an n*8 bit integer represenation of its internal value - again thowing
an exception if it wont fit. In practice this will just be a case of
filling the top part of the buffer with a 0 or 1 sign-extension.

In principle the API could have a few additonal methods like
get_int4() if it is found there are some very common cases; but in general
we dont want a method for every conceivable int size.

f = sv->get_real(sv)

same idea as get_int.

sv->get_realn(sv,buf,plen,elen)

same idea as get_intn, but in this case buf will be filled with
plen bytes of mantissa, and elen bytes of exponent. The exact format
of this is standardised, and may not necessarily match any internal
floating-point represenation (and so may be quite inefficient to extract).

sv->set_int(sv,i)
sv->set_intn(sv,buf,n)
sv->set_real(sv,f)
sv->set_realn(sv,buf,plen,elen)

The analogues of get_*.
I'm not sure whether all these are actually needed.
What may be needed instead is the equivalent of newSViv and newSVnv
for handling numeric literals in the code.

Given the ability to extract the value of any arbitrary-precision number
in a standard format, the code for a binop will look something like
the following.

First, suppose we have #defined INT_SIZE to be the standard size in bytes
of an integer on this platform, ie what get_int() returns.

Then for a hypothetical 16-bit integer SV class, the subtract function
(as called from pp_subtract above) might look something like

SV *
int16_subtract(SV* sv1, SV* sv2, int reversed)
{
        SV* result;
        char buf[2];
        int i;
        if (sv2->vtable == int16_vtable) {
                // normal case - both args of type INT16
                // just grab the value directly from the internals
                i = sv2->value;
        } else {
// optimise if standard int on this platform is INT16
#if INT_SIZE == 2
                i = sv2->get_int(sv2);
#else
                // 2nd arg is unknown - ask it to represent itself as 2 bytes
                sv2->get_intn(sv2,buf,2);
                i = typecast_or_conversion_or_whatever(buf);
#endif
        }
        // construct the return value
        result = a_new_empty_SV_if_you_please();
        result->vtable = int16_vtable;  // make it our type
        result->value = reversed ? (i - sv1->value) : (sv1->value - i);
        return result;
}


------

As a bit of an aside, I think there is a need for a
sv1->overwrite(sv1,sv2) function whose job is to rip out the existing
guts out of sv2 and replace it with a copy of the guts of sv1.
(This is what sv_setsv currently does).
This is needed for assignments, eg

$x = expression
expression returns an SV of the appropriate type; the existing contents
of $x must be thrown away and replaced with the results of the expression.


Reply via email to