On Wed, Apr 3, 2013 at 6:16 PM, Kenneth Zadeck <[email protected]> wrote:
> On 04/03/2013 09:53 AM, Richard Biener wrote:
>>
>> On Wed, Apr 3, 2013 at 2:05 PM, Kenneth Zadeck <[email protected]>
>> wrote:
>>>
>>> On 04/03/2013 05:17 AM, Richard Biener wrote:
>>>
>>>> In the end you will have a variable-size storage in TREE_INT_CST thus
>>>> you will have at least to emit _code_ copying over meta-data and data
>>>> from the tree representation to the wide-int (similar for RTX
>>>> CONST_DOUBLE/INT).
>>>> I'm objecting to the amount of code you emit and agree that the runtime
>>>> cost is copying the meta-data (hopefully optimizable via CSE / SRA)
>>>> and in most cases one (or two) iterations of the loop copying the data
>>>> (not optimizable).
>>>
>>> i did get rid of the bitsize in the wide-int patch so at this point the
>>> meta
>>> data is the precision and the len.
>>> not really a lot here. As usual we pay a high price in gcc for not
>>> pushing
>>> the tree rep down into the rtl level, then it would have been acceptable
>>> to
>>> have the tree type bleed into the wide-int code.
>>>
>>>
>>>
>>>>> 2) You present this as if the implementor actually should care about
>>>>> the
>>>>> implementation and you give 3 alternatives: the double_int, the
>>>>> current
>>>>> one, and HWI. We have tried to make it so that the client should
>>>>> not
>>>>> care. Certainly in my experience here, I have not found a place to
>>>>> care.
>>>>
>>>> Well, similar as for the copying overhead for tree your approach
>>>> requires
>>>> overloading operations for HOST_WIDE_INT operands to be able to
>>>> say wi + 1 (which is certainly desirable), or the overhead of using
>>>> wide_int_one ().
>>>>
>>>>> In my opinion double_int needs to go away. That is the main thrust of
>>>>> my
>>>>> patches. There is no place in a compiler for an abi that depends on
>>>>> constants fitting into 2 two words whose size is defined by the host.
>>>>
>>>> That's true. I'm not arguing to preserve "double-int" - I'm arguing to
>>>> preserve a way to ask for an integer type on the host with (at least)
>>>> N bits. Almost all double-int users really ask for an integer type on
>>>> the
>>>> host that has at least as many bits as the pointer representation (or
>>>> word_mode) on
>>>> the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer
>>>> targets). No double-int user specifically wants 2 * HOST_WIDE_INT
>>>> precision - that is just what happens to be there. Thus I am providing
>>>> a way to say get me a host integer with at least N bits (VRP asks for
>>>> this, for example).
>>>>
>>>> What I was asking for is that whatever can provide the above should
>>>> share
>>>> the functional interface with wide-int (or the othert way around). And
>>>> I
>>>> was claiming that wide-int is too fat, because current users of
>>>> double-int
>>>> eventually store double-ints permanently.
>>>
>>> The problem is that, in truth, double int is too fat. 99.something% of
>>> all
>>> constants fit in 1 hwi and that is likely to be true forever (i
>>> understand
>>> that tree vpn may need some thought here). The rtl level, which has, for
>>> as
>>> long as i have known it, had 2 reps for integer constants. So it was
>>> relatively easy to slide the CONST_WIDE_INT in. It seems like the right
>>> trickery here rather than adding a storage model for wide-ints might be a
>>> way to use the c++ to invisibly support several (and by "several" i
>>> really
>>> mean 2) classes of TREE_CSTs.
>>
>> The truth is that _now_ TREE_INT_CSTs use double-ints and we have
>> CONST_INT and CONST_DOUBLE. What I (and you) propose would
>> get us to use variable-size storage for both, allowing to just use a
>> single
>> HOST_WIDE_INT in the majority of cases. In my view the constant
>> length of the variable-size storage for TREE_INT_CSTs is determined
>> by its type (thus, it doesn't have "optimized" variable-size storage
>> but "unoptimized" fixed-size storage based on the maximum storage
>> requirement for the type). Similar for RTX CONST_INT which would
>> have fixed-size storage based on the mode-size of the constant.
>> Using optimized space (thus using the encoding properties) requires you
>> to fit the 'short len' somewhere which possibly will not pay off in the
>> end
>> (for tree we do have that storage available, so we could go with optimized
>> storage for it, not sure with RTL, I don't see available space there).
>
> There are two questions here: one is the fact that you object to the fact
> that we represent small constants efficiently
Huh? Where do I object to that? I question that for the storage in tree
and RTX the encoding trick pays off if you need another HWI-aligned
word to store the len. But see below.
> and the second is that we take
> advantage of the fact that fixed size stack allocation is effectively free
> for short lived objects like wide-ints (as i use them).
I don't question that and I am not asking you to change that. As part of
what I ask for a more optimal (smaller) stack allocation would be _possible_
(but not required).
> At the rtl level your idea does not work. rtl constants do not have a mode
> or type. So if you do not compress, how are you going to determine how
> many words you need for the constant 1. I would love to have a rep that
> had the mode in it. But it is a huge change that requires a lot of
> hacking to every port.
Quoting from your RTL parts patch:
+struct GTY((variable_size)) hwivec_def {
+ int num_elem; /* number of elements */
+ HOST_WIDE_INT elem[1];
+};
+
+#define HWI_GET_NUM_ELEM(HWIVEC) ((HWIVEC)->num_elem)
+#define HWI_PUT_NUM_ELEM(HWIVEC, NUM) ((HWIVEC)->num_elem = (NUM))
+
/* RTL expression ("rtx"). */
struct GTY((chain_next ("RTX_NEXT (&%h)"),
@@ -343,6 +352,7 @@ struct GTY((chain_next ("RTX_NEXT (&%h)"),
struct block_symbol block_sym;
struct real_value rv;
struct fixed_value fv;
+ struct hwivec_def hwiv;
} GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u;
};
that 'wastes' one HOST_WIDE_INT. So the most efficient encoding
of 0 will require two HOST_WIDE_INT size storage. Same as for
double-int currently. 99.9% of all constants will fit into two
HOST_WIDE_INTs, thus having the variable size encoding for
the storage in RTX (as opposed to having num_elem == GET_MODE_BITSIZE
/ HOST_BITS_PER_WIDE_INT) is a waste of time. num_elem
for a non-optimized encoding is available from the mode stored in
the integer constant RTX...
You say
> At the rtl level your idea does not work. rtl constants do not have a mode
> or type.
Which is not true and does not matter. I tell you why. Quote:
+#if TARGET_SUPPORTS_WIDE_INT
+
+/* Match CONST_*s that can represent compile-time constant integers. */
+#define CASE_CONST_SCALAR_INT \
+ case CONST_INT: \
+ case CONST_WIDE_INT
which means you are only replacing CONST_DOUBLE with wide-int.
And _all_ CONST_DOUBLE have a mode. Otherwise you'd have no
way of creating the wide-int in the first place. Which means any
future transition of CONST_INT to the wide-int code (which is desirable!)
will need to fix the fact that CONST_INTs do not have a mode.
Or even simpler - just retain CONST_INT behavior! CONST_INTs
need a single HWI storage (moving that to wide-int with a len field
regresses here) and CONST_INTs are always sign-extended.
Preserve that! Make VOIDmode integer constants have a single
HWI as storage, sign-extended.
> I understand that this makes me vulnerable to the argument that we should
> not let the rtl level ever dictate anything about the tree level, but the
> truth is that a variable len rep is almost always used for big integers.
> In our code, most constants of large types are small numbers. (Remember i
> got into this because the tree constant prop thinks that left shifting any
> number by anything greater than 128 is always 0 and discovered that that was
> just the tip of the iceberg.) But mostly i support the decision to canonize
> numbers to the smallest number of HWIs because most of the algorithms to do
> the math can be short circuited. I admit that if i had to effectively
> unpack most numbers to do the math, that the canonization would be a waste.
> However, this is not really relevant to this conversation. Yes, you could
> get rid of the len, but this such a small part of picture.
Getting rid of 'len' in the RTX storage was only a question of whether it
is an efficient way to go forward. And with considering to unify
CONST_INTs and CONST_WIDE_INTs it is not. And even for CONST_WIDE_INTs
(which most of the time would be 2 HWI storage, as otherwise you'd use
a CONST_INT) it would be an improvement.
> Furthermore, I am constrained at the rtl level because it is just too dirty
> to share the storage. We tried that and the amount of whack a mole we were
> playing was killing us.
The above argument (not encode len explicitely in the RTX) is unrelated
to "sharing the storage" (whatever you exactly refer to here).
> I am comfortable making big changes at the portable level because i can
> easily test them, but changes to fundamental data structures inside every
> port is more than i am willing to do. If you are going to do that, then
> you are going to have start giving rtl constants modes and that is a huge
> change (that while desirable i cannot do).
All CONST_DOUBLEs have modes. You do not touch CONST_INTs.
VOIDmode CONST_WIDE_INTs seem perfectly possible to me (see above).
So nothing to fix, even if that is desirable as you say and as I agree.
> At the tree level you could share the storage, I admit that that is easy, i
> just do not think that it is desirable. The stack allocation inside the
> wide-int is very cheap and the fact that the number that comes out is almost
> always one hwi makes this very efficient. Even if I was starting from
> scratch this would be a strong contender to be the right representation.
>
> Fundamentally, i do not believe that the amount of copying that i am
> proposing at the tree level will be significant. Yes, if you force me to
> use an uncompressed format you can make what i propose expensive.
An uncompressed format at tree / RTX will in 99.9% of all cases be
one or two HWIs. Realize that all my arguments and requests are
really independent (even though I make them together).
As we now have the power of C++ and templates I see no good reason
to not do the optimization of eliding the copying from the RTX / tree
representation. It can be done as a followup I guess (with the disadvantage
of leaving the possibility to "optimize" existing converted client code).
Btw, is the current state of the patches accessible as a branch somewhere?
>>>>> This is not a beauty contest argument, we have public ports are
>>>>> beginning
>>>>> to
>>>>> use modes that are larger than two x86-64 HWIs and i have a private
>>>>> port
>>>>> that has such modes and it is my experience that any pass that uses
>>>>> this
>>>>> interface has one of three behaviors: it silently gets the wrong
>>>>> answer,
>>>>> it
>>>>> ices, or it fails to do the transformation. If we leave double_int as
>>>>> an
>>>>> available option, then any use of it potentially will have one of these
>>>>> three behaviors. And so one of my strong objections to this direction
>>>>> is
>>>>> that i do not want to fight this kind of bug for the rest of my life.
>>>>> Having a single storage model that just always works is in my opinion a
>>>>> highly desirable option. What you have never answered in a concrete
>>>>> manner
>>>>> is, if we decide to provide this generality, what it would be used for.
>>>>> There is no place in a portable compiler where the right answer for
>>>>> every
>>>>> target is two HOST wide integers.
>>>>>
>>>>> However, i will admit that the HWI option has some merits. We try to
>>>>> address this in our implementation by dividing what is done inline in
>>>>> wide-int.h to the cases that fit in an HWI and then only drop into the
>>>>> heavy
>>>>> code in wide-int.c if mode is larger (which it rarely will be).
>>>>> However, a
>>>>> case could be made that for certain kinds of things like string lengths
>>>>> and
>>>>> such, we could use another interface or as you argue, a different
>>>>> storage
>>>>> model with the same interface. I just do not see that the cost of the
>>>>> conversion code is really going to show up on anyone's radar.
>>>>
>>>> What's the issue with abstracting away the model so a fixed-size 'len'
>>>> is possible? (let away the argument that this would easily allow an
>>>> adaptor to tree)
>>>
>>> I have a particularly pessimistic perspective because i have already
>>> written
>>> most of this patch. It is not that i do not want to change that code,
>>> it
>>> is that i have seen a certain set of mistakes that were made and i do not
>>> want to fix them more than once. At the rtl level you can see the
>>> transition from only supporting 32 bit ints to supporting 64 bit its to
>>> finally supporting two HWIs and that transition code is not pretty. My
>>> rtl
>>> patch fixes the places where an optimization was only made if the data
>>> type
>>> was 32 bits or smaller as well as the places where the optimization was
>>> made
>>> only if the data type is smaller than 64 bits (in addition to fixing all
>>> of
>>> the places where the code ices or simply gets the wrong answer if it is
>>> larger than TImode.) The tree level is only modestly better, I believe
>>> only
>>> because it is newer. I have not seen any 32 bit only code, but it is
>>> littered with transformations that only work for 64 bits. What is that
>>> 64
>>> bit only code going to look like in 5 years?
>>
>> The user interface of wide-int does not depend on whether a storage model
>> is abstracted or not. If you take advantage of the storage model by
>> making its interface leaner then it will. But I guess converting
>> everything
>> before settling on the wide-int interface may not have been the wisest
>> choice in the end (providing a wide-int that can literally replace
>> double-int
>> would have got you testing coverage without any change besides
>> double-int.[ch] and wide-int.[ch]).
>
> I think that it was exactly the correct decision. We have made significant
> changes to the structure as we have gone along. I have basically done most
> of what you have suggested, (the interface is completely functional, i got
> rid of the bitsize...) What you are running into is that mike stump,
> richard sandiford and myself actually believe that the storage model is a
> fundamentally bad idea.
Well, that must be because of C++ ignorance. Let's keep that issue
aside for now. Be sure I'll come back to it.
>>> I want to make it easier to write the general code than to write the code
>>> that only solves the problem for the size port that the implementor is
>>> currently working on. So I perceive the storage model as a way to keep
>>> having to fight this battle forever because it will allow the implementor
>>> to
>>> make a decision that the optimization only needs to be done for a
>>> particular
>>> sized integer.
>>>
>>> However, i get the fact that from your perspective, what you really want
>>> is
>>> a solution to the data structure problem in tree-vrp.
>>
>> No, that's just a convenient example. What I really want is a wide-int
>> that is less visibly a replacement for CONST_DOUBLE.
>
> I think that that is unfair. Variable length reps are the standard
> technique for doing wide math. I am just proposing using data structures
> that are common best practices in the rest of the world and adapting them so
> that they match the gcc world better than just hacking in a gmp interface.
I don't object to variable-length reps. I question the need of 'bitsize'
(gone! yay!) and now 'precision' in class wide_int. I question the need
and efficiency of encoding 'len' in RTX CONST_WIDE_INT.
>>> My patch for tree vrp
>>> scans the entire function to find the largest type used in the function
>>> and
>>> then does all of the math at 2x that size. But i have to admit that i
>>> thought it was weird that you use tree cst as your long term storage.
>>> If
>>> this were my pass, i would have allocated a 2d array of some type that
>>> was
>>> as large as the function in 1d and twice as large as the largest int used
>>> in
>>> the other dimension and not overloaded tree-cst and then had a set of
>>> friend
>>> functions in double int to get in and out. Of course you do not need
>>> friends
>>> in double int because the rep is exposed, but in wide-int that is now
>>> hidden
>>> since it now is purely functional.
>>>
>>> I just have to believe that there is a better way to do tree-vrp than
>>> messing up wide-int for the rest of the compiler.
>>
>> It's not "messing up", it's making wide-int a generally useful thing and
>> not tying it so closely to RTL.
>
> Again, this is unfair.
>
>>
>>>>> 3) your trick will work at the tree level, but not at the rtl level.
>>>>> The
>>>>> wide-int code cannot share storage with the CONST_INTs. We tried
>>>>> this,
>>>>> and there are a million bugs that would have to be fixed to make it
>>>>> work.
>>>>> It could have worked if CONST_INTs had carried a mode around, but since
>>>>> they
>>>>> do not, you end up with the same CONST_INT sharing the rep for several
>>>>> different types and that just did not work unless you are willing to do
>>>>> substantial cleanups.
>>>>
>>>> I don't believe you. I am only asking for the adaptors to tree and RTL
>>>> to
>>>> work in an RVALUE-ish way (no modification, as obviously RTL and tree
>>>> constants may be shared). I think your claim is because you have that
>>>> precision and bitsize members in your wide-int which I believe is a
>>>> design red herring. I suppose we should concentrate on addressing that
>>>> one first. Thus, let me repeat a few questions on your design to
>>>> eventually
>>>> let you understand my concerns:
>>>>
>>>> Given two wide-ints, a and b, what precision will the result of
>>>> a + b
>>>> have? Consider a having precision 32 and b having precision 64
>>>> on a 32-bit HWI host.
>>>>
>>>> You define wide-int to have no sign information:
>>>>
>>>> + The representation does not contain any information about
>>>> + signedness of the represented value, so it can be used to represent
>>>> + both signed and unsigned numbers. For operations where the results
>>>> + depend on signedness (division, comparisons), the signedness must
>>>> + be specified separately. For operations where the signness
>>>> + matters, one of the operands to the operation specifies either
>>>> + wide_int::SIGNED or wide_int::UNSIGNED.
>>>>
>>>> but the result of operations on mixed precision operands _does_ depend
>>>> on the sign, nevertheless most operations do not get a signedness
>>>> argument.
>>>> Nor would that help, because it depends on the signedness of the
>>>> individual
>>>> operands!
>>>>
>>>> double-int get's around this by having a common "precision" to which
>>>> all smaller precision constants have to be sign-/zero-extended. So
>>>> does CONST_INT and CONST_DOUBLE.
>>>>
>>>> Note that even with same precision you have introduced the same problem
>>>> with the variable len.
>>>>
>>>> My proposal is simple - all wide-ints are signed! wide-int is basically
>>>> an arbitrary precision signed integer format. The sign is encoded in
>>>> the most significant bit of the last active HWI of the representation
>>>> (val[len - 1] & (1 << HOST_BITS_PER_WIDE_INT - 1)). All values
>>>> with less precision than len * HOST_BITS_PER_WIDE_INT are
>>>> properly sign-/zero-extended to precision len * HOST_BITS_PER_WIDE_INT.
>>>>
>>>> This let's you define mixed len operations by implicitely
>>>> sign-/zero-extending
>>>> the operands to whatever len is required for the operation.
>>>>
>>>> Furthermore it allows you to get rid of the precision member (and
>>>> bitsize anyway).
>>>> Conversion from tree / RTL requires information on the signedness of the
>>>> input (trivial for tree, for RTL all constants are signed - well,
>>>> sign-extended).
>>>> Whenever you want to transfer the wide-int to tree / RTL you have to
>>>> sign-/zero-extend according to the desired precision. If you need sth
>>>> else
>>>> than arbitrary precision arithmetic you have to explicitely truncate /
>>>> extend
>>>> at suitable places - with overflow checking being trivial here. For
>>>> optimization
>>>> purposes selected operations may benefit from a combined implementation
>>>> receiving a target precision and signedness. Whatever extra meta-data
>>>> RTL requires does not belong in wide-int but in the RTX. Trivially
>>>> a mode comes to my mind (on tree we have a type), and trivially
>>>> each RTX has a mode. And each mode has a precision and bitsize.
>>>> It lacks a sign, so all RTL integer constants are sign-extended for
>>>> encoding efficiency purposes. mixed-mode operations will not
>>>> occur (mixed len operations will), mixed-mode ops are exclusively
>>>> sign-/zero-extensions and truncations.
>>>>
>>>> Representation of (unsigned HOST_WIDE_INT)-1 would necessarily
>>>> be { 0, (unsigned HOST_WIDE_INT)-1 }, representation of -1 in any
>>>> precision would be { -1 }.
>>>>
>>>> That was my proposal. Now, can you please properly specify yours?
>>
>> And you chose to not answer that fundamental question of how your
>> wide-int is _supposed_ to work? Ok, maybe I shouldn't have distracted
>> you with the bits before this.
>
> sorry, i missed this question by accident.
>
> we did not do infinite precision by design. We looked at the set of
> programming languages that gcc either compiles or might compile and we
> looked at the set of machines that gcc targets and neither of these two sets
> of entities define their operations in terms of infinite precision math.
> They always do math within a particular precision. Scripting languages do
> use infinite precision but gcc does not compile any of them. So unless you
> are going to restrict your set of operations to those that satisfy the
> properties of a ring, it is generally not strictly safe to do your math in
> infinite precision.
>
> we do all of the math in the precision defined by the types or modes that
> are passed in.
>
> constants are defined to be the size of the precision unless they can be
> compressed. The len field tells how many words are actually needed to
> exactly represent the constant if (len-1)*HOST_WIDE_BITS_PER_WIDE_INT is
> less than the precision. The decompression process is to add a number of
> words to the high order side of the number. These words must contain a
> zero if the highest represented bit is 0 and -1 if the highest represented
> bit is 1.
So the compressed representation is signed. Please document that.
> i.e. this looks a lot like sign extension, but the numbers them selves are
> not inherently signed or unsigned as in your representation. It is up to
> the operators to imply the signess. So the unsigned multiply operation is
> different than the signed multiply operation. But these are applied to
> the bits without an interpretation of their sign.
Which means that the inconsistency of ops getting a precision argument
vs. arbitrarily taking the precision of the first operand is a design bug.
Note that with your encoding scheme as far as I understand the "sign"
of { -1 } depends on the precision. If the precision is less than that of HWI
then { -1 } is signed, if it is exactly the precision of HWI the sign is
unspecified, it could be -1U or -1 (as far as I understand it is not an
invalid encoding of -1U?), if it has bigger precision than that of HWI
then the sign is unspecified (it could be -1U or -1 in larger precision)?
"Unspecified sign" means that you cannot combine two operands with
different 'len' / 'precision' without knowing the sign of the operands you
need to extend. Note that knowing the sign of the operation is not enough.
So I believe that for correctness you need to assert at least
gcc_assert that operands have the same precision - as you assume
here for example:
+wide_int
+wide_int::operator + (const wide_int &op1) const
+{
+ wide_int result;
+
+ if (precision <= HOST_BITS_PER_WIDE_INT)
+ {
+ result.len = 1;
+ result.bitsize = bitsize;
+ result.precision = precision;
+ result.val[0] = val[0] + op1.val[0];
+ if (precision < HOST_BITS_PER_WIDE_INT)
+ result.val[0] = sext_hwi (result.val[0], precision);
+ }
which is clearly wrong if op1.precision != precision. It also explains
why wide_int::one needs a precision ... (I hope you got rid of the
overloads)
I'm not sure to what extent you got rid of the precision/bitsize taking
functions like
+ inline wide_int lshift (unsigned int y, ShiftOp z, unsigned int bitsize,
+ unsigned int precision) const;
+ wide_int popcount (unsigned int bitsize, unsigned int prec) const;
already. They don't make sense as *this already has a precision.
Btw, my proposal would be to get rid of 'precision', as 'precision' is
not needed to interpret the encoding as I would define it (the MSB
would _always_ be a sign-bit) and as you say the operation
has a sign. Truncating to a desired target precision can be done
after the fact or as part of the operation. You already have to
deal with extension as even with equal precision operands can
have a different len.
I'd like to see the current state of wide-int somewhere btw., given
you've done significant changes since the last post.
Thanks,
Richard.
>
>> Richard.
>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>> On 04/02/2013 11:04 AM, Richard Biener wrote:
>>>>>>
>>>>>> On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> This patch contains a large number of the changes requested by Richi.
>>>>>>> It
>>>>>>> does not contain any of the changes that he requested to abstract the
>>>>>>> storage layer. That suggestion appears to be quite unworkable.
>>>>>>
>>>>>> I of course took this claim as a challenge ... with the following
>>>>>> result.
>>>>>> It is
>>>>>> of course quite workable ;)
>>>>>>
>>>>>> The attached patch implements the core wide-int class and three
>>>>>> storage
>>>>>> models (fixed size for things like plain HWI and double-int, variable
>>>>>> size
>>>>>> similar to how your wide-int works and an adaptor for the double-int
>>>>>> as
>>>>>> contained in trees). With that you can now do
>>>>>>
>>>>>> HOST_WIDE_INT
>>>>>> wi_test (tree x)
>>>>>> {
>>>>>> // template argument deduction doesn't do the magic we want it to
>>>>>> do
>>>>>> // to make this kind of implicit conversions work
>>>>>> // overload resolution considers this kind of conversions so we
>>>>>> // need some magic that combines both ... but seeding the
>>>>>> overload
>>>>>> // set with some instantiations doesn't seem to be possible :/
>>>>>> // wide_int<> w = x + 1;
>>>>>> wide_int<> w;
>>>>>> w += x;
>>>>>> w += 1;
>>>>>> // template argument deduction doesn't deduce the return value
>>>>>> type,
>>>>>> // not considering the template default argument either ...
>>>>>> // w = wi (x) + 1;
>>>>>> // we could support this by providing rvalue-to-lvalue promotion
>>>>>> // via a traits class?
>>>>>> // otoh it would lead to sub-optimal code anyway so we should
>>>>>> // make the result available as reference parameter and only
>>>>>> support
>>>>>> // wide_int <> res; add (res, x, 1); ?
>>>>>> w = wi (x).operator+<wide_int<> >(1);
>>>>>> wide_int<>::add(w, x, 1);
>>>>>> return w.to_hwi ();
>>>>>> }
>>>>>>
>>>>>> we are somewhat limited with C++ unless we want to get really fancy.
>>>>>> Eventually providing operator+ just doesn't make much sense for
>>>>>> generic wide-int combinations (though then the issue is its operands
>>>>>> are no longer commutative which I think is the case with your wide-int
>>>>>> or double-int as well - they don't suport "1 + wide_int" for obvious
>>>>>> reasons).
>>>>>>
>>>>>> So there are implementation design choices left undecided.
>>>>>>
>>>>>> Oh, and the operation implementations are crap (they compute
>>>>>> nonsense).
>>>>>>
>>>>>> But you should get the idea.
>>>>>>
>>>>>> Richard.
>>>>>
>>>>>
>