Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-22 Thread Richard Biener
On Sun, Apr 21, 2013 at 10:54 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:
 Richard,

 i pulled these two frags out of your comments because i wanted to get some
 input from you on it while i addressed the other issues you raised.


 +  enum SignOp {
 +/* Many of the math functions produce different results depending
 +   on if they are SIGNED or UNSIGNED.  In general, there are two
 +   different functions, whose names are prefixed with an 'S and
 +   or an 'U'.  However, for some math functions there is also a
 +   routine that does not have the prefix and takes an SignOp
 +   parameter of SIGNED or UNSIGNED.  */
 +SIGNED,
 +UNSIGNED
 +  };

 You seem to insist on that.  It should propagate to the various parts
 of the compiler that have settled for the 'uns' integer argument.
 Having one piece behave different is just weird.  I suppose I will
 find code like

  wi.ext (prec, uns ? UNSIGNED : SIGNED)


 there is a lot more flexibility on my part than you perceive with respect to
 this point.   My primary issue is that i do not want to have is an interface
 that has 0 and 1 as a programmer visible part.   Beyond that i am open to
 suggestion.

 The poster child of my hate are the host_integer_p and the tree_low_cst
 interfaces. I did not want the wide int stuff to look like these.  I see
 several problems with these:

 1) of the 314 places where tree_low_cst is called in the gcc directory (not
 the subdirectories where the front ends live), NONE of the calls have a
 variable second parameter.   There are a handful of places, as one expects,
 in the front ends that do, but NONE in the middle end.
 2) there are a small number of the places where host_integer_p is called
 with one parameter and then it is followed by a call to tree_low_cst that
 has the value with the other sex.   I am sure these are mistakes, but having
 the 0s and 1s flying around does not make it easy to spot them.
 3) tree_low_cst implies that the tree cst has only two hwis in it.

 While i do not want to propagate an interface with 0 and 1 into wide-int, i
 can understand your dislike of having a wide-int only solution for this.

 I will point out that for your particular example, uns is almost always set
 by a call to TYPE_UNSIGNED.  There could easily be a different type accessor
 that converts this part of the type to the right thing to pass in here.   I
 think that there is certainly some place for there to be a unified SYMBOLIC
 api that controls the signedness everywhere in the compiler.

 I would like to move toward this direction, but you have been so negative to
 the places where i have made it convenient to directly convert from tree or
 rtl into or out of wide-int that i have hesitated to do something that
 directly links trees and wide-int. So i would like to ask you what would
 like?

Ideally I'd like the wide-int introduction to _not_ be the introduction of
a unified symbolic way that controls signedness.  We do have two
kinds of interfaces currently - one that uses different API entries,
like build_int_cstu vs. build_int_cst or double_int::from_shwi vs. from_uhwi,
and one that uses the aforementioned integer flag 'uns' with 0 being
signed and 1 being unsigned.

I think the _uhwi vs. _shwi and _cstu variants are perfectly fine
(but only for compile-time constant uses as you say), and the wide-int
interface makes use of this kind, too.

Proposing a better API for the 'uns' flag separately from wide-int would
be a better way to get anybody else than me chime in (I have the feeling
that the wide-int series seems to scare off every other reviewer besides me...).
I can live with the SIGNED/UNSIGNED enum, but existing APIs should
be changed to use that.

For wide-int I suggest to go the route you don't want to go.  Stick to
existing practice and use the integer 'uns' flag.  It's as good as
SIGNED/UNSIGNED for _variable_ cases (and yes, a lot less descriptive
for constant cases).  For wide-int, always add a static interface
if there is a variable one and convert variable uses to the proper static
interface.

That said, a lot of my pushback is because I feel a little lonesome in this
wide-int review and don't want to lone-some decide about that (generic)
interface part as well.

 +  template typename T
 +inline bool gt_p (T c, SignOp sgn) const;
 +  template typename T
 +inline bool gts_p (T c) const;
 +  template typename T
 +inline bool gtu_p (T c) const;

 it's bad that we can't use the sign information we have available in
 almost
 all cases ... (where precision is not an exact multiple of
 HOST_BITS_PER_WIDE_INT
 and len == precision / HOST_BITS_PER_WIDE_INT).  It isn't hard to encode
 a sign - you just have to possibly waste a word of zeroes for positive
 values where at the moment precision is an exact multiple of
 HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT.
 Which of course means that the encoding can be one word larger than
 maximally 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-22 Thread Kenneth Zadeck

On 04/19/2013 09:31 AM, Richard Biener wrote:

+   number of elements of the vector that are in use.  When LEN *
+   HOST_BITS_PER_WIDE_INT  the precision, the value has been
+   compressed.  The values of the elements of the vector greater than
+   LEN - 1. are all equal to the highest order bit of LEN.

equal to the highest order bit of element LEN - 1. ?

Fixed, you are correct.

I have gone thru the entire wide-int patch to clean this up.   The 
bottom line is that if the precision is not a multiple of the size of a 
HWI then everything above that precision is assumed to be identical to 
the sign bit.



Especially _not_ equal to the precision - 1 bit of the value, correct?
I do not understand your question here, because in the case talked about 
above, the bit at precision - 1 would not have been explicitly represented.


Anyway,  i went thru this top part carefully and made many things clearer.

+   The representation does not contain any information inherant about
+   signedness of the represented value, so it can be used to represent
+   both signed and unsigned numbers.   For operations where the results
+   depend on signedness (division, comparisons), the signedness must
+   be specified separately.  For operations where the signness
+   matters, one of the operands to the operation specifies either
+   wide_int::SIGNED or wide_int::UNSIGNED.

The last sentence is somehow duplicated.

fixed


+   The numbers are stored as sign entended numbers as a means of
+   compression.  Leading HOST_WIDE_INTS that contain strings of either
+   -1 or 0 are removed as long as they can be reconstructed from the
+   top bit that is being represented.

I'd put this paragraph before the one that talks about signedness, next
to the one that already talks about encoding.

done

+   All constructors for wide_int take either a precision, an enum
+   machine_mode or tree_type.  */

That's probably no longer true (I'll now check).

yes you are correct


+class wide_int {
+  /* Internal representation.  */
+
+  /* VAL is set to a size that is capable of computing a full
+ multiplication on the largest mode that is represented on the
+ target.  The full multiplication is use by tree-vrp.  tree-vpn
+ currently does a 2x largest mode by 2x largest mode yielding a 4x
+ largest mode result.  If operations are added that require larger
+ buffers, then VAL needs to be changed.  */
+  HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
+  unsigned short len;
+  unsigned int precision;

I wonder if there is a technical reason to stick to HOST_WIDE_INTs?
I'd say for efficiency HOST_WIDEST_FAST_INT would be more appropriate
(to get a 32bit value on 32bit x86 for example).  I of course see that
conversion to/from HOST_WIDE_INT is an important operation
that would get slightly more complicated.

Maybe just quickly checking the code generated on 32bit x86 for
HOST_WIDE_INT vs. HOST_WIDEST_FAST_INT tells us whether
it's worth considering (it would be bad if each add/multiply would
end up calling to libgcc for example - I know that doesn't happen
for x86, but maybe it would happen for an arm hosted gcc
targeting x86_64?)
This is an interesting point.   my guess is that it is unlikely to be 
worth the work.
consider add:most machines have add with carry and well written 32 
bit ports would have used an add with carry sequence rather than making 
the libcall.   If i rewrite wide-int in terms of host_fastest_int, then 
i have to do some messy code to compute the carry which is unlikely to 
translate into the proper carry instructions.   Not to mention the cost 
overhead of converting to and from HFI given that gcc is written almost 
entirely using HWIs.


I thought about the possible idea of just converting the mul and div 
functions.   This would be easy because i already reblock them into 
HOST_WIDE_HALF_INTs to do the math.I could just do a different 
reblocking.   However, i think that it is unlikely that doing this would 
ever show up on anyone's performance counts.   Either way you do the 
same number of multiply instructions, it is just the subroutine wrapper 
that could possibly go away.



+  enum ShiftOp {
+NONE,
+/* There are two uses for the wide-int shifting functions.  The
+   first use is as an emulation of the target hardware.  The
+   second use is as service routines for other optimizations.  The
+   first case needs to be identified by passing TRUNC as the value
+   of ShiftOp so that shift amount is properly handled according to the
+   SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
+   amount is always truncated by the bytesize of the mode of
+   THIS.  */
+TRUNC
+  };

double-int simply honors SHIFT_COUNT_TRUNCATED.  Why differ
from that (and thus change behavior in existing code - not sure if you
do that with introducing wide-int)?
I believe that GCC is supposed to be a little schizophrenic here, at 
least according to the doc.when it is doing 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-22 Thread Richard Sandiford
Richard Biener richard.guent...@gmail.com writes:
 At the rtl level your idea does not work.   rtl constants do not have a mode
 or type.

 Which is not true and does not matter.  I tell you why.  Quote:

It _is_ true, as long as you read rtl constants as rtl integer constants :-)

 +#if TARGET_SUPPORTS_WIDE_INT
 +
 +/* Match CONST_*s that can represent compile-time constant integers.  */
 +#define CASE_CONST_SCALAR_INT \
 +   case CONST_INT: \
 +   case CONST_WIDE_INT

 which means you are only replacing CONST_DOUBLE with wide-int.
 And _all_ CONST_DOUBLE have a mode.  Otherwise you'd have no
 way of creating the wide-int in the first place.

No, integer CONST_DOUBLEs have VOIDmode, just like CONST_INT.
Only floating-point CONST_DOUBLEs have a real mode.

 I understand that this makes me vulnerable to the argument that we should
 not let the rtl level ever dictate anything about the tree level, but the
 truth is that a variable len rep is almost always used for big integers.
 In our code, most constants of large types are small numbers.   (Remember i
 got into this because the tree constant prop thinks that left shifting any
 number by anything greater than 128 is always 0 and discovered that that was
 just the tip of the iceberg.) But mostly i support the decision to canonize
 numbers to the smallest number of HWIs because most of the algorithms to do
 the math can be short circuited.I admit that if i had to effectively
 unpack most numbers to do the math, that the canonization would be a waste.
 However, this is not really relevant to this conversation.   Yes, you could
 get rid of the len, but this such a small part of picture.

 Getting rid of 'len' in the RTX storage was only a question of whether it
 is an efficient way to go forward.  And with considering to unify
 CONST_INTs and CONST_WIDE_INTs it is not.  And even for CONST_WIDE_INTs
 (which most of the time would be 2 HWI storage, as otherwise you'd use
 a CONST_INT) it would be an improvement.

FWIW, I don't really see any advantage in unifying CONST_INT and
CONST_WIDE_INT, for the reasons Kenny has already given.  CONST_INT
can represent a large majority of the integers and it is already a
fairly efficient representation.

It's more important that we don't pick a design that forces one
choice or the other.  And I think Kenny's patch achieves that goal,
because the choice is hidden behind macros and behind the wide_int
interface.

Thanks,
Richard


Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-22 Thread Richard Biener
Richard Sandiford rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:
 At the rtl level your idea does not work.   rtl constants do not
have a mode
 or type.

 Which is not true and does not matter.  I tell you why.  Quote:

It _is_ true, as long as you read rtl constants as rtl integer
constants :-)

 +#if TARGET_SUPPORTS_WIDE_INT
 +
 +/* Match CONST_*s that can represent compile-time constant integers.
 */
 +#define CASE_CONST_SCALAR_INT \
 +   case CONST_INT: \
 +   case CONST_WIDE_INT

 which means you are only replacing CONST_DOUBLE with wide-int.
 And _all_ CONST_DOUBLE have a mode.  Otherwise you'd have no
 way of creating the wide-int in the first place.

No, integer CONST_DOUBLEs have VOIDmode, just like CONST_INT.
Only floating-point CONST_DOUBLEs have a real mode.

I stand corrected. Now that's one more argument for infinite precision 
constants, as the mode is then certainly provided by the operations similar to 
the sign. That is, the mode (or size, or precision) of 1 certainly does not 
matter.

 I understand that this makes me vulnerable to the argument that we
should
 not let the rtl level ever dictate anything about the tree level,
but the
 truth is that a variable len rep is almost always used for big
integers.
 In our code, most constants of large types are small numbers.  
(Remember i
 got into this because the tree constant prop thinks that left
shifting any
 number by anything greater than 128 is always 0 and discovered that
that was
 just the tip of the iceberg.) But mostly i support the decision to
canonize
 numbers to the smallest number of HWIs because most of the
algorithms to do
 the math can be short circuited.I admit that if i had to
effectively
 unpack most numbers to do the math, that the canonization would be a
waste.
 However, this is not really relevant to this conversation.   Yes,
you could
 get rid of the len, but this such a small part of picture.

 Getting rid of 'len' in the RTX storage was only a question of
whether it
 is an efficient way to go forward.  And with considering to unify
 CONST_INTs and CONST_WIDE_INTs it is not.  And even for
CONST_WIDE_INTs
 (which most of the time would be 2 HWI storage, as otherwise you'd
use
 a CONST_INT) it would be an improvement.

FWIW, I don't really see any advantage in unifying CONST_INT and
CONST_WIDE_INT, for the reasons Kenny has already given.  CONST_INT
can represent a large majority of the integers and it is already a
fairly efficient representation.

It's more important that we don't pick a design that forces one
choice or the other.  And I think Kenny's patch achieves that goal,
because the choice is hidden behind macros and behind the wide_int
interface.

Not unifying const-int and double-int in the end would be odd.

Richard.

Thanks,
Richard




Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-22 Thread Richard Sandiford
Richard Biener richard.guent...@gmail.com writes:
 Richard Sandiford rdsandif...@googlemail.com wrote:
Richard Biener richard.guent...@gmail.com writes:
 At the rtl level your idea does not work.   rtl constants do not
have a mode
 or type.

 Which is not true and does not matter.  I tell you why.  Quote:

It _is_ true, as long as you read rtl constants as rtl integer
constants :-)

 +#if TARGET_SUPPORTS_WIDE_INT
 +
 +/* Match CONST_*s that can represent compile-time constant integers.
 */
 +#define CASE_CONST_SCALAR_INT \
 +   case CONST_INT: \
 +   case CONST_WIDE_INT

 which means you are only replacing CONST_DOUBLE with wide-int.
 And _all_ CONST_DOUBLE have a mode.  Otherwise you'd have no
 way of creating the wide-int in the first place.

No, integer CONST_DOUBLEs have VOIDmode, just like CONST_INT.
Only floating-point CONST_DOUBLEs have a real mode.

 I stand corrected. Now that's one more argument for infinite precision
 constants, as the mode is then certainly provided by the operations
 similar to the sign. That is, the mode (or size, or precision) of 1
 certainly does not matter.

I disagree.  Although CONST_INT and CONST_DOUBLE don't _store_ a mode,
they are always interpreted according to a particular mode.  It's just
that that mode has to be specified separately.  That's why so many
rtl functions have (enum machine_mode, rtx) pairs.

Infinite precision seems very alien to rtl, where everything is
interpreted according to a particular mode (whether that mode is
stored in the rtx or not).

For one thing, I don't see how infinite precision could work in an
environment where signedness often isn't defined.  E.g. if you optimise
an addition of two rtl constants, you don't know (and aren't supposed
to know) whether the values involved are signed or unsigned.  With
fixed-precision arithmetic it doesn't matter, because both operands must
have the same precision, and because bits outside the precision are not
significant.  With infinite precision arithmetic, the choice carries
over to the next operation.  E.g., to take a 4-bit example, you don't
know when constructing a wide_int from an rtx whether 0b1000 represents
8 or -8.  But if you have no precision to say how many bits are significant,
you have to pick one.  Which do you choose?  And why should we have to
make a choice at all?  (Note that this is a different question to
whether the internal wide_int representation is sign-extending or not,
which is purely an implementation detail.  The same implementation
principle applies to CONST_INTs: the HWI in a CONST_INT is always
sign-extended from the msb of the represented value, although of course
the CONST_INT itself doesn't tell you which bit the msb is; that has to
be determined separately.)

A particular wide_int isn't, and IMO shouldn't be, inherently signed
or unsigned.  The rtl model is that signedness is a question of
interpretation rather than representation.  I realise trees are
different, because signedness is a property of the type rather
than operations on the type, but I still think fixed precision
works with both tree and rtl whereas infinite precision doesn't
work with rtl.

I also fear there are going to be lots of bugs where we forget to
truncate the result of an N-bit operation from infinite precision
to N bits before using it in the next operation (as per Kenny's ring
explanation).  With finite precision, and with all-important asserts
that the operands have consistent precisions, we shouldn't have any
hidden bugs like that.

If there are parts of gcc that really want to do infinite-precision
arithmetic, mpz_t ought to be as good as anything.

Thanks,
Richard


Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-22 Thread Kenneth Zadeck

On 04/22/2013 08:20 AM, Richard Biener wrote:



That said, a lot of my pushback is because I feel a little lonesome in this
wide-int review and don't want to lone-some decide about that (generic)
interface part as well.
yeh,  now sandiford is back from vacation so there are two of us to beat 
on you about your

how bad it would be to do infinite precision!!!

be careful what you wish for.

kenny


Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-21 Thread Kenneth Zadeck

Richard,

i pulled these two frags out of your comments because i wanted to get 
some input from you on it while i addressed the other issues you raised.



+  enum SignOp {
+/* Many of the math functions produce different results depending
+   on if they are SIGNED or UNSIGNED.  In general, there are two
+   different functions, whose names are prefixed with an 'S and
+   or an 'U'.  However, for some math functions there is also a
+   routine that does not have the prefix and takes an SignOp
+   parameter of SIGNED or UNSIGNED.  */
+SIGNED,
+UNSIGNED
+  };

You seem to insist on that.  It should propagate to the various parts
of the compiler that have settled for the 'uns' integer argument.
Having one piece behave different is just weird.  I suppose I will
find code like

 wi.ext (prec, uns ? UNSIGNED : SIGNED)


there is a lot more flexibility on my part than you perceive with 
respect to this point.   My primary issue is that i do not want to have 
is an interface that has 0 and 1 as a programmer visible part.   Beyond 
that i am open to suggestion.


The poster child of my hate are the host_integer_p and the tree_low_cst 
interfaces. I did not want the wide int stuff to look like these.  I see 
several problems with these:


1) of the 314 places where tree_low_cst is called in the gcc directory 
(not the subdirectories where the front ends live), NONE of the calls 
have a variable second parameter.   There are a handful of places, as 
one expects, in the front ends that do, but NONE in the middle end.
2) there are a small number of the places where host_integer_p is called 
with one parameter and then it is followed by a call to tree_low_cst 
that has the value with the other sex.   I am sure these are mistakes, 
but having the 0s and 1s flying around does not make it easy to spot them.

3) tree_low_cst implies that the tree cst has only two hwis in it.

While i do not want to propagate an interface with 0 and 1 into 
wide-int, i can understand your dislike of having a wide-int only 
solution for this.


I will point out that for your particular example, uns is almost always 
set by a call to TYPE_UNSIGNED.  There could easily be a different type 
accessor that converts this part of the type to the right thing to pass 
in here.   I think that there is certainly some place for there to be a 
unified SYMBOLIC api that controls the signedness everywhere in the 
compiler.


I would like to move toward this direction, but you have been so 
negative to the places where i have made it convenient to directly 
convert from tree or rtl into or out of wide-int that i have hesitated 
to do something that directly links trees and wide-int. So i would like 
to ask you what would like?





+  template typename T
+inline bool gt_p (T c, SignOp sgn) const;
+  template typename T
+inline bool gts_p (T c) const;
+  template typename T
+inline bool gtu_p (T c) const;

it's bad that we can't use the sign information we have available in almost
all cases ... (where precision is not an exact multiple of
HOST_BITS_PER_WIDE_INT
and len == precision / HOST_BITS_PER_WIDE_INT).  It isn't hard to encode
a sign - you just have to possibly waste a word of zeroes for positive
values where at the moment precision is an exact multiple of
HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT.
Which of course means that the encoding can be one word larger than
maximally required by 'precision'.

Going back to point 1 above,   the front ends structure the middle end 
code where (generally) the sign that is used is encoded in the operator 
that one is looking at.So the majority of uses in the middle end 
this fall into the second or third templates and the first template is 
there as a convenience routine for the middle ends.

The front ends certainly use the first template.

This is how the rtl level has survived so long without a sign bit in the 
modes, the operators tell the whole story.   The truth is that in the 
middle end, the story is the same - it is the operators (most of the 
time) that drive the calls being made.


There is an assumption that you are making that i certainly do not 
believe is true in the backends and i kind of doubt is true in the 
middle ends.   That is that the sign of the compare ALWAYS matches the 
sign of the operands.   Given that i have never seen any code that 
verifies this in the middle end, i am going to assume that it is not 
true, because it is always true in gcc that anything that we do not 
explicitly verify generally turns out to only be generally true and you 
can spend your life tracking down the end cases.  This is a needless 
complication.
At the rtl level, this is completely doomed by the GEN_INT which neither 
takes a mode or an indication of sign.   To assume that there is any 
meaningful sign information there is a horror story waiting to be 
written (sure what could go wrong if we go into the old house?  whats 
that 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-19 Thread Richard Biener
On Tue, Apr 16, 2013 at 10:07 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:
 Richard,

 I made major changes to wide-int along the lines you suggested. Each of the
 binary operations is now a template.
 There are 5 possible implementations of those operations, one for each of
 HWI, unsigned HWI, wide-int, rtl, and tree.   Note that this is not exactly
 as you suggested, but it is along the same lines.

 The HWI template sign extends the value to the precision of the first
 operand, the unsigned HWI is the same except that it is an unsigned
 extension.   The wide-int version is used as before, but is in truth rarely
 used.  The rtl and tree logically convert the value to a wide-int but in
 practice do something more efficient than converting to the wide-int.   What
 they do is look inside the rtl or the tree and pass a pointer to the data
 and a length to the binary operation.  This is perfectly safe in the
 position of a second operand to the binary operation because the lifetime is
 guaranteed to be very short.  The wide-int implementation was also modified
 to do the same pointer trick allowing all 5 templates to share the same use
 of the data.

 Note that currently the tree code is more crufty than one would like.   This
 will clean up nicely when the tree-cst is changed to represent the value
 with an array and a length field.

 So now, at least for the second operand of binary operations, the storage is
 never copied.I do not believe that there is a good similar trick for the
 first operand.  i did not consider something like wide_int::add (a, b) to be
 a viable option; it seems to mis the point of using an object oriented
 language.   So I think that you really have to copy the data into an
 instance of a wide int.

 However, while all of this avoids ever having to pass a precision into the
 second operand, this patch does preserve the finite math implementation of
 wide-int.Finite math is really what people expect an optimizer to do,
 because it seamlessly matches what the machine is going to do.

 I hope at this point, i can get a comprehensive review on these patches.   I
 believe that I have done what is required.

 There are two other patches that will be submitted in the next few minutes.
 The first one is an updated version of the rtl level patch.   The only
 changes from what you have seen before are that the binary operations now
 use the templated binary operations.  The second one is the first of the
 tree level patches.   It converts builtins.c to use both use wide-int and it
 removes all assumptions that tree-csts are built with two HWIs.

 Once builtins.c is accepted, i will convert the rest of the middle end
 patches.   They will all be converted in a similar way.

+   number of elements of the vector that are in use.  When LEN *
+   HOST_BITS_PER_WIDE_INT  the precision, the value has been
+   compressed.  The values of the elements of the vector greater than
+   LEN - 1. are all equal to the highest order bit of LEN.

equal to the highest order bit of element LEN - 1. ?

Especially _not_ equal to the precision - 1 bit of the value, correct?

+   The representation does not contain any information inherant about
+   signedness of the represented value, so it can be used to represent
+   both signed and unsigned numbers.   For operations where the results
+   depend on signedness (division, comparisons), the signedness must
+   be specified separately.  For operations where the signness
+   matters, one of the operands to the operation specifies either
+   wide_int::SIGNED or wide_int::UNSIGNED.

The last sentence is somehow duplicated.

+   The numbers are stored as sign entended numbers as a means of
+   compression.  Leading HOST_WIDE_INTS that contain strings of either
+   -1 or 0 are removed as long as they can be reconstructed from the
+   top bit that is being represented.

I'd put this paragraph before the one that talks about signedness, next
to the one that already talks about encoding.

+   All constructors for wide_int take either a precision, an enum
+   machine_mode or tree_type.  */

That's probably no longer true (I'll now check).

+class wide_int {
+  /* Internal representation.  */
+
+  /* VAL is set to a size that is capable of computing a full
+ multiplication on the largest mode that is represented on the
+ target.  The full multiplication is use by tree-vrp.  tree-vpn
+ currently does a 2x largest mode by 2x largest mode yielding a 4x
+ largest mode result.  If operations are added that require larger
+ buffers, then VAL needs to be changed.  */
+  HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
+  unsigned short len;
+  unsigned int precision;

I wonder if there is a technical reason to stick to HOST_WIDE_INTs?
I'd say for efficiency HOST_WIDEST_FAST_INT would be more appropriate
(to get a 32bit value on 32bit x86 for example).  I of course see that
conversion to/from HOST_WIDE_INT is an important operation
that would get slightly more 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-08 Thread Richard Biener
On Fri, Apr 5, 2013 at 2:34 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 Richard,

 There has been something that has bothered me about you proposal for the
 storage manager and i think i can now characterize that problem.  Say i want
 to compute the expression

 (a + b) / c

 converting from tree values, using wide-int as the engine and then storing
 the result in a tree.   (A very common operation for the various simplifiers
 in gcc.)

 in my version of wide-int where there is only the stack allocated fix size
 allocation for the data, the compiler arranges for 6 instances of wide-int
 that are statically allocated on the stack when the function is entered.
 There would be 3 copies of the precision and data to get things started and
 one allocation variable sized object at the end when the INT_CST is built
 and one copy to put it back.   As i have argued, these copies are of
 negligible size.

 In your world, to get things started, you would do 3 pointer copies to get
 the values out of the tree to set the expression leaves but then you will
 call the allocator 3 times to get space to hold the intermediate nodes
 before you get to pointer copy the result back into the result cst which
 still needs an allocation to build it. I am assuming that we can play the
 same game at the tree level that we do at the rtl level where we do 1
 variable sized allocation to get the entire INT_CST rather than doing 1
 fixed sized allocation and 1 variable sized one.

 even if we take the simpler example of a + b, you still loose.   The cost of
 the extra allocation and it's subsequent recovery is more than my copies.
 In fact, even in the simplest case of someone going from a HWI thru wide_int
 into tree, you have 2 allocations vs my 1.

Just to clarify, my code wouldn't handle

  tree a, b, c;
  tree res = (a + b) / c;

transparently.  The most complex form of the above that I think would
be reasonable to handle would be

  tree a, b, c;
  wide_int wires = (wi (a) + b) / c;
  tree res = build_int_cst (TREE_TYPE (a), wires);

and the code as posted would even require you to specify the
return type of operator+ and operator/ explicitely like

 wide_int wires = (wi (a).operator+wi_embed_var
(b)).operator/wi_embed_var (c);

but as I said I just didn't bother to decide that the return type is
always of wide_int variable-len-storage kind.

Now, the only real allocation that happens is done by build_int_cst.
There is one wide_int on the stack to hold the a + b result and one
separate wide_int to hold wires (it's literally written in the code).
There are no pointer copies involved in the end - the result from
converting a tree to a wide_inttree-storage is the original 'tree'
pointer itself, thus a register.

 I just do not see the cost savings and if there are no cost savings, you
 certainly cannot say that having these templates is simpler than not having
 the templates.

I think you are missing the point - by abstracting away the storage
you don't necessarily need to add the templates.  But you open up
a very easy route for doing so and you make the operations _trivially_
work on the tree / RTL storage with no overhead in generated code
and minimal overhead in the amount of code in GCC itself.  In my
prototype the overhead of adding 'tree' support is to place

class wi_tree_int_cst
{
  tree cst;
public:
  void construct (tree c) { cst = c; }
  const HOST_WIDE_INT *storage() const { return reinterpret_cast
HOST_WIDE_INT *(TREE_INT_CST (cst)); }
  unsigned len() const { return 2; }
};

template 
class wi_traits tree
{
public:
typedef wide_int wi_tree_int_cst wi_t;
wi_traits(tree t)
  {
wi_tree_int_cst ws;
ws.construct (t);
w.construct (ws);
  }
wi_t* operator-() { return w; }
private:
wi_t w;
};

into tree.h.

Richard.

 Kenny


 On 04/02/2013 11:04 AM, Richard Biener wrote:

 On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 This patch contains a large number of the changes requested by Richi.
 It
 does not contain any of the changes that he requested to abstract the
 storage layer.   That suggestion appears to be quite unworkable.

 I of course took this claim as a challenge ... with the following result.
 It is
 of course quite workable ;)

 The attached patch implements the core wide-int class and three storage
 models (fixed size for things like plain HWI and double-int, variable size
 similar to how your wide-int works and an adaptor for the double-int as
 contained in trees).  With that you can now do

 HOST_WIDE_INT
 wi_test (tree x)
 {
// template argument deduction doesn't do the magic we want it to do
// to make this kind of implicit conversions work
// overload resolution considers this kind of conversions so we
// need some magic that combines both ... but seeding the overload
// set with some instantiations doesn't seem to be possible :/
// wide_int w = x + 1;
wide_int w;
w += x;
w += 1;
// template argument 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-05 Thread Kenneth Zadeck

Richard,

There has been something that has bothered me about you proposal for the 
storage manager and i think i can now characterize that problem.  Say i 
want to compute the expression


(a + b) / c

converting from tree values, using wide-int as the engine and then 
storing the result in a tree.   (A very common operation for the various 
simplifiers in gcc.)


in my version of wide-int where there is only the stack allocated fix 
size allocation for the data, the compiler arranges for 6 instances of 
wide-int that are statically allocated on the stack when the function 
is entered.There would be 3 copies of the precision and data to get 
things started and one allocation variable sized object at the end when 
the INT_CST is built and one copy to put it back.   As i have argued, 
these copies are of negligible size.


In your world, to get things started, you would do 3 pointer copies to 
get the values out of the tree to set the expression leaves but then you 
will call the allocator 3 times to get space to hold the intermediate 
nodes before you get to pointer copy the result back into the result cst 
which still needs an allocation to build it. I am assuming that we can 
play the same game at the tree level that we do at the rtl level where 
we do 1 variable sized allocation to get the entire INT_CST rather than 
doing 1 fixed sized allocation and 1 variable sized one.


even if we take the simpler example of a + b, you still loose.   The 
cost of the extra allocation and it's subsequent recovery is more than 
my copies.   In fact, even in the simplest case of someone going from a 
HWI thru wide_int into tree, you have 2 allocations vs my 1.


I just do not see the cost savings and if there are no cost savings, you 
certainly cannot say that having these templates is simpler than not 
having the templates.


Kenny

On 04/02/2013 11:04 AM, Richard Biener wrote:

On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

This patch contains a large number of the changes requested by Richi.   It
does not contain any of the changes that he requested to abstract the
storage layer.   That suggestion appears to be quite unworkable.

I of course took this claim as a challenge ... with the following result.  It is
of course quite workable ;)

The attached patch implements the core wide-int class and three storage
models (fixed size for things like plain HWI and double-int, variable size
similar to how your wide-int works and an adaptor for the double-int as
contained in trees).  With that you can now do

HOST_WIDE_INT
wi_test (tree x)
{
   // template argument deduction doesn't do the magic we want it to do
   // to make this kind of implicit conversions work
   // overload resolution considers this kind of conversions so we
   // need some magic that combines both ... but seeding the overload
   // set with some instantiations doesn't seem to be possible :/
   // wide_int w = x + 1;
   wide_int w;
   w += x;
   w += 1;
   // template argument deduction doesn't deduce the return value type,
   // not considering the template default argument either ...
   // w = wi (x) + 1;
   // we could support this by providing rvalue-to-lvalue promotion
   // via a traits class?
   // otoh it would lead to sub-optimal code anyway so we should
   // make the result available as reference parameter and only support
   // wide_int  res; add (res, x, 1); ?
   w = wi (x).operator+wide_int (1);
   wide_int::add(w, x, 1);
   return w.to_hwi ();
}

we are somewhat limited with C++ unless we want to get really fancy.
Eventually providing operator+ just doesn't make much sense for
generic wide-int combinations (though then the issue is its operands
are no longer commutative which I think is the case with your wide-int
or double-int as well - they don't suport 1 + wide_int for obvious reasons).

So there are implementation design choices left undecided.

Oh, and the operation implementations are crap (they compute nonsense).

But you should get the idea.

Richard.




Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-04 Thread Richard Biener
On Wed, Apr 3, 2013 at 6:16 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 On 04/03/2013 09:53 AM, Richard Biener wrote:

 On Wed, Apr 3, 2013 at 2:05 PM, Kenneth Zadeck zad...@naturalbridge.com
 wrote:

 On 04/03/2013 05:17 AM, Richard Biener wrote:

 In the end you will have a variable-size storage in TREE_INT_CST thus
 you will have at least to emit _code_ copying over meta-data and data
 from the tree representation to the wide-int (similar for RTX
 CONST_DOUBLE/INT).
 I'm objecting to the amount of code you emit and agree that the runtime
 cost is copying the meta-data (hopefully optimizable via CSE / SRA)
 and in most cases one (or two) iterations of the loop copying the data
 (not optimizable).

 i did get rid of the bitsize in the wide-int patch so at this point the
 meta
 data is the precision and the len.
 not really a lot here.   As usual we pay a high price in gcc for not
 pushing
 the tree rep down into the rtl level, then it would have been acceptable
 to
 have the tree type bleed into the wide-int code.



 2)  You present this as if the implementor actually should care about
 the
 implementation and you give 3 alternatives:  the double_int, the
 current
 one, and HWI. We have tried to make it so that the client should
 not
 care.   Certainly in my experience here, I have not found a place to
 care.

 Well, similar as for the copying overhead for tree your approach
 requires
 overloading operations for HOST_WIDE_INT operands to be able to
 say wi + 1 (which is certainly desirable), or the overhead of using
 wide_int_one ().

 In my opinion double_int needs to go away.  That is the main thrust of
 my
 patches.   There is no place in a compiler for an abi that depends on
 constants fitting into 2 two words whose size is defined by the host.

 That's true.  I'm not arguing to preserve double-int - I'm arguing to
 preserve a way to ask for an integer type on the host with (at least)
 N bits.  Almost all double-int users really ask for an integer type on
 the
 host that has at least as many bits as the pointer representation (or
 word_mode) on
 the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer
 targets).  No double-int user specifically wants 2 * HOST_WIDE_INT
 precision - that is just what happens to be there.  Thus I am providing
 a way to say get me a host integer with at least N bits (VRP asks for
 this, for example).

 What I was asking for is that whatever can provide the above should
 share
 the functional interface with wide-int (or the othert way around).  And
 I
 was claiming that wide-int is too fat, because current users of
 double-int
 eventually store double-ints permanently.

 The problem is that, in truth, double int is too fat. 99.something% of
 all
 constants fit in 1 hwi and that is likely to be true forever (i
 understand
 that tree vpn may need some thought here).  The rtl level, which has, for
 as
 long as i have known it, had 2 reps for integer constants. So it was
 relatively easy to slide the CONST_WIDE_INT in.  It seems like the right
 trickery here rather than adding a storage model for wide-ints might be a
 way to use the c++ to invisibly support several (and by several i
 really
 mean 2) classes of TREE_CSTs.

 The truth is that _now_ TREE_INT_CSTs use double-ints and we have
 CONST_INT and CONST_DOUBLE.  What I (and you) propose would
 get us to use variable-size storage for both, allowing to just use a
 single
 HOST_WIDE_INT in the majority of cases.  In my view the constant
 length of the variable-size storage for TREE_INT_CSTs is determined
 by its type (thus, it doesn't have optimized variable-size storage
 but unoptimized fixed-size storage based on the maximum storage
 requirement for the type).  Similar for RTX CONST_INT which would
 have fixed-size storage based on the mode-size of the constant.
 Using optimized space (thus using the encoding properties) requires you
 to fit the 'short len' somewhere which possibly will not pay off in the
 end
 (for tree we do have that storage available, so we could go with optimized
 storage for it, not sure with RTL, I don't see available space there).

 There are two questions here:   one is the fact that you object to the fact
 that we represent small constants efficiently

Huh?  Where do I object to that?  I question that for the storage in tree
and RTX the encoding trick pays off if you need another HWI-aligned
word to store the len.  But see below.

 and the second is that we take
 advantage of the fact that fixed size stack allocation is effectively free
 for short lived objects like wide-ints (as i use them).

I don't question that and I am not asking you to change that.  As part of
what I ask for a more optimal (smaller) stack allocation would be _possible_
(but not required).

 At the rtl level your idea does not work.   rtl constants do not have a mode
 or type.So if you do not compress, how are you going to determine how
 many words you need for the constant 1.   I would love to 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-03 Thread Richard Biener
On Tue, Apr 2, 2013 at 7:35 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 Yes, I agree that you win the challenge that it can be done.What you
 have always failed to address is why anyone would want to do this.  Or how
 this would at all be desirable.But I completely agree that from a purely
 abstract point of view you can add a storage model.

 Now here is why we REALLY do not want to go down this road:

 1)  The following comment from your earlier mail is completely wrong


 +#ifdef NEW_REP_FOR_INT_CST
 +  /* This is the code once the tree level is converted.  */
 +  wide_int result;
 +  int i;
 +
 +  tree type = TREE_TYPE (tcst);
 +
 +  result.bitsize = GET_MODE_BITSIZE (TYPE_MODE (type));
 +  result.precision = TYPE_PRECISION (type);
 +  result.len = TREE_INT_CST_LEN (tcst);
 +  for (i = 0; i  result.len; i++)
 +result.val[i] = TREE_INT_CST_ELT (tcst, i);
 +
 +  return result;
 +#else


 this also shows the main reason I was asking for storage abstraction.
 The initialization from tree is way too expensive.


 In almost all cases, constants will fit in a single HWI.  Thus, the only
 thing that you are copying is the length and a single HWI. So you are
 dragging in a lot of machinery just to save these two copies?   Certainly
 there has to be more to it than that.

In the end you will have a variable-size storage in TREE_INT_CST thus
you will have at least to emit _code_ copying over meta-data and data
from the tree representation to the wide-int (similar for RTX CONST_DOUBLE/INT).
I'm objecting to the amount of code you emit and agree that the runtime
cost is copying the meta-data (hopefully optimizable via CSE / SRA)
and in most cases one (or two) iterations of the loop copying the data
(not optimizable).

 2)  You present this as if the implementor actually should care about the
 implementation and you give 3 alternatives:  the double_int, the current
 one, and HWI. We have tried to make it so that the client should not
 care.   Certainly in my experience here, I have not found a place to care.

Well, similar as for the copying overhead for tree your approach requires
overloading operations for HOST_WIDE_INT operands to be able to
say wi + 1 (which is certainly desirable), or the overhead of using
wide_int_one ().

 In my opinion double_int needs to go away.  That is the main thrust of my
 patches.   There is no place in a compiler for an abi that depends on
 constants fitting into 2 two words whose size is defined by the host.

That's true.  I'm not arguing to preserve double-int - I'm arguing to
preserve a way to ask for an integer type on the host with (at least)
N bits.  Almost all double-int users really ask for an integer type on the
host that has at least as many bits as the pointer representation (or
word_mode) on
the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer
targets).  No double-int user specifically wants 2 * HOST_WIDE_INT
precision - that is just what happens to be there.  Thus I am providing
a way to say get me a host integer with at least N bits (VRP asks for
this, for example).

What I was asking for is that whatever can provide the above should share
the functional interface with wide-int (or the othert way around).  And I
was claiming that wide-int is too fat, because current users of double-int
eventually store double-ints permanently.

 This is not a beauty contest argument, we have public ports are beginning to
 use modes that are larger than two x86-64 HWIs and i have a private port
 that has such modes and it is my experience that any pass that uses this
 interface has one of three behaviors: it silently gets the wrong answer, it
 ices, or it fails to do the transformation.  If we leave double_int as an
 available option, then any use of it potentially will have one of these
 three behaviors.  And so one of my strong objections to this direction is
 that i do not want to fight this kind of bug for the rest of my life.
 Having a single storage model that just always works is in my opinion a
 highly desirable option.  What you have never answered in a concrete manner
 is, if we decide to provide this generality, what it would be used for.
 There is no place in a portable compiler where the right answer for every
 target is two HOST wide integers.

 However, i will admit that the HWI option has some merits.   We try to
 address this in our implementation by dividing what is done inline in
 wide-int.h to the cases that fit in an HWI and then only drop into the heavy
 code in wide-int.c if mode is larger (which it rarely will be).   However, a
 case could be made that for certain kinds of things like string lengths and
 such, we could use another interface or as you argue, a different storage
 model with the same interface.   I just do not see that the cost of the
 conversion code is really going to show up on anyone's radar.

What's the issue with abstracting away the model so a fixed-size 'len'
is possible?  (let away the argument that 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-03 Thread Kenneth Zadeck


On 04/03/2013 05:17 AM, Richard Biener wrote:


In the end you will have a variable-size storage in TREE_INT_CST thus
you will have at least to emit _code_ copying over meta-data and data
from the tree representation to the wide-int (similar for RTX CONST_DOUBLE/INT).
I'm objecting to the amount of code you emit and agree that the runtime
cost is copying the meta-data (hopefully optimizable via CSE / SRA)
and in most cases one (or two) iterations of the loop copying the data
(not optimizable).
i did get rid of the bitsize in the wide-int patch so at this point the 
meta data is the precision and the len.
not really a lot here.   As usual we pay a high price in gcc for not 
pushing the tree rep down into the rtl level, then it would have been 
acceptable to have the tree type bleed into the wide-int code.




2)  You present this as if the implementor actually should care about the
implementation and you give 3 alternatives:  the double_int, the current
one, and HWI. We have tried to make it so that the client should not
care.   Certainly in my experience here, I have not found a place to care.

Well, similar as for the copying overhead for tree your approach requires
overloading operations for HOST_WIDE_INT operands to be able to
say wi + 1 (which is certainly desirable), or the overhead of using
wide_int_one ().


In my opinion double_int needs to go away.  That is the main thrust of my
patches.   There is no place in a compiler for an abi that depends on
constants fitting into 2 two words whose size is defined by the host.

That's true.  I'm not arguing to preserve double-int - I'm arguing to
preserve a way to ask for an integer type on the host with (at least)
N bits.  Almost all double-int users really ask for an integer type on the
host that has at least as many bits as the pointer representation (or
word_mode) on
the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer
targets).  No double-int user specifically wants 2 * HOST_WIDE_INT
precision - that is just what happens to be there.  Thus I am providing
a way to say get me a host integer with at least N bits (VRP asks for
this, for example).

What I was asking for is that whatever can provide the above should share
the functional interface with wide-int (or the othert way around).  And I
was claiming that wide-int is too fat, because current users of double-int
eventually store double-ints permanently.
The problem is that, in truth, double int is too fat. 99.something% of 
all constants fit in 1 hwi and that is likely to be true forever (i 
understand that tree vpn may need some thought here).  The rtl level, 
which has, for as long as i have known it, had 2 reps for integer 
constants. So it was relatively easy to slide the CONST_WIDE_INT in.  It 
seems like the right trickery here rather than adding a storage model 
for wide-ints might be a way to use the c++ to invisibly support several 
(and by several i really mean 2) classes of TREE_CSTs.





This is not a beauty contest argument, we have public ports are beginning to
use modes that are larger than two x86-64 HWIs and i have a private port
that has such modes and it is my experience that any pass that uses this
interface has one of three behaviors: it silently gets the wrong answer, it
ices, or it fails to do the transformation.  If we leave double_int as an
available option, then any use of it potentially will have one of these
three behaviors.  And so one of my strong objections to this direction is
that i do not want to fight this kind of bug for the rest of my life.
Having a single storage model that just always works is in my opinion a
highly desirable option.  What you have never answered in a concrete manner
is, if we decide to provide this generality, what it would be used for.
There is no place in a portable compiler where the right answer for every
target is two HOST wide integers.

However, i will admit that the HWI option has some merits.   We try to
address this in our implementation by dividing what is done inline in
wide-int.h to the cases that fit in an HWI and then only drop into the heavy
code in wide-int.c if mode is larger (which it rarely will be).   However, a
case could be made that for certain kinds of things like string lengths and
such, we could use another interface or as you argue, a different storage
model with the same interface.   I just do not see that the cost of the
conversion code is really going to show up on anyone's radar.

What's the issue with abstracting away the model so a fixed-size 'len'
is possible?  (let away the argument that this would easily allow an
adaptor to tree)
I have a particularly pessimistic perspective because i have already 
written most of this patch.   It is not that i do not want to change 
that code, it is that i have seen a certain set of mistakes that were 
made and i do not want to fix them more than once.   At the rtl level 
you can see the transition from only supporting 32 bit ints to 
supporting 64 bit 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-03 Thread Richard Biener
On Wed, Apr 3, 2013 at 2:05 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

 On 04/03/2013 05:17 AM, Richard Biener wrote:

 In the end you will have a variable-size storage in TREE_INT_CST thus
 you will have at least to emit _code_ copying over meta-data and data
 from the tree representation to the wide-int (similar for RTX
 CONST_DOUBLE/INT).
 I'm objecting to the amount of code you emit and agree that the runtime
 cost is copying the meta-data (hopefully optimizable via CSE / SRA)
 and in most cases one (or two) iterations of the loop copying the data
 (not optimizable).

 i did get rid of the bitsize in the wide-int patch so at this point the meta
 data is the precision and the len.
 not really a lot here.   As usual we pay a high price in gcc for not pushing
 the tree rep down into the rtl level, then it would have been acceptable to
 have the tree type bleed into the wide-int code.



 2)  You present this as if the implementor actually should care about the
 implementation and you give 3 alternatives:  the double_int, the current
 one, and HWI. We have tried to make it so that the client should not
 care.   Certainly in my experience here, I have not found a place to
 care.

 Well, similar as for the copying overhead for tree your approach requires
 overloading operations for HOST_WIDE_INT operands to be able to
 say wi + 1 (which is certainly desirable), or the overhead of using
 wide_int_one ().

 In my opinion double_int needs to go away.  That is the main thrust of my
 patches.   There is no place in a compiler for an abi that depends on
 constants fitting into 2 two words whose size is defined by the host.

 That's true.  I'm not arguing to preserve double-int - I'm arguing to
 preserve a way to ask for an integer type on the host with (at least)
 N bits.  Almost all double-int users really ask for an integer type on the
 host that has at least as many bits as the pointer representation (or
 word_mode) on
 the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer
 targets).  No double-int user specifically wants 2 * HOST_WIDE_INT
 precision - that is just what happens to be there.  Thus I am providing
 a way to say get me a host integer with at least N bits (VRP asks for
 this, for example).

 What I was asking for is that whatever can provide the above should share
 the functional interface with wide-int (or the othert way around).  And I
 was claiming that wide-int is too fat, because current users of double-int
 eventually store double-ints permanently.

 The problem is that, in truth, double int is too fat. 99.something% of all
 constants fit in 1 hwi and that is likely to be true forever (i understand
 that tree vpn may need some thought here).  The rtl level, which has, for as
 long as i have known it, had 2 reps for integer constants. So it was
 relatively easy to slide the CONST_WIDE_INT in.  It seems like the right
 trickery here rather than adding a storage model for wide-ints might be a
 way to use the c++ to invisibly support several (and by several i really
 mean 2) classes of TREE_CSTs.

The truth is that _now_ TREE_INT_CSTs use double-ints and we have
CONST_INT and CONST_DOUBLE.  What I (and you) propose would
get us to use variable-size storage for both, allowing to just use a single
HOST_WIDE_INT in the majority of cases.  In my view the constant
length of the variable-size storage for TREE_INT_CSTs is determined
by its type (thus, it doesn't have optimized variable-size storage
but unoptimized fixed-size storage based on the maximum storage
requirement for the type).  Similar for RTX CONST_INT which would
have fixed-size storage based on the mode-size of the constant.
Using optimized space (thus using the encoding properties) requires you
to fit the 'short len' somewhere which possibly will not pay off in the end
(for tree we do have that storage available, so we could go with optimized
storage for it, not sure with RTL, I don't see available space there).

 This is not a beauty contest argument, we have public ports are beginning
 to
 use modes that are larger than two x86-64 HWIs and i have a private port
 that has such modes and it is my experience that any pass that uses this
 interface has one of three behaviors: it silently gets the wrong answer,
 it
 ices, or it fails to do the transformation.  If we leave double_int as an
 available option, then any use of it potentially will have one of these
 three behaviors.  And so one of my strong objections to this direction is
 that i do not want to fight this kind of bug for the rest of my life.
 Having a single storage model that just always works is in my opinion a
 highly desirable option.  What you have never answered in a concrete
 manner
 is, if we decide to provide this generality, what it would be used for.
 There is no place in a portable compiler where the right answer for every
 target is two HOST wide integers.

 However, i will admit that the HWI option has some merits.   We try to
 address this 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-03 Thread Kenneth Zadeck

On 04/03/2013 09:53 AM, Richard Biener wrote:

On Wed, Apr 3, 2013 at 2:05 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

On 04/03/2013 05:17 AM, Richard Biener wrote:


In the end you will have a variable-size storage in TREE_INT_CST thus
you will have at least to emit _code_ copying over meta-data and data
from the tree representation to the wide-int (similar for RTX
CONST_DOUBLE/INT).
I'm objecting to the amount of code you emit and agree that the runtime
cost is copying the meta-data (hopefully optimizable via CSE / SRA)
and in most cases one (or two) iterations of the loop copying the data
(not optimizable).

i did get rid of the bitsize in the wide-int patch so at this point the meta
data is the precision and the len.
not really a lot here.   As usual we pay a high price in gcc for not pushing
the tree rep down into the rtl level, then it would have been acceptable to
have the tree type bleed into the wide-int code.




2)  You present this as if the implementor actually should care about the
implementation and you give 3 alternatives:  the double_int, the current
one, and HWI. We have tried to make it so that the client should not
care.   Certainly in my experience here, I have not found a place to
care.

Well, similar as for the copying overhead for tree your approach requires
overloading operations for HOST_WIDE_INT operands to be able to
say wi + 1 (which is certainly desirable), or the overhead of using
wide_int_one ().


In my opinion double_int needs to go away.  That is the main thrust of my
patches.   There is no place in a compiler for an abi that depends on
constants fitting into 2 two words whose size is defined by the host.

That's true.  I'm not arguing to preserve double-int - I'm arguing to
preserve a way to ask for an integer type on the host with (at least)
N bits.  Almost all double-int users really ask for an integer type on the
host that has at least as many bits as the pointer representation (or
word_mode) on
the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer
targets).  No double-int user specifically wants 2 * HOST_WIDE_INT
precision - that is just what happens to be there.  Thus I am providing
a way to say get me a host integer with at least N bits (VRP asks for
this, for example).

What I was asking for is that whatever can provide the above should share
the functional interface with wide-int (or the othert way around).  And I
was claiming that wide-int is too fat, because current users of double-int
eventually store double-ints permanently.

The problem is that, in truth, double int is too fat. 99.something% of all
constants fit in 1 hwi and that is likely to be true forever (i understand
that tree vpn may need some thought here).  The rtl level, which has, for as
long as i have known it, had 2 reps for integer constants. So it was
relatively easy to slide the CONST_WIDE_INT in.  It seems like the right
trickery here rather than adding a storage model for wide-ints might be a
way to use the c++ to invisibly support several (and by several i really
mean 2) classes of TREE_CSTs.

The truth is that _now_ TREE_INT_CSTs use double-ints and we have
CONST_INT and CONST_DOUBLE.  What I (and you) propose would
get us to use variable-size storage for both, allowing to just use a single
HOST_WIDE_INT in the majority of cases.  In my view the constant
length of the variable-size storage for TREE_INT_CSTs is determined
by its type (thus, it doesn't have optimized variable-size storage
but unoptimized fixed-size storage based on the maximum storage
requirement for the type).  Similar for RTX CONST_INT which would
have fixed-size storage based on the mode-size of the constant.
Using optimized space (thus using the encoding properties) requires you
to fit the 'short len' somewhere which possibly will not pay off in the end
(for tree we do have that storage available, so we could go with optimized
storage for it, not sure with RTL, I don't see available space there).
There are two questions here:   one is the fact that you object to the 
fact that we represent small constants efficiently and the second is 
that we take advantage of the fact that fixed size stack allocation is 
effectively free for short lived objects like wide-ints (as i use them).


At the rtl level your idea does not work.   rtl constants do not have a 
mode or type.So if you do not compress, how are you going to 
determine how many words you need for the constant 1.   I would love to 
have a rep that had the mode in it.But it is a huge change that 
requires a lot of hacking to every port.


I understand that this makes me vulnerable to the argument that we 
should not let the rtl level ever dictate anything about the tree level, 
but the truth is that a variable len rep is almost always used for big 
integers.   In our code, most constants of large types are small 
numbers.   (Remember i got into this because the tree constant prop 
thinks that left shifting any number by anything 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-02 Thread Richard Biener
On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
zad...@naturalbridge.com wrote:
 This patch contains a large number of the changes requested by Richi.   It
 does not contain any of the changes that he requested to abstract the
 storage layer.   That suggestion appears to be quite unworkable.

I of course took this claim as a challenge ... with the following result.  It is
of course quite workable ;)

The attached patch implements the core wide-int class and three storage
models (fixed size for things like plain HWI and double-int, variable size
similar to how your wide-int works and an adaptor for the double-int as
contained in trees).  With that you can now do

HOST_WIDE_INT
wi_test (tree x)
{
  // template argument deduction doesn't do the magic we want it to do
  // to make this kind of implicit conversions work
  // overload resolution considers this kind of conversions so we
  // need some magic that combines both ... but seeding the overload
  // set with some instantiations doesn't seem to be possible :/
  // wide_int w = x + 1;
  wide_int w;
  w += x;
  w += 1;
  // template argument deduction doesn't deduce the return value type,
  // not considering the template default argument either ...
  // w = wi (x) + 1;
  // we could support this by providing rvalue-to-lvalue promotion
  // via a traits class?
  // otoh it would lead to sub-optimal code anyway so we should
  // make the result available as reference parameter and only support
  // wide_int  res; add (res, x, 1); ?
  w = wi (x).operator+wide_int (1);
  wide_int::add(w, x, 1);
  return w.to_hwi ();
}

we are somewhat limited with C++ unless we want to get really fancy.
Eventually providing operator+ just doesn't make much sense for
generic wide-int combinations (though then the issue is its operands
are no longer commutative which I think is the case with your wide-int
or double-int as well - they don't suport 1 + wide_int for obvious reasons).

So there are implementation design choices left undecided.

Oh, and the operation implementations are crap (they compute nonsense).

But you should get the idea.

Richard.
#include config.h
#include system.h

#include coretypes.h
#include hwint.h
#include tree.h


/* ???  wide-int should probably use HOST_WIDEST_FAST_INT as storage,
   not HOST_WIDE_INT.  Yeah, we could even template on that ...  */

/* Fixed-length embedded storage.  wi_embed2 is double-int,
   wi_embed1 is a plain HOST_WIDE_INT.  Can be used for
   small fixed-(minimum)-size calculations on hosts that have
   no suitable integer type.  */

template unsigned sz
class wi_embed
{
private:
  HOST_WIDE_INT s[sz];

public:
  void construct () {}
  HOST_WIDE_INT* storage() { return s; }
  const HOST_WIDE_INT* storage() const { return s; }
  unsigned len() const { return sz; }
  void set_len(unsigned l) { gcc_checking_assert (l = sz); }
};


/* Fixed maximum-length embedded storage but variable dynamic size.  */

//#define MAXSZ (4 * (MAX_MODE_INT_SIZE / HOST_BITS_PER_WIDE_INT))
#define MAXSZ 8

template unsigned max_sz
class wi_embed_var
{
private:
  unsigned len_;
  HOST_WIDE_INT s[max_sz];

public:
  void construct () { len_ = 0; }
  HOST_WIDE_INT* storage() { return s; }
  const HOST_WIDE_INT* storage() const { return s; }
  unsigned len() const { return len_; }
  void set_len(unsigned l) { len_ =  l; }
};


/* The wide-int class.  Defaults to variable-length storage
   (alternatively use a typedef to avoid the need to use wide_int ).  */

template class S = wi_embed_varMAXSZ 
class wide_int;


/* Avoid constructors / destructors to make sure this is a C++04 POD.  */

/* Basic wide_int class.  The storage model allows for rvalue
   storage abstraction avoiding copying from for example tree
   or RTX and to avoid the need of explicit construction for
   integral arguments of up to HWI size.

   A storage model needs to provide the following methods:
- construct (), default-initialize the storage
- unsigned len () const, the size of the storage in HWI quantities
- const HOST_WIDE_INT *storage () const, return a pointer
  to read-only HOST_WIDE_INT storage of size len ().
- HOST_WIDE_INT *storage (), return a pointer to writable
  HOST_WIDE_INT storage of size len ().  This method is optional.
- void set_len (unsigned l), adjust the size of the storage
  to at least l HWI words.

   Conversions of wide_int _to_ tree or RTX or HWI are explicit.

   Conversions to wide_int happen with overloads to the global
   function template wi () or via wide_int_traits specializations.  */

/* ???  With mixed length operations there are encoding issues
   for signed vs. unsigned numbers.  The easiest encoding is to
   say wide-ints are always signed which means that -1U needs
   the MSB of the wide-int storage as zero which means an extra
   word with zeros.  The sign-bit of a wide-int is then always
   storage()[len()  (1  (HOST_BITS_PER_WIDE_INT - 1))].  */

template class S
class wide_int : private S
{
  /* Allow 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-02 Thread Kenneth Zadeck
Yes, I agree that you win the challenge that it can be done.What you 
have always failed to address is why anyone would want to do this.  Or 
how this would at all be desirable.But I completely agree that from 
a purely abstract point of view you can add a storage model.


Now here is why we REALLY do not want to go down this road:

1)  The following comment from your earlier mail is completely wrong


+#ifdef NEW_REP_FOR_INT_CST
+  /* This is the code once the tree level is converted.  */
+  wide_int result;
+  int i;
+
+  tree type = TREE_TYPE (tcst);
+
+  result.bitsize = GET_MODE_BITSIZE (TYPE_MODE (type));
+  result.precision = TYPE_PRECISION (type);
+  result.len = TREE_INT_CST_LEN (tcst);
+  for (i = 0; i  result.len; i++)
+result.val[i] = TREE_INT_CST_ELT (tcst, i);
+
+  return result;
+#else



this also shows the main reason I was asking for storage abstraction.
The initialization from tree is way too expensive.


In almost all cases, constants will fit in a single HWI.  Thus, the only 
thing that you are copying is the length and a single HWI. So you are 
dragging in a lot of machinery just to save these two copies?   
Certainly there has to be more to it than that.


2)  You present this as if the implementor actually should care about 
the implementation and you give 3 alternatives:  the double_int, the 
current one, and HWI. We have tried to make it so that the client 
should not care.   Certainly in my experience here, I have not found a 
place to care.


In my opinion double_int needs to go away.  That is the main thrust of 
my patches.   There is no place in a compiler for an abi that depends on 
constants fitting into 2 two words whose size is defined by the host.
This is not a beauty contest argument, we have public ports are 
beginning to use modes that are larger than two x86-64 HWIs and i have a 
private port that has such modes and it is my experience that any pass 
that uses this interface has one of three behaviors: it silently gets 
the wrong answer, it ices, or it fails to do the transformation.  If we 
leave double_int as an available option, then any use of it potentially 
will have one of these three behaviors.  And so one of my strong 
objections to this direction is that i do not want to fight this kind of 
bug for the rest of my life.Having a single storage model that just 
always works is in my opinion a highly desirable option.  What you have 
never answered in a concrete manner is, if we decide to provide this 
generality, what it would be used for.There is no place in a 
portable compiler where the right answer for every target is two HOST 
wide integers.


However, i will admit that the HWI option has some merits.   We try to 
address this in our implementation by dividing what is done inline in 
wide-int.h to the cases that fit in an HWI and then only drop into the 
heavy code in wide-int.c if mode is larger (which it rarely will be).   
However, a case could be made that for certain kinds of things like 
string lengths and such, we could use another interface or as you argue, 
a different storage model with the same interface.   I just do not see 
that the cost of the conversion code is really going to show up on 
anyone's radar.


3) your trick will work at the tree level, but not at the rtl level.   
The wide-int code cannot share storage with the CONST_INTs.We tried 
this, and there are a million bugs that would have to be fixed to make 
it work.It could have worked if CONST_INTs had carried a mode 
around, but since they do not, you end up with the same CONST_INT 
sharing the rep for several different types and that just did not work 
unless you are willing to do substantial cleanups.


On 04/02/2013 11:04 AM, Richard Biener wrote:

On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

This patch contains a large number of the changes requested by Richi.   It
does not contain any of the changes that he requested to abstract the
storage layer.   That suggestion appears to be quite unworkable.

I of course took this claim as a challenge ... with the following result.  It is
of course quite workable ;)

The attached patch implements the core wide-int class and three storage
models (fixed size for things like plain HWI and double-int, variable size
similar to how your wide-int works and an adaptor for the double-int as
contained in trees).  With that you can now do

HOST_WIDE_INT
wi_test (tree x)
{
   // template argument deduction doesn't do the magic we want it to do
   // to make this kind of implicit conversions work
   // overload resolution considers this kind of conversions so we
   // need some magic that combines both ... but seeding the overload
   // set with some instantiations doesn't seem to be possible :/
   // wide_int w = x + 1;
   wide_int w;
   w += x;
   w += 1;
   // template argument deduction doesn't deduce the return value type,
   // not considering the template default argument 

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-03-27 Thread Richard Biener
On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
zad...@naturalbridge.com wrote:
 This patch contains a large number of the changes requested by Richi.   It
 does not contain any of the changes that he requested to abstract the
 storage layer.   That suggestion appears to be quite unworkable.

 I believe that the wide-int class addresses the needs of gcc for performing
 math on any size integer irregardless of the platform that hosts the
 compiler.  The interface is admittedly large, but it is large for a reason:
 these are the operations that are commonly performed by the client
 optimizations in the compiler.

 I would like to get this patch preapproved for the next stage 1.

Please clean from dead code like

+// using wide_int::;

and

+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+debug_wh (wide_int::from_shwi %s  HOST_WIDE_INT_PRINT_HEX )\n,
+ result, op0);
+#endif

and

+#ifdef NEW_REP_FOR_INT_CST
+  /* This is the code once the tree level is converted.  */
+  wide_int result;
+  int i;
+
+  tree type = TREE_TYPE (tcst);
+
+  result.bitsize = GET_MODE_BITSIZE (TYPE_MODE (type));
+  result.precision = TYPE_PRECISION (type);
+  result.len = TREE_INT_CST_LEN (tcst);
+  for (i = 0; i  result.len; i++)
+result.val[i] = TREE_INT_CST_ELT (tcst, i);
+
+  return result;
+#else

this also shows the main reason I was asking for storage abstraction.
The initialization from tree is way too expensive.

+/* Convert a integer cst into a wide int expanded to BITSIZE and
+   PRECISION.  This call is used by tree passes like vrp that expect
+   that the math is done in an infinite precision style.  BITSIZE and
+   PRECISION are generally determined to be twice the largest type
+   seen in the function.  */
+
+wide_int
+wide_int::from_tree_as_infinite_precision (const_tree tcst,
+  unsigned int bitsize,
+  unsigned int precision)
+{

I know you have converted everything, but to make this patch reviewable
I'd like you to strip the initial wide_int down to a bare minimum.

Only then people will have a reasonable chance to play with interface
changes (such as providing a storage abstraction).

+/* Check the upper HOST_WIDE_INTs of src to see if the length can be
+   shortened.  An upper HOST_WIDE_INT is unnecessary if it is all ones
+   or zeros and the top bit of the next lower word matches.
+
+   This function may change the representation of THIS, but does not
+   change the value that THIS represents.  It does not sign extend in
+   the case that the size of the mode is less than
+   HOST_BITS_PER_WIDE_INT.  */
+
+void
+wide_int::canonize ()

this shouldn't be necessary - it's an optimization - and due to value
semantics (yes - I know you have a weird mix of value semantics
and modify-in-place in wide_int) the new length should be computed
transparently when creating a new value.

Well.  Leaving wide-int.c for now.

+class wide_int {
+  /* Internal representation.  */
+
+  /* VAL is set to a size that is capable of computing a full
+ multiplication on the largest mode that is represented on the
+ target.  The full multiplication is use by tree-vrp.  tree-vpn
+ currently does a 2x largest mode by 2x largest mode yielding a 4x
+ largest mode result.  If operations are added that require larger
+ buffers, then VAL needs to be changed.  */
+  HOST_WIDE_INT val[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];

as you conver partial int modes in MAX_BITSIZE_MODE_ANY_INT the
above may come too short.  Please properly round up.

+  unsigned short len;
+  unsigned int bitsize;
+  unsigned int precision;

I see we didn't get away with this mix of bitsize and precision.  I'm probably
going to try revisit the past discussions - but can you point me to a single
place in the RTL conversion where they make a difference?  Bits beyond
precision are either undefined or properly zero-/sign-extended.  Implicit
extension beyond len val members should then provide in valid bits
up to bitsize (if anyone cares).  That's how double-ints work on tree
INTGER_CSTs
which only care for precision, even with partial integer mode types
(ok, I never came along one of these beasts - testcase / target?).

[abstraction possibility - have both wide_ints with actual mode and
wide_ints with arbitrary bitsize/precision]

+  enum ShiftOp {
+NONE,
+/* There are two uses for the wide-int shifting functions.  The
+   first use is as an emulation of the target hardware.  The
+   second use is as service routines for other optimizations.  The
+   first case needs to be identified by passing TRUNC as the value
+   of ShiftOp so that shift amount is properly handled according to the
+   SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
+   amount is always truncated by the bytesize of the mode of
+   THIS.  */
+TRUNC
+  };

I think I have expressed my opinion on this.  (and SHIFT_COUNT_TRUNCATED

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Sandiford
Richard Biener richard.guent...@gmail.com writes:
 On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes that
 are not power of two, and, if there isn't an int mode that is bigger, we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This
 goal of
 this patch is to get the compiler to do the constant math the way that
 the
 target does it.   Any such instantiation is by definition placing some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two implementations
 of functions that perform integer math on two-HWI sized integers.  What
 I also don't like too much is that we have two different interfaces to
 operate
 on them!  Can't you see how I come to not liking this?  Especially the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int is a
 waste of time, as the time is better spent removing double-int from the
 compiler.  All the necessary semantics and code of double-int _has_ been
 refactored into wide-int already.  Changing wide-int in any way to vend
 anything to double-int is wrong, as once double-int is removed, then all 
 the
 api changes to make double-int share from wide-int is wasted and must then
 be removed.  The path forward is the complete removal of double-int; it is
 wrong, has been wrong and always will be wrong, nothing can change that.

 double_int, compared to wide_int, is fast and lean.  I doubt we will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at least_
 templating wide_int on 'len' and providing specializations for 1 and 2.

 It might be a non-issue for math that operates on trees or RTXen due to
 the allocation overhead we pay, but in recent years we transitioned
 important
 paths away from using tree math to using double_ints _for speed reasons_.

 Richard.

 i do not know why you believe this about the speed. double int always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the precision's
 size, it almost never does uses synthetic operations since the sizes for
 almost every instance can be done using the native math on the machine.
 almost every call has a check to see if the operation can be done natively.
 I seriously doubt that you are going to do TI mode math much faster than i
 do it and if you do who cares.

 the number of calls does not effect the performance in any negative way and
 it fact is more efficient since common things that require more than one
 operation in double in are typically done in a single operation.

 Simple double-int operations like

 inline double_int
 double_int::and_not (double_int b) const
 {
   double_int result;
   result.low = low  ~b.low;
   result.high = high  ~b.high;
   return result;
 }

 are always going to be faster than conditionally executing only one operation
 (but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting that you gave this example, since before you were
complaining about too many fused ops.  Clearly this one could be
removed in favour of separate and() and not() operations, but why
not provide a fused one if there are clients who'll make use of it?
I think Kenny's API is just taking that to its logical conclusion.
There doesn't seem to be anything sacrosanct about the current choice
of what's fused and what isn't.

The speed problem we had using trees 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Biener
On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes that
 are not power of two, and, if there isn't an int mode that is bigger, we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This
 goal of
 this patch is to get the compiler to do the constant math the way that
 the
 target does it.   Any such instantiation is by definition placing some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two implementations
 of functions that perform integer math on two-HWI sized integers.  What
 I also don't like too much is that we have two different interfaces to
 operate
 on them!  Can't you see how I come to not liking this?  Especially the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int is a
 waste of time, as the time is better spent removing double-int from the
 compiler.  All the necessary semantics and code of double-int _has_ been
 refactored into wide-int already.  Changing wide-int in any way to vend
 anything to double-int is wrong, as once double-int is removed, then all 
 the
 api changes to make double-int share from wide-int is wasted and must then
 be removed.  The path forward is the complete removal of double-int; it is
 wrong, has been wrong and always will be wrong, nothing can change that.

 double_int, compared to wide_int, is fast and lean.  I doubt we will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at least_
 templating wide_int on 'len' and providing specializations for 1 and 2.

 It might be a non-issue for math that operates on trees or RTXen due to
 the allocation overhead we pay, but in recent years we transitioned
 important
 paths away from using tree math to using double_ints _for speed reasons_.

 Richard.

 i do not know why you believe this about the speed. double int always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the precision's
 size, it almost never does uses synthetic operations since the sizes for
 almost every instance can be done using the native math on the machine.
 almost every call has a check to see if the operation can be done natively.
 I seriously doubt that you are going to do TI mode math much faster than i
 do it and if you do who cares.

 the number of calls does not effect the performance in any negative way and
 it fact is more efficient since common things that require more than one
 operation in double in are typically done in a single operation.

 Simple double-int operations like

 inline double_int
 double_int::and_not (double_int b) const
 {
   double_int result;
   result.low = low  ~b.low;
   result.high = high  ~b.high;
   return result;
 }

 are always going to be faster than conditionally executing only one operation
 (but inside an offline function).

 OK, this is really in reply to the 4.8 thing, but it felt more
 appropriate here.

 It's interesting that you gave this example, since before you were
 complaining about too many fused ops.  Clearly this one could be
 removed in favour of separate and() and not() operations, but why
 not provide a fused one if there are clients who'll make use of it?

I was more concerned about fused operations that use precision
or bitsize as input.  That is for example

 +  bool 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Sandiford
Richard Biener richard.guent...@gmail.com writes:
 On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes that
 are not power of two, and, if there isn't an int mode that is bigger, 
 we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This
 goal of
 this patch is to get the compiler to do the constant math the way that
 the
 target does it.   Any such instantiation is by definition placing some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two implementations
 of functions that perform integer math on two-HWI sized integers.  What
 I also don't like too much is that we have two different interfaces to
 operate
 on them!  Can't you see how I come to not liking this?  Especially the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int is a
 waste of time, as the time is better spent removing double-int from the
 compiler.  All the necessary semantics and code of double-int _has_ been
 refactored into wide-int already.  Changing wide-int in any way to vend
 anything to double-int is wrong, as once double-int is removed,
 then all the
 api changes to make double-int share from wide-int is wasted and must 
 then
 be removed.  The path forward is the complete removal of double-int; it 
 is
 wrong, has been wrong and always will be wrong, nothing can change that.

 double_int, compared to wide_int, is fast and lean.  I doubt we will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at least_
 templating wide_int on 'len' and providing specializations for 1 and 2.

 It might be a non-issue for math that operates on trees or RTXen due to
 the allocation overhead we pay, but in recent years we transitioned
 important
 paths away from using tree math to using double_ints _for speed reasons_.

 Richard.

 i do not know why you believe this about the speed. double int always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the precision's
 size, it almost never does uses synthetic operations since the sizes for
 almost every instance can be done using the native math on the machine.
 almost every call has a check to see if the operation can be done natively.
 I seriously doubt that you are going to do TI mode math much faster than i
 do it and if you do who cares.

 the number of calls does not effect the performance in any negative way and
 it fact is more efficient since common things that require more than one
 operation in double in are typically done in a single operation.

 Simple double-int operations like

 inline double_int
 double_int::and_not (double_int b) const
 {
   double_int result;
   result.low = low  ~b.low;
   result.high = high  ~b.high;
   return result;
 }

 are always going to be faster than conditionally executing only one 
 operation
 (but inside an offline function).

 OK, this is really in reply to the 4.8 thing, but it felt more
 appropriate here.

 It's interesting that you gave this example, since before you were
 complaining about too many fused ops.  Clearly this one could be
 removed in favour of separate and() and not() operations, but why
 not provide a fused one if there are clients who'll make use of it?

 I was more concerned about fused operations that use 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Biener
On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net 
 wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler includes 
 all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes that
 are not power of two, and, if there isn't an int mode that is bigger, 
 we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template 
 templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This
 goal of
 this patch is to get the compiler to do the constant math the way that
 the
 target does it.   Any such instantiation is by definition placing some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two implementations
 of functions that perform integer math on two-HWI sized integers.  What
 I also don't like too much is that we have two different interfaces to
 operate
 on them!  Can't you see how I come to not liking this?  Especially the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int is a
 waste of time, as the time is better spent removing double-int from the
 compiler.  All the necessary semantics and code of double-int _has_ been
 refactored into wide-int already.  Changing wide-int in any way to vend
 anything to double-int is wrong, as once double-int is removed,
 then all the
 api changes to make double-int share from wide-int is wasted and must 
 then
 be removed.  The path forward is the complete removal of double-int; it 
 is
 wrong, has been wrong and always will be wrong, nothing can change that.

 double_int, compared to wide_int, is fast and lean.  I doubt we will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at least_
 templating wide_int on 'len' and providing specializations for 1 and 2.

 It might be a non-issue for math that operates on trees or RTXen due to
 the allocation overhead we pay, but in recent years we transitioned
 important
 paths away from using tree math to using double_ints _for speed reasons_.

 Richard.

 i do not know why you believe this about the speed. double int always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the precision's
 size, it almost never does uses synthetic operations since the sizes for
 almost every instance can be done using the native math on the machine.
 almost every call has a check to see if the operation can be done 
 natively.
 I seriously doubt that you are going to do TI mode math much faster than i
 do it and if you do who cares.

 the number of calls does not effect the performance in any negative way 
 and
 it fact is more efficient since common things that require more than one
 operation in double in are typically done in a single operation.

 Simple double-int operations like

 inline double_int
 double_int::and_not (double_int b) const
 {
   double_int result;
   result.low = low  ~b.low;
   result.high = high  ~b.high;
   return result;
 }

 are always going to be faster than conditionally executing only one 
 operation
 (but inside an offline function).

 OK, this is really in reply to the 4.8 thing, but it felt more
 appropriate here.

 It's interesting that you gave this example, since before you were
 complaining about too many fused ops.  Clearly this one could be
 removed in favour of separate and() and not() operations, but why
 not provide a fused one if 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Sandiford
Richard Biener richard.guent...@gmail.com writes:
 On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump
 mikest...@comcast.net wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener 
 richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte 
 mode
 size 4 on a 64bit host.

 I do not believe that this can happen.  The core compiler
 includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes that
 are not power of two, and, if there isn't an int mode that is
 bigger, we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template
 templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This
 goal of
 this patch is to get the compiler to do the constant math the way 
 that
 the
 target does it.   Any such instantiation is by definition placing 
 some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two implementations
 of functions that perform integer math on two-HWI sized integers.  
 What
 I also don't like too much is that we have two different interfaces to
 operate
 on them!  Can't you see how I come to not liking this?  Especially the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int is a
 waste of time, as the time is better spent removing double-int from the
 compiler.  All the necessary semantics and code of double-int _has_ 
 been
 refactored into wide-int already.  Changing wide-int in any way to vend
 anything to double-int is wrong, as once double-int is removed,
 then all the
 api changes to make double-int share from wide-int is wasted and
 must then
 be removed.  The path forward is the complete removal of
 double-int; it is
 wrong, has been wrong and always will be wrong, nothing can change 
 that.

 double_int, compared to wide_int, is fast and lean.  I doubt we will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at least_
 templating wide_int on 'len' and providing specializations for 1 and 2.

 It might be a non-issue for math that operates on trees or RTXen due to
 the allocation overhead we pay, but in recent years we transitioned
 important
 paths away from using tree math to using double_ints _for speed 
 reasons_.

 Richard.

 i do not know why you believe this about the speed. double int always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the precision's
 size, it almost never does uses synthetic operations since the sizes for
 almost every instance can be done using the native math on the machine.
 almost every call has a check to see if the operation can be done
 natively.
 I seriously doubt that you are going to do TI mode math much faster than 
 i
 do it and if you do who cares.

 the number of calls does not effect the performance in any
 negative way and
 it fact is more efficient since common things that require more than one
 operation in double in are typically done in a single operation.

 Simple double-int operations like

 inline double_int
 double_int::and_not (double_int b) const
 {
   double_int result;
   result.low = low  ~b.low;
   result.high = high  ~b.high;
   return result;
 }

 are always going to be faster than conditionally executing only one
 operation
 (but inside an offline function).

 OK, this is really in reply to the 4.8 thing, but it felt more
 appropriate here.

 It's interesting that you gave this example, since before you were
 complaining about too many fused ops.  Clearly this one could be
 removed in favour of separate and() 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Biener
On Wed, Oct 31, 2012 at 1:22 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump
 mikest...@comcast.net wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener 
 richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte 
 mode
 size 4 on a 64bit host.

 I do not believe that this can happen.  The core compiler
 includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes that
 are not power of two, and, if there isn't an int mode that is
 bigger, we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide specializations 
 of
 wide_int
 for small sizes, thus ideally wide_int would be a template
 templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for 
 it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This
 goal of
 this patch is to get the compiler to do the constant math the way 
 that
 the
 target does it.   Any such instantiation is by definition placing 
 some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two 
 implementations
 of functions that perform integer math on two-HWI sized integers.  
 What
 I also don't like too much is that we have two different interfaces 
 to
 operate
 on them!  Can't you see how I come to not liking this?  Especially 
 the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int is a
 waste of time, as the time is better spent removing double-int from 
 the
 compiler.  All the necessary semantics and code of double-int _has_ 
 been
 refactored into wide-int already.  Changing wide-int in any way to 
 vend
 anything to double-int is wrong, as once double-int is removed,
 then all the
 api changes to make double-int share from wide-int is wasted and
 must then
 be removed.  The path forward is the complete removal of
 double-int; it is
 wrong, has been wrong and always will be wrong, nothing can change 
 that.

 double_int, compared to wide_int, is fast and lean.  I doubt we will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at 
 least_
 templating wide_int on 'len' and providing specializations for 1 and 2.

 It might be a non-issue for math that operates on trees or RTXen due to
 the allocation overhead we pay, but in recent years we transitioned
 important
 paths away from using tree math to using double_ints _for speed 
 reasons_.

 Richard.

 i do not know why you believe this about the speed. double int 
 always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the precision's
 size, it almost never does uses synthetic operations since the sizes for
 almost every instance can be done using the native math on the machine.
 almost every call has a check to see if the operation can be done
 natively.
 I seriously doubt that you are going to do TI mode math much faster 
 than i
 do it and if you do who cares.

 the number of calls does not effect the performance in any
 negative way and
 it fact is more efficient since common things that require more than one
 operation in double in are typically done in a single operation.

 Simple double-int operations like

 inline double_int
 double_int::and_not (double_int b) const
 {
   double_int result;
   result.low = low  ~b.low;
   result.high = high  ~b.high;
   return result;
 }

 are always going to be faster than conditionally executing only one
 operation
 (but inside an offline function).

 OK, this is really in reply to the 4.8 thing, but it felt more
 appropriate here.

 It's interesting that you gave this example, since before you were

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Sandiford
Richard Biener richard.guent...@gmail.com writes:
 But that means that wide_int has to model a P-bit operation as a
 normal len*HOST_WIDE_INT operation and then fix up the result
 after the fact, which seems unnecessarily convoluted.

 It does that right now.  The operations are carried out in a loop
 over len HOST_WIDE_INT parts, the last HWI is then special-treated
 to account for precision/size.  (yes, 'len' is also used as optimization - the
 fact that len ends up being mutable is another thing I dislike about
 wide-int.  If wide-ints are cheap then all ops should be non-mutating
 (at least to 'len')).

But the point of having a mutating len is that things like zero and -1
are common even for OImode values.  So if you're doing someting potentially
expensive like OImode multiplication, why do it to the number of
HOST_WIDE_INTs needed for an OImode value when the value we're
processing has only one significant HOST_WIDE_INT?

  I still don't
 see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision
 X*HOST_WIDE_INT operation for any X) has any special meaning.

 Well, the same reason as a HOST_WIDE_INT variable has a meaning.
 We use it to constrain what we (efficiently) want to work on.  For example
 CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when
 doing bit-constant-propagation in loops (for TImode integers on a x86_64 
 host).

But what about targets with modes wider than TImode?  Would double_int
still be appropriate then?  If not, why does CCP have to use a templated
type with a fixed number of HWIs (and all arithmetic done to a fixed
number of HWIs) rather than one that can adapt to the runtime values,
like wide_int can?

 Oh, and I don't necessary see a use of double_int in its current form
 but for an integer representation on the host that is efficient to manipulate
 integer constants of a target dependent size.  For example the target
 detail that we have partial integer modes with bitsize  precision and that
 the bits  precision appearantly have a meaning when looking at the
 bit-representation of a constant should not be part of the base class
 of wide-int (I doubt it belongs to wide-int at all, but I guess you know more
 about the reason we track bitsize in addition to precision - I think it's
 abstraction at the wrong level, the tree level does fine without knowing
 about bitsize).

TBH I'm uneasy about the bitsize thing too.  I think bitsize is only
tracked for shift truncation, and if so, I agree it makes sense
to do that separately.

But anyway, this whole discussion seems to have reached a stalemate.
Or I suppose a de-facto rejection, since you're the only person in
a position to approve the thing :-)

Richard


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 08:11 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com
wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.

I do not believe that this can happen.   The core compiler includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes that
are not power of two, and, if there isn't an int mode that is bigger, we'd
want to round up the partial int bit size.  Something like ((2 *
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide specializations of
wide_int
for small sizes, thus ideally wide_int would be a template templated
on the number of HWIs in val.  Interface-wise wide_int2 should be
identical to double_int, thus we should be able to do

typedef wide_int2 double_int;

If you want to go down this path after the patches get in, go for it.
I
see no use at all for this.
This was not meant to be a plug in replacement for double int. This
goal of
this patch is to get the compiler to do the constant math the way that
the
target does it.   Any such instantiation is by definition placing some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two implementations
of functions that perform integer math on two-HWI sized integers.  What
I also don't like too much is that we have two different interfaces to
operate
on them!  Can't you see how I come to not liking this?  Especially the
latter …

double_int is logically dead.  Reactoring wide-int and double-int is a
waste of time, as the time is better spent removing double-int from the
compiler.  All the necessary semantics and code of double-int _has_ been
refactored into wide-int already.  Changing wide-int in any way to vend
anything to double-int is wrong, as once double-int is removed,
then all the
api changes to make double-int share from wide-int is wasted and must then
be removed.  The path forward is the complete removal of double-int; it is
wrong, has been wrong and always will be wrong, nothing can change that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you for
example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at least_
templating wide_int on 'len' and providing specializations for 1 and 2.

It might be a non-issue for math that operates on trees or RTXen due to
the allocation overhead we pay, but in recent years we transitioned
important
paths away from using tree math to using double_ints _for speed reasons_.

Richard.

i do not know why you believe this about the speed. double int always
does synthetic math since you do everything at 128 bit precision.

the thing about wide int, is that since it does math to the precision's
size, it almost never does uses synthetic operations since the sizes for
almost every instance can be done using the native math on the machine.
almost every call has a check to see if the operation can be done natively.
I seriously doubt that you are going to do TI mode math much faster than i
do it and if you do who cares.

the number of calls does not effect the performance in any negative way and
it fact is more efficient since common things that require more than one
operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
   double_int result;
   result.low = low  ~b.low;
   result.high = high  ~b.high;
   return result;
}

are always going to be faster than conditionally executing only one operation
(but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting that you gave this example, since before you were
complaining about too many fused ops.  Clearly this one could be
removed in favour of separate and() and not() operations, but why
not provide a fused one if there are clients who'll make use of it?

I was more 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Biener
On Wed, Oct 31, 2012 at 2:30 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 But that means that wide_int has to model a P-bit operation as a
 normal len*HOST_WIDE_INT operation and then fix up the result
 after the fact, which seems unnecessarily convoluted.

 It does that right now.  The operations are carried out in a loop
 over len HOST_WIDE_INT parts, the last HWI is then special-treated
 to account for precision/size.  (yes, 'len' is also used as optimization - 
 the
 fact that len ends up being mutable is another thing I dislike about
 wide-int.  If wide-ints are cheap then all ops should be non-mutating
 (at least to 'len')).

 But the point of having a mutating len is that things like zero and -1
 are common even for OImode values.  So if you're doing someting potentially
 expensive like OImode multiplication, why do it to the number of
 HOST_WIDE_INTs needed for an OImode value when the value we're
 processing has only one significant HOST_WIDE_INT?

I don't propose doing that.  I propose that no wide-int member function
may _change_ it's len (to something larger).  Only that way you can
avoid allocating wasted space for zero and -1.  That way also the
artificial limit on 2 * largest-int-mode-hwis goes.

  I still don't
 see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision
 X*HOST_WIDE_INT operation for any X) has any special meaning.

 Well, the same reason as a HOST_WIDE_INT variable has a meaning.
 We use it to constrain what we (efficiently) want to work on.  For example
 CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when
 doing bit-constant-propagation in loops (for TImode integers on a x86_64 
 host).

 But what about targets with modes wider than TImode?  Would double_int
 still be appropriate then?  If not, why does CCP have to use a templated
 type with a fixed number of HWIs (and all arithmetic done to a fixed
 number of HWIs) rather than one that can adapt to the runtime values,
 like wide_int can?

Because nobody cares about accurate bit-tracking for modes larger than
TImode.  And because no convenient abstraction was available ;)

 Oh, and I don't necessary see a use of double_int in its current form
 but for an integer representation on the host that is efficient to manipulate
 integer constants of a target dependent size.  For example the target
 detail that we have partial integer modes with bitsize  precision and that
 the bits  precision appearantly have a meaning when looking at the
 bit-representation of a constant should not be part of the base class
 of wide-int (I doubt it belongs to wide-int at all, but I guess you know more
 about the reason we track bitsize in addition to precision - I think it's
 abstraction at the wrong level, the tree level does fine without knowing
 about bitsize).

 TBH I'm uneasy about the bitsize thing too.  I think bitsize is only
 tracked for shift truncation, and if so, I agree it makes sense
 to do that separately.

So, can we please remove all traces of bitsize from wide-int then?

 But anyway, this whole discussion seems to have reached a stalemate.
 Or I suppose a de-facto rejection, since you're the only person in
 a position to approve the thing :-)

There are many (silent) people that are able to approve the thing.  But the
point is I have too many issues with the current patch that I'm unable
to point at a specific thing I want Kenny to change after which the patch
would be fine.  So I rely on some guesswork from Kenny giving my
advices leaner API, less fused ops, get rid of bitsize, think of
abstracting the core HWI[len] operation, there should be no tree or
RTL dependencies in the wide-int API to produce an updated variant.
Which of course takes time, which of course crosses my vacation, which
in the end means it isn't going to make 4.8 (I _do_ like the idea of not
having a dependence on host properties for integer constant representation).

Btw, a good hint at what a minimal wide-int API would look like is if
you _just_ replace double-int users with it.  Then you obviously have to
implement only the double-int interface and conversion from/to double-int.

Richard.


 Richard


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Biener
On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

 On 10/31/2012 08:11 AM, Richard Biener wrote:

 On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:

 Richard Biener richard.guent...@gmail.com writes:

 On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:

 Richard Biener richard.guent...@gmail.com writes:

 On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net
 wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener
 richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte
 mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler
 includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes
 that
 are not power of two, and, if there isn't an int mode that is
 bigger, we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide
 specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template
 templated
 on the number of HWIs in val.  Interface-wise wide_int2 should
 be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for
 it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int.
 This
 goal of
 this patch is to get the compiler to do the constant math the way
 that
 the
 target does it.   Any such instantiation is by definition placing
 some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two
 implementations
 of functions that perform integer math on two-HWI sized integers.
 What
 I also don't like too much is that we have two different
 interfaces to
 operate
 on them!  Can't you see how I come to not liking this?  Especially
 the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int
 is a
 waste of time, as the time is better spent removing double-int from
 the
 compiler.  All the necessary semantics and code of double-int _has_
 been
 refactored into wide-int already.  Changing wide-int in any way to
 vend
 anything to double-int is wrong, as once double-int is removed,
 then all the
 api changes to make double-int share from wide-int is wasted and
 must then
 be removed.  The path forward is the complete removal of
 double-int; it is
 wrong, has been wrong and always will be wrong, nothing can change
 that.

 double_int, compared to wide_int, is fast and lean.  I doubt we will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you
 for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at
 least_
 templating wide_int on 'len' and providing specializations for 1 and
 2.

 It might be a non-issue for math that operates on trees or RTXen due
 to
 the allocation overhead we pay, but in recent years we transitioned
 important
 paths away from using tree math to using double_ints _for speed
 reasons_.

 Richard.

 i do not know why you believe this about the speed. double int
 always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the
 precision's
 size, it almost never does uses synthetic operations since the sizes
 for
 almost every instance can be done using the native math on the
 machine.
 almost every call has a check to see if the operation can be done
 natively.
 I seriously doubt that you are going to do TI mode math much faster
 than i
 do it and if you do who cares.

 the number of calls does not effect the performance in any negative
 way and
 it fact is more efficient since common things that require more than
 one
 operation in double in are typically done in a single operation.

 Simple double-int operations like

 inline double_int
 double_int::and_not (double_int b) const
 {
double_int result;
result.low = low  ~b.low;
result.high = high  ~b.high;
return result;
 }

 are always going to be faster than conditionally executing only one
 operation
 (but inside an offline function).

 OK, this is really in reply to the 4.8 thing, but it felt more
 appropriate here.

 It's interesting that you gave this example, since before you were
 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 10:05 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/31/2012 08:11 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net
wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener
richard.guent...@gmail.com
wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte
mode
size 4 on a 64bit host.

I do not believe that this can happen.   The core compiler
includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes
that
are not power of two, and, if there isn't an int mode that is
bigger, we'd
want to round up the partial int bit size.  Something like ((2 *
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide
specializations of
wide_int
for small sizes, thus ideally wide_int would be a template
templated
on the number of HWIs in val.  Interface-wise wide_int2 should
be
identical to double_int, thus we should be able to do

typedef wide_int2 double_int;

If you want to go down this path after the patches get in, go for
it.
I
see no use at all for this.
This was not meant to be a plug in replacement for double int.
This
goal of
this patch is to get the compiler to do the constant math the way
that
the
target does it.   Any such instantiation is by definition placing
some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two
implementations
of functions that perform integer math on two-HWI sized integers.
What
I also don't like too much is that we have two different
interfaces to
operate
on them!  Can't you see how I come to not liking this?  Especially
the
latter …

double_int is logically dead.  Reactoring wide-int and double-int
is a
waste of time, as the time is better spent removing double-int from
the
compiler.  All the necessary semantics and code of double-int _has_
been
refactored into wide-int already.  Changing wide-int in any way to
vend
anything to double-int is wrong, as once double-int is removed,
then all the
api changes to make double-int share from wide-int is wasted and
must then
be removed.  The path forward is the complete removal of
double-int; it is
wrong, has been wrong and always will be wrong, nothing can change
that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you
for
example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at
least_
templating wide_int on 'len' and providing specializations for 1 and
2.

It might be a non-issue for math that operates on trees or RTXen due
to
the allocation overhead we pay, but in recent years we transitioned
important
paths away from using tree math to using double_ints _for speed
reasons_.

Richard.

i do not know why you believe this about the speed. double int
always
does synthetic math since you do everything at 128 bit precision.

the thing about wide int, is that since it does math to the
precision's
size, it almost never does uses synthetic operations since the sizes
for
almost every instance can be done using the native math on the
machine.
almost every call has a check to see if the operation can be done
natively.
I seriously doubt that you are going to do TI mode math much faster
than i
do it and if you do who cares.

the number of calls does not effect the performance in any negative
way and
it fact is more efficient since common things that require more than
one
operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
double_int result;
result.low = low  ~b.low;
result.high = high  ~b.high;
return result;
}

are always going to be faster than conditionally executing only one
operation
(but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting that you gave this example, since before you were
complaining about too many fused ops.  Clearly this one could be
removed in favour 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Richard Biener
On Wed, Oct 31, 2012 at 3:18 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

 On 10/31/2012 10:05 AM, Richard Biener wrote:

 On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/31/2012 08:11 AM, Richard Biener wrote:

 On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:

 Richard Biener richard.guent...@gmail.com writes:

 On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:

 Richard Biener richard.guent...@gmail.com writes:

 On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump
 mikest...@comcast.net
 wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener
 richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max
 byte
 mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler
 includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes
 that
 are not power of two, and, if there isn't an int mode that is
 bigger, we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide
 specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template
 templated
 on the number of HWIs in val.  Interface-wise wide_int2
 should
 be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go
 for
 it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int.
 This
 goal of
 this patch is to get the compiler to do the constant math the
 way
 that
 the
 target does it.   Any such instantiation is by definition
 placing
 some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two
 implementations
 of functions that perform integer math on two-HWI sized
 integers.
 What
 I also don't like too much is that we have two different
 interfaces to
 operate
 on them!  Can't you see how I come to not liking this?
 Especially
 the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int
 is a
 waste of time, as the time is better spent removing double-int
 from
 the
 compiler.  All the necessary semantics and code of double-int
 _has_
 been
 refactored into wide-int already.  Changing wide-int in any way
 to
 vend
 anything to double-int is wrong, as once double-int is removed,
 then all the
 api changes to make double-int share from wide-int is wasted and
 must then
 be removed.  The path forward is the complete removal of
 double-int; it is
 wrong, has been wrong and always will be wrong, nothing can
 change
 that.

 double_int, compared to wide_int, is fast and lean.  I doubt we
 will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you
 for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at
 least_
 templating wide_int on 'len' and providing specializations for 1
 and
 2.

 It might be a non-issue for math that operates on trees or RTXen
 due
 to
 the allocation overhead we pay, but in recent years we
 transitioned
 important
 paths away from using tree math to using double_ints _for speed
 reasons_.

 Richard.

 i do not know why you believe this about the speed. double int
 always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the
 precision's
 size, it almost never does uses synthetic operations since the
 sizes
 for
 almost every instance can be done using the native math on the
 machine.
 almost every call has a check to see if the operation can be done
 natively.
 I seriously doubt that you are going to do TI mode math much faster
 than i
 do it and if you do who cares.

 the number of calls does not effect the performance in any negative
 way and
 it fact is more efficient since common things that require more
 than
 one
 operation in double in are typically done in a single operation.

 Simple double-int operations like

 inline double_int
 double_int::and_not (double_int b) const
 {
 double_int result;
 result.low = low  ~b.low;
 result.high = high  ~b.high;
 return result;
 }

 are always going to be faster than conditionally executing only one
 operation
 (but inside an offline function).

 OK, 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 09:54 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 2:30 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

But that means that wide_int has to model a P-bit operation as a
normal len*HOST_WIDE_INT operation and then fix up the result
after the fact, which seems unnecessarily convoluted.

It does that right now.  The operations are carried out in a loop
over len HOST_WIDE_INT parts, the last HWI is then special-treated
to account for precision/size.  (yes, 'len' is also used as optimization - the
fact that len ends up being mutable is another thing I dislike about
wide-int.  If wide-ints are cheap then all ops should be non-mutating
(at least to 'len')).

But the point of having a mutating len is that things like zero and -1
are common even for OImode values.  So if you're doing someting potentially
expensive like OImode multiplication, why do it to the number of
HOST_WIDE_INTs needed for an OImode value when the value we're
processing has only one significant HOST_WIDE_INT?

I don't propose doing that.  I propose that no wide-int member function
may _change_ it's len (to something larger).  Only that way you can
avoid allocating wasted space for zero and -1.  That way also the
artificial limit on 2 * largest-int-mode-hwis goes.

it is now 4x not 2x to accomodate the extra bit in tree-vrp.

remember that the space burden is minimal.wide-ints are not 
persistent and there are never more than a handful at a time.



  I still don't
see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision
X*HOST_WIDE_INT operation for any X) has any special meaning.

Well, the same reason as a HOST_WIDE_INT variable has a meaning.
We use it to constrain what we (efficiently) want to work on.  For example
CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when
doing bit-constant-propagation in loops (for TImode integers on a x86_64 host).

But what about targets with modes wider than TImode?  Would double_int
still be appropriate then?  If not, why does CCP have to use a templated
type with a fixed number of HWIs (and all arithmetic done to a fixed
number of HWIs) rather than one that can adapt to the runtime values,
like wide_int can?

Because nobody cares about accurate bit-tracking for modes larger than
TImode.  And because no convenient abstraction was available ;)
yes, but tree-vrp does not even work for timode.  and there are not 
tests to scale it back when it does see ti-mode.   I understand that 
these can be added, but they so far have not been.


I would also point out that i was corrected on this point by (i believe) 
lawrence.   He points out that tree-vrp is still important for 
converting signed to unsigned for larger modes.




Oh, and I don't necessary see a use of double_int in its current form
but for an integer representation on the host that is efficient to manipulate
integer constants of a target dependent size.  For example the target
detail that we have partial integer modes with bitsize  precision and that
the bits  precision appearantly have a meaning when looking at the
bit-representation of a constant should not be part of the base class
of wide-int (I doubt it belongs to wide-int at all, but I guess you know more
about the reason we track bitsize in addition to precision - I think it's
abstraction at the wrong level, the tree level does fine without knowing
about bitsize).

TBH I'm uneasy about the bitsize thing too.  I think bitsize is only
tracked for shift truncation, and if so, I agree it makes sense
to do that separately.

So, can we please remove all traces of bitsize from wide-int then?


But anyway, this whole discussion seems to have reached a stalemate.
Or I suppose a de-facto rejection, since you're the only person in
a position to approve the thing :-)

There are many (silent) people that are able to approve the thing.  But the
point is I have too many issues with the current patch that I'm unable
to point at a specific thing I want Kenny to change after which the patch
would be fine.  So I rely on some guesswork from Kenny giving my
advices leaner API, less fused ops, get rid of bitsize, think of
abstracting the core HWI[len] operation, there should be no tree or
RTL dependencies in the wide-int API to produce an updated variant.
Which of course takes time, which of course crosses my vacation, which
in the end means it isn't going to make 4.8 (I _do_ like the idea of not
having a dependence on host properties for integer constant representation).

Btw, a good hint at what a minimal wide-int API would look like is if
you _just_ replace double-int users with it.  Then you obviously have to
implement only the double-int interface and conversion from/to double-int.

Richard.



Richard




Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 10:24 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 3:18 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/31/2012 10:05 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/31/2012 08:11 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump
mikest...@comcast.net
wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener
richard.guent...@gmail.com
wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max
byte
mode
size 4 on a 64bit host.

I do not believe that this can happen.   The core compiler
includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes
that
are not power of two, and, if there isn't an int mode that is
bigger, we'd
want to round up the partial int bit size.  Something like ((2 *
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide
specializations of
wide_int
for small sizes, thus ideally wide_int would be a template
templated
on the number of HWIs in val.  Interface-wise wide_int2
should
be
identical to double_int, thus we should be able to do

typedef wide_int2 double_int;

If you want to go down this path after the patches get in, go
for
it.
I
see no use at all for this.
This was not meant to be a plug in replacement for double int.
This
goal of
this patch is to get the compiler to do the constant math the
way
that
the
target does it.   Any such instantiation is by definition
placing
some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two
implementations
of functions that perform integer math on two-HWI sized
integers.
What
I also don't like too much is that we have two different
interfaces to
operate
on them!  Can't you see how I come to not liking this?
Especially
the
latter …

double_int is logically dead.  Reactoring wide-int and double-int
is a
waste of time, as the time is better spent removing double-int
from
the
compiler.  All the necessary semantics and code of double-int
_has_
been
refactored into wide-int already.  Changing wide-int in any way
to
vend
anything to double-int is wrong, as once double-int is removed,
then all the
api changes to make double-int share from wide-int is wasted and
must then
be removed.  The path forward is the complete removal of
double-int; it is
wrong, has been wrong and always will be wrong, nothing can
change
that.

double_int, compared to wide_int, is fast and lean.  I doubt we
will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you
for
example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at
least_
templating wide_int on 'len' and providing specializations for 1
and
2.

It might be a non-issue for math that operates on trees or RTXen
due
to
the allocation overhead we pay, but in recent years we
transitioned
important
paths away from using tree math to using double_ints _for speed
reasons_.

Richard.

i do not know why you believe this about the speed. double int
always
does synthetic math since you do everything at 128 bit precision.

the thing about wide int, is that since it does math to the
precision's
size, it almost never does uses synthetic operations since the
sizes
for
almost every instance can be done using the native math on the
machine.
almost every call has a check to see if the operation can be done
natively.
I seriously doubt that you are going to do TI mode math much faster
than i
do it and if you do who cares.

the number of calls does not effect the performance in any negative
way and
it fact is more efficient since common things that require more
than
one
operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
 double_int result;
 result.low = low  ~b.low;
 result.high = high  ~b.high;
 return result;
}

are always going to be faster than conditionally executing only one
operation
(but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 08:44 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 1:22 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Richard Biener richard.guent...@gmail.com writes:

On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump
mikest...@comcast.net wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com
wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.

I do not believe that this can happen.  The core compiler
includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes that
are not power of two, and, if there isn't an int mode that is
bigger, we'd
want to round up the partial int bit size.  Something like ((2 *
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide specializations of
wide_int
for small sizes, thus ideally wide_int would be a template
templated
on the number of HWIs in val.  Interface-wise wide_int2 should be
identical to double_int, thus we should be able to do

typedef wide_int2 double_int;

If you want to go down this path after the patches get in, go for it.
I
see no use at all for this.
This was not meant to be a plug in replacement for double int. This
goal of
this patch is to get the compiler to do the constant math the way that
the
target does it.   Any such instantiation is by definition placing some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two implementations
of functions that perform integer math on two-HWI sized integers.  What
I also don't like too much is that we have two different interfaces to
operate
on them!  Can't you see how I come to not liking this?  Especially the
latter …

double_int is logically dead.  Reactoring wide-int and double-int is a
waste of time, as the time is better spent removing double-int from the
compiler.  All the necessary semantics and code of double-int _has_ been
refactored into wide-int already.  Changing wide-int in any way to vend
anything to double-int is wrong, as once double-int is removed,
then all the
api changes to make double-int share from wide-int is wasted and
must then
be removed.  The path forward is the complete removal of
double-int; it is
wrong, has been wrong and always will be wrong, nothing can change that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you for
example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at least_
templating wide_int on 'len' and providing specializations for 1 and 2.

It might be a non-issue for math that operates on trees or RTXen due to
the allocation overhead we pay, but in recent years we transitioned
important
paths away from using tree math to using double_ints _for speed reasons_.

Richard.

i do not know why you believe this about the speed. double int always
does synthetic math since you do everything at 128 bit precision.

the thing about wide int, is that since it does math to the precision's
size, it almost never does uses synthetic operations since the sizes for
almost every instance can be done using the native math on the machine.
almost every call has a check to see if the operation can be done
natively.
I seriously doubt that you are going to do TI mode math much faster than i
do it and if you do who cares.

the number of calls does not effect the performance in any
negative way and
it fact is more efficient since common things that require more than one
operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
   double_int result;
   result.low = low  ~b.low;
   result.high = high  ~b.high;
   return result;
}

are always going to be faster than conditionally executing only one
operation
(but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting that you gave this example, since before you were
complaining about too many fused ops.  Clearly this one could be
removed in 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 09:30 AM, Richard Sandiford wrote:

Richard Biener richard.guent...@gmail.com writes:

But that means that wide_int has to model a P-bit operation as a
normal len*HOST_WIDE_INT operation and then fix up the result
after the fact, which seems unnecessarily convoluted.

It does that right now.  The operations are carried out in a loop
over len HOST_WIDE_INT parts, the last HWI is then special-treated
to account for precision/size.  (yes, 'len' is also used as optimization - the
fact that len ends up being mutable is another thing I dislike about
wide-int.  If wide-ints are cheap then all ops should be non-mutating
(at least to 'len')).

But the point of having a mutating len is that things like zero and -1
are common even for OImode values.  So if you're doing someting potentially
expensive like OImode multiplication, why do it to the number of
HOST_WIDE_INTs needed for an OImode value when the value we're
processing has only one significant HOST_WIDE_INT?
I think with a little thought i can add some special constructors and 
get rid of the mutating aspects of the interface.





  I still don't
see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision
X*HOST_WIDE_INT operation for any X) has any special meaning.

Well, the same reason as a HOST_WIDE_INT variable has a meaning.
We use it to constrain what we (efficiently) want to work on.  For example
CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when
doing bit-constant-propagation in loops (for TImode integers on a x86_64 host).

But what about targets with modes wider than TImode?  Would double_int
still be appropriate then?  If not, why does CCP have to use a templated
type with a fixed number of HWIs (and all arithmetic done to a fixed
number of HWIs) rather than one that can adapt to the runtime values,
like wide_int can?


Oh, and I don't necessary see a use of double_int in its current form
but for an integer representation on the host that is efficient to manipulate
integer constants of a target dependent size.  For example the target
detail that we have partial integer modes with bitsize  precision and that
the bits  precision appearantly have a meaning when looking at the
bit-representation of a constant should not be part of the base class
of wide-int (I doubt it belongs to wide-int at all, but I guess you know more
about the reason we track bitsize in addition to precision - I think it's
abstraction at the wrong level, the tree level does fine without knowing
about bitsize).

TBH I'm uneasy about the bitsize thing too.  I think bitsize is only
tracked for shift truncation, and if so, I agree it makes sense
to do that separately.

But anyway, this whole discussion seems to have reached a stalemate.
Or I suppose a de-facto rejection, since you're the only person in
a position to approve the thing :-)

Richard




Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Mike Stump
On Oct 31, 2012, at 5:44 AM, Richard Biener richard.guent...@gmail.com wrote:
 the
 fact that len ends up being mutable is another thing I dislike about
 wide-int.

We expose len for construction only, it is non-mutating.  During construction, 
there is no previous value.

  If wide-ints are cheap then all ops should be non-mutating
 (at least to 'len')).

It is.  Construction modifies the object as construction must be defined as 
initializing the state of the data.  Before construction, there is no data, so, 
we are constructing the data, not mutating the data.  Surely you don't object 
to construction?


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Mike Stump
On Oct 31, 2012, at 6:54 AM, Richard Biener richard.guent...@gmail.com wrote:
 I propose that no wide-int member function
 may _change_ it's len (to something larger).

We never do that, so, we already do as you wish.  We construct wide ints, and 
we have member functions to construct values.  We need to construct values as 
some parts of the compiler want to create values.  The construction of values 
can be removed when the rest of the compiler no longer wishes to construct 
values. LTO is an example of a client that wanted to construct a value.  I'll 
let the LTO people chime in if they wish to no loner construct values.


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Mike Stump
On Oct 31, 2012, at 7:05 AM, Richard Biener richard.guent...@gmail.com wrote:
 You have an artificial limit on what 'len' can be.

No.  There is no limit, and nothing artificial.  We take the maximum of the 
needs of the target, the maximum of the front-ends and the maximum of the 
mid-end and the back-end.  We can drop a category, if that category no longer 
wishes to be our client.  Any client is free to stop using wide-int, any time 
they want.  For example, vrp could use gmp, if they wanted to, and the need to 
serve them drops.  You have imagined the cost is high to do this, the reality 
is all long lived objects are small, and all short lived objects are so 
transitory that we are talking about maybe 5 live at a time.

 And you do not accomodate
 users that do not want to pay the storage penalty for that arbitrary upper 
 limit
 choice.

This is also wrong.  First, there is no arbitrary upper limit.  Second, all 
long lived objects are small.  We accommodated them by having all long lived 
objects be small.  The transitory objects are big, but there are only 5 of them 
alive at a time.

  That's all because 'len' may grow (mutate).

This is also wrong.


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-25 Thread Richard Biener
On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote:
 On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com 
 wrote:
 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes that are 
 not power of two, and, if there isn't an int mode that is bigger, we'd want 
 to round up the partial int bit size.  Something like ((2 * 
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /  
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for it.I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This goal of
 this patch is to get the compiler to do the constant math the way that the
 target does it.   Any such instantiation is by definition placing some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two implementations
 of functions that perform integer math on two-HWI sized integers.  What
 I also don't like too much is that we have two different interfaces to 
 operate
 on them!  Can't you see how I come to not liking this?  Especially the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int is a waste 
 of time, as the time is better spent removing double-int from the compiler.  
 All the necessary semantics and code of double-int _has_ been refactored into 
 wide-int already.  Changing wide-int in any way to vend anything to 
 double-int is wrong, as once double-int is removed, then all the api changes 
 to make double-int share from wide-int is wasted and must then be removed.  
 The path forward is the complete removal of double-int; it is wrong, has been 
 wrong and always will be wrong, nothing can change that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you for example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at least_
templating wide_int on 'len' and providing specializations for 1 and 2.

It might be a non-issue for math that operates on trees or RTXen due to
the allocation overhead we pay, but in recent years we transitioned important
paths away from using tree math to using double_ints _for speed reasons_.

Richard.


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-25 Thread Kenneth Zadeck


On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.

I do not believe that this can happen.   The core compiler includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes that are not 
power of two, and, if there isn't an int mode that is bigger, we'd want to 
round up the partial int bit size.  Something like ((2 * 
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /  
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide specializations of
wide_int
for small sizes, thus ideally wide_int would be a template templated
on the number of HWIs in val.  Interface-wise wide_int2 should be
identical to double_int, thus we should be able to do

typedef wide_int2 double_int;

If you want to go down this path after the patches get in, go for it.I
see no use at all for this.
This was not meant to be a plug in replacement for double int. This goal of
this patch is to get the compiler to do the constant math the way that the
target does it.   Any such instantiation is by definition placing some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two implementations
of functions that perform integer math on two-HWI sized integers.  What
I also don't like too much is that we have two different interfaces to operate
on them!  Can't you see how I come to not liking this?  Especially the
latter …

double_int is logically dead.  Reactoring wide-int and double-int is a waste of 
time, as the time is better spent removing double-int from the compiler.  All 
the necessary semantics and code of double-int _has_ been refactored into 
wide-int already.  Changing wide-int in any way to vend anything to double-int 
is wrong, as once double-int is removed, then all the api changes to make 
double-int share from wide-int is wasted and must then be removed.  The path 
forward is the complete removal of double-int; it is wrong, has been wrong and 
always will be wrong, nothing can change that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you for example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at least_
templating wide_int on 'len' and providing specializations for 1 and 2.

It might be a non-issue for math that operates on trees or RTXen due to
the allocation overhead we pay, but in recent years we transitioned important
paths away from using tree math to using double_ints _for speed reasons_.

Richard.
i do not know why you believe this about the speed. double int 
always does synthetic math since you do everything at 128 bit precision.


the thing about wide int, is that since it does math to the precision's 
size, it almost never does uses synthetic operations since the sizes for 
almost every instance can be done using the native math on the 
machine.   almost every call has a check to see if the operation can be 
done natively.I seriously doubt that you are going to do TI mode 
math much faster than i do it and if you do who cares.


the number of calls does not effect the performance in any negative way 
and it fact is more efficient since common things that require more than 
one operation in double in are typically done in a single operation.




Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-25 Thread Richard Biener
On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

 On 10/25/2012 06:42 AM, Richard Biener wrote:

 On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote:

 On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com
 wrote:

 On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler includes all
 modes up to TI mode, so by default we already up to 128 bits.

 And mode bitsizes are always power-of-two?  I suppose so.

 Actually, no, they are not.  Partial int modes can have bit sizes that
 are not power of two, and, if there isn't an int mode that is bigger, we'd
 want to round up the partial int bit size.  Something like ((2 *
 MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
 HOST_BITS_PER_WIDE_INT should do it.

 I still would like to have the ability to provide specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for it.
 I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This
 goal of
 this patch is to get the compiler to do the constant math the way that
 the
 target does it.   Any such instantiation is by definition placing some
 predefined limit that some target may not want.

 Well, what I don't really like is that we now have two implementations
 of functions that perform integer math on two-HWI sized integers.  What
 I also don't like too much is that we have two different interfaces to
 operate
 on them!  Can't you see how I come to not liking this?  Especially the
 latter …

 double_int is logically dead.  Reactoring wide-int and double-int is a
 waste of time, as the time is better spent removing double-int from the
 compiler.  All the necessary semantics and code of double-int _has_ been
 refactored into wide-int already.  Changing wide-int in any way to vend
 anything to double-int is wrong, as once double-int is removed, then all the
 api changes to make double-int share from wide-int is wasted and must then
 be removed.  The path forward is the complete removal of double-int; it is
 wrong, has been wrong and always will be wrong, nothing can change that.

 double_int, compared to wide_int, is fast and lean.  I doubt we will
 get rid of it - you
 will make compile-time math a _lot_ slower.  Just profile when you for
 example
 change get_inner_reference to use wide_ints.

 To be able to remove double_int in favor of wide_int requires _at least_
 templating wide_int on 'len' and providing specializations for 1 and 2.

 It might be a non-issue for math that operates on trees or RTXen due to
 the allocation overhead we pay, but in recent years we transitioned
 important
 paths away from using tree math to using double_ints _for speed reasons_.

 Richard.

 i do not know why you believe this about the speed. double int always
 does synthetic math since you do everything at 128 bit precision.

 the thing about wide int, is that since it does math to the precision's
 size, it almost never does uses synthetic operations since the sizes for
 almost every instance can be done using the native math on the machine.
 almost every call has a check to see if the operation can be done natively.
 I seriously doubt that you are going to do TI mode math much faster than i
 do it and if you do who cares.

 the number of calls does not effect the performance in any negative way and
 it fact is more efficient since common things that require more than one
 operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
  double_int result;
  result.low = low  ~b.low;
  result.high = high  ~b.high;
  return result;
}

are always going to be faster than conditionally executing only one operation
(but inside an offline function).

Richard.


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-24 Thread Richard Biener
On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

 On 10/23/2012 10:12 AM, Richard Biener wrote:

 On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck zad...@naturalbridge.com
 wrote:

 This patch implements the wide-int class.this is a more general
 version
 of the double-int class and is meant to be the eventual replacement for
 that
 class.The use of this class removes all dependencies of the host from
 the target compiler's integer math.

 I have made all of the changes i agreed to in the earlier emails. In
 particular, this class internally maintains a bitsize and precision but
 not
 a mode. The class now is neutral about modes and tree-types.the
 functions that take modes or tree-types are just convenience functions
 that
 translate the parameters into bitsize and precision and where ever there
 is
 a call that takes a mode, there is a corresponding call that takes a
 tree-type.

 All of the little changes that richi suggested have also been made.

 The buffer sizes is now twice the size needed by the largest integer
 mode.
 This gives enough room for tree-vrp to do full multiplies on any type
 that
 the target supports.

 Tested on x86-64.

 This patch depends on the first three patches.   I am still waiting on
 final
 approval on the hwint.h patch.

 Ok to commit?

 diff --git a/gcc/wide-int.h b/gcc/wide-int.h
 new file mode 100644
 index 000..efd2c01
 --- /dev/null
 +++ b/gcc/wide-int.h
 ...
 +#ifndef GENERATOR_FILE


 The whole file is guarded with that ... why?  That is bound to be fragile
 once
 use of wide-int spreads?  How do generator programs end up including
 this file if they don't need it at all?

 This is so that wide-int can be included at the level of the generators.
 There some stuff that needs to see this type that is done during the build
 build phase that cannot see the types that are included in wide-int.h.

 +#include tree.h
 +#include hwint.h
 +#include options.h
 +#include tm.h
 +#include insn-modes.h
 +#include machmode.h
 +#include double-int.h
 +#include gmp.h
 +#include insn-modes.h
 +

 That's a lot of tree and rtl dependencies.  double-int.h avoids these by
 placing conversion routines in different headers or by only resorting to
 types in coretypes.h.  Please try to reduce the above to a minimum.

 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];

 are we sure this rounds properly?  Consider a port with max byte mode
 size 4 on a 64bit host.

 I do not believe that this can happen.   The core compiler includes all
 modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

 I still would like to have the ability to provide specializations of
 wide_int
 for small sizes, thus ideally wide_int would be a template templated
 on the number of HWIs in val.  Interface-wise wide_int2 should be
 identical to double_int, thus we should be able to do

 typedef wide_int2 double_int;

 If you want to go down this path after the patches get in, go for it.I
 see no use at all for this.
 This was not meant to be a plug in replacement for double int. This goal of
 this patch is to get the compiler to do the constant math the way that the
 target does it.   Any such instantiation is by definition placing some
 predefined limit that some target may not want.

Well, what I don't really like is that we now have two implementations
of functions that perform integer math on two-HWI sized integers.  What
I also don't like too much is that we have two different interfaces to operate
on them!  Can't you see how I come to not liking this?  Especially the
latter ...

 in double-int.h and replace its implementation with a specialization of
 wide_int.  Due to a number of divergences (double_int is not a subset
 of wide_int) that doesn't seem easily possible (one reason is the
 ShiftOp and related enums you use).  Of course wide_int is not a
 template either.  For the hypotetical embedded target above we'd
 end up using wide_int1, a even more trivial specialization.

 I realize again this wide-int is not what your wide-int is (because you
 add a precision member).  Still factoring out the commons of
 wide-int and double-int into a wide_int_raw  template should be
 possible.

 +class wide_int {
 +  /* Internal representation.  */
 +
 +  /* VAL is set to a size that is capable of computing a full
 + multiplication on the largest mode that is represented on the
 + target.  The full multiplication is use by tree-vrp.  If
 + operations are added that require larger buffers, then VAL needs
 + to be changed.  */
 +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
 HOST_BITS_PER_WIDE_INT];
 +  unsigned short len;
 +  unsigned int bitsize;
 +  unsigned int precision;

 The len, bitsize and precision members need documentation.  At least
 one sounds redundant.

 + public:
 +  enum ShiftOp {
 +NONE,
 NONE is never a descriptive name ... I suppose 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Richard Biener
On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 This patch implements the wide-int class.this is a more general version
 of the double-int class and is meant to be the eventual replacement for that
 class.The use of this class removes all dependencies of the host from
 the target compiler's integer math.

 I have made all of the changes i agreed to in the earlier emails. In
 particular, this class internally maintains a bitsize and precision but not
 a mode. The class now is neutral about modes and tree-types.the
 functions that take modes or tree-types are just convenience functions that
 translate the parameters into bitsize and precision and where ever there is
 a call that takes a mode, there is a corresponding call that takes a
 tree-type.

 All of the little changes that richi suggested have also been made.

 The buffer sizes is now twice the size needed by the largest integer mode.
 This gives enough room for tree-vrp to do full multiplies on any type that
 the target supports.

 Tested on x86-64.

 This patch depends on the first three patches.   I am still waiting on final
 approval on the hwint.h patch.

 Ok to commit?

diff --git a/gcc/wide-int.h b/gcc/wide-int.h
new file mode 100644
index 000..efd2c01
--- /dev/null
+++ b/gcc/wide-int.h
...
+#ifndef GENERATOR_FILE

The whole file is guarded with that ... why?  That is bound to be fragile once
use of wide-int spreads?  How do generator programs end up including
this file if they don't need it at all?

+#include tree.h
+#include hwint.h
+#include options.h
+#include tm.h
+#include insn-modes.h
+#include machmode.h
+#include double-int.h
+#include gmp.h
+#include insn-modes.h
+

That's a lot of tree and rtl dependencies.  double-int.h avoids these by
placing conversion routines in different headers or by only resorting to
types in coretypes.h.  Please try to reduce the above to a minimum.

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.

I still would like to have the ability to provide specializations of wide_int
for small sizes, thus ideally wide_int would be a template templated
on the number of HWIs in val.  Interface-wise wide_int2 should be
identical to double_int, thus we should be able to do

typedef wide_int2 double_int;

in double-int.h and replace its implementation with a specialization of
wide_int.  Due to a number of divergences (double_int is not a subset
of wide_int) that doesn't seem easily possible (one reason is the
ShiftOp and related enums you use).  Of course wide_int is not a
template either.  For the hypotetical embedded target above we'd
end up using wide_int1, a even more trivial specialization.

I realize again this wide-int is not what your wide-int is (because you
add a precision member).  Still factoring out the commons of
wide-int and double-int into a wide_int_raw  template should be
possible.

+class wide_int {
+  /* Internal representation.  */
+
+  /* VAL is set to a size that is capable of computing a full
+ multiplication on the largest mode that is represented on the
+ target.  The full multiplication is use by tree-vrp.  If
+ operations are added that require larger buffers, then VAL needs
+ to be changed.  */
+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
+  unsigned short len;
+  unsigned int bitsize;
+  unsigned int precision;

The len, bitsize and precision members need documentation.  At least
one sounds redundant.

+ public:
+  enum ShiftOp {
+NONE,
NONE is never a descriptive name ... I suppose this is for arithmetic vs.
logical shifts?
+/* There are two uses for the wide-int shifting functions.  The
+   first use is as an emulation of the target hardware.  The
+   second use is as service routines for other optimizations.  The
+   first case needs to be identified by passing TRUNC as the value
+   of ShiftOp so that shift amount is properly handled according to the
+   SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
+   amount is always truncated by the bytesize of the mode of
+   THIS.  */
+TRUNC

ah, no, it's for SHIFT_COUNT_TRUNCATED.  mode of THIS?  Now
it's precision I suppose.  That said, handling SHIFT_COUNT_TRUNCATED
in wide-int sounds over-engineered, the caller should be responsible
of applying SHIFT_COUNT_TRUNCATED when needed.

+  enum SignOp {
+/* Many of the math functions produce different results depending
+   on if they are SIGNED or UNSIGNED.  In general, there are two
+   different functions, whose names are prefixed with an 'S and
+   or an 'U'.  However, for some math functions there is also a
+   routine that does not have the prefix and takes an SignOp
+   parameter of SIGNED or UNSIGNED.  */
+SIGNED,
+UNSIGNED
+  };

double-int and _all_ of the rest of the middle-end uses a 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Kenneth Zadeck


On 10/23/2012 10:12 AM, Richard Biener wrote:

On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

This patch implements the wide-int class.this is a more general version
of the double-int class and is meant to be the eventual replacement for that
class.The use of this class removes all dependencies of the host from
the target compiler's integer math.

I have made all of the changes i agreed to in the earlier emails. In
particular, this class internally maintains a bitsize and precision but not
a mode. The class now is neutral about modes and tree-types.the
functions that take modes or tree-types are just convenience functions that
translate the parameters into bitsize and precision and where ever there is
a call that takes a mode, there is a corresponding call that takes a
tree-type.

All of the little changes that richi suggested have also been made.

The buffer sizes is now twice the size needed by the largest integer mode.
This gives enough room for tree-vrp to do full multiplies on any type that
the target supports.

Tested on x86-64.

This patch depends on the first three patches.   I am still waiting on final
approval on the hwint.h patch.

Ok to commit?

diff --git a/gcc/wide-int.h b/gcc/wide-int.h
new file mode 100644
index 000..efd2c01
--- /dev/null
+++ b/gcc/wide-int.h
...
+#ifndef GENERATOR_FILE



The whole file is guarded with that ... why?  That is bound to be fragile once
use of wide-int spreads?  How do generator programs end up including
this file if they don't need it at all?
This is so that wide-int can be included at the level of the 
generators.   There some stuff that needs to see this type that is done 
during the build build phase that cannot see the types that are included 
in wide-int.h.

+#include tree.h
+#include hwint.h
+#include options.h
+#include tm.h
+#include insn-modes.h
+#include machmode.h
+#include double-int.h
+#include gmp.h
+#include insn-modes.h
+

That's a lot of tree and rtl dependencies.  double-int.h avoids these by
placing conversion routines in different headers or by only resorting to
types in coretypes.h.  Please try to reduce the above to a minimum.

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.
I do not believe that this can happen.   The core compiler includes all 
modes up to TI mode, so by default we already up to 128 bits.

I still would like to have the ability to provide specializations of wide_int
for small sizes, thus ideally wide_int would be a template templated
on the number of HWIs in val.  Interface-wise wide_int2 should be
identical to double_int, thus we should be able to do

typedef wide_int2 double_int;
If you want to go down this path after the patches get in, go for it.
I see no use at all for this.
This was not meant to be a plug in replacement for double int. This goal 
of this patch is to get the compiler to do the constant math the way 
that the target does it.   Any such instantiation is by definition 
placing some predefined limit that some target may not want.



in double-int.h and replace its implementation with a specialization of
wide_int.  Due to a number of divergences (double_int is not a subset
of wide_int) that doesn't seem easily possible (one reason is the
ShiftOp and related enums you use).  Of course wide_int is not a
template either.  For the hypotetical embedded target above we'd
end up using wide_int1, a even more trivial specialization.

I realize again this wide-int is not what your wide-int is (because you
add a precision member).  Still factoring out the commons of
wide-int and double-int into a wide_int_raw  template should be
possible.

+class wide_int {
+  /* Internal representation.  */
+
+  /* VAL is set to a size that is capable of computing a full
+ multiplication on the largest mode that is represented on the
+ target.  The full multiplication is use by tree-vrp.  If
+ operations are added that require larger buffers, then VAL needs
+ to be changed.  */
+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
+  unsigned short len;
+  unsigned int bitsize;
+  unsigned int precision;

The len, bitsize and precision members need documentation.  At least
one sounds redundant.

+ public:
+  enum ShiftOp {
+NONE,
NONE is never a descriptive name ... I suppose this is for arithmetic vs.
logical shifts?

suggest something

+/* There are two uses for the wide-int shifting functions.  The
+   first use is as an emulation of the target hardware.  The
+   second use is as service routines for other optimizations.  The
+   first case needs to be identified by passing TRUNC as the value
+   of ShiftOp so that shift amount is properly handled according to the
+   SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
+   amount is always truncated by the bytesize of 

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Lawrence Crowl
On 10/23/12, Richard Biener richard.guent...@gmail.com wrote:
 I wonder if for the various ways to specify precision/len there
 is a nice C++ way of moving this detail out of wide-int.  I can
 think only of one:

 struct WIntSpec {
   WIntSpec (unsigned int len, unsigned int precision);
   WIntSpec (const_tree);
   WIntSpec (enum machine_mode);
   unsigned int len;
   unsigned int precision;
 };

 and then (sorry to pick one of the less useful functions):

   inline static wide_int zero (WIntSpec)

 which you should be able to call like

   wide_int::zero (SImode)
   wide_int::zero (integer_type_node)

 and (ugly)

   wide_int::zero (WIntSpec (32, 32))

 with C++0x wide_int::zero ({32, 32}) should be possible?  Or we
 keep the precision overload.  At least providing the WIntSpec
 abstraction allows custom ways of specifying required bits to
 not pollute wide-int itself too much.  Lawrence?

Yes, in C++11, wide_int::zero ({32, 32}) is possible using an
implicit conversion to WIntSpec from an initializer_list.  However,
at present we are limited to C++03 to enable older compilers as
boot compilers.

-- 
Lawrence Crowl


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Lawrence Crowl
On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote:
 On 10/23/2012 10:12 AM, Richard Biener wrote:
  +  inline bool minus_one_p () const;
  +  inline bool zero_p () const;
  +  inline bool one_p () const;
  +  inline bool neg_p () const;
 
  what's wrong with w == -1, w == 0, w == 1, etc.?

 I would love to do this and you seem to be somewhat knowledgeable
 of c++.  But i cannot for the life of me figure out how to do it.

Starting from the simple case, you write an operator ==.

as global operator:  bool operator == (wide_int w, int i);
as member operator:  bool wide_int::operator == (int i);

In the simple case,

bool operator == (wide_int w, int i)
{
  switch (i)
{
  case -1: return w.minus_one_p ();
  case  0: return w.zero_p ();
  case  1: return w.one_p ();
  default: unexpected
}
}

 say i have a TImode number, which must be represented in 4 ints
 on a 32 bit host (the same issue happens on 64 bit hosts, but
 the examples are simpler on 32 bit hosts) and i compare it to -1.
 The value that i am going to see as the argument of the function
 is going have the value 0x.  but the value that i have
 internally is 128 bits.  do i take this and 0 or sign extend it?

What would you have done with w.minus_one_p ()?

 in particular if someone wants to compare a number to 0xdeadbeef i
 have no idea what to do.  I tried defining two different functions,
 one that took a signed and one that took and unsigned number but
 then i wanted a cast in front of all the positive numbers.

This is where it does get tricky.  For signed arguments, you should sign
extend.  For unsigned arguments, you should not.  At present, we need
multiple overloads to avoid type ambiguities.

bool operator == (wide_int w, long long int i);
bool operator == (wide_int w, unsigned long long int i);
inline bool operator == (wide_int w, long int i)
  { return w == (long long int) i; }
inline bool operator (wide_int w, unsigned long int i)
  { return w == (unsigned long long int) i; }
inline bool operator == (wide_int w, int i)
  { return w == (long long int) i; }
inline bool operator (wide_int w, unsigned int i)
  { return w == (unsigned long long int) i; }

(There is a proposal before the C++ committee to fix this problem.)

Even so, there is room for potential bugs when wide_int does not
carry around whether or not it is signed.  The problem is that
regardless of what the programmer thinks of the sign of the wide int,
the comparison will use the sign of the int.

 If there is a way to do this, then i will do it, but it is going
 to have to work properly for things larger than a HOST_WIDE_INT.

The long-term solution, IMHO, is to either carry the sign information
around in either the type or the class data.  (I prefer type, but
with a mechanism to carry it as data when needed.)  Such comparisons
would then require consistency in signedness between the wide int
and the plain int.

 I know that double-int does some of this and it does not carry
 around a notion of signedness either.  is this just code that has
 not been fully tested or is there a trick in c++ that i am missing?

The double int class only provides == and !=, and only with other
double ints.  Otherwise, it has the same value query functions that
you do above.  In the case of double int, the goal was to simplify
use of the existing semantics.  If you are changing the semantics,
consider incorporating sign explicitly.

-- 
Lawrence Crowl


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Kenneth Zadeck


On 10/23/2012 02:38 PM, Lawrence Crowl wrote:

On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  inline bool minus_one_p () const;
+  inline bool zero_p () const;
+  inline bool one_p () const;
+  inline bool neg_p () const;

what's wrong with w == -1, w == 0, w == 1, etc.?

I would love to do this and you seem to be somewhat knowledgeable
of c++.  But i cannot for the life of me figure out how to do it.

Starting from the simple case, you write an operator ==.

as global operator:  bool operator == (wide_int w, int i);
as member operator:  bool wide_int::operator == (int i);

In the simple case,

bool operator == (wide_int w, int i)
{
   switch (i)
 {
   case -1: return w.minus_one_p ();
   case  0: return w.zero_p ();
   case  1: return w.one_p ();
   default: unexpected
 }
}

no, this seems wrong.you do not want to write code that can only 
fail at runtime unless there is a damn good reason to do that.

say i have a TImode number, which must be represented in 4 ints
on a 32 bit host (the same issue happens on 64 bit hosts, but
the examples are simpler on 32 bit hosts) and i compare it to -1.
The value that i am going to see as the argument of the function
is going have the value 0x.  but the value that i have
internally is 128 bits.  do i take this and 0 or sign extend it?

What would you have done with w.minus_one_p ()?
the code knows that -1 is a negative number and it knows the precision 
of w.That is enough information.   So it logically builds a -1 that 
has enough bits to do the conversion.




in particular if someone wants to compare a number to 0xdeadbeef i
have no idea what to do.  I tried defining two different functions,
one that took a signed and one that took and unsigned number but
then i wanted a cast in front of all the positive numbers.

This is where it does get tricky.  For signed arguments, you should sign
extend.  For unsigned arguments, you should not.  At present, we need
multiple overloads to avoid type ambiguities.

bool operator == (wide_int w, long long int i);
bool operator == (wide_int w, unsigned long long int i);
inline bool operator == (wide_int w, long int i)
   { return w == (long long int) i; }
inline bool operator (wide_int w, unsigned long int i)
   { return w == (unsigned long long int) i; }
inline bool operator == (wide_int w, int i)
   { return w == (long long int) i; }
inline bool operator (wide_int w, unsigned int i)
   { return w == (unsigned long long int) i; }

(There is a proposal before the C++ committee to fix this problem.)

Even so, there is room for potential bugs when wide_int does not
carry around whether or not it is signed.  The problem is that
regardless of what the programmer thinks of the sign of the wide int,
the comparison will use the sign of the int.
when they do we can revisit this.   but i looked at this and i said the 
potential bugs were not worth the effort.



If there is a way to do this, then i will do it, but it is going
to have to work properly for things larger than a HOST_WIDE_INT.

The long-term solution, IMHO, is to either carry the sign information
around in either the type or the class data.  (I prefer type, but
with a mechanism to carry it as data when needed.)  Such comparisons
would then require consistency in signedness between the wide int
and the plain int.
carrying the sign information is a non starter.The rtl level does 
not have it and the middle end violates it more often than not.My 
view was to design this having looked at all of the usage.   I have 
basically converted the whole compiler before i released the abi.   I am 
still getting out the errors and breaking it up in reviewable sized 
patches, but i knew very very well who my clients were before i wrote 
the abi.



I know that double-int does some of this and it does not carry
around a notion of signedness either.  is this just code that has
not been fully tested or is there a trick in c++ that i am missing?

The double int class only provides == and !=, and only with other
double ints.  Otherwise, it has the same value query functions that
you do above.  In the case of double int, the goal was to simplify
use of the existing semantics.  If you are changing the semantics,
consider incorporating sign explicitly.


i have, and it does not work.


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Lawrence Crowl
On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote:
 On 10/23/2012 02:38 PM, Lawrence Crowl wrote:
 On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote:
 On 10/23/2012 10:12 AM, Richard Biener wrote:
 +  inline bool minus_one_p () const;
 +  inline bool zero_p () const;
 +  inline bool one_p () const;
 +  inline bool neg_p () const;

 what's wrong with w == -1, w == 0, w == 1, etc.?
 I would love to do this and you seem to be somewhat knowledgeable
 of c++.  But i cannot for the life of me figure out how to do it.
 Starting from the simple case, you write an operator ==.

 as global operator:  bool operator == (wide_int w, int i);
 as member operator:  bool wide_int::operator == (int i);

 In the simple case,

 bool operator == (wide_int w, int i)
 {
switch (i)
  {
case -1: return w.minus_one_p ();
case  0: return w.zero_p ();
case  1: return w.one_p ();
default: unexpected
  }
 }

 no, this seems wrong.you do not want to write code that can only
 fail at runtime unless there is a damn good reason to do that.

Well, that's because it's the oversimplified case.  :-)

 say i have a TImode number, which must be represented in 4 ints
 on a 32 bit host (the same issue happens on 64 bit hosts, but
 the examples are simpler on 32 bit hosts) and i compare it to -1.
 The value that i am going to see as the argument of the function
 is going have the value 0x.  but the value that i have
 internally is 128 bits.  do i take this and 0 or sign extend it?

 What would you have done with w.minus_one_p ()?

 the code knows that -1 is a negative number and it knows the
 precision of w.  That is enough information.  So it logically
 builds a -1 that has enough bits to do the conversion.

And the code could also know that '-n' is a negative number and do
the identical conversion.  It would certainly be more difficult to
write and get all the edge cases.

 in particular if someone wants to compare a number to 0xdeadbeef i
 have no idea what to do.  I tried defining two different functions,
 one that took a signed and one that took and unsigned number but
 then i wanted a cast in front of all the positive numbers.
 This is where it does get tricky.  For signed arguments, you should sign
 extend.  For unsigned arguments, you should not.  At present, we need
 multiple overloads to avoid type ambiguities.

 bool operator == (wide_int w, long long int i);
 bool operator == (wide_int w, unsigned long long int i);
 inline bool operator == (wide_int w, long int i)
{ return w == (long long int) i; }
 inline bool operator (wide_int w, unsigned long int i)
{ return w == (unsigned long long int) i; }
 inline bool operator == (wide_int w, int i)
{ return w == (long long int) i; }
 inline bool operator (wide_int w, unsigned int i)
{ return w == (unsigned long long int) i; }

 (There is a proposal before the C++ committee to fix this problem.)

 Even so, there is room for potential bugs when wide_int does not
 carry around whether or not it is signed.  The problem is that
 regardless of what the programmer thinks of the sign of the wide int,
 the comparison will use the sign of the int.

 when they do we can revisit this.   but i looked at this and i said the
 potential bugs were not worth the effort.

I won't disagree.  I was answering what I thought were questions on
what was possible.

 If there is a way to do this, then i will do it, but it is going
 to have to work properly for things larger than a HOST_WIDE_INT.
 The long-term solution, IMHO, is to either carry the sign information
 around in either the type or the class data.  (I prefer type, but
 with a mechanism to carry it as data when needed.)  Such comparisons
 would then require consistency in signedness between the wide int
 and the plain int.

 carrying the sign information is a non starter.The rtl level does
 not have it and the middle end violates it more often than not.My
 view was to design this having looked at all of the usage.   I have
 basically converted the whole compiler before i released the abi.   I am
 still getting out the errors and breaking it up in reviewable sized
 patches, but i knew very very well who my clients were before i wrote
 the abi.

Okay.

 I know that double-int does some of this and it does not carry
 around a notion of signedness either.  is this just code that has
 not been fully tested or is there a trick in c++ that i am missing?
 The double int class only provides == and !=, and only with other
 double ints.  Otherwise, it has the same value query functions that
 you do above.  In the case of double int, the goal was to simplify
 use of the existing semantics.  If you are changing the semantics,
 consider incorporating sign explicitly.

 i have, and it does not work.

Unfortunate.

-- 
Lawrence Crowl


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Kenneth Zadeck


On 10/23/2012 04:25 PM, Lawrence Crowl wrote:

On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote:

On 10/23/2012 02:38 PM, Lawrence Crowl wrote:

On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  inline bool minus_one_p () const;
+  inline bool zero_p () const;
+  inline bool one_p () const;
+  inline bool neg_p () const;

what's wrong with w == -1, w == 0, w == 1, etc.?

I would love to do this and you seem to be somewhat knowledgeable
of c++.  But i cannot for the life of me figure out how to do it.

Starting from the simple case, you write an operator ==.

as global operator:  bool operator == (wide_int w, int i);
as member operator:  bool wide_int::operator == (int i);

In the simple case,

bool operator == (wide_int w, int i)
{
switch (i)
  {
case -1: return w.minus_one_p ();
case  0: return w.zero_p ();
case  1: return w.one_p ();
default: unexpected
  }
}

no, this seems wrong.you do not want to write code that can only
fail at runtime unless there is a damn good reason to do that.

Well, that's because it's the oversimplified case.  :-)


say i have a TImode number, which must be represented in 4 ints
on a 32 bit host (the same issue happens on 64 bit hosts, but
the examples are simpler on 32 bit hosts) and i compare it to -1.
The value that i am going to see as the argument of the function
is going have the value 0x.  but the value that i have
internally is 128 bits.  do i take this and 0 or sign extend it?

What would you have done with w.minus_one_p ()?

the code knows that -1 is a negative number and it knows the
precision of w.  That is enough information.  So it logically
builds a -1 that has enough bits to do the conversion.

And the code could also know that '-n' is a negative number and do
the identical conversion.  It would certainly be more difficult to
write and get all the edge cases.
I am not a c++ hacker.   if someone wants to go there later, we can 
investigate this.

but it seems like a can of worms right now.



in particular if someone wants to compare a number to 0xdeadbeef i
have no idea what to do.  I tried defining two different functions,
one that took a signed and one that took and unsigned number but
then i wanted a cast in front of all the positive numbers.

This is where it does get tricky.  For signed arguments, you should sign
extend.  For unsigned arguments, you should not.  At present, we need
multiple overloads to avoid type ambiguities.

bool operator == (wide_int w, long long int i);
bool operator == (wide_int w, unsigned long long int i);
inline bool operator == (wide_int w, long int i)
{ return w == (long long int) i; }
inline bool operator (wide_int w, unsigned long int i)
{ return w == (unsigned long long int) i; }
inline bool operator == (wide_int w, int i)
{ return w == (long long int) i; }
inline bool operator (wide_int w, unsigned int i)
{ return w == (unsigned long long int) i; }

(There is a proposal before the C++ committee to fix this problem.)

Even so, there is room for potential bugs when wide_int does not
carry around whether or not it is signed.  The problem is that
regardless of what the programmer thinks of the sign of the wide int,
the comparison will use the sign of the int.

when they do we can revisit this.   but i looked at this and i said the
potential bugs were not worth the effort.

I won't disagree.  I was answering what I thought were questions on
what was possible.


If there is a way to do this, then i will do it, but it is going
to have to work properly for things larger than a HOST_WIDE_INT.

The long-term solution, IMHO, is to either carry the sign information
around in either the type or the class data.  (I prefer type, but
with a mechanism to carry it as data when needed.)  Such comparisons
would then require consistency in signedness between the wide int
and the plain int.

carrying the sign information is a non starter.The rtl level does
not have it and the middle end violates it more often than not.My
view was to design this having looked at all of the usage.   I have
basically converted the whole compiler before i released the abi.   I am
still getting out the errors and breaking it up in reviewable sized
patches, but i knew very very well who my clients were before i wrote
the abi.

Okay.


I know that double-int does some of this and it does not carry
around a notion of signedness either.  is this just code that has
not been fully tested or is there a trick in c++ that i am missing?

The double int class only provides == and !=, and only with other
double ints.  Otherwise, it has the same value query functions that
you do above.  In the case of double int, the goal was to simplify
use of the existing semantics.  If you are changing the semantics,
consider incorporating sign explicitly.

i have, and it does not work.

Unfortunate.

There is certainly a desire here not