Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On Sun, Apr 21, 2013 at 10:54 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: Richard, i pulled these two frags out of your comments because i wanted to get some input from you on it while i addressed the other issues you raised. + enum SignOp { +/* Many of the math functions produce different results depending + on if they are SIGNED or UNSIGNED. In general, there are two + different functions, whose names are prefixed with an 'S and + or an 'U'. However, for some math functions there is also a + routine that does not have the prefix and takes an SignOp + parameter of SIGNED or UNSIGNED. */ +SIGNED, +UNSIGNED + }; You seem to insist on that. It should propagate to the various parts of the compiler that have settled for the 'uns' integer argument. Having one piece behave different is just weird. I suppose I will find code like wi.ext (prec, uns ? UNSIGNED : SIGNED) there is a lot more flexibility on my part than you perceive with respect to this point. My primary issue is that i do not want to have is an interface that has 0 and 1 as a programmer visible part. Beyond that i am open to suggestion. The poster child of my hate are the host_integer_p and the tree_low_cst interfaces. I did not want the wide int stuff to look like these. I see several problems with these: 1) of the 314 places where tree_low_cst is called in the gcc directory (not the subdirectories where the front ends live), NONE of the calls have a variable second parameter. There are a handful of places, as one expects, in the front ends that do, but NONE in the middle end. 2) there are a small number of the places where host_integer_p is called with one parameter and then it is followed by a call to tree_low_cst that has the value with the other sex. I am sure these are mistakes, but having the 0s and 1s flying around does not make it easy to spot them. 3) tree_low_cst implies that the tree cst has only two hwis in it. While i do not want to propagate an interface with 0 and 1 into wide-int, i can understand your dislike of having a wide-int only solution for this. I will point out that for your particular example, uns is almost always set by a call to TYPE_UNSIGNED. There could easily be a different type accessor that converts this part of the type to the right thing to pass in here. I think that there is certainly some place for there to be a unified SYMBOLIC api that controls the signedness everywhere in the compiler. I would like to move toward this direction, but you have been so negative to the places where i have made it convenient to directly convert from tree or rtl into or out of wide-int that i have hesitated to do something that directly links trees and wide-int. So i would like to ask you what would like? Ideally I'd like the wide-int introduction to _not_ be the introduction of a unified symbolic way that controls signedness. We do have two kinds of interfaces currently - one that uses different API entries, like build_int_cstu vs. build_int_cst or double_int::from_shwi vs. from_uhwi, and one that uses the aforementioned integer flag 'uns' with 0 being signed and 1 being unsigned. I think the _uhwi vs. _shwi and _cstu variants are perfectly fine (but only for compile-time constant uses as you say), and the wide-int interface makes use of this kind, too. Proposing a better API for the 'uns' flag separately from wide-int would be a better way to get anybody else than me chime in (I have the feeling that the wide-int series seems to scare off every other reviewer besides me...). I can live with the SIGNED/UNSIGNED enum, but existing APIs should be changed to use that. For wide-int I suggest to go the route you don't want to go. Stick to existing practice and use the integer 'uns' flag. It's as good as SIGNED/UNSIGNED for _variable_ cases (and yes, a lot less descriptive for constant cases). For wide-int, always add a static interface if there is a variable one and convert variable uses to the proper static interface. That said, a lot of my pushback is because I feel a little lonesome in this wide-int review and don't want to lone-some decide about that (generic) interface part as well. + template typename T +inline bool gt_p (T c, SignOp sgn) const; + template typename T +inline bool gts_p (T c) const; + template typename T +inline bool gtu_p (T c) const; it's bad that we can't use the sign information we have available in almost all cases ... (where precision is not an exact multiple of HOST_BITS_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT). It isn't hard to encode a sign - you just have to possibly waste a word of zeroes for positive values where at the moment precision is an exact multiple of HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT. Which of course means that the encoding can be one word larger than maximally
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On 04/19/2013 09:31 AM, Richard Biener wrote: + number of elements of the vector that are in use. When LEN * + HOST_BITS_PER_WIDE_INT the precision, the value has been + compressed. The values of the elements of the vector greater than + LEN - 1. are all equal to the highest order bit of LEN. equal to the highest order bit of element LEN - 1. ? Fixed, you are correct. I have gone thru the entire wide-int patch to clean this up. The bottom line is that if the precision is not a multiple of the size of a HWI then everything above that precision is assumed to be identical to the sign bit. Especially _not_ equal to the precision - 1 bit of the value, correct? I do not understand your question here, because in the case talked about above, the bit at precision - 1 would not have been explicitly represented. Anyway, i went thru this top part carefully and made many things clearer. + The representation does not contain any information inherant about + signedness of the represented value, so it can be used to represent + both signed and unsigned numbers. For operations where the results + depend on signedness (division, comparisons), the signedness must + be specified separately. For operations where the signness + matters, one of the operands to the operation specifies either + wide_int::SIGNED or wide_int::UNSIGNED. The last sentence is somehow duplicated. fixed + The numbers are stored as sign entended numbers as a means of + compression. Leading HOST_WIDE_INTS that contain strings of either + -1 or 0 are removed as long as they can be reconstructed from the + top bit that is being represented. I'd put this paragraph before the one that talks about signedness, next to the one that already talks about encoding. done + All constructors for wide_int take either a precision, an enum + machine_mode or tree_type. */ That's probably no longer true (I'll now check). yes you are correct +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. tree-vpn + currently does a 2x largest mode by 2x largest mode yielding a 4x + largest mode result. If operations are added that require larger + buffers, then VAL needs to be changed. */ + HOST_WIDE_INT val[WIDE_INT_MAX_ELTS]; + unsigned short len; + unsigned int precision; I wonder if there is a technical reason to stick to HOST_WIDE_INTs? I'd say for efficiency HOST_WIDEST_FAST_INT would be more appropriate (to get a 32bit value on 32bit x86 for example). I of course see that conversion to/from HOST_WIDE_INT is an important operation that would get slightly more complicated. Maybe just quickly checking the code generated on 32bit x86 for HOST_WIDE_INT vs. HOST_WIDEST_FAST_INT tells us whether it's worth considering (it would be bad if each add/multiply would end up calling to libgcc for example - I know that doesn't happen for x86, but maybe it would happen for an arm hosted gcc targeting x86_64?) This is an interesting point. my guess is that it is unlikely to be worth the work. consider add:most machines have add with carry and well written 32 bit ports would have used an add with carry sequence rather than making the libcall. If i rewrite wide-int in terms of host_fastest_int, then i have to do some messy code to compute the carry which is unlikely to translate into the proper carry instructions. Not to mention the cost overhead of converting to and from HFI given that gcc is written almost entirely using HWIs. I thought about the possible idea of just converting the mul and div functions. This would be easy because i already reblock them into HOST_WIDE_HALF_INTs to do the math.I could just do a different reblocking. However, i think that it is unlikely that doing this would ever show up on anyone's performance counts. Either way you do the same number of multiply instructions, it is just the subroutine wrapper that could possibly go away. + enum ShiftOp { +NONE, +/* There are two uses for the wide-int shifting functions. The + first use is as an emulation of the target hardware. The + second use is as service routines for other optimizations. The + first case needs to be identified by passing TRUNC as the value + of ShiftOp so that shift amount is properly handled according to the + SHIFT_COUNT_TRUNCATED flag. For the second case, the shift + amount is always truncated by the bytesize of the mode of + THIS. */ +TRUNC + }; double-int simply honors SHIFT_COUNT_TRUNCATED. Why differ from that (and thus change behavior in existing code - not sure if you do that with introducing wide-int)? I believe that GCC is supposed to be a little schizophrenic here, at least according to the doc.when it is doing
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
Richard Biener richard.guent...@gmail.com writes: At the rtl level your idea does not work. rtl constants do not have a mode or type. Which is not true and does not matter. I tell you why. Quote: It _is_ true, as long as you read rtl constants as rtl integer constants :-) +#if TARGET_SUPPORTS_WIDE_INT + +/* Match CONST_*s that can represent compile-time constant integers. */ +#define CASE_CONST_SCALAR_INT \ + case CONST_INT: \ + case CONST_WIDE_INT which means you are only replacing CONST_DOUBLE with wide-int. And _all_ CONST_DOUBLE have a mode. Otherwise you'd have no way of creating the wide-int in the first place. No, integer CONST_DOUBLEs have VOIDmode, just like CONST_INT. Only floating-point CONST_DOUBLEs have a real mode. I understand that this makes me vulnerable to the argument that we should not let the rtl level ever dictate anything about the tree level, but the truth is that a variable len rep is almost always used for big integers. In our code, most constants of large types are small numbers. (Remember i got into this because the tree constant prop thinks that left shifting any number by anything greater than 128 is always 0 and discovered that that was just the tip of the iceberg.) But mostly i support the decision to canonize numbers to the smallest number of HWIs because most of the algorithms to do the math can be short circuited.I admit that if i had to effectively unpack most numbers to do the math, that the canonization would be a waste. However, this is not really relevant to this conversation. Yes, you could get rid of the len, but this such a small part of picture. Getting rid of 'len' in the RTX storage was only a question of whether it is an efficient way to go forward. And with considering to unify CONST_INTs and CONST_WIDE_INTs it is not. And even for CONST_WIDE_INTs (which most of the time would be 2 HWI storage, as otherwise you'd use a CONST_INT) it would be an improvement. FWIW, I don't really see any advantage in unifying CONST_INT and CONST_WIDE_INT, for the reasons Kenny has already given. CONST_INT can represent a large majority of the integers and it is already a fairly efficient representation. It's more important that we don't pick a design that forces one choice or the other. And I think Kenny's patch achieves that goal, because the choice is hidden behind macros and behind the wide_int interface. Thanks, Richard
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: At the rtl level your idea does not work. rtl constants do not have a mode or type. Which is not true and does not matter. I tell you why. Quote: It _is_ true, as long as you read rtl constants as rtl integer constants :-) +#if TARGET_SUPPORTS_WIDE_INT + +/* Match CONST_*s that can represent compile-time constant integers. */ +#define CASE_CONST_SCALAR_INT \ + case CONST_INT: \ + case CONST_WIDE_INT which means you are only replacing CONST_DOUBLE with wide-int. And _all_ CONST_DOUBLE have a mode. Otherwise you'd have no way of creating the wide-int in the first place. No, integer CONST_DOUBLEs have VOIDmode, just like CONST_INT. Only floating-point CONST_DOUBLEs have a real mode. I stand corrected. Now that's one more argument for infinite precision constants, as the mode is then certainly provided by the operations similar to the sign. That is, the mode (or size, or precision) of 1 certainly does not matter. I understand that this makes me vulnerable to the argument that we should not let the rtl level ever dictate anything about the tree level, but the truth is that a variable len rep is almost always used for big integers. In our code, most constants of large types are small numbers. (Remember i got into this because the tree constant prop thinks that left shifting any number by anything greater than 128 is always 0 and discovered that that was just the tip of the iceberg.) But mostly i support the decision to canonize numbers to the smallest number of HWIs because most of the algorithms to do the math can be short circuited.I admit that if i had to effectively unpack most numbers to do the math, that the canonization would be a waste. However, this is not really relevant to this conversation. Yes, you could get rid of the len, but this such a small part of picture. Getting rid of 'len' in the RTX storage was only a question of whether it is an efficient way to go forward. And with considering to unify CONST_INTs and CONST_WIDE_INTs it is not. And even for CONST_WIDE_INTs (which most of the time would be 2 HWI storage, as otherwise you'd use a CONST_INT) it would be an improvement. FWIW, I don't really see any advantage in unifying CONST_INT and CONST_WIDE_INT, for the reasons Kenny has already given. CONST_INT can represent a large majority of the integers and it is already a fairly efficient representation. It's more important that we don't pick a design that forces one choice or the other. And I think Kenny's patch achieves that goal, because the choice is hidden behind macros and behind the wide_int interface. Not unifying const-int and double-int in the end would be odd. Richard. Thanks, Richard
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
Richard Biener richard.guent...@gmail.com writes: Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: At the rtl level your idea does not work. rtl constants do not have a mode or type. Which is not true and does not matter. I tell you why. Quote: It _is_ true, as long as you read rtl constants as rtl integer constants :-) +#if TARGET_SUPPORTS_WIDE_INT + +/* Match CONST_*s that can represent compile-time constant integers. */ +#define CASE_CONST_SCALAR_INT \ + case CONST_INT: \ + case CONST_WIDE_INT which means you are only replacing CONST_DOUBLE with wide-int. And _all_ CONST_DOUBLE have a mode. Otherwise you'd have no way of creating the wide-int in the first place. No, integer CONST_DOUBLEs have VOIDmode, just like CONST_INT. Only floating-point CONST_DOUBLEs have a real mode. I stand corrected. Now that's one more argument for infinite precision constants, as the mode is then certainly provided by the operations similar to the sign. That is, the mode (or size, or precision) of 1 certainly does not matter. I disagree. Although CONST_INT and CONST_DOUBLE don't _store_ a mode, they are always interpreted according to a particular mode. It's just that that mode has to be specified separately. That's why so many rtl functions have (enum machine_mode, rtx) pairs. Infinite precision seems very alien to rtl, where everything is interpreted according to a particular mode (whether that mode is stored in the rtx or not). For one thing, I don't see how infinite precision could work in an environment where signedness often isn't defined. E.g. if you optimise an addition of two rtl constants, you don't know (and aren't supposed to know) whether the values involved are signed or unsigned. With fixed-precision arithmetic it doesn't matter, because both operands must have the same precision, and because bits outside the precision are not significant. With infinite precision arithmetic, the choice carries over to the next operation. E.g., to take a 4-bit example, you don't know when constructing a wide_int from an rtx whether 0b1000 represents 8 or -8. But if you have no precision to say how many bits are significant, you have to pick one. Which do you choose? And why should we have to make a choice at all? (Note that this is a different question to whether the internal wide_int representation is sign-extending or not, which is purely an implementation detail. The same implementation principle applies to CONST_INTs: the HWI in a CONST_INT is always sign-extended from the msb of the represented value, although of course the CONST_INT itself doesn't tell you which bit the msb is; that has to be determined separately.) A particular wide_int isn't, and IMO shouldn't be, inherently signed or unsigned. The rtl model is that signedness is a question of interpretation rather than representation. I realise trees are different, because signedness is a property of the type rather than operations on the type, but I still think fixed precision works with both tree and rtl whereas infinite precision doesn't work with rtl. I also fear there are going to be lots of bugs where we forget to truncate the result of an N-bit operation from infinite precision to N bits before using it in the next operation (as per Kenny's ring explanation). With finite precision, and with all-important asserts that the operands have consistent precisions, we shouldn't have any hidden bugs like that. If there are parts of gcc that really want to do infinite-precision arithmetic, mpz_t ought to be as good as anything. Thanks, Richard
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On 04/22/2013 08:20 AM, Richard Biener wrote: That said, a lot of my pushback is because I feel a little lonesome in this wide-int review and don't want to lone-some decide about that (generic) interface part as well. yeh, now sandiford is back from vacation so there are two of us to beat on you about your how bad it would be to do infinite precision!!! be careful what you wish for. kenny
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
Richard, i pulled these two frags out of your comments because i wanted to get some input from you on it while i addressed the other issues you raised. + enum SignOp { +/* Many of the math functions produce different results depending + on if they are SIGNED or UNSIGNED. In general, there are two + different functions, whose names are prefixed with an 'S and + or an 'U'. However, for some math functions there is also a + routine that does not have the prefix and takes an SignOp + parameter of SIGNED or UNSIGNED. */ +SIGNED, +UNSIGNED + }; You seem to insist on that. It should propagate to the various parts of the compiler that have settled for the 'uns' integer argument. Having one piece behave different is just weird. I suppose I will find code like wi.ext (prec, uns ? UNSIGNED : SIGNED) there is a lot more flexibility on my part than you perceive with respect to this point. My primary issue is that i do not want to have is an interface that has 0 and 1 as a programmer visible part. Beyond that i am open to suggestion. The poster child of my hate are the host_integer_p and the tree_low_cst interfaces. I did not want the wide int stuff to look like these. I see several problems with these: 1) of the 314 places where tree_low_cst is called in the gcc directory (not the subdirectories where the front ends live), NONE of the calls have a variable second parameter. There are a handful of places, as one expects, in the front ends that do, but NONE in the middle end. 2) there are a small number of the places where host_integer_p is called with one parameter and then it is followed by a call to tree_low_cst that has the value with the other sex. I am sure these are mistakes, but having the 0s and 1s flying around does not make it easy to spot them. 3) tree_low_cst implies that the tree cst has only two hwis in it. While i do not want to propagate an interface with 0 and 1 into wide-int, i can understand your dislike of having a wide-int only solution for this. I will point out that for your particular example, uns is almost always set by a call to TYPE_UNSIGNED. There could easily be a different type accessor that converts this part of the type to the right thing to pass in here. I think that there is certainly some place for there to be a unified SYMBOLIC api that controls the signedness everywhere in the compiler. I would like to move toward this direction, but you have been so negative to the places where i have made it convenient to directly convert from tree or rtl into or out of wide-int that i have hesitated to do something that directly links trees and wide-int. So i would like to ask you what would like? + template typename T +inline bool gt_p (T c, SignOp sgn) const; + template typename T +inline bool gts_p (T c) const; + template typename T +inline bool gtu_p (T c) const; it's bad that we can't use the sign information we have available in almost all cases ... (where precision is not an exact multiple of HOST_BITS_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT). It isn't hard to encode a sign - you just have to possibly waste a word of zeroes for positive values where at the moment precision is an exact multiple of HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT. Which of course means that the encoding can be one word larger than maximally required by 'precision'. Going back to point 1 above, the front ends structure the middle end code where (generally) the sign that is used is encoded in the operator that one is looking at.So the majority of uses in the middle end this fall into the second or third templates and the first template is there as a convenience routine for the middle ends. The front ends certainly use the first template. This is how the rtl level has survived so long without a sign bit in the modes, the operators tell the whole story. The truth is that in the middle end, the story is the same - it is the operators (most of the time) that drive the calls being made. There is an assumption that you are making that i certainly do not believe is true in the backends and i kind of doubt is true in the middle ends. That is that the sign of the compare ALWAYS matches the sign of the operands. Given that i have never seen any code that verifies this in the middle end, i am going to assume that it is not true, because it is always true in gcc that anything that we do not explicitly verify generally turns out to only be generally true and you can spend your life tracking down the end cases. This is a needless complication. At the rtl level, this is completely doomed by the GEN_INT which neither takes a mode or an indication of sign. To assume that there is any meaningful sign information there is a horror story waiting to be written (sure what could go wrong if we go into the old house? whats that
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On Tue, Apr 16, 2013 at 10:07 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: Richard, I made major changes to wide-int along the lines you suggested. Each of the binary operations is now a template. There are 5 possible implementations of those operations, one for each of HWI, unsigned HWI, wide-int, rtl, and tree. Note that this is not exactly as you suggested, but it is along the same lines. The HWI template sign extends the value to the precision of the first operand, the unsigned HWI is the same except that it is an unsigned extension. The wide-int version is used as before, but is in truth rarely used. The rtl and tree logically convert the value to a wide-int but in practice do something more efficient than converting to the wide-int. What they do is look inside the rtl or the tree and pass a pointer to the data and a length to the binary operation. This is perfectly safe in the position of a second operand to the binary operation because the lifetime is guaranteed to be very short. The wide-int implementation was also modified to do the same pointer trick allowing all 5 templates to share the same use of the data. Note that currently the tree code is more crufty than one would like. This will clean up nicely when the tree-cst is changed to represent the value with an array and a length field. So now, at least for the second operand of binary operations, the storage is never copied.I do not believe that there is a good similar trick for the first operand. i did not consider something like wide_int::add (a, b) to be a viable option; it seems to mis the point of using an object oriented language. So I think that you really have to copy the data into an instance of a wide int. However, while all of this avoids ever having to pass a precision into the second operand, this patch does preserve the finite math implementation of wide-int.Finite math is really what people expect an optimizer to do, because it seamlessly matches what the machine is going to do. I hope at this point, i can get a comprehensive review on these patches. I believe that I have done what is required. There are two other patches that will be submitted in the next few minutes. The first one is an updated version of the rtl level patch. The only changes from what you have seen before are that the binary operations now use the templated binary operations. The second one is the first of the tree level patches. It converts builtins.c to use both use wide-int and it removes all assumptions that tree-csts are built with two HWIs. Once builtins.c is accepted, i will convert the rest of the middle end patches. They will all be converted in a similar way. + number of elements of the vector that are in use. When LEN * + HOST_BITS_PER_WIDE_INT the precision, the value has been + compressed. The values of the elements of the vector greater than + LEN - 1. are all equal to the highest order bit of LEN. equal to the highest order bit of element LEN - 1. ? Especially _not_ equal to the precision - 1 bit of the value, correct? + The representation does not contain any information inherant about + signedness of the represented value, so it can be used to represent + both signed and unsigned numbers. For operations where the results + depend on signedness (division, comparisons), the signedness must + be specified separately. For operations where the signness + matters, one of the operands to the operation specifies either + wide_int::SIGNED or wide_int::UNSIGNED. The last sentence is somehow duplicated. + The numbers are stored as sign entended numbers as a means of + compression. Leading HOST_WIDE_INTS that contain strings of either + -1 or 0 are removed as long as they can be reconstructed from the + top bit that is being represented. I'd put this paragraph before the one that talks about signedness, next to the one that already talks about encoding. + All constructors for wide_int take either a precision, an enum + machine_mode or tree_type. */ That's probably no longer true (I'll now check). +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. tree-vpn + currently does a 2x largest mode by 2x largest mode yielding a 4x + largest mode result. If operations are added that require larger + buffers, then VAL needs to be changed. */ + HOST_WIDE_INT val[WIDE_INT_MAX_ELTS]; + unsigned short len; + unsigned int precision; I wonder if there is a technical reason to stick to HOST_WIDE_INTs? I'd say for efficiency HOST_WIDEST_FAST_INT would be more appropriate (to get a 32bit value on 32bit x86 for example). I of course see that conversion to/from HOST_WIDE_INT is an important operation that would get slightly more
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On Fri, Apr 5, 2013 at 2:34 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: Richard, There has been something that has bothered me about you proposal for the storage manager and i think i can now characterize that problem. Say i want to compute the expression (a + b) / c converting from tree values, using wide-int as the engine and then storing the result in a tree. (A very common operation for the various simplifiers in gcc.) in my version of wide-int where there is only the stack allocated fix size allocation for the data, the compiler arranges for 6 instances of wide-int that are statically allocated on the stack when the function is entered. There would be 3 copies of the precision and data to get things started and one allocation variable sized object at the end when the INT_CST is built and one copy to put it back. As i have argued, these copies are of negligible size. In your world, to get things started, you would do 3 pointer copies to get the values out of the tree to set the expression leaves but then you will call the allocator 3 times to get space to hold the intermediate nodes before you get to pointer copy the result back into the result cst which still needs an allocation to build it. I am assuming that we can play the same game at the tree level that we do at the rtl level where we do 1 variable sized allocation to get the entire INT_CST rather than doing 1 fixed sized allocation and 1 variable sized one. even if we take the simpler example of a + b, you still loose. The cost of the extra allocation and it's subsequent recovery is more than my copies. In fact, even in the simplest case of someone going from a HWI thru wide_int into tree, you have 2 allocations vs my 1. Just to clarify, my code wouldn't handle tree a, b, c; tree res = (a + b) / c; transparently. The most complex form of the above that I think would be reasonable to handle would be tree a, b, c; wide_int wires = (wi (a) + b) / c; tree res = build_int_cst (TREE_TYPE (a), wires); and the code as posted would even require you to specify the return type of operator+ and operator/ explicitely like wide_int wires = (wi (a).operator+wi_embed_var (b)).operator/wi_embed_var (c); but as I said I just didn't bother to decide that the return type is always of wide_int variable-len-storage kind. Now, the only real allocation that happens is done by build_int_cst. There is one wide_int on the stack to hold the a + b result and one separate wide_int to hold wires (it's literally written in the code). There are no pointer copies involved in the end - the result from converting a tree to a wide_inttree-storage is the original 'tree' pointer itself, thus a register. I just do not see the cost savings and if there are no cost savings, you certainly cannot say that having these templates is simpler than not having the templates. I think you are missing the point - by abstracting away the storage you don't necessarily need to add the templates. But you open up a very easy route for doing so and you make the operations _trivially_ work on the tree / RTL storage with no overhead in generated code and minimal overhead in the amount of code in GCC itself. In my prototype the overhead of adding 'tree' support is to place class wi_tree_int_cst { tree cst; public: void construct (tree c) { cst = c; } const HOST_WIDE_INT *storage() const { return reinterpret_cast HOST_WIDE_INT *(TREE_INT_CST (cst)); } unsigned len() const { return 2; } }; template class wi_traits tree { public: typedef wide_int wi_tree_int_cst wi_t; wi_traits(tree t) { wi_tree_int_cst ws; ws.construct (t); w.construct (ws); } wi_t* operator-() { return w; } private: wi_t w; }; into tree.h. Richard. Kenny On 04/02/2013 11:04 AM, Richard Biener wrote: On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch contains a large number of the changes requested by Richi. It does not contain any of the changes that he requested to abstract the storage layer. That suggestion appears to be quite unworkable. I of course took this claim as a challenge ... with the following result. It is of course quite workable ;) The attached patch implements the core wide-int class and three storage models (fixed size for things like plain HWI and double-int, variable size similar to how your wide-int works and an adaptor for the double-int as contained in trees). With that you can now do HOST_WIDE_INT wi_test (tree x) { // template argument deduction doesn't do the magic we want it to do // to make this kind of implicit conversions work // overload resolution considers this kind of conversions so we // need some magic that combines both ... but seeding the overload // set with some instantiations doesn't seem to be possible :/ // wide_int w = x + 1; wide_int w; w += x; w += 1; // template argument
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
Richard, There has been something that has bothered me about you proposal for the storage manager and i think i can now characterize that problem. Say i want to compute the expression (a + b) / c converting from tree values, using wide-int as the engine and then storing the result in a tree. (A very common operation for the various simplifiers in gcc.) in my version of wide-int where there is only the stack allocated fix size allocation for the data, the compiler arranges for 6 instances of wide-int that are statically allocated on the stack when the function is entered.There would be 3 copies of the precision and data to get things started and one allocation variable sized object at the end when the INT_CST is built and one copy to put it back. As i have argued, these copies are of negligible size. In your world, to get things started, you would do 3 pointer copies to get the values out of the tree to set the expression leaves but then you will call the allocator 3 times to get space to hold the intermediate nodes before you get to pointer copy the result back into the result cst which still needs an allocation to build it. I am assuming that we can play the same game at the tree level that we do at the rtl level where we do 1 variable sized allocation to get the entire INT_CST rather than doing 1 fixed sized allocation and 1 variable sized one. even if we take the simpler example of a + b, you still loose. The cost of the extra allocation and it's subsequent recovery is more than my copies. In fact, even in the simplest case of someone going from a HWI thru wide_int into tree, you have 2 allocations vs my 1. I just do not see the cost savings and if there are no cost savings, you certainly cannot say that having these templates is simpler than not having the templates. Kenny On 04/02/2013 11:04 AM, Richard Biener wrote: On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch contains a large number of the changes requested by Richi. It does not contain any of the changes that he requested to abstract the storage layer. That suggestion appears to be quite unworkable. I of course took this claim as a challenge ... with the following result. It is of course quite workable ;) The attached patch implements the core wide-int class and three storage models (fixed size for things like plain HWI and double-int, variable size similar to how your wide-int works and an adaptor for the double-int as contained in trees). With that you can now do HOST_WIDE_INT wi_test (tree x) { // template argument deduction doesn't do the magic we want it to do // to make this kind of implicit conversions work // overload resolution considers this kind of conversions so we // need some magic that combines both ... but seeding the overload // set with some instantiations doesn't seem to be possible :/ // wide_int w = x + 1; wide_int w; w += x; w += 1; // template argument deduction doesn't deduce the return value type, // not considering the template default argument either ... // w = wi (x) + 1; // we could support this by providing rvalue-to-lvalue promotion // via a traits class? // otoh it would lead to sub-optimal code anyway so we should // make the result available as reference parameter and only support // wide_int res; add (res, x, 1); ? w = wi (x).operator+wide_int (1); wide_int::add(w, x, 1); return w.to_hwi (); } we are somewhat limited with C++ unless we want to get really fancy. Eventually providing operator+ just doesn't make much sense for generic wide-int combinations (though then the issue is its operands are no longer commutative which I think is the case with your wide-int or double-int as well - they don't suport 1 + wide_int for obvious reasons). So there are implementation design choices left undecided. Oh, and the operation implementations are crap (they compute nonsense). But you should get the idea. Richard.
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On Wed, Apr 3, 2013 at 6:16 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 04/03/2013 09:53 AM, Richard Biener wrote: On Wed, Apr 3, 2013 at 2:05 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 04/03/2013 05:17 AM, Richard Biener wrote: In the end you will have a variable-size storage in TREE_INT_CST thus you will have at least to emit _code_ copying over meta-data and data from the tree representation to the wide-int (similar for RTX CONST_DOUBLE/INT). I'm objecting to the amount of code you emit and agree that the runtime cost is copying the meta-data (hopefully optimizable via CSE / SRA) and in most cases one (or two) iterations of the loop copying the data (not optimizable). i did get rid of the bitsize in the wide-int patch so at this point the meta data is the precision and the len. not really a lot here. As usual we pay a high price in gcc for not pushing the tree rep down into the rtl level, then it would have been acceptable to have the tree type bleed into the wide-int code. 2) You present this as if the implementor actually should care about the implementation and you give 3 alternatives: the double_int, the current one, and HWI. We have tried to make it so that the client should not care. Certainly in my experience here, I have not found a place to care. Well, similar as for the copying overhead for tree your approach requires overloading operations for HOST_WIDE_INT operands to be able to say wi + 1 (which is certainly desirable), or the overhead of using wide_int_one (). In my opinion double_int needs to go away. That is the main thrust of my patches. There is no place in a compiler for an abi that depends on constants fitting into 2 two words whose size is defined by the host. That's true. I'm not arguing to preserve double-int - I'm arguing to preserve a way to ask for an integer type on the host with (at least) N bits. Almost all double-int users really ask for an integer type on the host that has at least as many bits as the pointer representation (or word_mode) on the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer targets). No double-int user specifically wants 2 * HOST_WIDE_INT precision - that is just what happens to be there. Thus I am providing a way to say get me a host integer with at least N bits (VRP asks for this, for example). What I was asking for is that whatever can provide the above should share the functional interface with wide-int (or the othert way around). And I was claiming that wide-int is too fat, because current users of double-int eventually store double-ints permanently. The problem is that, in truth, double int is too fat. 99.something% of all constants fit in 1 hwi and that is likely to be true forever (i understand that tree vpn may need some thought here). The rtl level, which has, for as long as i have known it, had 2 reps for integer constants. So it was relatively easy to slide the CONST_WIDE_INT in. It seems like the right trickery here rather than adding a storage model for wide-ints might be a way to use the c++ to invisibly support several (and by several i really mean 2) classes of TREE_CSTs. The truth is that _now_ TREE_INT_CSTs use double-ints and we have CONST_INT and CONST_DOUBLE. What I (and you) propose would get us to use variable-size storage for both, allowing to just use a single HOST_WIDE_INT in the majority of cases. In my view the constant length of the variable-size storage for TREE_INT_CSTs is determined by its type (thus, it doesn't have optimized variable-size storage but unoptimized fixed-size storage based on the maximum storage requirement for the type). Similar for RTX CONST_INT which would have fixed-size storage based on the mode-size of the constant. Using optimized space (thus using the encoding properties) requires you to fit the 'short len' somewhere which possibly will not pay off in the end (for tree we do have that storage available, so we could go with optimized storage for it, not sure with RTL, I don't see available space there). There are two questions here: one is the fact that you object to the fact that we represent small constants efficiently Huh? Where do I object to that? I question that for the storage in tree and RTX the encoding trick pays off if you need another HWI-aligned word to store the len. But see below. and the second is that we take advantage of the fact that fixed size stack allocation is effectively free for short lived objects like wide-ints (as i use them). I don't question that and I am not asking you to change that. As part of what I ask for a more optimal (smaller) stack allocation would be _possible_ (but not required). At the rtl level your idea does not work. rtl constants do not have a mode or type.So if you do not compress, how are you going to determine how many words you need for the constant 1. I would love to
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On Tue, Apr 2, 2013 at 7:35 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: Yes, I agree that you win the challenge that it can be done.What you have always failed to address is why anyone would want to do this. Or how this would at all be desirable.But I completely agree that from a purely abstract point of view you can add a storage model. Now here is why we REALLY do not want to go down this road: 1) The following comment from your earlier mail is completely wrong +#ifdef NEW_REP_FOR_INT_CST + /* This is the code once the tree level is converted. */ + wide_int result; + int i; + + tree type = TREE_TYPE (tcst); + + result.bitsize = GET_MODE_BITSIZE (TYPE_MODE (type)); + result.precision = TYPE_PRECISION (type); + result.len = TREE_INT_CST_LEN (tcst); + for (i = 0; i result.len; i++) +result.val[i] = TREE_INT_CST_ELT (tcst, i); + + return result; +#else this also shows the main reason I was asking for storage abstraction. The initialization from tree is way too expensive. In almost all cases, constants will fit in a single HWI. Thus, the only thing that you are copying is the length and a single HWI. So you are dragging in a lot of machinery just to save these two copies? Certainly there has to be more to it than that. In the end you will have a variable-size storage in TREE_INT_CST thus you will have at least to emit _code_ copying over meta-data and data from the tree representation to the wide-int (similar for RTX CONST_DOUBLE/INT). I'm objecting to the amount of code you emit and agree that the runtime cost is copying the meta-data (hopefully optimizable via CSE / SRA) and in most cases one (or two) iterations of the loop copying the data (not optimizable). 2) You present this as if the implementor actually should care about the implementation and you give 3 alternatives: the double_int, the current one, and HWI. We have tried to make it so that the client should not care. Certainly in my experience here, I have not found a place to care. Well, similar as for the copying overhead for tree your approach requires overloading operations for HOST_WIDE_INT operands to be able to say wi + 1 (which is certainly desirable), or the overhead of using wide_int_one (). In my opinion double_int needs to go away. That is the main thrust of my patches. There is no place in a compiler for an abi that depends on constants fitting into 2 two words whose size is defined by the host. That's true. I'm not arguing to preserve double-int - I'm arguing to preserve a way to ask for an integer type on the host with (at least) N bits. Almost all double-int users really ask for an integer type on the host that has at least as many bits as the pointer representation (or word_mode) on the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer targets). No double-int user specifically wants 2 * HOST_WIDE_INT precision - that is just what happens to be there. Thus I am providing a way to say get me a host integer with at least N bits (VRP asks for this, for example). What I was asking for is that whatever can provide the above should share the functional interface with wide-int (or the othert way around). And I was claiming that wide-int is too fat, because current users of double-int eventually store double-ints permanently. This is not a beauty contest argument, we have public ports are beginning to use modes that are larger than two x86-64 HWIs and i have a private port that has such modes and it is my experience that any pass that uses this interface has one of three behaviors: it silently gets the wrong answer, it ices, or it fails to do the transformation. If we leave double_int as an available option, then any use of it potentially will have one of these three behaviors. And so one of my strong objections to this direction is that i do not want to fight this kind of bug for the rest of my life. Having a single storage model that just always works is in my opinion a highly desirable option. What you have never answered in a concrete manner is, if we decide to provide this generality, what it would be used for. There is no place in a portable compiler where the right answer for every target is two HOST wide integers. However, i will admit that the HWI option has some merits. We try to address this in our implementation by dividing what is done inline in wide-int.h to the cases that fit in an HWI and then only drop into the heavy code in wide-int.c if mode is larger (which it rarely will be). However, a case could be made that for certain kinds of things like string lengths and such, we could use another interface or as you argue, a different storage model with the same interface. I just do not see that the cost of the conversion code is really going to show up on anyone's radar. What's the issue with abstracting away the model so a fixed-size 'len' is possible? (let away the argument that
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On 04/03/2013 05:17 AM, Richard Biener wrote: In the end you will have a variable-size storage in TREE_INT_CST thus you will have at least to emit _code_ copying over meta-data and data from the tree representation to the wide-int (similar for RTX CONST_DOUBLE/INT). I'm objecting to the amount of code you emit and agree that the runtime cost is copying the meta-data (hopefully optimizable via CSE / SRA) and in most cases one (or two) iterations of the loop copying the data (not optimizable). i did get rid of the bitsize in the wide-int patch so at this point the meta data is the precision and the len. not really a lot here. As usual we pay a high price in gcc for not pushing the tree rep down into the rtl level, then it would have been acceptable to have the tree type bleed into the wide-int code. 2) You present this as if the implementor actually should care about the implementation and you give 3 alternatives: the double_int, the current one, and HWI. We have tried to make it so that the client should not care. Certainly in my experience here, I have not found a place to care. Well, similar as for the copying overhead for tree your approach requires overloading operations for HOST_WIDE_INT operands to be able to say wi + 1 (which is certainly desirable), or the overhead of using wide_int_one (). In my opinion double_int needs to go away. That is the main thrust of my patches. There is no place in a compiler for an abi that depends on constants fitting into 2 two words whose size is defined by the host. That's true. I'm not arguing to preserve double-int - I'm arguing to preserve a way to ask for an integer type on the host with (at least) N bits. Almost all double-int users really ask for an integer type on the host that has at least as many bits as the pointer representation (or word_mode) on the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer targets). No double-int user specifically wants 2 * HOST_WIDE_INT precision - that is just what happens to be there. Thus I am providing a way to say get me a host integer with at least N bits (VRP asks for this, for example). What I was asking for is that whatever can provide the above should share the functional interface with wide-int (or the othert way around). And I was claiming that wide-int is too fat, because current users of double-int eventually store double-ints permanently. The problem is that, in truth, double int is too fat. 99.something% of all constants fit in 1 hwi and that is likely to be true forever (i understand that tree vpn may need some thought here). The rtl level, which has, for as long as i have known it, had 2 reps for integer constants. So it was relatively easy to slide the CONST_WIDE_INT in. It seems like the right trickery here rather than adding a storage model for wide-ints might be a way to use the c++ to invisibly support several (and by several i really mean 2) classes of TREE_CSTs. This is not a beauty contest argument, we have public ports are beginning to use modes that are larger than two x86-64 HWIs and i have a private port that has such modes and it is my experience that any pass that uses this interface has one of three behaviors: it silently gets the wrong answer, it ices, or it fails to do the transformation. If we leave double_int as an available option, then any use of it potentially will have one of these three behaviors. And so one of my strong objections to this direction is that i do not want to fight this kind of bug for the rest of my life. Having a single storage model that just always works is in my opinion a highly desirable option. What you have never answered in a concrete manner is, if we decide to provide this generality, what it would be used for. There is no place in a portable compiler where the right answer for every target is two HOST wide integers. However, i will admit that the HWI option has some merits. We try to address this in our implementation by dividing what is done inline in wide-int.h to the cases that fit in an HWI and then only drop into the heavy code in wide-int.c if mode is larger (which it rarely will be). However, a case could be made that for certain kinds of things like string lengths and such, we could use another interface or as you argue, a different storage model with the same interface. I just do not see that the cost of the conversion code is really going to show up on anyone's radar. What's the issue with abstracting away the model so a fixed-size 'len' is possible? (let away the argument that this would easily allow an adaptor to tree) I have a particularly pessimistic perspective because i have already written most of this patch. It is not that i do not want to change that code, it is that i have seen a certain set of mistakes that were made and i do not want to fix them more than once. At the rtl level you can see the transition from only supporting 32 bit ints to supporting 64 bit
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On Wed, Apr 3, 2013 at 2:05 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 04/03/2013 05:17 AM, Richard Biener wrote: In the end you will have a variable-size storage in TREE_INT_CST thus you will have at least to emit _code_ copying over meta-data and data from the tree representation to the wide-int (similar for RTX CONST_DOUBLE/INT). I'm objecting to the amount of code you emit and agree that the runtime cost is copying the meta-data (hopefully optimizable via CSE / SRA) and in most cases one (or two) iterations of the loop copying the data (not optimizable). i did get rid of the bitsize in the wide-int patch so at this point the meta data is the precision and the len. not really a lot here. As usual we pay a high price in gcc for not pushing the tree rep down into the rtl level, then it would have been acceptable to have the tree type bleed into the wide-int code. 2) You present this as if the implementor actually should care about the implementation and you give 3 alternatives: the double_int, the current one, and HWI. We have tried to make it so that the client should not care. Certainly in my experience here, I have not found a place to care. Well, similar as for the copying overhead for tree your approach requires overloading operations for HOST_WIDE_INT operands to be able to say wi + 1 (which is certainly desirable), or the overhead of using wide_int_one (). In my opinion double_int needs to go away. That is the main thrust of my patches. There is no place in a compiler for an abi that depends on constants fitting into 2 two words whose size is defined by the host. That's true. I'm not arguing to preserve double-int - I'm arguing to preserve a way to ask for an integer type on the host with (at least) N bits. Almost all double-int users really ask for an integer type on the host that has at least as many bits as the pointer representation (or word_mode) on the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer targets). No double-int user specifically wants 2 * HOST_WIDE_INT precision - that is just what happens to be there. Thus I am providing a way to say get me a host integer with at least N bits (VRP asks for this, for example). What I was asking for is that whatever can provide the above should share the functional interface with wide-int (or the othert way around). And I was claiming that wide-int is too fat, because current users of double-int eventually store double-ints permanently. The problem is that, in truth, double int is too fat. 99.something% of all constants fit in 1 hwi and that is likely to be true forever (i understand that tree vpn may need some thought here). The rtl level, which has, for as long as i have known it, had 2 reps for integer constants. So it was relatively easy to slide the CONST_WIDE_INT in. It seems like the right trickery here rather than adding a storage model for wide-ints might be a way to use the c++ to invisibly support several (and by several i really mean 2) classes of TREE_CSTs. The truth is that _now_ TREE_INT_CSTs use double-ints and we have CONST_INT and CONST_DOUBLE. What I (and you) propose would get us to use variable-size storage for both, allowing to just use a single HOST_WIDE_INT in the majority of cases. In my view the constant length of the variable-size storage for TREE_INT_CSTs is determined by its type (thus, it doesn't have optimized variable-size storage but unoptimized fixed-size storage based on the maximum storage requirement for the type). Similar for RTX CONST_INT which would have fixed-size storage based on the mode-size of the constant. Using optimized space (thus using the encoding properties) requires you to fit the 'short len' somewhere which possibly will not pay off in the end (for tree we do have that storage available, so we could go with optimized storage for it, not sure with RTL, I don't see available space there). This is not a beauty contest argument, we have public ports are beginning to use modes that are larger than two x86-64 HWIs and i have a private port that has such modes and it is my experience that any pass that uses this interface has one of three behaviors: it silently gets the wrong answer, it ices, or it fails to do the transformation. If we leave double_int as an available option, then any use of it potentially will have one of these three behaviors. And so one of my strong objections to this direction is that i do not want to fight this kind of bug for the rest of my life. Having a single storage model that just always works is in my opinion a highly desirable option. What you have never answered in a concrete manner is, if we decide to provide this generality, what it would be used for. There is no place in a portable compiler where the right answer for every target is two HOST wide integers. However, i will admit that the HWI option has some merits. We try to address this
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On 04/03/2013 09:53 AM, Richard Biener wrote: On Wed, Apr 3, 2013 at 2:05 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 04/03/2013 05:17 AM, Richard Biener wrote: In the end you will have a variable-size storage in TREE_INT_CST thus you will have at least to emit _code_ copying over meta-data and data from the tree representation to the wide-int (similar for RTX CONST_DOUBLE/INT). I'm objecting to the amount of code you emit and agree that the runtime cost is copying the meta-data (hopefully optimizable via CSE / SRA) and in most cases one (or two) iterations of the loop copying the data (not optimizable). i did get rid of the bitsize in the wide-int patch so at this point the meta data is the precision and the len. not really a lot here. As usual we pay a high price in gcc for not pushing the tree rep down into the rtl level, then it would have been acceptable to have the tree type bleed into the wide-int code. 2) You present this as if the implementor actually should care about the implementation and you give 3 alternatives: the double_int, the current one, and HWI. We have tried to make it so that the client should not care. Certainly in my experience here, I have not found a place to care. Well, similar as for the copying overhead for tree your approach requires overloading operations for HOST_WIDE_INT operands to be able to say wi + 1 (which is certainly desirable), or the overhead of using wide_int_one (). In my opinion double_int needs to go away. That is the main thrust of my patches. There is no place in a compiler for an abi that depends on constants fitting into 2 two words whose size is defined by the host. That's true. I'm not arguing to preserve double-int - I'm arguing to preserve a way to ask for an integer type on the host with (at least) N bits. Almost all double-int users really ask for an integer type on the host that has at least as many bits as the pointer representation (or word_mode) on the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer targets). No double-int user specifically wants 2 * HOST_WIDE_INT precision - that is just what happens to be there. Thus I am providing a way to say get me a host integer with at least N bits (VRP asks for this, for example). What I was asking for is that whatever can provide the above should share the functional interface with wide-int (or the othert way around). And I was claiming that wide-int is too fat, because current users of double-int eventually store double-ints permanently. The problem is that, in truth, double int is too fat. 99.something% of all constants fit in 1 hwi and that is likely to be true forever (i understand that tree vpn may need some thought here). The rtl level, which has, for as long as i have known it, had 2 reps for integer constants. So it was relatively easy to slide the CONST_WIDE_INT in. It seems like the right trickery here rather than adding a storage model for wide-ints might be a way to use the c++ to invisibly support several (and by several i really mean 2) classes of TREE_CSTs. The truth is that _now_ TREE_INT_CSTs use double-ints and we have CONST_INT and CONST_DOUBLE. What I (and you) propose would get us to use variable-size storage for both, allowing to just use a single HOST_WIDE_INT in the majority of cases. In my view the constant length of the variable-size storage for TREE_INT_CSTs is determined by its type (thus, it doesn't have optimized variable-size storage but unoptimized fixed-size storage based on the maximum storage requirement for the type). Similar for RTX CONST_INT which would have fixed-size storage based on the mode-size of the constant. Using optimized space (thus using the encoding properties) requires you to fit the 'short len' somewhere which possibly will not pay off in the end (for tree we do have that storage available, so we could go with optimized storage for it, not sure with RTL, I don't see available space there). There are two questions here: one is the fact that you object to the fact that we represent small constants efficiently and the second is that we take advantage of the fact that fixed size stack allocation is effectively free for short lived objects like wide-ints (as i use them). At the rtl level your idea does not work. rtl constants do not have a mode or type.So if you do not compress, how are you going to determine how many words you need for the constant 1. I would love to have a rep that had the mode in it.But it is a huge change that requires a lot of hacking to every port. I understand that this makes me vulnerable to the argument that we should not let the rtl level ever dictate anything about the tree level, but the truth is that a variable len rep is almost always used for big integers. In our code, most constants of large types are small numbers. (Remember i got into this because the tree constant prop thinks that left shifting any number by anything
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch contains a large number of the changes requested by Richi. It does not contain any of the changes that he requested to abstract the storage layer. That suggestion appears to be quite unworkable. I of course took this claim as a challenge ... with the following result. It is of course quite workable ;) The attached patch implements the core wide-int class and three storage models (fixed size for things like plain HWI and double-int, variable size similar to how your wide-int works and an adaptor for the double-int as contained in trees). With that you can now do HOST_WIDE_INT wi_test (tree x) { // template argument deduction doesn't do the magic we want it to do // to make this kind of implicit conversions work // overload resolution considers this kind of conversions so we // need some magic that combines both ... but seeding the overload // set with some instantiations doesn't seem to be possible :/ // wide_int w = x + 1; wide_int w; w += x; w += 1; // template argument deduction doesn't deduce the return value type, // not considering the template default argument either ... // w = wi (x) + 1; // we could support this by providing rvalue-to-lvalue promotion // via a traits class? // otoh it would lead to sub-optimal code anyway so we should // make the result available as reference parameter and only support // wide_int res; add (res, x, 1); ? w = wi (x).operator+wide_int (1); wide_int::add(w, x, 1); return w.to_hwi (); } we are somewhat limited with C++ unless we want to get really fancy. Eventually providing operator+ just doesn't make much sense for generic wide-int combinations (though then the issue is its operands are no longer commutative which I think is the case with your wide-int or double-int as well - they don't suport 1 + wide_int for obvious reasons). So there are implementation design choices left undecided. Oh, and the operation implementations are crap (they compute nonsense). But you should get the idea. Richard. #include config.h #include system.h #include coretypes.h #include hwint.h #include tree.h /* ??? wide-int should probably use HOST_WIDEST_FAST_INT as storage, not HOST_WIDE_INT. Yeah, we could even template on that ... */ /* Fixed-length embedded storage. wi_embed2 is double-int, wi_embed1 is a plain HOST_WIDE_INT. Can be used for small fixed-(minimum)-size calculations on hosts that have no suitable integer type. */ template unsigned sz class wi_embed { private: HOST_WIDE_INT s[sz]; public: void construct () {} HOST_WIDE_INT* storage() { return s; } const HOST_WIDE_INT* storage() const { return s; } unsigned len() const { return sz; } void set_len(unsigned l) { gcc_checking_assert (l = sz); } }; /* Fixed maximum-length embedded storage but variable dynamic size. */ //#define MAXSZ (4 * (MAX_MODE_INT_SIZE / HOST_BITS_PER_WIDE_INT)) #define MAXSZ 8 template unsigned max_sz class wi_embed_var { private: unsigned len_; HOST_WIDE_INT s[max_sz]; public: void construct () { len_ = 0; } HOST_WIDE_INT* storage() { return s; } const HOST_WIDE_INT* storage() const { return s; } unsigned len() const { return len_; } void set_len(unsigned l) { len_ = l; } }; /* The wide-int class. Defaults to variable-length storage (alternatively use a typedef to avoid the need to use wide_int ). */ template class S = wi_embed_varMAXSZ class wide_int; /* Avoid constructors / destructors to make sure this is a C++04 POD. */ /* Basic wide_int class. The storage model allows for rvalue storage abstraction avoiding copying from for example tree or RTX and to avoid the need of explicit construction for integral arguments of up to HWI size. A storage model needs to provide the following methods: - construct (), default-initialize the storage - unsigned len () const, the size of the storage in HWI quantities - const HOST_WIDE_INT *storage () const, return a pointer to read-only HOST_WIDE_INT storage of size len (). - HOST_WIDE_INT *storage (), return a pointer to writable HOST_WIDE_INT storage of size len (). This method is optional. - void set_len (unsigned l), adjust the size of the storage to at least l HWI words. Conversions of wide_int _to_ tree or RTX or HWI are explicit. Conversions to wide_int happen with overloads to the global function template wi () or via wide_int_traits specializations. */ /* ??? With mixed length operations there are encoding issues for signed vs. unsigned numbers. The easiest encoding is to say wide-ints are always signed which means that -1U needs the MSB of the wide-int storage as zero which means an extra word with zeros. The sign-bit of a wide-int is then always storage()[len() (1 (HOST_BITS_PER_WIDE_INT - 1))]. */ template class S class wide_int : private S { /* Allow
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
Yes, I agree that you win the challenge that it can be done.What you have always failed to address is why anyone would want to do this. Or how this would at all be desirable.But I completely agree that from a purely abstract point of view you can add a storage model. Now here is why we REALLY do not want to go down this road: 1) The following comment from your earlier mail is completely wrong +#ifdef NEW_REP_FOR_INT_CST + /* This is the code once the tree level is converted. */ + wide_int result; + int i; + + tree type = TREE_TYPE (tcst); + + result.bitsize = GET_MODE_BITSIZE (TYPE_MODE (type)); + result.precision = TYPE_PRECISION (type); + result.len = TREE_INT_CST_LEN (tcst); + for (i = 0; i result.len; i++) +result.val[i] = TREE_INT_CST_ELT (tcst, i); + + return result; +#else this also shows the main reason I was asking for storage abstraction. The initialization from tree is way too expensive. In almost all cases, constants will fit in a single HWI. Thus, the only thing that you are copying is the length and a single HWI. So you are dragging in a lot of machinery just to save these two copies? Certainly there has to be more to it than that. 2) You present this as if the implementor actually should care about the implementation and you give 3 alternatives: the double_int, the current one, and HWI. We have tried to make it so that the client should not care. Certainly in my experience here, I have not found a place to care. In my opinion double_int needs to go away. That is the main thrust of my patches. There is no place in a compiler for an abi that depends on constants fitting into 2 two words whose size is defined by the host. This is not a beauty contest argument, we have public ports are beginning to use modes that are larger than two x86-64 HWIs and i have a private port that has such modes and it is my experience that any pass that uses this interface has one of three behaviors: it silently gets the wrong answer, it ices, or it fails to do the transformation. If we leave double_int as an available option, then any use of it potentially will have one of these three behaviors. And so one of my strong objections to this direction is that i do not want to fight this kind of bug for the rest of my life.Having a single storage model that just always works is in my opinion a highly desirable option. What you have never answered in a concrete manner is, if we decide to provide this generality, what it would be used for.There is no place in a portable compiler where the right answer for every target is two HOST wide integers. However, i will admit that the HWI option has some merits. We try to address this in our implementation by dividing what is done inline in wide-int.h to the cases that fit in an HWI and then only drop into the heavy code in wide-int.c if mode is larger (which it rarely will be). However, a case could be made that for certain kinds of things like string lengths and such, we could use another interface or as you argue, a different storage model with the same interface. I just do not see that the cost of the conversion code is really going to show up on anyone's radar. 3) your trick will work at the tree level, but not at the rtl level. The wide-int code cannot share storage with the CONST_INTs.We tried this, and there are a million bugs that would have to be fixed to make it work.It could have worked if CONST_INTs had carried a mode around, but since they do not, you end up with the same CONST_INT sharing the rep for several different types and that just did not work unless you are willing to do substantial cleanups. On 04/02/2013 11:04 AM, Richard Biener wrote: On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch contains a large number of the changes requested by Richi. It does not contain any of the changes that he requested to abstract the storage layer. That suggestion appears to be quite unworkable. I of course took this claim as a challenge ... with the following result. It is of course quite workable ;) The attached patch implements the core wide-int class and three storage models (fixed size for things like plain HWI and double-int, variable size similar to how your wide-int works and an adaptor for the double-int as contained in trees). With that you can now do HOST_WIDE_INT wi_test (tree x) { // template argument deduction doesn't do the magic we want it to do // to make this kind of implicit conversions work // overload resolution considers this kind of conversions so we // need some magic that combines both ... but seeding the overload // set with some instantiations doesn't seem to be possible :/ // wide_int w = x + 1; wide_int w; w += x; w += 1; // template argument deduction doesn't deduce the return value type, // not considering the template default argument
Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1
On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch contains a large number of the changes requested by Richi. It does not contain any of the changes that he requested to abstract the storage layer. That suggestion appears to be quite unworkable. I believe that the wide-int class addresses the needs of gcc for performing math on any size integer irregardless of the platform that hosts the compiler. The interface is admittedly large, but it is large for a reason: these are the operations that are commonly performed by the client optimizations in the compiler. I would like to get this patch preapproved for the next stage 1. Please clean from dead code like +// using wide_int::; and +#ifdef DEBUG_WIDE_INT + if (dump_file) +debug_wh (wide_int::from_shwi %s HOST_WIDE_INT_PRINT_HEX )\n, + result, op0); +#endif and +#ifdef NEW_REP_FOR_INT_CST + /* This is the code once the tree level is converted. */ + wide_int result; + int i; + + tree type = TREE_TYPE (tcst); + + result.bitsize = GET_MODE_BITSIZE (TYPE_MODE (type)); + result.precision = TYPE_PRECISION (type); + result.len = TREE_INT_CST_LEN (tcst); + for (i = 0; i result.len; i++) +result.val[i] = TREE_INT_CST_ELT (tcst, i); + + return result; +#else this also shows the main reason I was asking for storage abstraction. The initialization from tree is way too expensive. +/* Convert a integer cst into a wide int expanded to BITSIZE and + PRECISION. This call is used by tree passes like vrp that expect + that the math is done in an infinite precision style. BITSIZE and + PRECISION are generally determined to be twice the largest type + seen in the function. */ + +wide_int +wide_int::from_tree_as_infinite_precision (const_tree tcst, + unsigned int bitsize, + unsigned int precision) +{ I know you have converted everything, but to make this patch reviewable I'd like you to strip the initial wide_int down to a bare minimum. Only then people will have a reasonable chance to play with interface changes (such as providing a storage abstraction). +/* Check the upper HOST_WIDE_INTs of src to see if the length can be + shortened. An upper HOST_WIDE_INT is unnecessary if it is all ones + or zeros and the top bit of the next lower word matches. + + This function may change the representation of THIS, but does not + change the value that THIS represents. It does not sign extend in + the case that the size of the mode is less than + HOST_BITS_PER_WIDE_INT. */ + +void +wide_int::canonize () this shouldn't be necessary - it's an optimization - and due to value semantics (yes - I know you have a weird mix of value semantics and modify-in-place in wide_int) the new length should be computed transparently when creating a new value. Well. Leaving wide-int.c for now. +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. tree-vpn + currently does a 2x largest mode by 2x largest mode yielding a 4x + largest mode result. If operations are added that require larger + buffers, then VAL needs to be changed. */ + HOST_WIDE_INT val[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; as you conver partial int modes in MAX_BITSIZE_MODE_ANY_INT the above may come too short. Please properly round up. + unsigned short len; + unsigned int bitsize; + unsigned int precision; I see we didn't get away with this mix of bitsize and precision. I'm probably going to try revisit the past discussions - but can you point me to a single place in the RTL conversion where they make a difference? Bits beyond precision are either undefined or properly zero-/sign-extended. Implicit extension beyond len val members should then provide in valid bits up to bitsize (if anyone cares). That's how double-ints work on tree INTGER_CSTs which only care for precision, even with partial integer mode types (ok, I never came along one of these beasts - testcase / target?). [abstraction possibility - have both wide_ints with actual mode and wide_ints with arbitrary bitsize/precision] + enum ShiftOp { +NONE, +/* There are two uses for the wide-int shifting functions. The + first use is as an emulation of the target hardware. The + second use is as service routines for other optimizations. The + first case needs to be identified by passing TRUNC as the value + of ShiftOp so that shift amount is properly handled according to the + SHIFT_COUNT_TRUNCATED flag. For the second case, the shift + amount is always truncated by the bytesize of the mode of + THIS. */ +TRUNC + }; I think I have expressed my opinion on this. (and SHIFT_COUNT_TRUNCATED
Re: patch to fix constant math - 4th patch - the wide-int class.
Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were complaining about too many fused ops. Clearly this one could be removed in favour of separate and() and not() operations, but why not provide a fused one if there are clients who'll make use of it? I think Kenny's API is just taking that to its logical conclusion. There doesn't seem to be anything sacrosanct about the current choice of what's fused and what isn't. The speed problem we had using trees
Re: patch to fix constant math - 4th patch - the wide-int class.
On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were complaining about too many fused ops. Clearly this one could be removed in favour of separate and() and not() operations, but why not provide a fused one if there are clients who'll make use of it? I was more concerned about fused operations that use precision or bitsize as input. That is for example + bool
Re: patch to fix constant math - 4th patch - the wide-int class.
Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were complaining about too many fused ops. Clearly this one could be removed in favour of separate and() and not() operations, but why not provide a fused one if there are clients who'll make use of it? I was more concerned about fused operations that use
Re: patch to fix constant math - 4th patch - the wide-int class.
On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were complaining about too many fused ops. Clearly this one could be removed in favour of separate and() and not() operations, but why not provide a fused one if
Re: patch to fix constant math - 4th patch - the wide-int class.
Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were complaining about too many fused ops. Clearly this one could be removed in favour of separate and()
Re: patch to fix constant math - 4th patch - the wide-int class.
On Wed, Oct 31, 2012 at 1:22 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were
Re: patch to fix constant math - 4th patch - the wide-int class.
Richard Biener richard.guent...@gmail.com writes: But that means that wide_int has to model a P-bit operation as a normal len*HOST_WIDE_INT operation and then fix up the result after the fact, which seems unnecessarily convoluted. It does that right now. The operations are carried out in a loop over len HOST_WIDE_INT parts, the last HWI is then special-treated to account for precision/size. (yes, 'len' is also used as optimization - the fact that len ends up being mutable is another thing I dislike about wide-int. If wide-ints are cheap then all ops should be non-mutating (at least to 'len')). But the point of having a mutating len is that things like zero and -1 are common even for OImode values. So if you're doing someting potentially expensive like OImode multiplication, why do it to the number of HOST_WIDE_INTs needed for an OImode value when the value we're processing has only one significant HOST_WIDE_INT? I still don't see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision X*HOST_WIDE_INT operation for any X) has any special meaning. Well, the same reason as a HOST_WIDE_INT variable has a meaning. We use it to constrain what we (efficiently) want to work on. For example CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when doing bit-constant-propagation in loops (for TImode integers on a x86_64 host). But what about targets with modes wider than TImode? Would double_int still be appropriate then? If not, why does CCP have to use a templated type with a fixed number of HWIs (and all arithmetic done to a fixed number of HWIs) rather than one that can adapt to the runtime values, like wide_int can? Oh, and I don't necessary see a use of double_int in its current form but for an integer representation on the host that is efficient to manipulate integer constants of a target dependent size. For example the target detail that we have partial integer modes with bitsize precision and that the bits precision appearantly have a meaning when looking at the bit-representation of a constant should not be part of the base class of wide-int (I doubt it belongs to wide-int at all, but I guess you know more about the reason we track bitsize in addition to precision - I think it's abstraction at the wrong level, the tree level does fine without knowing about bitsize). TBH I'm uneasy about the bitsize thing too. I think bitsize is only tracked for shift truncation, and if so, I agree it makes sense to do that separately. But anyway, this whole discussion seems to have reached a stalemate. Or I suppose a de-facto rejection, since you're the only person in a position to approve the thing :-) Richard
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/31/2012 08:11 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were complaining about too many fused ops. Clearly this one could be removed in favour of separate and() and not() operations, but why not provide a fused one if there are clients who'll make use of it? I was more
Re: patch to fix constant math - 4th patch - the wide-int class.
On Wed, Oct 31, 2012 at 2:30 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: But that means that wide_int has to model a P-bit operation as a normal len*HOST_WIDE_INT operation and then fix up the result after the fact, which seems unnecessarily convoluted. It does that right now. The operations are carried out in a loop over len HOST_WIDE_INT parts, the last HWI is then special-treated to account for precision/size. (yes, 'len' is also used as optimization - the fact that len ends up being mutable is another thing I dislike about wide-int. If wide-ints are cheap then all ops should be non-mutating (at least to 'len')). But the point of having a mutating len is that things like zero and -1 are common even for OImode values. So if you're doing someting potentially expensive like OImode multiplication, why do it to the number of HOST_WIDE_INTs needed for an OImode value when the value we're processing has only one significant HOST_WIDE_INT? I don't propose doing that. I propose that no wide-int member function may _change_ it's len (to something larger). Only that way you can avoid allocating wasted space for zero and -1. That way also the artificial limit on 2 * largest-int-mode-hwis goes. I still don't see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision X*HOST_WIDE_INT operation for any X) has any special meaning. Well, the same reason as a HOST_WIDE_INT variable has a meaning. We use it to constrain what we (efficiently) want to work on. For example CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when doing bit-constant-propagation in loops (for TImode integers on a x86_64 host). But what about targets with modes wider than TImode? Would double_int still be appropriate then? If not, why does CCP have to use a templated type with a fixed number of HWIs (and all arithmetic done to a fixed number of HWIs) rather than one that can adapt to the runtime values, like wide_int can? Because nobody cares about accurate bit-tracking for modes larger than TImode. And because no convenient abstraction was available ;) Oh, and I don't necessary see a use of double_int in its current form but for an integer representation on the host that is efficient to manipulate integer constants of a target dependent size. For example the target detail that we have partial integer modes with bitsize precision and that the bits precision appearantly have a meaning when looking at the bit-representation of a constant should not be part of the base class of wide-int (I doubt it belongs to wide-int at all, but I guess you know more about the reason we track bitsize in addition to precision - I think it's abstraction at the wrong level, the tree level does fine without knowing about bitsize). TBH I'm uneasy about the bitsize thing too. I think bitsize is only tracked for shift truncation, and if so, I agree it makes sense to do that separately. So, can we please remove all traces of bitsize from wide-int then? But anyway, this whole discussion seems to have reached a stalemate. Or I suppose a de-facto rejection, since you're the only person in a position to approve the thing :-) There are many (silent) people that are able to approve the thing. But the point is I have too many issues with the current patch that I'm unable to point at a specific thing I want Kenny to change after which the patch would be fine. So I rely on some guesswork from Kenny giving my advices leaner API, less fused ops, get rid of bitsize, think of abstracting the core HWI[len] operation, there should be no tree or RTL dependencies in the wide-int API to produce an updated variant. Which of course takes time, which of course crosses my vacation, which in the end means it isn't going to make 4.8 (I _do_ like the idea of not having a dependence on host properties for integer constant representation). Btw, a good hint at what a minimal wide-int API would look like is if you _just_ replace double-int users with it. Then you obviously have to implement only the double-int interface and conversion from/to double-int. Richard. Richard
Re: patch to fix constant math - 4th patch - the wide-int class.
On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/31/2012 08:11 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/31/2012 10:05 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/31/2012 08:11 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were complaining about too many fused ops. Clearly this one could be removed in favour
Re: patch to fix constant math - 4th patch - the wide-int class.
On Wed, Oct 31, 2012 at 3:18 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/31/2012 10:05 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/31/2012 08:11 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK,
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/31/2012 09:54 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 2:30 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: But that means that wide_int has to model a P-bit operation as a normal len*HOST_WIDE_INT operation and then fix up the result after the fact, which seems unnecessarily convoluted. It does that right now. The operations are carried out in a loop over len HOST_WIDE_INT parts, the last HWI is then special-treated to account for precision/size. (yes, 'len' is also used as optimization - the fact that len ends up being mutable is another thing I dislike about wide-int. If wide-ints are cheap then all ops should be non-mutating (at least to 'len')). But the point of having a mutating len is that things like zero and -1 are common even for OImode values. So if you're doing someting potentially expensive like OImode multiplication, why do it to the number of HOST_WIDE_INTs needed for an OImode value when the value we're processing has only one significant HOST_WIDE_INT? I don't propose doing that. I propose that no wide-int member function may _change_ it's len (to something larger). Only that way you can avoid allocating wasted space for zero and -1. That way also the artificial limit on 2 * largest-int-mode-hwis goes. it is now 4x not 2x to accomodate the extra bit in tree-vrp. remember that the space burden is minimal.wide-ints are not persistent and there are never more than a handful at a time. I still don't see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision X*HOST_WIDE_INT operation for any X) has any special meaning. Well, the same reason as a HOST_WIDE_INT variable has a meaning. We use it to constrain what we (efficiently) want to work on. For example CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when doing bit-constant-propagation in loops (for TImode integers on a x86_64 host). But what about targets with modes wider than TImode? Would double_int still be appropriate then? If not, why does CCP have to use a templated type with a fixed number of HWIs (and all arithmetic done to a fixed number of HWIs) rather than one that can adapt to the runtime values, like wide_int can? Because nobody cares about accurate bit-tracking for modes larger than TImode. And because no convenient abstraction was available ;) yes, but tree-vrp does not even work for timode. and there are not tests to scale it back when it does see ti-mode. I understand that these can be added, but they so far have not been. I would also point out that i was corrected on this point by (i believe) lawrence. He points out that tree-vrp is still important for converting signed to unsigned for larger modes. Oh, and I don't necessary see a use of double_int in its current form but for an integer representation on the host that is efficient to manipulate integer constants of a target dependent size. For example the target detail that we have partial integer modes with bitsize precision and that the bits precision appearantly have a meaning when looking at the bit-representation of a constant should not be part of the base class of wide-int (I doubt it belongs to wide-int at all, but I guess you know more about the reason we track bitsize in addition to precision - I think it's abstraction at the wrong level, the tree level does fine without knowing about bitsize). TBH I'm uneasy about the bitsize thing too. I think bitsize is only tracked for shift truncation, and if so, I agree it makes sense to do that separately. So, can we please remove all traces of bitsize from wide-int then? But anyway, this whole discussion seems to have reached a stalemate. Or I suppose a de-facto rejection, since you're the only person in a position to approve the thing :-) There are many (silent) people that are able to approve the thing. But the point is I have too many issues with the current patch that I'm unable to point at a specific thing I want Kenny to change after which the patch would be fine. So I rely on some guesswork from Kenny giving my advices leaner API, less fused ops, get rid of bitsize, think of abstracting the core HWI[len] operation, there should be no tree or RTL dependencies in the wide-int API to produce an updated variant. Which of course takes time, which of course crosses my vacation, which in the end means it isn't going to make 4.8 (I _do_ like the idea of not having a dependence on host properties for integer constant representation). Btw, a good hint at what a minimal wide-int API would look like is if you _just_ replace double-int users with it. Then you obviously have to implement only the double-int interface and conversion from/to double-int. Richard. Richard
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/31/2012 10:24 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 3:18 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/31/2012 10:05 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/31/2012 08:11 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/31/2012 08:44 AM, Richard Biener wrote: On Wed, Oct 31, 2012 at 1:22 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). OK, this is really in reply to the 4.8 thing, but it felt more appropriate here. It's interesting that you gave this example, since before you were complaining about too many fused ops. Clearly this one could be removed in
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/31/2012 09:30 AM, Richard Sandiford wrote: Richard Biener richard.guent...@gmail.com writes: But that means that wide_int has to model a P-bit operation as a normal len*HOST_WIDE_INT operation and then fix up the result after the fact, which seems unnecessarily convoluted. It does that right now. The operations are carried out in a loop over len HOST_WIDE_INT parts, the last HWI is then special-treated to account for precision/size. (yes, 'len' is also used as optimization - the fact that len ends up being mutable is another thing I dislike about wide-int. If wide-ints are cheap then all ops should be non-mutating (at least to 'len')). But the point of having a mutating len is that things like zero and -1 are common even for OImode values. So if you're doing someting potentially expensive like OImode multiplication, why do it to the number of HOST_WIDE_INTs needed for an OImode value when the value we're processing has only one significant HOST_WIDE_INT? I think with a little thought i can add some special constructors and get rid of the mutating aspects of the interface. I still don't see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision X*HOST_WIDE_INT operation for any X) has any special meaning. Well, the same reason as a HOST_WIDE_INT variable has a meaning. We use it to constrain what we (efficiently) want to work on. For example CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when doing bit-constant-propagation in loops (for TImode integers on a x86_64 host). But what about targets with modes wider than TImode? Would double_int still be appropriate then? If not, why does CCP have to use a templated type with a fixed number of HWIs (and all arithmetic done to a fixed number of HWIs) rather than one that can adapt to the runtime values, like wide_int can? Oh, and I don't necessary see a use of double_int in its current form but for an integer representation on the host that is efficient to manipulate integer constants of a target dependent size. For example the target detail that we have partial integer modes with bitsize precision and that the bits precision appearantly have a meaning when looking at the bit-representation of a constant should not be part of the base class of wide-int (I doubt it belongs to wide-int at all, but I guess you know more about the reason we track bitsize in addition to precision - I think it's abstraction at the wrong level, the tree level does fine without knowing about bitsize). TBH I'm uneasy about the bitsize thing too. I think bitsize is only tracked for shift truncation, and if so, I agree it makes sense to do that separately. But anyway, this whole discussion seems to have reached a stalemate. Or I suppose a de-facto rejection, since you're the only person in a position to approve the thing :-) Richard
Re: patch to fix constant math - 4th patch - the wide-int class.
On Oct 31, 2012, at 5:44 AM, Richard Biener richard.guent...@gmail.com wrote: the fact that len ends up being mutable is another thing I dislike about wide-int. We expose len for construction only, it is non-mutating. During construction, there is no previous value. If wide-ints are cheap then all ops should be non-mutating (at least to 'len')). It is. Construction modifies the object as construction must be defined as initializing the state of the data. Before construction, there is no data, so, we are constructing the data, not mutating the data. Surely you don't object to construction?
Re: patch to fix constant math - 4th patch - the wide-int class.
On Oct 31, 2012, at 6:54 AM, Richard Biener richard.guent...@gmail.com wrote: I propose that no wide-int member function may _change_ it's len (to something larger). We never do that, so, we already do as you wish. We construct wide ints, and we have member functions to construct values. We need to construct values as some parts of the compiler want to create values. The construction of values can be removed when the rest of the compiler no longer wishes to construct values. LTO is an example of a client that wanted to construct a value. I'll let the LTO people chime in if they wish to no loner construct values.
Re: patch to fix constant math - 4th patch - the wide-int class.
On Oct 31, 2012, at 7:05 AM, Richard Biener richard.guent...@gmail.com wrote: You have an artificial limit on what 'len' can be. No. There is no limit, and nothing artificial. We take the maximum of the needs of the target, the maximum of the front-ends and the maximum of the mid-end and the back-end. We can drop a category, if that category no longer wishes to be our client. Any client is free to stop using wide-int, any time they want. For example, vrp could use gmp, if they wanted to, and the need to serve them drops. You have imagined the cost is high to do this, the reality is all long lived objects are small, and all short lived objects are so transitory that we are talking about maybe 5 live at a time. And you do not accomodate users that do not want to pay the storage penalty for that arbitrary upper limit choice. This is also wrong. First, there is no arbitrary upper limit. Second, all long lived objects are small. We accommodated them by having all long lived objects be small. The transitory objects are big, but there are only 5 of them alive at a time. That's all because 'len' may grow (mutate). This is also wrong.
Re: patch to fix constant math - 4th patch - the wide-int class.
On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it.I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard.
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it.I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively.I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation.
Re: patch to fix constant math - 4th patch - the wide-int class.
On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/25/2012 06:42 AM, Richard Biener wrote: On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump mikest...@comcast.net wrote: On Oct 24, 2012, at 2:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. Actually, no, they are not. Partial int modes can have bit sizes that are not power of two, and, if there isn't an int mode that is bigger, we'd want to round up the partial int bit size. Something like ((2 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT should do it. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter … double_int is logically dead. Reactoring wide-int and double-int is a waste of time, as the time is better spent removing double-int from the compiler. All the necessary semantics and code of double-int _has_ been refactored into wide-int already. Changing wide-int in any way to vend anything to double-int is wrong, as once double-int is removed, then all the api changes to make double-int share from wide-int is wasted and must then be removed. The path forward is the complete removal of double-int; it is wrong, has been wrong and always will be wrong, nothing can change that. double_int, compared to wide_int, is fast and lean. I doubt we will get rid of it - you will make compile-time math a _lot_ slower. Just profile when you for example change get_inner_reference to use wide_ints. To be able to remove double_int in favor of wide_int requires _at least_ templating wide_int on 'len' and providing specializations for 1 and 2. It might be a non-issue for math that operates on trees or RTXen due to the allocation overhead we pay, but in recent years we transitioned important paths away from using tree math to using double_ints _for speed reasons_. Richard. i do not know why you believe this about the speed. double int always does synthetic math since you do everything at 128 bit precision. the thing about wide int, is that since it does math to the precision's size, it almost never does uses synthetic operations since the sizes for almost every instance can be done using the native math on the machine. almost every call has a check to see if the operation can be done natively. I seriously doubt that you are going to do TI mode math much faster than i do it and if you do who cares. the number of calls does not effect the performance in any negative way and it fact is more efficient since common things that require more than one operation in double in are typically done in a single operation. Simple double-int operations like inline double_int double_int::and_not (double_int b) const { double_int result; result.low = low ~b.low; result.high = high ~b.high; return result; } are always going to be faster than conditionally executing only one operation (but inside an offline function). Richard.
Re: patch to fix constant math - 4th patch - the wide-int class.
On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch implements the wide-int class.this is a more general version of the double-int class and is meant to be the eventual replacement for that class.The use of this class removes all dependencies of the host from the target compiler's integer math. I have made all of the changes i agreed to in the earlier emails. In particular, this class internally maintains a bitsize and precision but not a mode. The class now is neutral about modes and tree-types.the functions that take modes or tree-types are just convenience functions that translate the parameters into bitsize and precision and where ever there is a call that takes a mode, there is a corresponding call that takes a tree-type. All of the little changes that richi suggested have also been made. The buffer sizes is now twice the size needed by the largest integer mode. This gives enough room for tree-vrp to do full multiplies on any type that the target supports. Tested on x86-64. This patch depends on the first three patches. I am still waiting on final approval on the hwint.h patch. Ok to commit? diff --git a/gcc/wide-int.h b/gcc/wide-int.h new file mode 100644 index 000..efd2c01 --- /dev/null +++ b/gcc/wide-int.h ... +#ifndef GENERATOR_FILE The whole file is guarded with that ... why? That is bound to be fragile once use of wide-int spreads? How do generator programs end up including this file if they don't need it at all? This is so that wide-int can be included at the level of the generators. There some stuff that needs to see this type that is done during the build build phase that cannot see the types that are included in wide-int.h. +#include tree.h +#include hwint.h +#include options.h +#include tm.h +#include insn-modes.h +#include machmode.h +#include double-int.h +#include gmp.h +#include insn-modes.h + That's a lot of tree and rtl dependencies. double-int.h avoids these by placing conversion routines in different headers or by only resorting to types in coretypes.h. Please try to reduce the above to a minimum. + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. And mode bitsizes are always power-of-two? I suppose so. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it.I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. Well, what I don't really like is that we now have two implementations of functions that perform integer math on two-HWI sized integers. What I also don't like too much is that we have two different interfaces to operate on them! Can't you see how I come to not liking this? Especially the latter ... in double-int.h and replace its implementation with a specialization of wide_int. Due to a number of divergences (double_int is not a subset of wide_int) that doesn't seem easily possible (one reason is the ShiftOp and related enums you use). Of course wide_int is not a template either. For the hypotetical embedded target above we'd end up using wide_int1, a even more trivial specialization. I realize again this wide-int is not what your wide-int is (because you add a precision member). Still factoring out the commons of wide-int and double-int into a wide_int_raw template should be possible. +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. If + operations are added that require larger buffers, then VAL needs + to be changed. */ + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; + unsigned short len; + unsigned int bitsize; + unsigned int precision; The len, bitsize and precision members need documentation. At least one sounds redundant. + public: + enum ShiftOp { +NONE, NONE is never a descriptive name ... I suppose
Re: patch to fix constant math - 4th patch - the wide-int class.
On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch implements the wide-int class.this is a more general version of the double-int class and is meant to be the eventual replacement for that class.The use of this class removes all dependencies of the host from the target compiler's integer math. I have made all of the changes i agreed to in the earlier emails. In particular, this class internally maintains a bitsize and precision but not a mode. The class now is neutral about modes and tree-types.the functions that take modes or tree-types are just convenience functions that translate the parameters into bitsize and precision and where ever there is a call that takes a mode, there is a corresponding call that takes a tree-type. All of the little changes that richi suggested have also been made. The buffer sizes is now twice the size needed by the largest integer mode. This gives enough room for tree-vrp to do full multiplies on any type that the target supports. Tested on x86-64. This patch depends on the first three patches. I am still waiting on final approval on the hwint.h patch. Ok to commit? diff --git a/gcc/wide-int.h b/gcc/wide-int.h new file mode 100644 index 000..efd2c01 --- /dev/null +++ b/gcc/wide-int.h ... +#ifndef GENERATOR_FILE The whole file is guarded with that ... why? That is bound to be fragile once use of wide-int spreads? How do generator programs end up including this file if they don't need it at all? +#include tree.h +#include hwint.h +#include options.h +#include tm.h +#include insn-modes.h +#include machmode.h +#include double-int.h +#include gmp.h +#include insn-modes.h + That's a lot of tree and rtl dependencies. double-int.h avoids these by placing conversion routines in different headers or by only resorting to types in coretypes.h. Please try to reduce the above to a minimum. + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; in double-int.h and replace its implementation with a specialization of wide_int. Due to a number of divergences (double_int is not a subset of wide_int) that doesn't seem easily possible (one reason is the ShiftOp and related enums you use). Of course wide_int is not a template either. For the hypotetical embedded target above we'd end up using wide_int1, a even more trivial specialization. I realize again this wide-int is not what your wide-int is (because you add a precision member). Still factoring out the commons of wide-int and double-int into a wide_int_raw template should be possible. +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. If + operations are added that require larger buffers, then VAL needs + to be changed. */ + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; + unsigned short len; + unsigned int bitsize; + unsigned int precision; The len, bitsize and precision members need documentation. At least one sounds redundant. + public: + enum ShiftOp { +NONE, NONE is never a descriptive name ... I suppose this is for arithmetic vs. logical shifts? +/* There are two uses for the wide-int shifting functions. The + first use is as an emulation of the target hardware. The + second use is as service routines for other optimizations. The + first case needs to be identified by passing TRUNC as the value + of ShiftOp so that shift amount is properly handled according to the + SHIFT_COUNT_TRUNCATED flag. For the second case, the shift + amount is always truncated by the bytesize of the mode of + THIS. */ +TRUNC ah, no, it's for SHIFT_COUNT_TRUNCATED. mode of THIS? Now it's precision I suppose. That said, handling SHIFT_COUNT_TRUNCATED in wide-int sounds over-engineered, the caller should be responsible of applying SHIFT_COUNT_TRUNCATED when needed. + enum SignOp { +/* Many of the math functions produce different results depending + on if they are SIGNED or UNSIGNED. In general, there are two + different functions, whose names are prefixed with an 'S and + or an 'U'. However, for some math functions there is also a + routine that does not have the prefix and takes an SignOp + parameter of SIGNED or UNSIGNED. */ +SIGNED, +UNSIGNED + }; double-int and _all_ of the rest of the middle-end uses a
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/2012 10:12 AM, Richard Biener wrote: On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch implements the wide-int class.this is a more general version of the double-int class and is meant to be the eventual replacement for that class.The use of this class removes all dependencies of the host from the target compiler's integer math. I have made all of the changes i agreed to in the earlier emails. In particular, this class internally maintains a bitsize and precision but not a mode. The class now is neutral about modes and tree-types.the functions that take modes or tree-types are just convenience functions that translate the parameters into bitsize and precision and where ever there is a call that takes a mode, there is a corresponding call that takes a tree-type. All of the little changes that richi suggested have also been made. The buffer sizes is now twice the size needed by the largest integer mode. This gives enough room for tree-vrp to do full multiplies on any type that the target supports. Tested on x86-64. This patch depends on the first three patches. I am still waiting on final approval on the hwint.h patch. Ok to commit? diff --git a/gcc/wide-int.h b/gcc/wide-int.h new file mode 100644 index 000..efd2c01 --- /dev/null +++ b/gcc/wide-int.h ... +#ifndef GENERATOR_FILE The whole file is guarded with that ... why? That is bound to be fragile once use of wide-int spreads? How do generator programs end up including this file if they don't need it at all? This is so that wide-int can be included at the level of the generators. There some stuff that needs to see this type that is done during the build build phase that cannot see the types that are included in wide-int.h. +#include tree.h +#include hwint.h +#include options.h +#include tm.h +#include insn-modes.h +#include machmode.h +#include double-int.h +#include gmp.h +#include insn-modes.h + That's a lot of tree and rtl dependencies. double-int.h avoids these by placing conversion routines in different headers or by only resorting to types in coretypes.h. Please try to reduce the above to a minimum. + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. in double-int.h and replace its implementation with a specialization of wide_int. Due to a number of divergences (double_int is not a subset of wide_int) that doesn't seem easily possible (one reason is the ShiftOp and related enums you use). Of course wide_int is not a template either. For the hypotetical embedded target above we'd end up using wide_int1, a even more trivial specialization. I realize again this wide-int is not what your wide-int is (because you add a precision member). Still factoring out the commons of wide-int and double-int into a wide_int_raw template should be possible. +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. If + operations are added that require larger buffers, then VAL needs + to be changed. */ + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; + unsigned short len; + unsigned int bitsize; + unsigned int precision; The len, bitsize and precision members need documentation. At least one sounds redundant. + public: + enum ShiftOp { +NONE, NONE is never a descriptive name ... I suppose this is for arithmetic vs. logical shifts? suggest something +/* There are two uses for the wide-int shifting functions. The + first use is as an emulation of the target hardware. The + second use is as service routines for other optimizations. The + first case needs to be identified by passing TRUNC as the value + of ShiftOp so that shift amount is properly handled according to the + SHIFT_COUNT_TRUNCATED flag. For the second case, the shift + amount is always truncated by the bytesize of
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/12, Richard Biener richard.guent...@gmail.com wrote: I wonder if for the various ways to specify precision/len there is a nice C++ way of moving this detail out of wide-int. I can think only of one: struct WIntSpec { WIntSpec (unsigned int len, unsigned int precision); WIntSpec (const_tree); WIntSpec (enum machine_mode); unsigned int len; unsigned int precision; }; and then (sorry to pick one of the less useful functions): inline static wide_int zero (WIntSpec) which you should be able to call like wide_int::zero (SImode) wide_int::zero (integer_type_node) and (ugly) wide_int::zero (WIntSpec (32, 32)) with C++0x wide_int::zero ({32, 32}) should be possible? Or we keep the precision overload. At least providing the WIntSpec abstraction allows custom ways of specifying required bits to not pollute wide-int itself too much. Lawrence? Yes, in C++11, wide_int::zero ({32, 32}) is possible using an implicit conversion to WIntSpec from an initializer_list. However, at present we are limited to C++03 to enable older compilers as boot compilers. -- Lawrence Crowl
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + inline bool minus_one_p () const; + inline bool zero_p () const; + inline bool one_p () const; + inline bool neg_p () const; what's wrong with w == -1, w == 0, w == 1, etc.? I would love to do this and you seem to be somewhat knowledgeable of c++. But i cannot for the life of me figure out how to do it. Starting from the simple case, you write an operator ==. as global operator: bool operator == (wide_int w, int i); as member operator: bool wide_int::operator == (int i); In the simple case, bool operator == (wide_int w, int i) { switch (i) { case -1: return w.minus_one_p (); case 0: return w.zero_p (); case 1: return w.one_p (); default: unexpected } } say i have a TImode number, which must be represented in 4 ints on a 32 bit host (the same issue happens on 64 bit hosts, but the examples are simpler on 32 bit hosts) and i compare it to -1. The value that i am going to see as the argument of the function is going have the value 0x. but the value that i have internally is 128 bits. do i take this and 0 or sign extend it? What would you have done with w.minus_one_p ()? in particular if someone wants to compare a number to 0xdeadbeef i have no idea what to do. I tried defining two different functions, one that took a signed and one that took and unsigned number but then i wanted a cast in front of all the positive numbers. This is where it does get tricky. For signed arguments, you should sign extend. For unsigned arguments, you should not. At present, we need multiple overloads to avoid type ambiguities. bool operator == (wide_int w, long long int i); bool operator == (wide_int w, unsigned long long int i); inline bool operator == (wide_int w, long int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned long int i) { return w == (unsigned long long int) i; } inline bool operator == (wide_int w, int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned int i) { return w == (unsigned long long int) i; } (There is a proposal before the C++ committee to fix this problem.) Even so, there is room for potential bugs when wide_int does not carry around whether or not it is signed. The problem is that regardless of what the programmer thinks of the sign of the wide int, the comparison will use the sign of the int. If there is a way to do this, then i will do it, but it is going to have to work properly for things larger than a HOST_WIDE_INT. The long-term solution, IMHO, is to either carry the sign information around in either the type or the class data. (I prefer type, but with a mechanism to carry it as data when needed.) Such comparisons would then require consistency in signedness between the wide int and the plain int. I know that double-int does some of this and it does not carry around a notion of signedness either. is this just code that has not been fully tested or is there a trick in c++ that i am missing? The double int class only provides == and !=, and only with other double ints. Otherwise, it has the same value query functions that you do above. In the case of double int, the goal was to simplify use of the existing semantics. If you are changing the semantics, consider incorporating sign explicitly. -- Lawrence Crowl
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/2012 02:38 PM, Lawrence Crowl wrote: On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + inline bool minus_one_p () const; + inline bool zero_p () const; + inline bool one_p () const; + inline bool neg_p () const; what's wrong with w == -1, w == 0, w == 1, etc.? I would love to do this and you seem to be somewhat knowledgeable of c++. But i cannot for the life of me figure out how to do it. Starting from the simple case, you write an operator ==. as global operator: bool operator == (wide_int w, int i); as member operator: bool wide_int::operator == (int i); In the simple case, bool operator == (wide_int w, int i) { switch (i) { case -1: return w.minus_one_p (); case 0: return w.zero_p (); case 1: return w.one_p (); default: unexpected } } no, this seems wrong.you do not want to write code that can only fail at runtime unless there is a damn good reason to do that. say i have a TImode number, which must be represented in 4 ints on a 32 bit host (the same issue happens on 64 bit hosts, but the examples are simpler on 32 bit hosts) and i compare it to -1. The value that i am going to see as the argument of the function is going have the value 0x. but the value that i have internally is 128 bits. do i take this and 0 or sign extend it? What would you have done with w.minus_one_p ()? the code knows that -1 is a negative number and it knows the precision of w.That is enough information. So it logically builds a -1 that has enough bits to do the conversion. in particular if someone wants to compare a number to 0xdeadbeef i have no idea what to do. I tried defining two different functions, one that took a signed and one that took and unsigned number but then i wanted a cast in front of all the positive numbers. This is where it does get tricky. For signed arguments, you should sign extend. For unsigned arguments, you should not. At present, we need multiple overloads to avoid type ambiguities. bool operator == (wide_int w, long long int i); bool operator == (wide_int w, unsigned long long int i); inline bool operator == (wide_int w, long int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned long int i) { return w == (unsigned long long int) i; } inline bool operator == (wide_int w, int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned int i) { return w == (unsigned long long int) i; } (There is a proposal before the C++ committee to fix this problem.) Even so, there is room for potential bugs when wide_int does not carry around whether or not it is signed. The problem is that regardless of what the programmer thinks of the sign of the wide int, the comparison will use the sign of the int. when they do we can revisit this. but i looked at this and i said the potential bugs were not worth the effort. If there is a way to do this, then i will do it, but it is going to have to work properly for things larger than a HOST_WIDE_INT. The long-term solution, IMHO, is to either carry the sign information around in either the type or the class data. (I prefer type, but with a mechanism to carry it as data when needed.) Such comparisons would then require consistency in signedness between the wide int and the plain int. carrying the sign information is a non starter.The rtl level does not have it and the middle end violates it more often than not.My view was to design this having looked at all of the usage. I have basically converted the whole compiler before i released the abi. I am still getting out the errors and breaking it up in reviewable sized patches, but i knew very very well who my clients were before i wrote the abi. I know that double-int does some of this and it does not carry around a notion of signedness either. is this just code that has not been fully tested or is there a trick in c++ that i am missing? The double int class only provides == and !=, and only with other double ints. Otherwise, it has the same value query functions that you do above. In the case of double int, the goal was to simplify use of the existing semantics. If you are changing the semantics, consider incorporating sign explicitly. i have, and it does not work.
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 02:38 PM, Lawrence Crowl wrote: On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + inline bool minus_one_p () const; + inline bool zero_p () const; + inline bool one_p () const; + inline bool neg_p () const; what's wrong with w == -1, w == 0, w == 1, etc.? I would love to do this and you seem to be somewhat knowledgeable of c++. But i cannot for the life of me figure out how to do it. Starting from the simple case, you write an operator ==. as global operator: bool operator == (wide_int w, int i); as member operator: bool wide_int::operator == (int i); In the simple case, bool operator == (wide_int w, int i) { switch (i) { case -1: return w.minus_one_p (); case 0: return w.zero_p (); case 1: return w.one_p (); default: unexpected } } no, this seems wrong.you do not want to write code that can only fail at runtime unless there is a damn good reason to do that. Well, that's because it's the oversimplified case. :-) say i have a TImode number, which must be represented in 4 ints on a 32 bit host (the same issue happens on 64 bit hosts, but the examples are simpler on 32 bit hosts) and i compare it to -1. The value that i am going to see as the argument of the function is going have the value 0x. but the value that i have internally is 128 bits. do i take this and 0 or sign extend it? What would you have done with w.minus_one_p ()? the code knows that -1 is a negative number and it knows the precision of w. That is enough information. So it logically builds a -1 that has enough bits to do the conversion. And the code could also know that '-n' is a negative number and do the identical conversion. It would certainly be more difficult to write and get all the edge cases. in particular if someone wants to compare a number to 0xdeadbeef i have no idea what to do. I tried defining two different functions, one that took a signed and one that took and unsigned number but then i wanted a cast in front of all the positive numbers. This is where it does get tricky. For signed arguments, you should sign extend. For unsigned arguments, you should not. At present, we need multiple overloads to avoid type ambiguities. bool operator == (wide_int w, long long int i); bool operator == (wide_int w, unsigned long long int i); inline bool operator == (wide_int w, long int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned long int i) { return w == (unsigned long long int) i; } inline bool operator == (wide_int w, int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned int i) { return w == (unsigned long long int) i; } (There is a proposal before the C++ committee to fix this problem.) Even so, there is room for potential bugs when wide_int does not carry around whether or not it is signed. The problem is that regardless of what the programmer thinks of the sign of the wide int, the comparison will use the sign of the int. when they do we can revisit this. but i looked at this and i said the potential bugs were not worth the effort. I won't disagree. I was answering what I thought were questions on what was possible. If there is a way to do this, then i will do it, but it is going to have to work properly for things larger than a HOST_WIDE_INT. The long-term solution, IMHO, is to either carry the sign information around in either the type or the class data. (I prefer type, but with a mechanism to carry it as data when needed.) Such comparisons would then require consistency in signedness between the wide int and the plain int. carrying the sign information is a non starter.The rtl level does not have it and the middle end violates it more often than not.My view was to design this having looked at all of the usage. I have basically converted the whole compiler before i released the abi. I am still getting out the errors and breaking it up in reviewable sized patches, but i knew very very well who my clients were before i wrote the abi. Okay. I know that double-int does some of this and it does not carry around a notion of signedness either. is this just code that has not been fully tested or is there a trick in c++ that i am missing? The double int class only provides == and !=, and only with other double ints. Otherwise, it has the same value query functions that you do above. In the case of double int, the goal was to simplify use of the existing semantics. If you are changing the semantics, consider incorporating sign explicitly. i have, and it does not work. Unfortunate. -- Lawrence Crowl
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/2012 04:25 PM, Lawrence Crowl wrote: On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 02:38 PM, Lawrence Crowl wrote: On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + inline bool minus_one_p () const; + inline bool zero_p () const; + inline bool one_p () const; + inline bool neg_p () const; what's wrong with w == -1, w == 0, w == 1, etc.? I would love to do this and you seem to be somewhat knowledgeable of c++. But i cannot for the life of me figure out how to do it. Starting from the simple case, you write an operator ==. as global operator: bool operator == (wide_int w, int i); as member operator: bool wide_int::operator == (int i); In the simple case, bool operator == (wide_int w, int i) { switch (i) { case -1: return w.minus_one_p (); case 0: return w.zero_p (); case 1: return w.one_p (); default: unexpected } } no, this seems wrong.you do not want to write code that can only fail at runtime unless there is a damn good reason to do that. Well, that's because it's the oversimplified case. :-) say i have a TImode number, which must be represented in 4 ints on a 32 bit host (the same issue happens on 64 bit hosts, but the examples are simpler on 32 bit hosts) and i compare it to -1. The value that i am going to see as the argument of the function is going have the value 0x. but the value that i have internally is 128 bits. do i take this and 0 or sign extend it? What would you have done with w.minus_one_p ()? the code knows that -1 is a negative number and it knows the precision of w. That is enough information. So it logically builds a -1 that has enough bits to do the conversion. And the code could also know that '-n' is a negative number and do the identical conversion. It would certainly be more difficult to write and get all the edge cases. I am not a c++ hacker. if someone wants to go there later, we can investigate this. but it seems like a can of worms right now. in particular if someone wants to compare a number to 0xdeadbeef i have no idea what to do. I tried defining two different functions, one that took a signed and one that took and unsigned number but then i wanted a cast in front of all the positive numbers. This is where it does get tricky. For signed arguments, you should sign extend. For unsigned arguments, you should not. At present, we need multiple overloads to avoid type ambiguities. bool operator == (wide_int w, long long int i); bool operator == (wide_int w, unsigned long long int i); inline bool operator == (wide_int w, long int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned long int i) { return w == (unsigned long long int) i; } inline bool operator == (wide_int w, int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned int i) { return w == (unsigned long long int) i; } (There is a proposal before the C++ committee to fix this problem.) Even so, there is room for potential bugs when wide_int does not carry around whether or not it is signed. The problem is that regardless of what the programmer thinks of the sign of the wide int, the comparison will use the sign of the int. when they do we can revisit this. but i looked at this and i said the potential bugs were not worth the effort. I won't disagree. I was answering what I thought were questions on what was possible. If there is a way to do this, then i will do it, but it is going to have to work properly for things larger than a HOST_WIDE_INT. The long-term solution, IMHO, is to either carry the sign information around in either the type or the class data. (I prefer type, but with a mechanism to carry it as data when needed.) Such comparisons would then require consistency in signedness between the wide int and the plain int. carrying the sign information is a non starter.The rtl level does not have it and the middle end violates it more often than not.My view was to design this having looked at all of the usage. I have basically converted the whole compiler before i released the abi. I am still getting out the errors and breaking it up in reviewable sized patches, but i knew very very well who my clients were before i wrote the abi. Okay. I know that double-int does some of this and it does not carry around a notion of signedness either. is this just code that has not been fully tested or is there a trick in c++ that i am missing? The double int class only provides == and !=, and only with other double ints. Otherwise, it has the same value query functions that you do above. In the case of double int, the goal was to simplify use of the existing semantics. If you are changing the semantics, consider incorporating sign explicitly. i have, and it does not work. Unfortunate. There is certainly a desire here not