On Fri, 07 Aug 2009 01:22:31 -0700, Yigal Chripun <[email protected]>
wrote:
Robert Jacques wrote:
On Thu, 06 Aug 2009 13:34:21 -0700, Yigal Chripun <[email protected]>
wrote:
Ary Borenszweig wrote:
Adam D. Ruppe wrote:
On Thu, Aug 06, 2009 at 03:06:49PM -0400, Paul D. Anderson wrote:
Oh wait...I think "//" is used elsewhere.
Is this a joke?
No. When porting C, C++, Java or C# code just search "//" and
replace it with "--".
Oh wait... I think "--" is used elsewhere.
Why would I as a user want to have two ops that do the same thing?!
Python's solution for this is wrong IMHO since the problem is not with
the op itself but rather with the way C handles numbers.
5 / 2 should always be the more precise FP division unless the
compiler knows or is instructed to do otherwise.
int a = 5 / 2; // compiler knows to use integer division
No, it doesn't. (i.e. Welcome to the limitations of a context-free
grammar) The right hand of the expression has to be evaluated before
the left, or otherwise function overloads, etc, don't work, so there's
no way for the compiler to know the type of the expected result when
5/2 is evaluated.
auto b = 5 / 2; // b is double, FP division
auto c = cast(int)(5/2)); // compiler instructed to use integer
division
No, the cast would apply to the result of (5/2), not the '/' operator.
auto d = floor(5 / 2); // FP division floored by a function to int
floor returns a real, not an int. I think you were looking for
roundTo!int or roundTo!int(floor(5/2))
auto f = std.math.div(5, 2); // intristic that does integer division
what's the rationale of not doing the above, besides C compatibility?
The rationale is that integer division is very common and is usually
assigned back into an int, and not a real. The only issue is literals,
which were going to be handled with polysemous values (but that got
dropped). But 2/5.0 is really not that much overhead.
As you noted yourself, polysemous types help solve this and also the
compiler can have special handling (peep hole optimizations) for some of
the above cases.
in case D roundTo sounds indeed better (and since in the above it's
compile time, it should be possible to optimize it.
you've ignored case f which seems to me the most important: instead of
currently 5/2 == 2 there should be a div intristic function such that
div(5, 2) == 2 and that intristic will be the appropriate ASM
instruction for integer division.
No, I didn't. I just didn't have anything to say.
In general, you can always round down a float but you can't get the
correct double out of the rounded int, so my question is how much a
performance hit is it to use the FP as default and only use the div
intristic where performance is really an issue?
Now that I think about it, turning integer division into a function is a
really bad idea: 1) there's function call overhead and 2) even if inlined,
register values and the stack still have to be manipulated (as function
call syntax dictates where the inputs and outputs are located) Today, DMD
can choose any set of register inputs and outputs.
As for the int-float conversion, apperently float-int is somewhat slow and
destroys the floating point pipeline
From http://mega-nerd.com/FPcast/
The instruction which causes the real damage in this block is fldcw,
(FPU load control word) on lines 8 and 11. Whenever the FPU encounters
this instruction it flushes its pipeline and loads the control word
before continuing operation. The FPUs of modern CPUs like the Pentium
III, Pentium IV and AMD Athlons rely on deep pipelines to achieve higher
peak performance. Unfortunately certain pieces of C code can reduce the
floating point performance of the CPU to level of a non-pipelined FPU.
So why is the fldcw instruction used? Unfortunately, it is required to
make the calculation meet the ISO C Standard which specifies that
casting from floating point to integer is a truncation operation.
However, if the fistpl instruction was executed without changing the
mode of the FPU, the value would have been rounded instead of truncated.
The standard rounding mode is required for all normal operations like
addition, subtraction, multiplication etc while truncation mode is
required for the float to int cast. Hence if a block of code contains a
float to int cast, the FPU will spend a large amount of its time
switching between the two modes.