Robert Jacques wrote:
On Fri, 07 Aug 2009 01:22:31 -0700, Yigal Chripun <[email protected]>
wrote:
Robert Jacques wrote:
On Thu, 06 Aug 2009 13:34:21 -0700, Yigal Chripun
<[email protected]> wrote:
Ary Borenszweig wrote:
Adam D. Ruppe wrote:
On Thu, Aug 06, 2009 at 03:06:49PM -0400, Paul D. Anderson wrote:
Oh wait...I think "//" is used elsewhere.
Is this a joke?
No. When porting C, C++, Java or C# code just search "//" and
replace it with "--".
Oh wait... I think "--" is used elsewhere.
Why would I as a user want to have two ops that do the same thing?!
Python's solution for this is wrong IMHO since the problem is not
with the op itself but rather with the way C handles numbers.
5 / 2 should always be the more precise FP division unless the
compiler knows or is instructed to do otherwise.
int a = 5 / 2; // compiler knows to use integer division
No, it doesn't. (i.e. Welcome to the limitations of a context-free
grammar) The right hand of the expression has to be evaluated before
the left, or otherwise function overloads, etc, don't work, so
there's no way for the compiler to know the type of the expected
result when 5/2 is evaluated.
auto b = 5 / 2; // b is double, FP division
auto c = cast(int)(5/2)); // compiler instructed to use integer
division
No, the cast would apply to the result of (5/2), not the '/' operator.
auto d = floor(5 / 2); // FP division floored by a function to int
floor returns a real, not an int. I think you were looking for
roundTo!int or roundTo!int(floor(5/2))
auto f = std.math.div(5, 2); // intristic that does integer division
what's the rationale of not doing the above, besides C compatibility?
The rationale is that integer division is very common and is usually
assigned back into an int, and not a real. The only issue is
literals, which were going to be handled with polysemous values (but
that got dropped). But 2/5.0 is really not that much overhead.
As you noted yourself, polysemous types help solve this and also the
compiler can have special handling (peep hole optimizations) for some
of the above cases.
in case D roundTo sounds indeed better (and since in the above it's
compile time, it should be possible to optimize it.
you've ignored case f which seems to me the most important: instead of
currently 5/2 == 2 there should be a div intristic function such that
div(5, 2) == 2 and that intristic will be the appropriate ASM
instruction for integer division.
No, I didn't. I just didn't have anything to say.
In general, you can always round down a float but you can't get the
correct double out of the rounded int, so my question is how much a
performance hit is it to use the FP as default and only use the div
intristic where performance is really an issue?
Now that I think about it, turning integer division into a function is a
really bad idea: 1) there's function call overhead and 2) even if
inlined, register values and the stack still have to be manipulated (as
function call syntax dictates where the inputs and outputs are located)
Today, DMD can choose any set of register inputs and outputs.
As for the int-float conversion, apperently float-int is somewhat slow
and destroys the floating point pipeline
From http://mega-nerd.com/FPcast/
The instruction which causes the real damage in this block is fldcw,
(FPU load control word) on lines 8 and 11. Whenever the FPU encounters
this instruction it flushes its pipeline and loads the control word
before continuing operation. The FPUs of modern CPUs like the Pentium
III, Pentium IV and AMD Athlons rely on deep pipelines to achieve
higher peak performance. Unfortunately certain pieces of C code can
reduce the floating point performance of the CPU to level of a
non-pipelined FPU.
So why is the fldcw instruction used? Unfortunately, it is required to
make the calculation meet the ISO C Standard which specifies that
casting from floating point to integer is a truncation operation.
However, if the fistpl instruction was executed without changing the
mode of the FPU, the value would have been rounded instead of
truncated. The standard rounding mode is required for all normal
operations like addition, subtraction, multiplication etc while
truncation mode is required for the float to int cast. Hence if a
block of code contains a float to int cast, the FPU will spend a large
amount of its time switching between the two modes.
Thanks for the info about the cast.
regarding the div() function above, I was thinking about using D's naked
asm feature. From what little I know about this, the compiler doesn't
generate the usual asm code for this sort of function to handle
registers, stack, etc, and you're supposed to do everything yourself in
asm.
I don't think your comment above about the function call overhead
applies if div is implemented in such a way but I'm no expert and
someone more knowledgeable can shed more light on this. (Don?)
also, div can be implemented in the compiler in the same way c++ style
casts have template syntax ( e.g. static_cast<whatever>(thing) ) but are
implemented inside the compiler.