Hi Jerry and Steve,
Well I know 42 is the answer to the ultimate question of the universe so this
must be OK. I just don't know what the question is.
OK and thanks,
Jerry
+#define CONSTR_LEN_MAX 42
Actually, I was wondering about the choice myself. With
most common hardware having fairly robust L1 and L2 cache
sizes, a double precision array constructor with 42
elements only occupies 336 bytes. Seems small.
Well, the answer is that I didn't know how to chose a reasonable
constant. I now actually ran some benchmarks using rdtsc, and
these seem to indicate that the optimum value for CONST_LEN_MAX
is actually quite short, 3 or 4, otherwise I just got a slowdown
or a break even.
So, I committed (r253872) with a length of 4 as a limit. If anybody
comes up with a better number, we can always change this.
So, thanks for the review and the comments.
Regards
Thomas
If somebody wants to check, here is the test case:
main.f90:
module tick
interface
function rdtsc()
integer(kind=8) :: rdtsc
end function rdtsc
end interface
end module tick
program main
use tick
use tst
implicit none
integer(8) :: t1, t2
t1 = rdtsc()
call sub1(2.0)
t2 = rdtsc()
! print *,"sub1 : ", t2-t1
t1 = rdtsc()
do i=1,10000
call sub1(2.0)
end do
t2 = rdtsc()
print *,"sub1 : ", t2-t1
t1 = rdtsc()
do i=1,10000
call sub2(2.0)
end do
t2 = rdtsc()
print *,"sub2 : ", t2-t1
end program main
tst.f90:
module tst
integer, parameter :: n=4
real, dimension(n) :: x
real, dimension(n), parameter :: s = [(i,i=1,n)]
contains
subroutine sub1(a)
real, intent(in) :: a
x(1) = a * 1.0
x(2) = a * 2.0
x(3) = a * 3.0
x(4) = a * 3.0
end subroutine sub1
subroutine sub2(a)
x(:) = a * s(:)
end subroutine sub2
end module tst
rdtsc.s:
.file "rdtsc.s"
.text
.globl rdtsc_
.type rdtsc_, @function
rdtsc_:
.LFB0:
.cfi_startproc
rdtsc
shl $32, %rdx
or %rdx, %rax
ret
.cfi_endproc
.LFE0:
.size rdtsc_, .-rdtsc_
.section .note.GNU-stack,"",@progbits