That's an interesting observation. Vandermonde matrices are actually quite an interesting test of unusual floating point computations. Constructing Vandermonde matrices explicitly very quickly results in matrix elements that overflow to +/-Inf or underflow toward 0 unless your starting vector has entries that are all exactly of magnitude 1. Hence the larger problems you are computing on [1:N] end up testing largely the speed of multiplying finite numbers by infinity. Conversely had your tests used as input rand(N) (uniform random numbers between 0 and 1), you would have tested largely the speed of multiplying denormalized floats and zeros.
On my machine, the speed penalty from denormalized floats turns out to be significant, but not the +Infs. julia> gc() julia> x=[(-1.0)^n for n=1:10^4]; @time vander2(x); #Lots of multiplications involving +1.0 and -1.0 elapsed time: 1.216966083 seconds (1600000400 bytes allocated, 6.44% gc time) julia> gc() julia> x=[1.0:10000.0]; @time vander2(x); #Lots of multiplications involving +Inf elapsed time: 1.206602925 seconds (1600000400 bytes allocated, 6.04% gc time) julia> gc() julia> x=rand(10000); @time vander2(x); #Lots of multiplications involving 0s and denormalized floats elapsed time: 2.952932852 seconds (1600000400 bytes allocated, 2.82% gc time) On my machine, the equivalent computations using numpy are significantly slower: >>> x=[(-1.0)**n for n in range(1,10001)]; start=time.clock(); np.vander(x); time.clock()-start ... 3.961352000000005 >>> x=[n for n in range(1,10001)]; start=time.clock(); np.vander(x); time.clock()-start ... 5.036712999999999 >>> x=np.random.rand(10000); start=time.clock(); np.vander(x); time.clock()-start ... 5.462721000000002
