For 1D vectors, even if you use @inbounds, it will fetch the base address (pointer) each time you access the element (since the compiler do not know whether the array has been resized and thus the memory reallocated). If you use unsafe_views, it is assumed that the base address is fixed, and the pointer is cached in the register. So it is a little bit faster.
- Dahua On Monday, January 6, 2014 4:31:05 AM UTC-6, Sheehan Olver wrote: > > Won’t it be slightly faster with the view, as doing > > x[k] *= -1. > > will require a bound check on x[k]? > > I also don’t understand the trade-offs between @inbounds vs > unsafe_view()…just in experiments unsafe_view seems to be faster. > > > > > On 6 Jan 2014, at 4:53 pm, Stefan Karpinski > <[email protected]<javascript:>> > wrote: > > Why do you need the view? Why not just write a loop that flips the sign of > every other element? > > On Jan 5, 2014, at 10:29 PM, Sheehan Olver <[email protected]<javascript:>> > wrote: > > > > I'm looking for a fast (BLAS?) way to multiply every other entry of a > vector by -1. Attached are some different ways I've tried, all of which > seem dreadfully slow. (I include a pre-planned DCT for timing comparison: > it should be much faster but is similar in cost) > > > > n=10000 > x=2rand(n)-1 > > > @time x.*(-1.).^[0:length(x)-1]; > > # elapsed time: 0.00030401 seconds (241456 bytes allocated) > > > using NumericExtensions > > y = unsafe_view(x) > @time for k =2:2:length(x) > y[k] = -y[k] > end > > # elapsed time: 0.000876339 seconds (699616 bytes allocated) > > > > ##pre-planned DCT for comparison > p=FFTW.plan_r2r(x, FFTW.REDFT00) > > @time p(x); > # elapsed time: 0.000814533 seconds (80928 bytes allocated) > > >
