Wow, what a clever implementation! I'm really impressed how several simple optimizations (like inlining pixel interpolation code and moving common computations outside the loop) gave ~20 times faster implementation than mine (and ~3 times faster than Cairo)!
On Fri, Aug 8, 2014 at 4:11 AM, Tim Holy <[email protected]> wrote: > Now, it looks like you're doing it right. I expected this, see > https://github.com/timholy/Grid.jl/pull/38. This is part of what I meant > by > "refactoring" :). However, for image interpolation even further savings > beyond > that pull request are possible: for example, you only need one call to > floor > per pixel, because you can nest the index operations. > > Here's a reasonably well-optimized prototype (a flexible implementation > will > take longer to write, but should not cost performance). Note there may be > some > additional optimizations possible. > > First, the results: > julia> include("/tmp/resize.jl"); > Cairo: > elapsed time: 0.098905264 seconds (53896192 bytes allocated, 38.25% gc > time) > Julia: > elapsed time: 0.034537582 seconds (6340816 bytes allocated) > > Julia is ~3 times faster. > > Here's the code (I just copied your Cairo implementation): > > > using Images, Cairo > > function imresize_julia!(resized, original) > scale1 = (size(original,1)-1)/(size(resized,1)-0.999f0) > scale2 = (size(original,2)-1)/(size(resized,2)-0.999f0) > for jr = 0:size(resized,2)-1 > jo = scale2*jr > ijo = itrunc(jo) > fjo = jo - oftype(jo, ijo) > @inbounds for ir = 0:size(resized,1)-1 > io = scale1*ir > iio = itrunc(io) > fio = io - oftype(io, iio) > tmp = (1-fio)*((1-fjo)*original[iio+1,ijo+1] + > fjo*original[iio+1,ijo+2]) > + fio*((1-fjo)*original[iio+2,ijo+1] + > fjo*original[iio+2,ijo+2]) > resized[ir+1,jr+1] = convertsafely(eltype(resized), tmp) > end > end > resized > end > imresize_julia(original, new_size) = imresize_julia!(similar(original, > new_size), original) > convertsafely{T<:FloatingPoint}(::Type{T}, val) = convert(T, val) > convertsafely{T<:Integer}(::Type{T}, val::Integer) = convert(T, val) > convertsafely{T<:Integer}(::Type{T}, val::FloatingPoint) = itrunc(T, > val+oftype(val, 0.5)) > > > function imresize_cairo(dat::Array{Uint32, 2}, new_size::(Int, Int)) > cs = CairoImageSurface(dat, 0) > new_dat = zeros(Uint32, new_size) > new_cs = CairoImageSurface(new_dat, 0) > pat = CairoPattern(cs) > pattern_set_filter(pat, Cairo.FILTER_BILINEAR) > c = CairoContext(new_cs) > h, w = size(dat) > new_h, new_w = new_size > scale(c, new_h / h, new_w / w) > set_source(c, pat) > paint(c) > return new_cs.data > end > > img = rand(0x00:0xff, 774, 512) > new_size = (3096, 2048) > imresize_cairo(convert(Array{Uint32}, img), new_size) > println("Cairo:") > @time imresize_cairo(convert(Array{Uint32}, img), new_size) > imresize_julia(img, new_size) > println("Julia:") > @time imresize_julia(img, new_size) > > > > More advantages of doing this in Julia rather than through Cairo: > - No binary dependencies (as you stated earlier) > - Cairo is basically 8-bit, which is inadequate for many applications. > Cairo's > Uint32s are actually for encoding color, and internally they get treated > like > 4 Uint8s---you can't get 16-bit dynamic range, for example. > - For Cairo you'd have to convert images with different datatypes into > Uint32. > Measuring the performance of Cairo should include the cost of conversion. > With > Julia, we can write versions that work for any input. For example, try a > Float32 image, you'll see the julia version is even faster that what I > showed > above. > - We can separately implement algorithms for grayscale and color. Part of > the > reason this is three times faster than Cairo's is that Cairo is basically > doing three times the work. > > > --Tim > > >
