Re: [gdal-dev] downsampling geotiff with a low-pass filter

Kay F. Jahnke Sat, 17 Nov 2018 02:04:56 -0800

Thank you all for responding. Rather than responding to each individualpost, I'll try and summarized what I have gleaned so far, please correctme if I have misunderstood something. This will be a long post, pleaseforgive my verbosity, but this is not a trivial topic.

1.

I'll start with cubic spline resampling. Thanks to Even for pointing meto the GDAL code. It looks to me as if the sample of the B-spline basisfunction is applied directly to the source data, with no prefilteringapplied to the source data. The sampled b-spline basis function is thenused to produce a weighted average of source data values. Ifprefiltering is indeed omitted, the resulting interpolation has aslightly smoothing effect and does not fulfill the interpolationcriterion, meaning that the result of the operation at locationscoinciding with source data points does not precisely reproduce thesource data. This seems to be what's happening: without a scale change,the application of cubic spline resampling seems to slightly smoothe thesignal.

Application of the b-spline evaluation without previous prefiltering iscommonly done, sometimes deliberately because the slight smoothing isintended and the resulting signal will stay within the convex hull ofthe coefficients (so, there won't be overshoots, which are possible withprefiltered coefficients) - and sometimes erroneously, because the needfor prefiltering is not known. For slight downscaling, this smoothingmay be 'just right', but it's only 'about right' for a small range ofscaling factors. Note here that b-spline prefiltering is not thelow-pass filter which has to be applied to the source data beforeresampling.

b-splines are not 'direct' interpolators. Direct interpolatorsapproximate the continuous signal by using some operation over the givensamples. b-splines instead use an operation over a set of coefficients,which are generated from the original samples by 'prefiltering' them.This can be explained simply: The 'reconstruction' filter for a givenb-spline is a low-pass filter. Hence, applying it to the original signalwill 'blur' that signal. The prefilter is a high-pass filter which'sharpens' the original signal by just so much that applying thereconstruction filter will produce the original signal. The problem withprefiltering is that it's done with an IIR filter and has theoreticallyinfinite support (usually it's enough to have a handful of supporteither end, forgive my sloppy formulation), which makes margin treatmentand handling of no-data value a dicey issue. This is probably one reasonwhy prefiltering tends to be omitted.

2.

Looking at the source code, what I understand from it is that, as I havestated in my initial post, upsampling is indeed well-covered, but theonly 'filter' which makes sense in a downsampling context is 'average',with all the disadvantages it has. In the GRASS codebase (Thanks toMarkus for pointing to it) I noticed that the need for specializedtechnique for downsampling is acknowledged and there is a filteringmethod which is definitely better-suited than just averaging overcontributing source data values: the 'weight according to area'parameter, which can be passed to the r.resamp.stats method. This is atechnique which is also used in astronomy - I've seen it first inopenCV, where it is one of the standard interpolators ('area'). It canproduce output data which preserve the energy content of the inputsignal, looking at it as if it were composed of small rectangles of auniform value and doing an average over a part of that pattern which isgiven by the extent of the (larger) target pixel. This method isdefinitely preferable to using the method without the -w parameter. Italso has the nice property of having small support and the resultingsignal will not over- or undershoot. I'd encourage GDAL to consideradopting this method to avoid the worst of downsampling errors that comewith '-r average'.

3.

Downsampling usually has to be preceded by an appropriate low-passfilter. Let me iterate when such a filter is needed:

The highest representable frequency in the target (downsampled) signalis given by the Nyquist frequency for this signal's sampling rate. Ifthe source signal has frequency content higher than that, it *must* below-pass filtered before resampling, because otherwise thehigh-frequency content will 'alias' into the target signal and produceartifacts. If such high-frequency content is not present, resampling canproceed without previous low-pass filtering.

Oftentimes the high-frequency content is small, and especially when thiscomes together with small amounts of downscaling, omitting the low-passfilter will only produce small errors which are not easily seen, so theresult may pass as correct, looking 'okay', when in fact is it slightlywrong, but not noticeably so. With increasing high-frequency content andlarger downscaling factors, the error grows and may 'spoil' the signalso much that it becomes noticeable.

So, first, we have to establish if there is need for low-pass-filtering,and the advice would be to use it if there is relevant high-frequncycontent. This might be seen from an FFT of the signal. If the FFT is athand, the high-frequency part can be removed in the frequency domain,and after IFFT the signal is 'eligible' for resampling without furtherado. This is a straightforward approach, but it does not play well withno-data values. It's also possible to apply a low-pass filter no matterif there is high-frequency content: if there is no high-frequencycontent, a good low-pass filter has no effect. So now we come to thenext question, namely 'what is a good low-pass filter'.

4.

Thanks to Jakob for pointing me to your python code. I'll happily usePython :D

Using a gaussian to smoothe (or low-pass-filter) the source signal is avalid approach, especially since the radius of the gaussian can bechosen to match the desired degree of smoothing, which, in adownsampling context, depends on the scale change. Using a truncatedgaussian (the 'true gaussian' is infinitely large) as a convolutionkernel is common and well-understood. The suppression of high-frequencycontent is not as sharp as one might wish; there are filters withsmaller transition bands and better stop-band attenuation, but they haveto be custom-designed for a given downsampling factor, and from whatI've seen this is not commonly done in general-purpose software.

I'd say that using a gaussian - or other convolution-based low-passfilter is a good idea and should outperform the 'weight according toarea' method, with the downside of the wider support, which makes thehandling of no-data values more difficult. Gaussian kernels will alsonot produce overshoots (because all weights are positive), which is a plus.

What has to be understood here is that the filter has to match thedesired scale change. The kernel has to be produced by a process takingthe scale change into account, which can be done for a gaussian bychoosing an appropriate standard deviation. So a general-puposedownsampling algorithm would perform these steps:


- generate a custom filter (like, a kernel) given the desired scale change

- apply this filter to the source data

- use an interpolator over the low-pass filtered signal to obtain thetarget values for the given target locations.

Not taking the scale change into account and using a fixed kernel willonly be appropriate for a specific scale change, so it's no good for thegeneral purpose case, but can be used for specific scale changes, likehalving the resolution ('half-band filter'), which may be repeatedseveral times to cover larger scale changes.

5.

This gets me to Jukka's response pointing me to VRT files withKernelFilteredSource. I hadn't seen/noticed this before, thanks forpointing it out! With the proviso that the kernel suits the scalingfactor (so, it has to be computed for a desired scale change) this lookslike an excellent way of dealing with the problem: The filter isconceptually attached to the source data where it belongs, andinterpolation of target values operates on the filtered data. Neat. Itleaves the user with having to figure out a suitable kernel - a kernelderived from a truncated (or otherwise windowed) gaussian (needs to benormalized, though) would be a good starting point. I'll try that andreport back. It also fits well with my current work flow of using shellscripts and gdal.

Also, thanks to Joaquim. The code you link to also looks like it willsuit the problem, but I'll admit to a bit of overload here and deferlooking at it in more detail to sometime later. What I found especiallyattractive is 'grdfft', which performs operations in the frequencydomain, which is, as I have pointed out above, a very clean anddesirable way of treating the problem and should be ideal, but might bedifficult to use if there are no-data values - and the FFT and IFFT aresure computationally intensive. I think that the GRASS code base alsooffers FFT and IFFT, so it might be possible to use GRASS to the sameeffect.


6.

To conclude:

- downsampling without previous low-pass filtering is normally an error.

- upsampling is a very different matter and requires no previousfiltering, it's necessary to see the distinction.

- there is a wide spectrum of possible low-pass filters, but the filterhas to be customized to the given scaling factor. There is no 'onekernel fits all' approach.

- once the low-pass filtered signal has been obtained, it can bedownsampled using any of the available methods, but preferably *not*nearest neighbour. Cubic spline is good here.

Still with me? Thanks for your patience! I hope I have understood allthe source code and response posts correctly, don't hesitate to correct me.


Kay
_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] downsampling geotiff with a low-pass filter

Reply via email to