Mohammed Raad wrote:
Can you provide some data regarding how well these metrics correlate
with subjective measurements? I expect each will be more suitable for
different types of content, but it would be interesting to know how
these perform.
There are a number of studies available (on of the advantages of not
designing our own metrics).
The original PSNR-HVS-M paper [1] conducted experiments using additive
Gaussian noise and spatially-correlated Gaussian noise (in regions with
various masking properties), and got the following Spearman correlations
(Kendall correlations showed similar results):
MS-SSIM: 0.406
PSNR: 0.537
PSNR-HVS-M: 0,984
Of course, Gaussian noise is not terribly representative of compression
artifacts.
The original Fast SSIM paper [2] used the LIVE image database (JPEG and
JPEG2000 compression artifacts, white noise, Gaussian blur, and
fast-fading channel noise), and showed these Spearman rank-order
correlations:
PSNR: 0.8755
SSIM: 0.9244
MS-SSIM: 0.9425
Fast MS-SSIM: 0.9409
They also compared with the LIVE video database (MPEG 2 compression,
H.264 compression, H.264 compressed bit streams under packet loss, and
H.264 compressed bit streams with bit errors):
PSNR: 0.3684
SSIM: 0.4953
MS-SSIM: 0.7593
Fast MS-SSIM: 0.6991
I expect the numbers are so much lower than the image case primarily
because of the packet loss and bit error conditions, but the authors did
not break out the numbers by type of distortion, so it is difficult to
say. Comparing with the numbers below, it also seems clear there is a
significant temporal effect that none of these still-image metrics can
capture.
The TID 2008 database (maintained by the authors of PSNR-HVS-M) has
human rankings for a number of different distortion types, and
comparisons of various metrics against them can be found [3]:
JPEG JPEG 2000 JPEG JPEG 2000
transmission errors t. errors
PSNR 0.899 0.8255 0.7646 0.7769
SSIM 0.8994 0.8888 0.8216 0.8395
MS-SSIM: 0.935 0.9706 0.8747 0.8585
The same authors have a new expanded TID 2013 database, with more
results [4]. Sadly, they no longer break out results by the individual
type of distortion, but group them into subsets. Aside from the "Full"
subset, only the "Actual" subset includes both JPEG and JPEG 2000
compression artifacts ("Simple" and "Color" also include JPEG
compression). Only "Exotic" includes transmission errors, along with
many other distortions completely unlike what would be produced by a
codec. Spearman rank-order correlations:
Actual Simple Full
PSNR: 0.8246 0.9134 0.6395
SSIM: 0.7877 0.8371 0.6370
MS-SSIM: 0.8871 0.9053 0.7872
PSNR-HVS-M: 0.9175 0.9379 0.6246
The results here are by no means exhaustive. The algorithms here are
also by no means always the best ones in the studies cited, and new
studies usually come with a new algorithm that purports to be better.
There have been a number of these since 2011. But these are the metrics
we've been using and have some familiarity with.
[1] http://enpub.fulton.asu.edu/resp/vpqm/vpqm2007/papers/399.pdf
[2] http://live.ece.utexas.edu/publications/2011/chen_rtip_2011.pdf
[3] http://hdrvdp.sourceforge.net/reports/2.0/quality_tid2008/index.html
[4] http://ponomarenko.info/tid2013.htm
_______________________________________________
video-codec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/video-codec