Mohammed Raad wrote:
Can you provide some data regarding how well these metrics correlate
with subjective measurements? I expect each will be more suitable for
different types of content, but it would be interesting to know how
these perform.

There are a number of studies available (on of the advantages of not designing our own metrics).

The original PSNR-HVS-M paper [1] conducted experiments using additive Gaussian noise and spatially-correlated Gaussian noise (in regions with various masking properties), and got the following Spearman correlations (Kendall correlations showed similar results):

MS-SSIM:    0.406
PSNR:       0.537
PSNR-HVS-M: 0,984

Of course, Gaussian noise is not terribly representative of compression artifacts.

The original Fast SSIM paper [2] used the LIVE image database (JPEG and JPEG2000 compression artifacts, white noise, Gaussian blur, and fast-fading channel noise), and showed these Spearman rank-order correlations:

PSNR:         0.8755
SSIM:         0.9244
MS-SSIM:      0.9425
Fast MS-SSIM: 0.9409

They also compared with the LIVE video database (MPEG 2 compression, H.264 compression, H.264 compressed bit streams under packet loss, and H.264 compressed bit streams with bit errors):

PSNR:         0.3684
SSIM:         0.4953
MS-SSIM:      0.7593
Fast MS-SSIM: 0.6991

I expect the numbers are so much lower than the image case primarily because of the packet loss and bit error conditions, but the authors did not break out the numbers by type of distortion, so it is difficult to say. Comparing with the numbers below, it also seems clear there is a significant temporal effect that none of these still-image metrics can capture.

The TID 2008 database (maintained by the authors of PSNR-HVS-M) has human rankings for a number of different distortion types, and comparisons of various metrics against them can be found [3]:
             JPEG    JPEG 2000  JPEG                 JPEG 2000
                                transmission errors  t. errors
PSNR         0.899   0.8255     0.7646               0.7769
SSIM         0.8994  0.8888     0.8216               0.8395
MS-SSIM:     0.935   0.9706     0.8747               0.8585

The same authors have a new expanded TID 2013 database, with more results [4]. Sadly, they no longer break out results by the individual type of distortion, but group them into subsets. Aside from the "Full" subset, only the "Actual" subset includes both JPEG and JPEG 2000 compression artifacts ("Simple" and "Color" also include JPEG compression). Only "Exotic" includes transmission errors, along with many other distortions completely unlike what would be produced by a codec. Spearman rank-order correlations:

            Actual  Simple  Full
PSNR:       0.8246  0.9134  0.6395
SSIM:       0.7877  0.8371  0.6370
MS-SSIM:    0.8871  0.9053  0.7872
PSNR-HVS-M: 0.9175  0.9379  0.6246

The results here are by no means exhaustive. The algorithms here are also by no means always the best ones in the studies cited, and new studies usually come with a new algorithm that purports to be better. There have been a number of these since 2011. But these are the metrics we've been using and have some familiarity with.


[1] http://enpub.fulton.asu.edu/resp/vpqm/vpqm2007/papers/399.pdf
[2] http://live.ece.utexas.edu/publications/2011/chen_rtip_2011.pdf
[3] http://hdrvdp.sourceforge.net/reports/2.0/quality_tid2008/index.html
[4] http://ponomarenko.info/tid2013.htm

_______________________________________________
video-codec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/video-codec

Reply via email to