2018-01-07 15:40 GMT+01:00 Jörn Heusipp <osm...@problemloesungsmaschine.de>: > > On 01/06/2018 04:10 PM, Carl Eugen Hoyos wrote: >> >> 2018-01-06 11:07 GMT+01:00 Jörn Heusipp >> <osm...@problemloesungsmaschine.de>: > > >>> libopenmpt file header probing is tested regularly against the FATE >>> suite and other diverse file collections by libopenmpt upstream in >>> order to avoid false positives. >> >> >> You could also test tools/probetest > > > I was not aware of that tool. Thanks for the suggestion. > > It currently lists a failure related to libopenmpt: > Failure of libopenmpt probing code with score=76 type=0 p=FDC size=1024
I did not look at this closely but I suspect libopenmpt should return a smaller score in this case. A solution could be to never return a high value for the FFmpeg probe function. > Looking at tools/probetest.c, that would be a file with very few bits set. > libopenmpt detects the random data in question as M15 .MOD files (original > Amiga 15 samples .mod files), and there is sadly not much that can be done > about that. There are perfectly valid real-world M15 .MOD files with only 73 > bits set in the first 600 bytes (untouchables-station.mod, > <https://modarchive.org/index.php?request=view_by_moduleid&query=104280>). > The following up to 64*4*4*63 bytes could even be completely 0 (empty > pattern data) for valid files (even without the file being totally silent). > The generated random data that tools/probetest complains about here contains > 46 bits set to 1 in the first 600 bytes. What follows are various other > examples with up to 96 bits set to 1. Completely loading a file like that > would very likely reject it (in particular, libopenmpt can deduce the > expected file size from the sample headers and, with some slack to account > for corrupted real-world examples, reject files with non-matching size), > however, that is not possible when only probing the file header. > The libopenmpt API allows for the file header probing functions to return > false-positives, however false-negatives are not allowed. > > Performance numbers shown by tools/probetest are what I had expected > (measured on an AMD A4-5000, during normal Xorg session (i.e. not 100% > idle)): > > 1110194971 cycles, aa > 24986722468 cycles, aac > 26418545168 cycles, ac3 > 1484717267 cycles, acm > 1627888281 cycles, act > 2109884646 cycles, adp > 2316235992 cycles, ads > 1244706028 cycles, adx > 1132390431 cycles, aea > 1729241354 cycles, aiff > 1728288238 cycles, aix > 2662531158 cycles, amr > 16189546067 cycles, amrnb > 10342883200 cycles, amrwb > 1487752343 cycles, anm > 2268900502 cycles, apc > 1140814303 cycles, ape > 2181170710 cycles, apng > 18698762054 cycles, aqtitle > 2656908730 cycles, asf > 2402762967 cycles, asf_o > 18148196647 cycles, ass > 1392503829 cycles, ast > 1774264703 cycles, au > 1807159562 cycles, avi > 1745391230 cycles, avr > 1370939762 cycles, avs > 1555620708 cycles, bethsoftvid > 1459171160 cycles, bfi > 2640635742 cycles, bink > 2022320986 cycles, bit > 1664933324 cycles, bfstm > 1588023172 cycles, brstm > 1769430536 cycles, boa > 2294286860 cycles, c93 > 1022646071 cycles, caf > 9063207678 cycles, cavsvideo > 1898790300 cycles, cdxl > 1037718383 cycles, cine > 3358938768 cycles, concat > 2367399953 cycles, dcstr > 1795803759 cycles, dfa > 1454750468 cycles, dirac > 1167905836 cycles, dnxhd > 2076678208 cycles, dsf > 1226761232 cycles, dsicin > 1157816261 cycles, dss > 31466350564 cycles, dts > 1357475606 cycles, dtshd > 15626181281 cycles, dv > 12227021709 cycles, dvbsub > 1747998309 cycles, dvbtxt > 1941371107 cycles, dxa > 1988122049 cycles, ea > 1395161162 cycles, ea_cdata > 21013119067 cycles, eac3 > 1282697126 cycles, epaf > 1658521102 cycles, ffm > 2919787300 cycles, ffmetadata > 3786264941 cycles, fits > 2700385826 cycles, flac > 1840776863 cycles, flic > 1317695853 cycles, flv > 1511756606 cycles, live_flv > 1135064427 cycles, 4xm > 1830871233 cycles, frm > 3011403748 cycles, fsb > 1462985803 cycles, gdv > 1708440935 cycles, genh > 3480430744 cycles, gif > 2533542048 cycles, gsm > 2412598563 cycles, gxf > 21637989787 cycles, h261 > 22268834035 cycles, h263 > 22135718754 cycles, h264 > 13939886275 cycles, hevc > 1979375582 cycles, hls,applehttp > 1658646375 cycles, hnm > 1507634977 cycles, ico > 2534774499 cycles, idcin > 1684324336 cycles, idf > 1353664382 cycles, iff > 2978779893 cycles, ilbc > 1892353081 cycles, alias_pix > 2456259645 cycles, brender_pix > 2077466815 cycles, ingenient > 11281657144 cycles, ipmovie > 1840789384 cycles, ircam > 2455541614 cycles, iss > 1114518907 cycles, iv8 > 1750327098 cycles, ivf > 3803895407 cycles, ivr > 30510491919 cycles, jacosub > 1271391143 cycles, jv > 1504674165 cycles, lmlm4 > 28284647311 cycles, loas > 2746771768 cycles, lrc > 1630546444 cycles, lvf > 2198871369 cycles, lxf > 15210250791 cycles, m4v > 2074024051 cycles, matroska,webm > 1756348463 cycles, mgsts > 13894318111 cycles, microdvd > 15146276963 cycles, mjpeg > 13215378411 cycles, mjpeg_2000 > 21505153187 cycles, mlp > 1623684275 cycles, mlv > 2009009898 cycles, mm > 1401453493 cycles, mmf > 3614852044 cycles, mov,mp4,m4a,3gp,3g2,mj2 > 37065167696 cycles, mp3 > 2003306237 cycles, mpc > 1695842377 cycles, mpc8 > 20922947044 cycles, mpeg > 26950626806 cycles, mpegts > 12903395151 cycles, mpegvideo > 1861191163 cycles, mpjpeg > 11292546869 cycles, mpl2 > 10904909514 cycles, mpsub > 2556705558 cycles, msf > 14520727615 cycles, msnwctcp > 1513345014 cycles, mtaf > 1498181103 cycles, mtv > 2100567692 cycles, musx > 1398481833 cycles, mv > 3839928046 cycles, mxf > 1084340183 cycles, nc > 2260039804 cycles, nistsphere > 1557302811 cycles, nsp > 14077588650 cycles, nsv > 12804865958 cycles, nut > 3498085105 cycles, nuv > 2785399093 cycles, ogg > 2800628120 cycles, oma > 2241873172 cycles, paf > 11630567717 cycles, pjs > 1538360044 cycles, pmp > 1966776985 cycles, pva > 2051297210 cycles, pvf > 1464824135 cycles, qcp > 1395151376 cycles, r3d > 13872717447 cycles, realtext > 1648061451 cycles, redspark > 1881530375 cycles, rl2 > 1865198787 cycles, rm > 1848791502 cycles, roq > 3141932957 cycles, rpl > 2379252069 cycles, rsd > 31146518791 cycles, s337m > 7497815228 cycles, sami > 24830800138 cycles, sbg > 15351196732 cycles, scc > 9758760073 cycles, sdp > 2159674057 cycles, sdr2 > 1555316250 cycles, sds > 1533405328 cycles, sdx > 1681270049 cycles, film_cpk > 2303851902 cycles, shn > 1761647489 cycles, siff > 1510520120 cycles, smk > 2859907925 cycles, smjpeg > 1643498999 cycles, smush > 1545689291 cycles, sol > 1912740702 cycles, sox > 17486361594 cycles, spdif > 20080502425 cycles, srt > 2659637846 cycles, psxstr > 17633213722 cycles, stl > 8032855323 cycles, subviewer1 > 8572013351 cycles, subviewer > 2043897951 cycles, sup > 2980746200 cycles, svag > 1617398584 cycles, swf > 2842115745 cycles, tak > 5320163051 cycles, tedcaptions > 1884107745 cycles, thp > 4320119922 cycles, 3dostr > 2018755118 cycles, tiertexseq > 1714617022 cycles, tmv > 21456317423 cycles, truehd > 1050826275 cycles, tta > 2065773077 cycles, txd > 1577829281 cycles, ty > 3450802460 cycles, vag > 19179500628 cycles, vc1 > 1860036853 cycles, vc1test > 2035593194 cycles, vivo > 1518758455 cycles, vmd > 2696860615 cycles, vobsub > 2762235280 cycles, voc > 1957794567 cycles, vpk > 15280000639 cycles, vplayer > 1763355055 cycles, vqf > 1879310121 cycles, w64 > 1717961542 cycles, wav > 2095837026 cycles, wc3movie > 2960188092 cycles, webvtt > 1922356839 cycles, wsaud > 1978715237 cycles, wsd > 1468438585 cycles, wsvqa > 2668937770 cycles, wtv > 3193222838 cycles, wve > 1744694735 cycles, wv > 1677278541 cycles, xa > 1759862474 cycles, xbin > 2077217647 cycles, xmv > 2161496331 cycles, xvag > 2330794326 cycles, xwma > 1103137131 cycles, yop > 2154690280 cycles, yuv4mpegpipe > 1842301899 cycles, bmp_pipe > 2039875920 cycles, dds_pipe > 1627504710 cycles, dpx_pipe > 1463019740 cycles, exr_pipe > 1539585051 cycles, j2k_pipe > 1187861714 cycles, jpeg_pipe > 1682815484 cycles, jpegls_pipe > 1840465166 cycles, pam_pipe > 1755858395 cycles, pbm_pipe > 1211589601 cycles, pcx_pipe > 2002446954 cycles, pgmyuv_pipe > 1818965412 cycles, pgm_pipe > 1654095834 cycles, pictor_pipe > 1404252441 cycles, png_pipe > 1211120882 cycles, ppm_pipe > 1205883539 cycles, psd_pipe > 1764091290 cycles, qdraw_pipe > 1091809273 cycles, sgi_pipe > 2994663150 cycles, svg_pipe > 1348938514 cycles, sunrast_pipe > 1464347337 cycles, tiff_pipe > 1142572756 cycles, webp_pipe > 1412715104 cycles, xpm_pipe > 3550700989 cycles, libmodplug > 109589637233 cycles, libopenmpt This sadly may not be acceptable, others may want to comment. > 2672917981 libopenmpt (per module format) > > At first glance, libopenmpt looks huge here in comparison. However one > should consider that libopenmpt internally has to probe for (currently) 41 > different module file formats, going through 41 separate probing functions > internally. > > Dividing 109589637233 by 41 gives 2672917981, which is in the ballpark of > all other probing functions in ffmpeg. > >>> +#if OPENMPT_API_VERSION_AT_LEAST(0,3,0) >>> + if (p->buf && p->buf_size > 0) { >>> + probe_result = openmpt_probe_file_header_without_filesize( >>> + OPENMPT_PROBE_FILE_HEADER_FLAGS_DEFAULT, >>> + p->buf, p->buf_size, >>> + &openmpt_logfunc, NULL, NULL, NULL, NULL, >>> NULL); >>> + if (probe_result == OPENMPT_PROBE_FILE_HEADER_RESULT_FAILURE) { >>> + score = score_fail; >> >> >> What's wrong with return 0;? > > > Nothing. If preferred, I can get rid of all score_* constants and use 0 or > AVPROBE_SCORE_* directly. That sounds like a very good idea to me but please wait for others to comment. >>> + } else if (probe_result == >>> OPENMPT_PROBE_FILE_HEADER_RESULT_SUCCESS) { >>> + score = FFMAX(score, score_data); >> >> >> What does OPENMPT_PROBE_FILE_HEADER_RESULT_SUCCESS mean? > > > It is documented as "OPENMPT_PROBE_FILE_HEADER_RESULT_SUCCESS: The file will > most likely be supported by libopenmpt." (see > <https://lib.openmpt.org/doc/group__libopenmpt__c.html#ga92cdc66eb529a8a4a67987b659ed3c5e>). > An ultimately precise answer is never possible as that would require > actually trying to load the complete file in some cases: > * Not all module file formats store feature flags in the file header. > * Some module file formats provide very little file magic numbers, and/or > file magic numbers at strange offsets (like at 1080 for M.K. .MOD). > * Some formats store header-like information in the file footer, which is > not accessible during probing. > * The extreme case of M15 (original 15 samples Amiga .MOD files) provides > absolutely no true file header or magic numbers. libopenmpt implements > heuristics to reliably identify and probe even those, however there is only > so much it can do. > * Some container formats (Unreal Music .UMX, which can contain module music > files) theoretically potentially require seeking to arbitrary locations in > the file in order to determine the format. > >> Why not return MAX? > > For all the reasons listed above, even though libopenmpt tries to be as > pessimistic as possible, false positives fundamentally cannot be avoided > completely. As the libopenmpt probing logic is code outside of ffmpeg, the > effects of such a false positive could potentially cause mis-detection of > other formats supported by ffmpeg, which would not be immediately or easily > fixable by ffmpeg itself. I used the lowest possible score that makes sense > in order to reduce the risk of potential impact. > The probing result in this case is deduced from looking at the actual file > data, as opposed to just trusting a mime-type which is external to the file > and could be inconsistent/wrong, which is why I used a score higher than > AVPROBE_SCORE_MIME. > I opted for AVPROBE_SCORE_MIME+1, which seemed reasonable to me. > Should I add a comment explaining the reasoning to the code? > >>> + } else if (probe_result == >>> OPENMPT_PROBE_FILE_HEADER_RESULT_WANTMOREDATA) { >> >> > I believe this should return 0 but maybe you found that this is bad? > > > Would 0 be semantically right here? > OPENMPT_PROBE_FILE_HEADER_RESULT_WANTMOREDATA means that libopenmpt requires > more data to come to any usable conclusion, which is what I thought > AVPROBE_SCORE_RETRY would mean. > I do not see any particular problem with returning 0 in this case either, > given the probing logic in av_probe_input_format() (and it would reduce the > whole probe_result == OPENMPT_PROBE_FILE_HEADER_RESULT_WANTMOREDATA block to > a single line). However, if client code directly calls .read_probe() on > AVInputFormat ff_libopenmpt_demuxer, I think returning AVPROBE_SCORE_RETRY > (or similar) makes more sense. I agree. >>> + if (score > score_fail) { >>> + /* known file extension */ >>> + score = FFMAX(score, score_ext_retry); >>> + } else { >>> + /* unknown file extension */ >>> + if (p->buf_size >= >>> openmpt_probe_file_header_get_recommended_size()) { >>> + /* We have already received the recommended amount >>> of data >>> + * and still cannot decide. Return a rather low >>> score. >>> + */ >>> + score = FFMAX(score, score_retry); >>> + } else { >>> + /* The file extension is unknown and we have very >>> few data >>> + * bytes available. libopenmpt cannot decide >>> anything here, >>> + * and returning any score > 0 would result in >>> successfull >>> + * probing of random data. >>> + */ >>> + score = score_fail; >> >> This patch indicates that it may be a good idea to require libopenmpt 0.3, > > The amount of #ifdef needed to support 0.2 and 0.3 is rather small, I think. > > I understand that the current (and future libopenmpt 0.2) way of solely > relying on the file extension is far from optimal, but I do not see any > reason to drop libopenmpt 0.2 support right now; in particular, continuing > 0.2 support as is would be no regression. Additionally, libopenmpt 0.2 can > be built with C++03 compilers while libopenmpt 0.3 requires a C++11 > compiler, thus, libopenmpt 0.3 cannot easily be built on older platforms. > > libopenmpt 0.2 also allows for file probing, however the API and code path > is very heavy-weight (goes through the normal file loader and discards > unneeded data), which I fear would be way too heavy performance-wise for > ffmpeg. > >> when was it released, which distributions do not include it? > > The first version of libopenmpt 0.3 was released 2017-09-28. > > I am not aware of any stable, non-rolling distribution that ships libopenmpt > 0.3 as of now. Then supporting only 0.3 is currently not a good option. Carl Eugen _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel