Hi there! As part of my work on dvbcut2, I've developed a new algorithm for file format detection. The basic idea is that the beginning of a file - say, a few megabytes at most - is searched for magic byte sequences. But only once. If you scan the file again and again for every file format you're supporting, you'll waste a lot of time - in particular when the list includes not only MPEG PS/TS (as in dvbcut) but also AVI, Matroska, Ogg, M2TS (a variant of TS used on the Blu-ray disc) and possibly others. Besides that, the probing order may affect the result.
Therefore, my algorithm performs the probing in parallel and calculates a probability for every format. The one with the highest probability wins. This can be done with containers or raw media streams, e.g., MP3 files - but the algorithm also "knows" the rest of MPEG-1/2 audio, MPEG-4 AAC, MPEG-1/2 video, MPEG-4 AVC aka H.264, AC-3, DTS/DCA, Dirac, FLAC, Meridian Lossless (MLP) and Dolby TrueHD right now, and I'm working on LPCM, Vorbis and subtitle support. LPCM may turn out to be impossible, though, because there are no magic bytes in it. You might have to perform a fourier analysis or something like that - if it's got a noise-like spectrum, it's probably not LPCM. Anyway, I've packed the necessary bits and bytes into the attachment and I would like to ask you to test them on any file you can get your hands on. I'm particularly interested in all misdetected and undetected files. I want the detection to work as flawlessly as possible. Oh, and if you think an important format is missing, don't hesitate to tell me. To compile the stuff: # save attachment to testing.tar.gz gunzip testing.tar.gz tar xvf testing.tar cd testing make Afterwards, you can run ./container <filenames> (for container files) or ./media <filenames> (for all others). Theoretically, the code should work on Windows as well (with Cygwin or MinGW) - there's nothing special about it. You may have to remove some or even all lines from config.h to make it compile, though. The same may apply to old(er) Linux/Unix systems. If in doubt, just comment out the whole thing. -- Michael "Tired" Riepe <mich...@mr511.de> X-Tired: Each morning I get up I die a little
testing.tar.gz
Description: GNU Zip compressed data
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev
_______________________________________________ DVBCUT-user mailing list DVBCUT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dvbcut-user