Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Mon, 3 Oct 2016 23:57:32 +0200 u-9...@aetey.se wrote: please refrain from off topic posts on this mailing list. bug reports go to trac or ffmpeg-user list. see http://ffmpeg.org/bugreports.html "Please do not report your problem on the developer mailing list:" > You are welcome to learn why the things are done in a certain way. > A good place to start might be the musl mailing list. after reading this thread, the thing i realize now is that our developers would rather do a line by line audit of another libc software project than an audit of ffmpeg code :D -compn ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Mon, Oct 03, 2016 at 09:48:47PM +0200, Nicolas George wrote: > > As bright as the people here are, they land in a foreign area, which > > accidentally leads to statements like the above. > > I am sorry, but an appeal to authority will not cut it in front of real > arguments. [Everyone, sorry for offtopic] It was not an appeal to authority but a warning, trying to reduce the number of nice and intelligent people making fools of themselves. > The real argument here is: musl makes several assumptions about the > architecture and compiler (for example: that floats and ints have the same You should have noticed that it is actually generated per architecture and has requirements on compilers. > endianness) that are not mandated by any standard. And although these Hmm isn't there such a thing called the CPU specification? > assumptions are very reasonable and widely respected, there will eventually > be an architecture or, more probably, a compiler, that will break them. You mean if you _choose_ to build it for a wrong architecture or by a wrong compiler? Then yes. :) But not otherwise. When built properly it will fully cooperate with applications which conform to standards. Any of them. Unfortunately, this apparently can not be said about ffmpeg. Replacing one standard-conforming library with another one can apparently lead to a crash. > FFmpeg does exactly the same. Alas, not really. > In this instance, we know the reason for FFmpeg: resetting the FPU state is > expensive, and this is speed-critical code. Please tell us what are musl's > reasons for using this ugly hack. If the answer is speed, I would very much Be careful before calling any code "ugly". May be you just do not understand it. And concerning why it looks this way - this is _irrelevant_ here. Don't try to blame the error of the old vp3 code on musl. > like to see a benchmark comparing this implementation to a portable but > optimized log2, especially for x86_32, where int <-> float conversions are > very expensive. You are welcome to learn why the things are done in a certain way. A good place to start might be the musl mailing list. Yours, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Mon, Oct 03, 2016 at 04:17:00PM -0400, Ronald S. Bultje wrote: > Hi, > > On Mon, Oct 3, 2016 at 4:11 PM, Clément Bœschwrote: > > > On Mon, Oct 03, 2016 at 09:48:47PM +0200, Nicolas George wrote: > > > Le duodi 12 vendémiaire, an CCXXV, u-9...@aetey.se a écrit : > > > > The author of the referred code acts in his actual area of competence > > > > (C libraries and standards). > > > > > > > > The comments here on the C library code and standard compliance come > > from > > > > developers having a different competence area (multimedia programming). > > > > > > > > As bright as the people here are, they land in a foreign area, which > > > > accidentally leads to statements like the above. > > > > > > I am sorry, but an appeal to authority will not cut it in front of real > > > arguments. > > > > > > The real argument here is: musl makes several assumptions about the > > > architecture and compiler (for example: that floats and ints have the > > same > > > endianness) that are not mandated by any standard. And although these > > > assumptions are very reasonable and widely respected, there will > > eventually > > > be an architecture or, more probably, a compiler, that will break them. > > > > > > FFmpeg does exactly the same. > > > > > > > TBH, having a set of supported architectures in a libc is much more legit > > than a set of supported libc in a multimedia framework IMO. > > > > We can mess as much as we want within our codebase wrt ABI violations, but > > I don't think it's reasonable to make random assumptions about libc (and > > others external libraries) implementations. It's way too risky. Even if we > > are able today to change musl implementation, seeing random floats usage > > in the many allocators out there doesn't look that surprising. > > > > Actually, if I was a valgrind developers, maybe I would do that on > > purpose... > > > > [...] > > > In this instance, we know the reason for FFmpeg: resetting the FPU state > > is > > > expensive, and this is speed-critical code. > > > > We have mallocs in inner loops of speed-critical? Really? > > > What code is speed critical? In an audio decoder (like a voice decoder) > where a decode call plays as many as a few tens of audio samples at best, I > would say that anything is speed critical, because once-per-frame isn't > that far off from once-per-sample in that case. And then you insert an audio filter using floats and you're doomed? > The bigger problem is that we're talking about malloc (actually, free, > according to cehoyos) here. But this could be any external/libc function. > We need emms before any libc call. That's ... Not just mallocs. Well currently the issue is definitely in the alloc/free, so maybe we should stick to that problem for now? BTW, just another allocator using floats: https://github.com/jemalloc/jemalloc/blob/dev/src/prof.c#L850 -- Clément B. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hi, On Mon, Oct 3, 2016 at 4:11 PM, Clément Bœschwrote: > On Mon, Oct 03, 2016 at 09:48:47PM +0200, Nicolas George wrote: > > Le duodi 12 vendémiaire, an CCXXV, u-9...@aetey.se a écrit : > > > The author of the referred code acts in his actual area of competence > > > (C libraries and standards). > > > > > > The comments here on the C library code and standard compliance come > from > > > developers having a different competence area (multimedia programming). > > > > > > As bright as the people here are, they land in a foreign area, which > > > accidentally leads to statements like the above. > > > > I am sorry, but an appeal to authority will not cut it in front of real > > arguments. > > > > The real argument here is: musl makes several assumptions about the > > architecture and compiler (for example: that floats and ints have the > same > > endianness) that are not mandated by any standard. And although these > > assumptions are very reasonable and widely respected, there will > eventually > > be an architecture or, more probably, a compiler, that will break them. > > > > FFmpeg does exactly the same. > > > > TBH, having a set of supported architectures in a libc is much more legit > than a set of supported libc in a multimedia framework IMO. > > We can mess as much as we want within our codebase wrt ABI violations, but > I don't think it's reasonable to make random assumptions about libc (and > others external libraries) implementations. It's way too risky. Even if we > are able today to change musl implementation, seeing random floats usage > in the many allocators out there doesn't look that surprising. > > Actually, if I was a valgrind developers, maybe I would do that on > purpose... > > [...] > > In this instance, we know the reason for FFmpeg: resetting the FPU state > is > > expensive, and this is speed-critical code. > > We have mallocs in inner loops of speed-critical? Really? What code is speed critical? In an audio decoder (like a voice decoder) where a decode call plays as many as a few tens of audio samples at best, I would say that anything is speed critical, because once-per-frame isn't that far off from once-per-sample in that case. The bigger problem is that we're talking about malloc (actually, free, according to cehoyos) here. But this could be any external/libc function. We need emms before any libc call. That's ... Not just mallocs. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Mon, Oct 03, 2016 at 09:48:47PM +0200, Nicolas George wrote: > Le duodi 12 vendémiaire, an CCXXV, u-9...@aetey.se a écrit : > > The author of the referred code acts in his actual area of competence > > (C libraries and standards). > > > > The comments here on the C library code and standard compliance come from > > developers having a different competence area (multimedia programming). > > > > As bright as the people here are, they land in a foreign area, which > > accidentally leads to statements like the above. > > I am sorry, but an appeal to authority will not cut it in front of real > arguments. > > The real argument here is: musl makes several assumptions about the > architecture and compiler (for example: that floats and ints have the same > endianness) that are not mandated by any standard. And although these > assumptions are very reasonable and widely respected, there will eventually > be an architecture or, more probably, a compiler, that will break them. > > FFmpeg does exactly the same. > TBH, having a set of supported architectures in a libc is much more legit than a set of supported libc in a multimedia framework IMO. We can mess as much as we want within our codebase wrt ABI violations, but I don't think it's reasonable to make random assumptions about libc (and others external libraries) implementations. It's way too risky. Even if we are able today to change musl implementation, seeing random floats usage in the many allocators out there doesn't look that surprising. Actually, if I was a valgrind developers, maybe I would do that on purpose... [...] > In this instance, we know the reason for FFmpeg: resetting the FPU state is > expensive, and this is speed-critical code. We have mallocs in inner loops of speed-critical? Really? [...] -- Clément B. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Le duodi 12 vendémiaire, an CCXXV, u-9...@aetey.se a écrit : > The author of the referred code acts in his actual area of competence > (C libraries and standards). > > The comments here on the C library code and standard compliance come from > developers having a different competence area (multimedia programming). > > As bright as the people here are, they land in a foreign area, which > accidentally leads to statements like the above. I am sorry, but an appeal to authority will not cut it in front of real arguments. The real argument here is: musl makes several assumptions about the architecture and compiler (for example: that floats and ints have the same endianness) that are not mandated by any standard. And although these assumptions are very reasonable and widely respected, there will eventually be an architecture or, more probably, a compiler, that will break them. FFmpeg does exactly the same. Now, when these assumptions break, who is to blame? One could say that, by principle, whoever made an assumption not guaranteed by a standard is guilty, but this is a simplistic approach. More reasonably, the one to blame is the one who has the worse reason for making the assumption or breaking it. In this instance, we know the reason for FFmpeg: resetting the FPU state is expensive, and this is speed-critical code. Please tell us what are musl's reasons for using this ugly hack. If the answer is speed, I would very much like to see a benchmark comparing this implementation to a portable but optimized log2, especially for x86_32, where int <-> float conversions are very expensive. Regards, -- Nicolas George signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Mon, 3 Oct 2016 12:45:13 +0200 u-9...@aetey.se wrote: > > > http://git.musl-libc.org/cgit/musl/tree/src/malloc/malloc.c#n114 > > > > Urgh. This is even worse than I imagined. FFmpeg is using undefined > > behaviours by calling it without resetting the state, but this is > > also completely undefined behaviour on their side. > > I feel a duty to remind, in the most positive and friendly tone: > > The author of the referred code acts in his actual area of competence > (C libraries and standards). > > The comments here on the C library code and standard compliance come > from developers having a different competence area (multimedia > programming). > > As bright as the people here are, they land in a foreign area, which > accidentally leads to statements like the above. ehe, well.. do your homework before assuming things. ;) rich felker , who wrote musl, was an mplayer and ffmpeg? developer actually. he wrote musl because everyone hated glibc. i contacted both him and even mike melanson (vp3 decoder author) mikes reply: >* I didn't write any VP3 MMX (or other SIMD), so I can't be of much > immediate help, even if I did have perfect recall of the code. > However, I invite you to run 'git blame' on the vp3 code and see how > many of my non-comment lines still persist. rich felker musl reply: > [23:46] yes they're violating the x86 abi > [23:46] at function call time the x87 floating point stack > has to be empty (and thus in x87 mode, not mmx mode) > [23:47] you can't call external functions while in mmx state i dont doubt its a bug in vp3 that needs to be rewritten. but we also go to extreme lengths to blame others... ;D musl could handle it better sure, but should they really write a full c-checker into their lib? would make it as bloated as glibc! -compn ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hi, On Sun, Oct 2, 2016 at 11:04 PM, compnwrote: > On Sun, 2 Oct 2016 08:29:22 -0400 > "Ronald S. Bultje" wrote: > > > I also think we could contact musl developers and see what's going on > > there. We certainly shouldn't blindly fix this bug by adding an emms > > in a random place, to me that's like opening pandora's box. > > on irc i pointed dalias to this thread. > > or you can just ask on #musl in freenode. Thanks, that's very helpful, I'll chat with them today. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
> > http://git.musl-libc.org/cgit/musl/tree/src/malloc/malloc.c#n114 > > Urgh. This is even worse than I imagined. FFmpeg is using undefined > behaviours by calling it without resetting the state, but this is also > completely undefined behaviour on their side. I feel a duty to remind, in the most positive and friendly tone: The author of the referred code acts in his actual area of competence (C libraries and standards). The comments here on the C library code and standard compliance come from developers having a different competence area (multimedia programming). As bright as the people here are, they land in a foreign area, which accidentally leads to statements like the above. Regards, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Le duodi 12 vendémiaire, an CCXXV, Carl Eugen Hoyos a écrit : > I suspect I found the responsible code: > http://git.musl-libc.org/cgit/musl/tree/src/malloc/malloc.c#n114 Urgh. This is even worse than I imagined. FFmpeg is using undefined behaviours by calling it without resetting the state, but this is also completely undefined behaviour on their side. Regards, -- Nicolas George ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sun, Oct 02, 2016 at 10:59:31PM -0400, compn wrote: > vp3 is a decoder written 10+ years ago by a dev who is no longer > active in ffmpeg. > > we have many decoders and encoders (and other code) in ffmpeg that have > not been audited (to my knowledge). > > i know this isnt an excuse for not looking at the code. just letting > you know the history. Sure, I know the code is old (it is even documented as such). > please dont think carl's tone is unpleasant, maybe he is just > incredulous. Thanks for this comment. I am well aware of how hard it is to weight emotion-loaded expressions, especially when someone else looks wrong in one's expected area of competence. > carl has tested and verified thousands of bugs for ffmpeg. carl also > fixes some of them. I was quite aware of Carl's qualities, but they are surely nice to mention. > carl would like to reproduce the bug. reproducing the bug with an easy > command line, on a dev computer usually makes the bug finding and fixing > quicker. Sure. This is though (in my very humble opinion) a special kind of bug, where Carl did not have to chase the "offending code" in the C library (it is not the C library which is at fault) but directly look at libavcodec/x86/vp3dsp.asm and the places where it is used. OTOH everything seems to go very well and I am looking forward to some elegant solution, with confidence. > > To give you an example of successful code auditing, the corresponding > > UB-problem in libtheora was properly fixed without anybody at Xiph > > having to install musl. > > That's why I still believe that auditing the code is more useful than This can look misleading, sorry for that. I meant, auditing particular parts of the code, for the actual, identified problem. > > hunting once again the hard-to-pinpoint symptoms of the already known > > cause. > > do you have any suggestions for how the ffmpeg project could do > successful code audits? I would like to, but regrettably I do not have any bright ideas about this "in general". Auditing is hard. To look at the assembler routines and check for a particular UB, whether the FPU state is restored before calling C library, this has a limited scope and is what I suggested. > i really dont see much enthusiasm for line by line code auditing within > ffmpeg. in fact, i'd say people would rather re-write an entirely new > piece of code than try to clean up the old code. this isnt a knock on > anyone or any piece of code, just my observation. I can only agree with every word. Regards, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sun, 2 Oct 2016 08:29:22 -0400 "Ronald S. Bultje"wrote: > I also think we could contact musl developers and see what's going on > there. We certainly shouldn't blindly fix this bug by adding an emms > in a random place, to me that's like opening pandora's box. on irc i pointed dalias to this thread. or you can just ask on #musl in freenode. -compn ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sun, 2 Oct 2016 23:16:02 +0200 u-h...@aetey.se wrote: > > looking at the code? Seriously?) > > Yes, I am serious. vp3 is a decoder written 10+ years ago by a dev who is no longer active in ffmpeg. we have many decoders and encoders (and other code) in ffmpeg that have not been audited (to my knowledge). i know this isnt an excuse for not looking at the code. just letting you know the history. > Frankly, your tone looks unpleasant. > It is a friendly information to make you aware. I guess you were not. please dont think carl's tone is unpleasant, maybe he is just incredulous. carl has tested and verified thousands of bugs for ffmpeg. carl also fixes some of them. carl i think was confused by your comment, because carl would like to reproduce the bug. reproducing the bug with an easy command line, on a dev computer usually makes the bug finding and fixing quicker. > To give you an example of successful code auditing, the corresponding > UB-problem in libtheora was properly fixed without anybody at Xiph > having to install musl. > That's why I still believe that auditing the code is more useful than > hunting once again the hard-to-pinpoint symptoms of the already known > cause. do you have any suggestions for how the ffmpeg project could do successful code audits? we have static code analyzers like coverity running on ffmpeg. there has also been a lot of fuzz testing done. i'm sure some of the more popular decoders like h264 have had code audits done in private. i really dont see much enthusiasm for line by line code auditing within ffmpeg. in fact, i'd say people would rather re-write an entirely new piece of code than try to clean up the old code. this isnt a knock on anyone or any piece of code, just my observation. feel free to tell me that i am wrong. -compn ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hi, On Sun, Oct 2, 2016 at 6:41 PM, Carl Eugen Hoyoswrote: > 2016-10-02 14:29 GMT+02:00 Ronald S. Bultje : > > I also think we could contact musl developers and see what's > > going on there. > > I suspect I found the responsible code: > http://git.musl-libc.org/cgit/musl/tree/src/malloc/malloc.c#n114 So it's like a hashing function? (Basically.) Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
2016-10-02 14:29 GMT+02:00 Ronald S. Bultje: > I also think we could contact musl developers and see what's > going on there. I suspect I found the responsible code: http://git.musl-libc.org/cgit/musl/tree/src/malloc/malloc.c#n114 Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hi, On Sun, Oct 2, 2016 at 4:24 PM, Carl Eugen Hoyoswrote: > 2016-10-02 16:01 GMT+02:00 Carl Eugen Hoyos : > > > PS: There are several unrelated build issues. > > The build issues were just seen here because I > had to add "-melf_i386" to the musl spec file. > After adding it, building works perfectly smooth. > > I tested with --enable-debug for musl and still get > no backtrace, valgrind cannot help. > > No issue on x86_64: The musl spec file works > fine and fate passes with asm optimizations. Floating point operations on x86-64 (which is by definition >= SSE2) are typically done using scalar SSE instructions, not using x87. Therefore, if you don't clean up the floating point state, things will usually still work fine. On x86-32, you can't do this because you can't be sure that the CPU supports SSE. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hello Carl Eugen, On Sun, Oct 02, 2016 at 04:01:21PM +0200, Carl Eugen Hoyos wrote: > > I do not expect that the ffmpeg developers take the trouble > > to install an extra build environment with a different libc, > > (If I could just understand this sentence: Is it meant as > an insult? How do you expect the issue to be fixed? By > looking at the code? Seriously?) It is unfortunate if you took this as an insult, it was not. > looking at the code? Seriously?) Yes, I am serious. Frankly, your tone looks unpleasant. It is a friendly information to make you aware. I guess you were not. To give you an example of successful code auditing, the corresponding UB-problem in libtheora was properly fixed without anybody at Xiph having to install musl. With all respect to your prefered workflow, it is not the only possible one and in certain situations may be not the most optimal. > I can confirm that there is an issue with current musl, This confirmation is certainly valuable per se. On the other hand, the underlying problem is not the interaction with musl, but the reliance of the ffmpeg code on certain UBs. That's why I still believe that auditing the code is more useful than hunting once again the hard-to-pinpoint symptoms of the already known cause. For the sake of clarity: I do not mean any part of this message as an insult. It is great that you care about ffmpeg, so do I. Best regards, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
2016-10-02 16:01 GMT+02:00 Carl Eugen Hoyos: > PS: There are several unrelated build issues. The build issues were just seen here because I had to add "-melf_i386" to the musl spec file. After adding it, building works perfectly smooth. I tested with --enable-debug for musl and still get no backtrace, valgrind cannot help. No issue on x86_64: The musl spec file works fine and fate passes with asm optimizations. Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
2016-10-02 9:27 GMT+02:00: > I do not expect that the ffmpeg developers take the trouble > to install an extra build environment with a different libc, (If I could just understand this sentence: Is it meant as an insult? How do you expect the issue to be fixed? By looking at the code? Seriously?) I can confirm that there is an issue with current musl, it is (for example) reproducible by calling libavutil/tests/pixelutils which crashes every time and I can confirm that adding an emms() call to av_free() fixes the issue. Carl Eugen PS: There are several unrelated build issues. Either they only happen here (which I consider unlikely) or there is an - unknown - reason why the OP decided not to tell us about those. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hi, On Sun, Oct 2, 2016 at 2:17 AM,wrote: > Hello Henrik, > > On Sun, Oct 02, 2016 at 01:18:29AM +0200, Henrik Gramner wrote: > > Ensuring that emms is issued before every single libc function call is > > likely problematic. > > > > What if we simply document the requirement that C standard library > > functions are assumed to not modify the x87 FPU state unless > > specifically designated to handle floating-point numbers? > > Is it remarkably expensive to reset the FPU state? Yes. > (I am unsure > also how much of the assembler code depends on clobbering it?) > If yes, then such documentation would be a reasonable approach. I would advocate for this approach. I also think we could contact musl developers and see what's going on there. We certainly shouldn't blindly fix this bug by adding an emms in a random place, to me that's like opening pandora's box. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sun, 2 Oct 2016 02:25:30 +0200 Michael Niedermayerwrote: > On Sun, Oct 02, 2016 at 01:18:29AM +0200, Henrik Gramner wrote: > > Ensuring that emms is issued before every single libc function call is > > likely problematic. > > maybe, maybe not, iam not sure but > calling emms between init/de/reinint and optimized inner loops should > be doable, we generally shouldnt be doing malloc in highly optimized > loops as factorizing allocation out and reusing buffers is likely > a better choice +1 for not leaving the C environment and the FPU in a broken state. > beyond malloc() what else is there ? > > string functions ? > these are in fact not unlikely to use SIMD of some sort > same for memcpy/move() > > also there are callbacks like av_log() we should document any > requirements that apply to them or ensure no such requirements exist > > exact backtraces of where issues occur would be interesting to better > understand how much code is affected by this > > > > > > What if we simply document the requirement that C standard library > > functions are assumed to not modify the x87 FPU state unless > > specifically designated to handle floating-point numbers? > > thats like saying that we require undefined behavior to be defined > in a specific way. We can do this but thats like saying we support > only a subset of POSIX platforms and that subset could shrink at > any time if implementations change > > I think our first choice should be to comply to specs where its > needed in practice and doable. > when its not useful in practice and a total unreadable mess if done > i would tend suggest to ignore specs. > > If all else fails we could add a emms call behind #if in > av_malloc() and detect affected libcs but thats IMHO a ugly hack > but better than declaring "non support" > Yes, that would be a very ugly hack that should be avoided. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Thanks Michael, On Sun, Oct 02, 2016 at 02:25:30AM +0200, Michael Niedermayer wrote: > beyond malloc() what else is there ? > > string functions ? > these are in fact not unlikely to use SIMD of some sort > same for memcpy/move() > > also there are callbacks like av_log() we should document any > requirements that apply to them or ensure no such requirements exist > > exact backtraces of where issues occur would be interesting to better > understand how much code is affected by this Regarding backtraces of the ill effects, I made an effort when I earlier hit the same issue with libtheora (the fpu corruption seemed to affect consistency in picking malloc buckets). The noticeable problems happened with a delay, eventually causing accesses to quite random places in the address space, often without any troubles for a while. Hideous and out of the application code, it is the libc which must be carefully traced. As for how much of ffmpeg code is affected, I can't tell. The vp3/theora decoder is certainly in this group but there may be more. The exact place where the corruption happens is hard to point out, this depends too much on the particular situation. > > What if we simply document the requirement that C standard library > > functions are assumed to not modify the x87 FPU state unless > > specifically designated to handle floating-point numbers? > > thats like saying that we require undefined behavior to be defined > in a specific way. We can do this but thats like saying we support > only a subset of POSIX platforms and that subset could shrink at > any time if implementations change Exactly. > If all else fails we could add a emms call behind #if in > av_malloc() and detect affected libcs but thats IMHO a ugly hack > but better than declaring "non support" This would help to somewhat boost performance on those libraries and versions which e.g. are known not to touch fpu, but I wonder if this gain is worth the trouble. Thanks again. I hope you find a reasonable solution, without undue complication of the code and preserving the performance. Regards, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hello Carl Eugen, On Sun, Oct 02, 2016 at 03:05:13AM +0200, Carl Eugen Hoyos wrote: > 2016-10-01 19:48 GMT+02:00: > > I do not expect the ffmpeg developers to try to reproduce this. > > (Would you mind explaining this sentence to me? I > do not understand it.) I do not expect that the ffmpeg developers take the trouble to install an extra build environment with a different libc, just to confirm a fact which is pretty evident when looking at the code. > > I expect somebody to look at the code (beginning with the > > MMX assembler in the vp3 decoder), which is a lower > > threshold than building a new libc. > > Either you know a lot of things that nobody on this list > knows or you have very little clue about FFmpeg > development. It is safe to assume that the knowledge varies a lot :) and indeed it has been quite some time since i added code to ffmpeg. But discussing personal qualities is not the topic of this thread. > The expected route if trac is down (but also if > you don't want to use it) is to send an email to > the user mailing list. This discussion is about ffmpeg internal details, not about its usage, so why go to the wrong forum? > > ffmpeg -i test.640x480.19seconds.theora.ogg > > -c:v libtheora -y test.out.ogg > > Missing console output and backtrace, please do > not use external libraries unless they are necessary > to reproduce the issue. I am aware of the general rules and why/when they make sense. Is there anybody on this lilst who builds ffmpeg against musl? If there were, the problem would be noticed and hopefully fixed. If there is none, nobody here will be able to make sense of my console output. I am contributing the result of the analysis: the assumptions in the mmx-related assembler code in ffmpeg are not fulfilled in a real world scenario and they do not conform to what standards say You do not have to reproduce my builds to confirm or deny this. Some replies have apparently confirmed the situation. The rest is up to the maintainers of that code. Regards, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sat, Oct 01, 2016 at 06:25:29PM -0400, Ronald S. Bultje wrote: > > We are not talking about what you and me "would like inside a standard > > library" but whether we can rely on the specified interfaces. The > > malloc() interface was apparently misunderstood as a promise to never > > do float. There is no such promise. > > Can malloc call sleep() then? "Can malloc do XXX?" If the standard (as of https://en.wikipedia.org/wiki/C_standard_library) does not prohibit this, then certainly yes. You do not have any time guarantees on malloc(), if this is what you meant. Other implementation details aside, a call to a general purpose malloc() on a Unix-like system regularly results in a system call and then your process can pretty well be put to sleep for unspecified time, by the kernel. Regards, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
2016-10-01 19:48 GMT+02:00: > I do not expect the ffmpeg developers to try to reproduce this. (Would you mind explaining this sentence to me? I do not understand it.) > I expect somebody to look at the code (beginning with the > MMX assembler in the vp3 decoder), which is a lower > threshold than building a new libc. Either you know a lot of things that nobody on this list knows or you have very little clue about FFmpeg development. >> > P.S. an attempt to report via trac did not succeed, >> > that's why posting to the list >> >> What was the issue there? > > Let's take one issue at a time [*] Yes, please tell us one of the issues with trac at a time (I successfully opened a ticket yesterday). > I mentioned this only as my effort to follow the > expected bug reporting route. The expected route if trac is down (but also if you don't want to use it) is to send an email to the user mailing list. > ffmpeg -i test.640x480.19seconds.theora.ogg > -c:v libtheora -y test.out.ogg Missing console output and backtrace, please do not use external libraries unless they are necessary to reproduce the issue. Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sun, Oct 02, 2016 at 01:18:29AM +0200, Henrik Gramner wrote: > Ensuring that emms is issued before every single libc function call is > likely problematic. maybe, maybe not, iam not sure but calling emms between init/de/reinint and optimized inner loops should be doable, we generally shouldnt be doing malloc in highly optimized loops as factorizing allocation out and reusing buffers is likely a better choice beyond malloc() what else is there ? string functions ? these are in fact not unlikely to use SIMD of some sort same for memcpy/move() also there are callbacks like av_log() we should document any requirements that apply to them or ensure no such requirements exist exact backtraces of where issues occur would be interesting to better understand how much code is affected by this > > What if we simply document the requirement that C standard library > functions are assumed to not modify the x87 FPU state unless > specifically designated to handle floating-point numbers? thats like saying that we require undefined behavior to be defined in a specific way. We can do this but thats like saying we support only a subset of POSIX platforms and that subset could shrink at any time if implementations change I think our first choice should be to comply to specs where its needed in practice and doable. when its not useful in practice and a total unreadable mess if done i would tend suggest to ignore specs. If all else fails we could add a emms call behind #if in av_malloc() and detect affected libcs but thats IMHO a ugly hack but better than declaring "non support" -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Republics decline into democracies and democracies degenerate into despotisms. -- Aristotle signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Ensuring that emms is issued before every single libc function call is likely problematic. What if we simply document the requirement that C standard library functions are assumed to not modify the x87 FPU state unless specifically designated to handle floating-point numbers? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hi, On Sat, Oct 1, 2016 at 4:12 PM,wrote: > Hello Ronald, > > On Sat, Oct 01, 2016 at 03:11:38PM -0400, Ronald S. Bultje wrote: > > I'm not sure we want to go down the rabbit hole of allowing any code to > be > > executed inside frame en/decoding routines. Historically, IIRC, we have > > disallowed float in user callbacks like request_frame also. > > > > Why does musl malloc require float? This is a real question. I can't > think > > of any reason why you'd want this. > > Taking the question for what it is, I have already answered - because > it make sence for the C library implementation and functioning. > > We are not talking about what you and me "would like inside a standard > library" but whether we can rely on the specified interfaces. The > malloc() interface was apparently misunderstood as a promise to never > do float. There is no such promise. Can malloc call sleep() then? Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hello Ronald, On Sat, Oct 01, 2016 at 03:11:38PM -0400, Ronald S. Bultje wrote: > I'm not sure we want to go down the rabbit hole of allowing any code to be > executed inside frame en/decoding routines. Historically, IIRC, we have > disallowed float in user callbacks like request_frame also. > > Why does musl malloc require float? This is a real question. I can't think > of any reason why you'd want this. Taking the question for what it is, I have already answered - because it make sence for the C library implementation and functioning. We are not talking about what you and me "would like inside a standard library" but whether we can rely on the specified interfaces. The malloc() interface was apparently misunderstood as a promise to never do float. There is no such promise. If ffmpeg depends on the implementation details behind the abstraction of the standard C library, then it is not portable. This would be regrettable. > Historically, IIRC, we have > disallowed float in user callbacks like request_frame also. It is fine to disallow calling any C library routines which can do floats. Then this by definition implies the majority (all?) of the C library routines, because the user has no power over its internals. We can not do anything to this fact, if we do not break the C library abstraction by postulating implementation details or redefining the interfaces with extra non-standard contraints. I would of course instead prefer keeping the assembler code harmless to the FPU and hope this is not overly expensive. If this is a real hit against efficiency (frankly, it should not be _that_ important, mmx is a technology of the past anyway) it might be worth taking up on the musl mailing list and see what its developers say. Regards, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sat, 1 Oct 2016 20:27:44 +0200 u-h...@aetey.se wrote: > If you are curious you may wish to look at its code (it is clean > and readable). Would you really call the piece of code that uses float in malloc.c readable? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hi, On Sat, Oct 1, 2016 at 2:27 PM,wrote: > Hello, > > It looks like some general information is due: > > Musl libc is a high quality standard C library for Linux > with emphasis on "quality" and "standard" > http://www.musl-libc.org/ > > On Sat, Oct 01, 2016 at 12:31:05PM -0400, Ronald S. Bultje wrote: > > > This means when malloc()/free()/... happens to be called internally, > > > it corrupts the malloc structures, in a non-straightforward ways. > > > > That's correct. > > > > Why does your malloc implementation use floats? > > Not mine, but musl's. I am not involved in musl development. > > (The tradition of fixed-point malloc was born while floating point was > extremely expensive. It is otherwise a "myth" that memory management > can not have use for floating point operations. Musl malloc() is well > behaving, efficient and robust.) > > If you are curious you may wish to look at its code (it is clean > and readable). > > On Sat, Oct 01, 2016 at 06:38:49PM +0200, Henrik Gramner wrote: > > On Sat, Oct 1, 2016 at 5:37 PM, wrote: > > > musl libc which uses floating point in its malloc() implementation. > > > > That's honestly the real "WTF?" here. > > I hope the explanation above sheds light on the matter. > > > In that case use emms_c() before calling those functions. > > Yes, it is probably what ffmpeg should do and this is the point of my > report. If you mean that I should have submitted a patch, I really > hope somebody else can do the actual fix(es). I'm not sure we want to go down the rabbit hole of allowing any code to be executed inside frame en/decoding routines. Historically, IIRC, we have disallowed float in user callbacks like request_frame also. Why does musl malloc require float? This is a real question. I can't think of any reason why you'd want this. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hello, It looks like some general information is due: Musl libc is a high quality standard C library for Linux with emphasis on "quality" and "standard" http://www.musl-libc.org/ On Sat, Oct 01, 2016 at 12:31:05PM -0400, Ronald S. Bultje wrote: > > This means when malloc()/free()/... happens to be called internally, > > it corrupts the malloc structures, in a non-straightforward ways. > > That's correct. > > Why does your malloc implementation use floats? Not mine, but musl's. I am not involved in musl development. (The tradition of fixed-point malloc was born while floating point was extremely expensive. It is otherwise a "myth" that memory management can not have use for floating point operations. Musl malloc() is well behaving, efficient and robust.) If you are curious you may wish to look at its code (it is clean and readable). On Sat, Oct 01, 2016 at 06:38:49PM +0200, Henrik Gramner wrote: > On Sat, Oct 1, 2016 at 5:37 PM,wrote: > > musl libc which uses floating point in its malloc() implementation. > > That's honestly the real "WTF?" here. I hope the explanation above sheds light on the matter. > In that case use emms_c() before calling those functions. Yes, it is probably what ffmpeg should do and this is the point of my report. If you mean that I should have submitted a patch, I really hope somebody else can do the actual fix(es). Regards, Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hello Moritz, On Sat, Oct 01, 2016 at 06:13:46PM +0200, Moritz Barsnick wrote: > On Sat, Oct 01, 2016 at 17:37:50 +0200, u-h...@aetey.se wrote: > > My troubleshooting > [...] > > behaves erratically on certain combinations > [...] > > Most often > > You are a busy troubleshooter, but you fail to describe what you did, > and what exactly went wrong. No-one will be able to (even try to) > reproduce if you don't provide details. I do not expect the ffmpeg developers to try to reproduce this. I expect somebody to look at the code (beginning with the MMX assembler in the vp3 decoder), which is a lower threshold than building a new libc. > > P.S. an attempt to report via trac did not succeed, that's why posting > > to the list > > What was the issue there? Let's take one issue at a time [*] I mentioned this only as my effort to follow the expected bug reporting route. > When trac finally works, you are asked to provide a command line to > reproduce this (best something which doesn't use external libraries), The problem is revealed by the use of a certain libc implementation. A command line does not help much. It was among others ffmpeg -i test.640x480.19seconds.theora.ogg -c:v libtheora -y test.out.ogg But there are also many other combinations of input and output data which break. The exact outcome depends on many hardly controllable variables, both at build time and at run time. The two crucial details are building ffmpeg with assembler optimizations and using musl libc. I mentioned those. Regrettably I did not mention that the target platform is ia32. My bad. > and its full console output. In your case, you should also describe how > you built ffmpeg, as you are using a "special" libc. Well. for the formal purposes: ./configure ; make ; make install which does not say anything without a detailed (and irrelevant) description of what my build system looks like. By the way, if there is something "special" about musl libc, it is the very high level of code quality and standard compliance. The latter exposes quite a few hidden bugs in various applications. > Thanks, > Moritz Thanks for paying attention and teaching the bug reporters good manners :) This is often necessary and useful. Regards, Rune [*] Sorry but I can not contribute to troubleshooting/improvements of the bug reporting system. Hopefully it works well enough for others. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sat, Oct 1, 2016 at 5:37 PM,wrote: > musl libc which uses floating point in its malloc() implementation. That's honestly the real "WTF?" here. On Sat, Oct 1, 2016 at 5:56 PM, wrote: > On Sat, Oct 01, 2016 at 05:44:13PM +0200, wm4 wrote: >> AFAIK most MMX code in FFmpeg does not run emms (i.e. keeps the FPU >> state trashed) until returning to the API user. > > This means when malloc()/free()/... happens to be called internally, > it corrupts the malloc structures, in a non-straightforward ways. In that case use emms_c() before calling those functions. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hi, On Sat, Oct 1, 2016 at 11:56 AM,wrote: > On Sat, Oct 01, 2016 at 05:44:13PM +0200, wm4 wrote: > > AFAIK most MMX code in FFmpeg does not run emms (i.e. keeps the FPU > > state trashed) until returning to the API user. > > This means when malloc()/free()/... happens to be called internally, > it corrupts the malloc structures, in a non-straightforward ways. That's correct. Why does your malloc implementation use floats? Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sat, Oct 01, 2016 at 17:37:50 +0200, u-h...@aetey.se wrote: > My troubleshooting [...] > behaves erratically on certain combinations [...] > Most often You are a busy troubleshooter, but you fail to describe what you did, and what exactly went wrong. No-one will be able to (even try to) reproduce if you don't provide details. > P.S. an attempt to report via trac did not succeed, that's why posting > to the list What was the issue there? When trac finally works, you are asked to provide a command line to reproduce this (best something which doesn't use external libraries), and its full console output. In your case, you should also describe how you built ffmpeg, as you are using a "special" libc. Thanks, Moritz ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sat, Oct 01, 2016 at 05:44:13PM +0200, wm4 wrote: > AFAIK most MMX code in FFmpeg does not run emms (i.e. keeps the FPU > state trashed) until returning to the API user. This means when malloc()/free()/... happens to be called internally, it corrupts the malloc structures, in a non-straightforward ways. Rune ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
On Sat, 1 Oct 2016 17:37:50 +0200 u-h...@aetey.se wrote: > Hello, > > Would someone familiar with the mmx assembler parts of ffmpeg > look over and check that the fp-state is restored as needed? > > This is crucial with musl libc which uses floating point in its > malloc() implementation. > > My troubleshooting strongly suggests that it is the mmx code which is > the culprit of ffmpeg misbehavior (the same issue has been recently > fixed in libtheora). > > Ffmpeg built against musl behaves erratically on certain combinations of > inputs and outputs, which may also depend on other settings. > Most often ffmpeg hangs forever quite early while beginning the conversion > (at "frame=0 fps" or even before showing this) but sometimes it > segfaults at that point or while being manually interrupted. > > At least the internal theora decoder is affected but there may be other > misbehaving decoders or encoders. I think I observed similar breakage with > other file types than ogg/theora but did not document all the tested > combinations. > > Affected ffmpeg versions: tested 2.8.8, 3.0.2, 3.1.3, same behaviour, > also when building with various gcc versions (4.x, 5.x, 6.x). > > Disabling assembler in ffmpeg gets rid of the problem. > > Given patches/fixes I will be glad to test them and report here. > > Regards, > Rune > P.S. an attempt to report via trac did not succeed, that's why posting > to the list AFAIK most MMX code in FFmpeg does not run emms (i.e. keeps the FPU state trashed) until returning to the API user. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] lurking bugs in the mmx-related assembler code (?)
Hello, Would someone familiar with the mmx assembler parts of ffmpeg look over and check that the fp-state is restored as needed? This is crucial with musl libc which uses floating point in its malloc() implementation. My troubleshooting strongly suggests that it is the mmx code which is the culprit of ffmpeg misbehavior (the same issue has been recently fixed in libtheora). Ffmpeg built against musl behaves erratically on certain combinations of inputs and outputs, which may also depend on other settings. Most often ffmpeg hangs forever quite early while beginning the conversion (at "frame=0 fps" or even before showing this) but sometimes it segfaults at that point or while being manually interrupted. At least the internal theora decoder is affected but there may be other misbehaving decoders or encoders. I think I observed similar breakage with other file types than ogg/theora but did not document all the tested combinations. Affected ffmpeg versions: tested 2.8.8, 3.0.2, 3.1.3, same behaviour, also when building with various gcc versions (4.x, 5.x, 6.x). Disabling assembler in ffmpeg gets rid of the problem. Given patches/fixes I will be glad to test them and report here. Regards, Rune P.S. an attempt to report via trac did not succeed, that's why posting to the list ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel