I have updated #1010 where I think it is possibly a duplicate to #536. As far as #1010 I did an experiment where I changed priority field of type float in JSON thread structure to long (which makes it not use fmt_fp()) and the crash went away. So I wonder if this problem is similar in that the the underlying cause might be related to how OSv handles FPU state when concurrent threads are trying to printf floating numbers (%f) (that is why it happens in fmt_fp()) or possibly there is a bug in this code that was fixed in musl.
I will try to create simple test that printfs floating point number in multiple threads to see if we can narrow it down even more. On Thursday, November 22, 2018 at 2:58:40 PM UTC-5, Waldek Kozaczuk wrote: > > To reproduce (may not happen every time): > > 1) build with ffmpeg (enable h265 in ffmpeg Makefile) > ./scripts/build image=ffmpeg > > 2) In one terminal start ffmpeg to receive video on host: > cd apps/ffmpeg/ROOTFS > LD_LIBRARY_PATH=. ./ffmpeg.so -i tcp://0.0.0.0:12345?listen -c copy > /home/wkozaczuk/test.mp4 > > 3) In another terminal start ffmpeg to transcode and send video: > ./scripts/run.py -c 8 -e '/ffmpeg.so -i > http://clips.vorwaerts-gmbh.de/VfE_html5.mp4 -c:v libx265 -crf 28 -c:a > aac -b:a 128k -f mpegts tcp://192.168.122.1:12345' > > Waldek > > On Thursday, November 22, 2018 at 2:55:03 PM UTC-5, Waldek Kozaczuk wrote: >> >> I see this crash: >> >> frame= 275 fps= 47 q=-0.0 size= 520kB time=00:00:1pa2ge. f2a1 ublitt >> roauttes=i d3e4 8a.p6pklbiictast/iso ns,p eeadd=d2r200001d00000 0 0 >> [registers] >> RIP: 0x000000000044ef5f <???+4517727> >> RFL: 0x0000000000010202 CS: 0x0000000000000008 SS: 0x0000000000000010 >> RAX: 0x8000000000000000 RBX: 0x0000200001d00004 RCX: >> 0x0000000000000002 RDX: 0x0000200001cff66c >> RSI: 0xfffffffffffffffc RDI: 0x8000000000000000 RBP: >> 0x0000200001cff7a0 R8: 0x0000000000004000 >> R9: 0x00000000ffffffe5 R10: 0x0000000000004000 R11: >> 0x8000000000000000 R12: 0x0000000000000000 >> R13: 0x00000000ffffffe5 R14: 0x0000000000004000 R15: >> 0x8000000000000000 RSP: 0x0000200001cfda50 >> Aborted >> >> [backtrace] >> 0x0000000000346ce2 <???+3435746> >> 0x0000000000347946 <mmu::vm_fault(unsigned long, exception_frame*)+310> >> 0x00000000003a222b <page_fault+123> >> 0x00000000003a10a6 <???+3805350> >> >> Please note that ffmpeg is constantly printing to screen (vga or serial >> console?) some output about progress. >> >> Once connected to gdb I see this stacktrace: >> >> (gdb) bt >> #0 0x00000000003a83d2 in processor::cli_hlt () at >> arch/x64/processor.hh:247 >> #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48 >> #2 osv::halt () at arch/x64/power.cc:24 >> #3 0x000000000023ef34 in abort (fmt=fmt@entry=0x63095b "Aborted\n") at >> runtime.cc:132 >> #4 0x0000000000202765 in abort () at runtime.cc:98 >> #5 0x0000000000346ce3 in mmu::vm_sigsegv (addr=<optimized out>, >> ef=0xffff800006550068) at core/mmu.cc:1316 >> #6 0x0000000000347947 in mmu::vm_fault (addr=addr@entry=35184402497536, >> ef=ef@entry=0xffff800006550068) at core/mmu.cc:1330 >> #7 0x00000000003a222c in page_fault (ef=0xffff800006550068) at >> arch/x64/mmu.cc:38 >> #8 <signal handler called> >> #9 0x000000000044ef5f in fmt_fp (f=0x200001cffa50, y=0, w=0, p=2, fl=0, >> t=102) at libc/stdio/vfprintf.c:300 >> #10 0x0000000000000000 in ?? () >> >> I wonder if this is related to >> https://github.com/cloudius-systems/osv/issues/1010 (though no >> httpserver at all) and this >> https://github.com/cloudius-systems/osv/issues/536. >> >> Please note that the common thing between all these stack traces is >> fmt_fp() function in libc/stdio/vfprintf.c:300. Coincidence? >> >> Wadek >> > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
