Re: [osv-dev] how to find missing symbols if it doesn't print the name
I downloaded the wheel file but I get this error (I am using Ubuntu 19.10): pip3 install ./tensorflow-1.14.1-cp36-cp36m-linux_x86_64.whl tensorflow-1.14.1-cp36-cp36m-linux_x86_64.whl is not a supported wheel on this platform. Could you please send us full recipe that would build the app? Ideally an "app" just like any we have under osv-apps repo so that I can simply run: ./scripts/buils image= and then run it. Thanks, Waldek On Friday, December 13, 2019 at 5:52:03 PM UTC-5, zhiting zhu wrote: > > No. It's >Num:Value Size TypeBind Vis Ndx Name > 1: 0affa7f8 0 SECTION LOCAL DEFAULT 21 > > It's not hidden. > > Here's the new link: > https://send.firefox.com/download/9a8bf3fa2909635f/#0ZecrR7UJwspr743vNBo6A > to the file. > > On Fri, Dec 13, 2019 at 4:44 PM Waldek Kozaczuk > wrote: > >> I am not sure but this issue is similar to what I encountered when >> dealing with dotnet: >> >> readelf -s libcoreclr.so | grep gCurrentThreadInfo >> readelf: Warning: local symbol 31 found at index >= .dynsym's sh_info >> value of 1 >> 31: 24 TLS LOCAL HIDDEN19 >> gCurrentThreadInfo >> 9799: 24 TLS LOCAL HIDDEN19 >> gCurrentThreadInfo >> >> When you use readelf does it show it as a hidden, local TLS symbol? >> >> That link with compiles expired. Can you upload it somewhere again? >> >> Thanks >> >> On Friday, December 13, 2019 at 1:16:21 PM UTC-5, zhiting zhu wrote: >>> >>> I was reading the print wrong. >>> I think the problem is this one: >>> >>> 0b23ca60 00010010 R_X86_64_DTPMOD64 0aee9f58 .tbss + 0 >>> >>> The sym is 1 but in this case, .tbss is still in the same shared object >>> but in arch_relocate_rela, it's going to the else branch that handles the >>> variable is located in DIFFERENT shared object that the caller. >>> >>> On Wed, Dec 11, 2019 at 3:12 PM zhiting zhu >>> wrote: >>> I get rid of the __VA_OPT and replace the __VA_ARGS__ with ##__VA_ARGS__. ## seems to be the extension from GNU that does similar things. Here's the link of the TensorFlow that I compile myself: https://send.firefox.com/download/68d2a81e2cacdafb/#ui_lMMYAU9UW9e6Fd1VG7w Install it locally and build an osv vm that includes all the dependencies: "tensorflow grpc google _cffi_backend past future \ absl wrapt gast astor termcolor numpy unittest libfuturize \ keras_applications keras_preprocessing tensorflow_estimator \ tensorboard". Then just run "import tensorflow" in python shell, it should show you the error I'm looking at. On Wed, Dec 11, 2019 at 2:55 PM Waldek Kozaczuk wrote: > I wonder if that is related to a similar issue as described here - > https://groups.google.com/forum/#!topic/osv-dev/k69cHw7qvTg. > > I will try to fix and apply this debug patch to master so it makes > easier to debug it. > > Meanwhile, can you provide a reproducing test? > > On Wednesday, December 11, 2019 at 3:49:07 PM UTC-5, zhiting zhu wrote: >> >> After tracing inside arch_relocate_rela, it's failed in case >> R_X86_64_DTPMOD64. >> >> I print out the name index of that symbol and it's 0. I use readelf >> -a to check the .rela.dyn section. >> There's a line that doesn't have Sym. Value and Sym.Name + Addend is >> 0. I think I'm failing at that line. >> >> Offset Info Type Sym. ValueSym. >> Name + Addend >> *0b202f90 0010 R_X86_64_DTPMOD640* >> 0b23ca60 00010010 R_X86_64_DTPMOD64 0aee9f58 .tbss + >> 0 >> >> >> >> On Wed, Dec 11, 2019 at 10:36 AM zhiting zhu >> wrote: >> >>> Thanks for the debug patch. When I apply it, g++ complains about >>> "core/elf.cc:36:118: error: expected ‘)’ before ‘__VA_OPT__’" >>> I think __VA_OPT__ is only added to c++2a according to >>> https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html. Gcc 7.4.0 >>> on Ubuntu cannot compile that. >>> >>> On Tue, Dec 10, 2019 at 7:20 AM Waldek Kozaczuk >>> wrote: >>> You may want to try to apply this patch - https://groups.google.com/forum/#!topic/osv-dev/LbnnY2Kcmak - that should provide many useful debug printouts. There is another patch I have sent that fixes the versioned self-lookup problem - https://groups.google.com/forum/#!topic/osv-dev/d56plMGXi6E. I wonder if it fixes your problem (my gut tells me yours is different). On Tuesday, December 10, 2019 at 3:07:09 AM UTC-5, Nadav Har'El wrote: > > On Mon, Dec 9, 2019 at 10:51 PM zhiting zhu > wrote: > >> Hey, >> >> I'm encountering this when I'm using some
Re: [osv-dev] Lazily allocating thread stacks WIP
On Friday, December 13, 2019 at 5:26:46 PM UTC-5, Waldek Kozaczuk wrote: > > Some other thoughts: > > 1. All the increments and decrements of the *stack_page_read_counter* > should be symmetrical. Decrementing it in wait_for_interrupts() does not > seem to but if I remember correctly when I was trying to get it all to work > I had to put it there, maybe because we start with 1. But I am not longer > convinced it is necessary. We need to better understand it. > > 2. Ideally, we should not even be trying to read the next page on kernel > threads or even better on threads with stack mapped with mmap_stack. That > can be accomplished by initializing the *stack_page_read_counter *to some > higher value than 1 (10 or 11) so it never reaches 0 to trigger page read. > This possibly can be set as a 1 thing in the thread_main_c method (see > arch/x64/arch-switch.hh). It received thread pointer so we should somehow > be able to determine how its stack got constructed. If not we can add a > boolean flag to the thread class. > > I really meant: "we should not even be trying to read the next page on kernel threads or even better on threads with stack NOT mapped with mmap_stack"" > 3. In theory, we could get by without *stack_page_read_counter *with just > checking sched::preemptable() AND somehow read the flags register (PUSHF?) > to directly check if interrupt flag is set or not in the read_stack_page > method. But I have a feeling it would be more expensive. > > 4. Using the counter may not be that expensive given we already have use a > similar mechanism to implement preemptable(). > > It would be nice to have Nadav to weigh in on that as it seems it all > either is related to the scheduler or affects it in some way. > > On Wednesday, December 11, 2019 at 5:07:58 PM UTC-5, Matthew Pabst wrote: >> >> Great stuff Waldek! That solved a few of the issues I was running into. >> However, tst-vfs.so is still failing occasionally with the error I >> mentioned earlier (std::length_error), which apparently is usually the >> result of an illegal memory access, probably caused by the changes to >> thread::init_stack() or arch::read_next_stack_page(). I was trying to debug >> the test case using GDB, but I couldn't figure out how to run tst-vfs.so in >> debug mode like the wiki examples. What is the best way to do this? >> > > Are you saying that this test runs just fine without the lazy stack patch? > > The best way is to run the test in repeatable mode until it breaks like so: > > ./scripts/tests.py --name "/tests/tst-vfs.so" -r > > Though by default when it fails or aborts it powers down automatically. > There is no way to prevent it by passing an option to the test.py script, > but you can manually edit the scripts/tests/testing.py and remove > '"--power-off-on-abort" from the command line constructed by > run_command_in_guest method. This will keep qemu and OSv running after > crash and let you connect with gdb from another terminal. Best is also > limit the number of vcpus to 1 (add '-c 1' to that same line in > run_command_in_guest). You can debug using regular release version: > > gdb build/release/loader.elf > connect > osv syms > bt > > The interesting stack trace may be on other thread so use 'osv thread ' > to switch to whatever thread you need to. See > https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb > for > more details. > >> >> Another question I had was if the calls to barrier() are necessary in >> your patch. Do they ensure that the running thread gets the correct >> stack_page_read_counter? >> > As I understand barrier() is a hint to a compiler not to perform certain > optimizations that might skew what we want to achieve. I was roughly > following what we do around preempt_counter but not sure if that is > necessary for stack_page_read_counter. Nadav would probably be the best > person to explain that. > >> >> On Wednesday, December 11, 2019 at 9:08:11 AM UTC-6, Waldek Kozaczuk >> wrote: >> >>> It was pretty late when I was sending this email. So to recap more >>> clearly how this patch works: >>> >>> 1. The stack_ >>> page_read_counter is a thread-local variable initialized to 1 and >>> behaves similarly to the preempt_counter >>> 2. For every new thread (except the very first one) its thread_main_c >>> (see >>> https://github.com/cloudius-systems/osv/blob/master/arch/x64/arch-switch.hh#L312-L323) >>> >>> calls irq_enable() and preempt_enable() which more importantly decrements >>> and resets the stack_page_read_counter to 0. >>> 3. From this point on any time preempt_disable() or the lock() method on >>> irq_lock_type or irq_save_lock_type is called, the read_next_stack_page is >>> called which ALWAYS increments the stack_page_read_counter counter but >>> ONLY reads one byte from the page ahead on stack if that counter is 1 (to >>> prevent nesting problem). If nested preempt_disable() or lock() on those >>> irq
Re: [osv-dev] how to find missing symbols if it doesn't print the name
No. It's Num:Value Size TypeBind Vis Ndx Name 1: 0affa7f8 0 SECTION LOCAL DEFAULT 21 It's not hidden. Here's the new link: https://send.firefox.com/download/9a8bf3fa2909635f/#0ZecrR7UJwspr743vNBo6A to the file. On Fri, Dec 13, 2019 at 4:44 PM Waldek Kozaczuk wrote: > I am not sure but this issue is similar to what I encountered when dealing > with dotnet: > > readelf -s libcoreclr.so | grep gCurrentThreadInfo > readelf: Warning: local symbol 31 found at index >= .dynsym's sh_info > value of 1 > 31: 24 TLS LOCAL HIDDEN19 > gCurrentThreadInfo > 9799: 24 TLS LOCAL HIDDEN19 > gCurrentThreadInfo > > When you use readelf does it show it as a hidden, local TLS symbol? > > That link with compiles expired. Can you upload it somewhere again? > > Thanks > > On Friday, December 13, 2019 at 1:16:21 PM UTC-5, zhiting zhu wrote: >> >> I was reading the print wrong. >> I think the problem is this one: >> >> 0b23ca60 00010010 R_X86_64_DTPMOD64 0aee9f58 .tbss + 0 >> >> The sym is 1 but in this case, .tbss is still in the same shared object >> but in arch_relocate_rela, it's going to the else branch that handles the >> variable is located in DIFFERENT shared object that the caller. >> >> On Wed, Dec 11, 2019 at 3:12 PM zhiting zhu >> wrote: >> >>> I get rid of the __VA_OPT and replace the __VA_ARGS__ with >>> ##__VA_ARGS__. ## seems to be the extension from GNU that does similar >>> things. >>> >>> Here's the link of the TensorFlow that I compile myself: >>> https://send.firefox.com/download/68d2a81e2cacdafb/#ui_lMMYAU9UW9e6Fd1VG7w >>> >>> Install it locally and build an osv vm that includes all the >>> dependencies: >>> "tensorflow grpc google _cffi_backend past future \ >>> absl wrapt gast astor termcolor numpy unittest libfuturize \ >>> keras_applications keras_preprocessing tensorflow_estimator \ >>> tensorboard". >>> >>> Then just run "import tensorflow" in python shell, it should show you >>> the error I'm looking at. >>> >>> On Wed, Dec 11, 2019 at 2:55 PM Waldek Kozaczuk >>> wrote: >>> I wonder if that is related to a similar issue as described here - https://groups.google.com/forum/#!topic/osv-dev/k69cHw7qvTg. I will try to fix and apply this debug patch to master so it makes easier to debug it. Meanwhile, can you provide a reproducing test? On Wednesday, December 11, 2019 at 3:49:07 PM UTC-5, zhiting zhu wrote: > > After tracing inside arch_relocate_rela, it's failed in case > R_X86_64_DTPMOD64. > > I print out the name index of that symbol and it's 0. I use readelf -a > to check the .rela.dyn section. > There's a line that doesn't have Sym. Value and Sym.Name + Addend is > 0. I think I'm failing at that line. > > Offset Info Type Sym. ValueSym. > Name + Addend > *0b202f90 0010 R_X86_64_DTPMOD640* > 0b23ca60 00010010 R_X86_64_DTPMOD64 0aee9f58 .tbss + 0 > > > > On Wed, Dec 11, 2019 at 10:36 AM zhiting zhu > wrote: > >> Thanks for the debug patch. When I apply it, g++ complains about >> "core/elf.cc:36:118: error: expected ‘)’ before ‘__VA_OPT__’" >> I think __VA_OPT__ is only added to c++2a according to >> https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html. Gcc 7.4.0 >> on Ubuntu cannot compile that. >> >> On Tue, Dec 10, 2019 at 7:20 AM Waldek Kozaczuk >> wrote: >> >>> You may want to try to apply this patch - >>> https://groups.google.com/forum/#!topic/osv-dev/LbnnY2Kcmak - that >>> should provide many useful debug printouts. >>> >>> There is another patch I have sent that fixes the versioned >>> self-lookup problem - >>> https://groups.google.com/forum/#!topic/osv-dev/d56plMGXi6E. I >>> wonder if it fixes your problem (my gut tells me yours is different). >>> >>> On Tuesday, December 10, 2019 at 3:07:09 AM UTC-5, Nadav Har'El >>> wrote: On Mon, Dec 9, 2019 at 10:51 PM zhiting zhu wrote: > Hey, > > I'm encountering this when I'm using some tensorflow functions: > > /lib/python3.6/tensorflow/python/_pywrap_tensorflow_internal.so: > failed looking up symbol > This is interesting, because the "failed looking up symbol" message is always followed by the name of the symbol looked up: core/elf.cc:abort("%s: failed looking up symbol %s\n", pathname().c_str(), demangle(name).c_str()); You can try to add printouts in object::arch_relocate_rela() to try to understand which symbol() is being called with an empty name. > [backtrace] > 0x403442a7 >
Re: [osv-dev] how to find missing symbols if it doesn't print the name
I am not sure but this issue is similar to what I encountered when dealing with dotnet: readelf -s libcoreclr.so | grep gCurrentThreadInfo readelf: Warning: local symbol 31 found at index >= .dynsym's sh_info value of 1 31: 24 TLS LOCAL HIDDEN19 gCurrentThreadInfo 9799: 24 TLS LOCAL HIDDEN19 gCurrentThreadInfo When you use readelf does it show it as a hidden, local TLS symbol? That link with compiles expired. Can you upload it somewhere again? Thanks On Friday, December 13, 2019 at 1:16:21 PM UTC-5, zhiting zhu wrote: > > I was reading the print wrong. > I think the problem is this one: > > 0b23ca60 00010010 R_X86_64_DTPMOD64 0aee9f58 .tbss + 0 > > The sym is 1 but in this case, .tbss is still in the same shared object > but in arch_relocate_rela, it's going to the else branch that handles the > variable is located in DIFFERENT shared object that the caller. > > On Wed, Dec 11, 2019 at 3:12 PM zhiting zhu > wrote: > >> I get rid of the __VA_OPT and replace the __VA_ARGS__ with ##__VA_ARGS__. >> ## seems to be the extension from GNU that does similar things. >> >> Here's the link of the TensorFlow that I compile myself: >> https://send.firefox.com/download/68d2a81e2cacdafb/#ui_lMMYAU9UW9e6Fd1VG7w >> >> Install it locally and build an osv vm that includes all the dependencies: >> "tensorflow grpc google _cffi_backend past future \ >> absl wrapt gast astor termcolor numpy unittest libfuturize \ >> keras_applications keras_preprocessing tensorflow_estimator \ >> tensorboard". >> >> Then just run "import tensorflow" in python shell, it should show you the >> error I'm looking at. >> >> On Wed, Dec 11, 2019 at 2:55 PM Waldek Kozaczuk > > wrote: >> >>> I wonder if that is related to a similar issue as described here - >>> https://groups.google.com/forum/#!topic/osv-dev/k69cHw7qvTg. >>> >>> I will try to fix and apply this debug patch to master so it makes >>> easier to debug it. >>> >>> Meanwhile, can you provide a reproducing test? >>> >>> On Wednesday, December 11, 2019 at 3:49:07 PM UTC-5, zhiting zhu wrote: After tracing inside arch_relocate_rela, it's failed in case R_X86_64_DTPMOD64. I print out the name index of that symbol and it's 0. I use readelf -a to check the .rela.dyn section. There's a line that doesn't have Sym. Value and Sym.Name + Addend is 0. I think I'm failing at that line. Offset Info Type Sym. ValueSym. Name + Addend *0b202f90 0010 R_X86_64_DTPMOD640* 0b23ca60 00010010 R_X86_64_DTPMOD64 0aee9f58 .tbss + 0 On Wed, Dec 11, 2019 at 10:36 AM zhiting zhu wrote: > Thanks for the debug patch. When I apply it, g++ complains about > "core/elf.cc:36:118: error: expected ‘)’ before ‘__VA_OPT__’" > I think __VA_OPT__ is only added to c++2a according to > https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html. Gcc 7.4.0 on > Ubuntu cannot compile that. > > On Tue, Dec 10, 2019 at 7:20 AM Waldek Kozaczuk > wrote: > >> You may want to try to apply this patch - >> https://groups.google.com/forum/#!topic/osv-dev/LbnnY2Kcmak - that >> should provide many useful debug printouts. >> >> There is another patch I have sent that fixes the versioned >> self-lookup problem - >> https://groups.google.com/forum/#!topic/osv-dev/d56plMGXi6E. I >> wonder if it fixes your problem (my gut tells me yours is different). >> >> On Tuesday, December 10, 2019 at 3:07:09 AM UTC-5, Nadav Har'El wrote: >>> >>> On Mon, Dec 9, 2019 at 10:51 PM zhiting zhu >>> wrote: >>> Hey, I'm encountering this when I'm using some tensorflow functions: /lib/python3.6/tensorflow/python/_pywrap_tensorflow_internal.so: failed looking up symbol >>> >>> This is interesting, because the "failed looking up symbol" message >>> is always followed by the name of the symbol looked up: >>> >>> core/elf.cc:abort("%s: failed looking up symbol %s\n", >>> pathname().c_str(), demangle(name).c_str()); >>> >>> You can try to add printouts in object::arch_relocate_rela() to try >>> to understand which symbol() is being called >>> with an empty name. >>> >>> [backtrace] 0x403442a7 0x40397dce >>> unsigned int, void*, long)+574> >>> >>> 0x4033eed4 0x40341d27 0x40345623 >>> std::char_traits, std::allocator >, std::vector, std::allocator >, std::allocator>>> std::char_traits, std::allocator > > >, std::vector, std::allocator > >&)+1459> 0x40345e70
Re: [osv-dev] Lazily allocating thread stacks WIP
Some other thoughts: 1. All the increments and decrements of the *stack_page_read_counter* should be symmetrical. Decrementing it in wait_for_interrupts() does not seem to but if I remember correctly when I was trying to get it all to work I had to put it there, maybe because we start with 1. But I am not longer convinced it is necessary. We need to better understand it. 2. Ideally, we should not even be trying to read the next page on kernel threads or even better on threads with stack mapped with mmap_stack. That can be accomplished by initializing the *stack_page_read_counter *to some higher value than 1 (10 or 11) so it never reaches 0 to trigger page read. This possibly can be set as a 1 thing in the thread_main_c method (see arch/x64/arch-switch.hh). It received thread pointer so we should somehow be able to determine how its stack got constructed. If not we can add a boolean flag to the thread class. 3. In theory, we could get by without *stack_page_read_counter *with just checking sched::preemptable() AND somehow read the flags register (PUSHF?) to directly check if interrupt flag is set or not in the read_stack_page method. But I have a feeling it would be more expensive. 4. Using the counter may not be that expensive given we already have use a similar mechanism to implement preemptable(). It would be nice to have Nadav to weigh in on that as it seems it all either is related to the scheduler or affects it in some way. On Wednesday, December 11, 2019 at 5:07:58 PM UTC-5, Matthew Pabst wrote: > > Great stuff Waldek! That solved a few of the issues I was running into. > However, tst-vfs.so is still failing occasionally with the error I > mentioned earlier (std::length_error), which apparently is usually the > result of an illegal memory access, probably caused by the changes to > thread::init_stack() or arch::read_next_stack_page(). I was trying to debug > the test case using GDB, but I couldn't figure out how to run tst-vfs.so in > debug mode like the wiki examples. What is the best way to do this? > Are you saying that this test runs just fine without the lazy stack patch? The best way is to run the test in repeatable mode until it breaks like so: ./scripts/tests.py --name "/tests/tst-vfs.so" -r Though by default when it fails or aborts it powers down automatically. There is no way to prevent it by passing an option to the test.py script, but you can manually edit the scripts/tests/testing.py and remove '"--power-off-on-abort" from the command line constructed by run_command_in_guest method. This will keep qemu and OSv running after crash and let you connect with gdb from another terminal. Best is also limit the number of vcpus to 1 (add '-c 1' to that same line in run_command_in_guest). You can debug using regular release version: gdb build/release/loader.elf connect osv syms bt The interesting stack trace may be on other thread so use 'osv thread ' to switch to whatever thread you need to. See https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb for more details. > > Another question I had was if the calls to barrier() are necessary in your > patch. Do they ensure that the running thread gets the correct > stack_page_read_counter? > As I understand barrier() is a hint to a compiler not to perform certain optimizations that might skew what we want to achieve. I was roughly following what we do around preempt_counter but not sure if that is necessary for stack_page_read_counter. Nadav would probably be the best person to explain that. > > On Wednesday, December 11, 2019 at 9:08:11 AM UTC-6, Waldek Kozaczuk wrote: > >> It was pretty late when I was sending this email. So to recap more >> clearly how this patch works: >> >> 1. The stack_ >> page_read_counter is a thread-local variable initialized to 1 and >> behaves similarly to the preempt_counter >> 2. For every new thread (except the very first one) its thread_main_c >> (see >> https://github.com/cloudius-systems/osv/blob/master/arch/x64/arch-switch.hh#L312-L323) >> >> calls irq_enable() and preempt_enable() which more importantly decrements >> and resets the stack_page_read_counter to 0. >> 3. From this point on any time preempt_disable() or the lock() method on >> irq_lock_type or irq_save_lock_type is called, the read_next_stack_page is >> called which ALWAYS increments the stack_page_read_counter counter but >> ONLY reads one byte from the page ahead on stack if that counter is 1 (to >> prevent nesting problem). If nested preempt_disable() or lock() on those >> irq locks is called it will only increment the counter but not read from >> stack. >> 4. Any time preempt_enable() or unlock() on on irq_lock_type or >> irq_save_lock_type is called, correspondingly the >> stack_page_read_counter is decremented (eventually to 0). >> 5. Lastly, any time wait_for_interrupt is called (re-enabled interrupts) >> we also decrement the
Re: [osv-dev] how to find missing symbols if it doesn't print the name
I was reading the print wrong. I think the problem is this one: 0b23ca60 00010010 R_X86_64_DTPMOD64 0aee9f58 .tbss + 0 The sym is 1 but in this case, .tbss is still in the same shared object but in arch_relocate_rela, it's going to the else branch that handles the variable is located in DIFFERENT shared object that the caller. On Wed, Dec 11, 2019 at 3:12 PM zhiting zhu wrote: > I get rid of the __VA_OPT and replace the __VA_ARGS__ with ##__VA_ARGS__. > ## seems to be the extension from GNU that does similar things. > > Here's the link of the TensorFlow that I compile myself: > https://send.firefox.com/download/68d2a81e2cacdafb/#ui_lMMYAU9UW9e6Fd1VG7w > > Install it locally and build an osv vm that includes all the dependencies: > "tensorflow grpc google _cffi_backend past future \ > absl wrapt gast astor termcolor numpy unittest libfuturize \ > keras_applications keras_preprocessing tensorflow_estimator \ > tensorboard". > > Then just run "import tensorflow" in python shell, it should show you the > error I'm looking at. > > On Wed, Dec 11, 2019 at 2:55 PM Waldek Kozaczuk > wrote: > >> I wonder if that is related to a similar issue as described here - >> https://groups.google.com/forum/#!topic/osv-dev/k69cHw7qvTg. >> >> I will try to fix and apply this debug patch to master so it makes easier >> to debug it. >> >> Meanwhile, can you provide a reproducing test? >> >> On Wednesday, December 11, 2019 at 3:49:07 PM UTC-5, zhiting zhu wrote: >>> >>> After tracing inside arch_relocate_rela, it's failed in case >>> R_X86_64_DTPMOD64. >>> >>> I print out the name index of that symbol and it's 0. I use readelf -a >>> to check the .rela.dyn section. >>> There's a line that doesn't have Sym. Value and Sym.Name + Addend is 0. >>> I think I'm failing at that line. >>> >>> Offset Info Type Sym. ValueSym. Name >>> + Addend >>> *0b202f90 0010 R_X86_64_DTPMOD640* >>> 0b23ca60 00010010 R_X86_64_DTPMOD64 0aee9f58 .tbss + 0 >>> >>> >>> >>> On Wed, Dec 11, 2019 at 10:36 AM zhiting zhu >>> wrote: >>> Thanks for the debug patch. When I apply it, g++ complains about "core/elf.cc:36:118: error: expected ‘)’ before ‘__VA_OPT__’" I think __VA_OPT__ is only added to c++2a according to https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html. Gcc 7.4.0 on Ubuntu cannot compile that. On Tue, Dec 10, 2019 at 7:20 AM Waldek Kozaczuk wrote: > You may want to try to apply this patch - > https://groups.google.com/forum/#!topic/osv-dev/LbnnY2Kcmak - that > should provide many useful debug printouts. > > There is another patch I have sent that fixes the versioned > self-lookup problem - > https://groups.google.com/forum/#!topic/osv-dev/d56plMGXi6E. I wonder > if it fixes your problem (my gut tells me yours is different). > > On Tuesday, December 10, 2019 at 3:07:09 AM UTC-5, Nadav Har'El wrote: >> >> On Mon, Dec 9, 2019 at 10:51 PM zhiting zhu wrote: >> >>> Hey, >>> >>> I'm encountering this when I'm using some tensorflow functions: >>> >>> /lib/python3.6/tensorflow/python/_pywrap_tensorflow_internal.so: >>> failed looking up symbol >>> >> >> This is interesting, because the "failed looking up symbol" message >> is always followed by the name of the symbol looked up: >> >> core/elf.cc:abort("%s: failed looking up symbol %s\n", >> pathname().c_str(), demangle(name).c_str()); >> >> You can try to add printouts in object::arch_relocate_rela() to try >> to understand which symbol() is being called >> with an empty name. >> >> >>> [backtrace] >>> 0x403442a7 >>> 0x40397dce >> unsigned int, void*, long)+574> >>> >> >> >>> 0x4033eed4 >>> 0x40341d27 >>> 0x40345623 >>> >> std::char_traits, std::allocator >, >>> std::vector, >>> std::allocator >, std::allocator>> std::char_traits, std::allocator > > >, >>> std::vector, >>> std::allocator > >&)+1459> >>> 0x40345e70 >>> >> std::char_traits, std::allocator >, >>> std::vector, >>> std::allocator >, std::allocator>> std::char_traits, std::allocator > > >, bool)+336> >>> 0x40465fd8 >>> 0x10937228 <_PyImport_FindSharedFuncptr+376> >>> 0x745f70617277796f >>> >>> It seems the name is not print out. >>> >>> If I'm using the check-libcfunc-avail.sh to check the >>> _pywrap_tensorflow_internal.so, I get the following output: >>> >>> pthread_mutex_consistent not found >>> pthread_mutexattr_setrobust not found >>> fmaf not found >>> fma not found >>> mallinfo not found >>> >> >> All of these we should eventually add, these are real Linux glibc >> functions... >> Feel free to open issues
Re: [osv-dev] [PATCH] Signed-off-by: BassMatt
On Thu, Dec 12, 2019 at 10:02 PM BassMatt wrote: > > Main scripts in scripts/ folder updated to use Python3 > > I went through the scripts detailed in scripts/README and updated them to use > Python3. I used the Python "Future" module to provide suggestions, then > manually went through and applied the changes. The "Future" module gives > suggestions to allow for cross-compatibility between Python2/3, but since it > was expressed that only Python3 needed to be supported, I left all that out. > > The issue is detailed here: > https://github.com/cloudius-systems/osv/issues/1056 Looks good, thanks! Acked-by: Pekka Enberg I assume this is you: https://github.com/BassMatt Please fix the sign-off to be: Signed-off-by: Real Name as per contributions guide: https://github.com/cloudius-systems/osv/blob/master/CONTRIBUTING - Pekka -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CAGUyND8PC%3DD0ctpO0FxX6mJhBJTxr0d1jiF3-zGRLg_16UCn%3Dw%40mail.gmail.com.