On Thu, Feb 11, 2021 at 7:42 AM Waldek Kozaczuk <jwkozac...@gmail.com> wrote:
> > #1 0x0000100000037954 in test_bsd_tcp1::tcp_server (this=0x2000006ff988) > at /home/wkozaczuk/projects/osv/tests/tst-bsd-tcp1-zsnd.cc:114 > > 114 int bytes2 = zcopy_tx(client_s, &zm); > > (gdb) p client_s > > $1 = 5 > > (gdb) p &zm > > $2 = (zmsghdr *) 0xffff800041782d40 > > > As you can see the test app calls zcopy_tx() which takes 2 arguments: > > ssize_t zcopy_tx(int s, struct zmsghdr *zm) > > the 1st one is int and has value 5 in the caller - the test app - and is > received as such > > in the kernel zcopy_tx. > > > The second one - the address of struct zmsghdr - is problematic. On the > caller's side looks OK but when received in the kernel it is wrong - 0x1. > > Why? > Not being an expert on aarch64 or it's function calling conventions, all I can do is raise some wild guesses, I hope one of them is correct and you can figure out which - perhaps by reading the code or trying to reproduce it in new tests (you can perhaps write a new test which loops calling some function f() with a bunch of parameters in multiple threads, and printing an error if f ever gets called with wrong parameters) . One possibility is that our context-switch implementation is forgetting to save some of the registers, and the register which is used to hold the third argument of a function is lost on the context switch. Another possibility is that we lose this register in situations smaller asynchronous events, not just context switches between threads. We have interrupts (e.g., the timer interrupt), exceptions, and signals, which can run complex OSv code in the middle of the user's function without the function knowing that this is happening, so when we switch to these interrupts or exceptions we mustn't forget the registers which the OSv code may clobber. > > I saw another test crashing in a similar way when the caller (another > test) would pass 3 arguments to kernel function and 2 of those > (non-addresses) were passed correctly but the 3rd one - address one was not. > > > Any ideas what might be going on? > > > Waldek > > -- > You received this message because you are subscribed to the Google Groups > "OSv Development" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to osv-dev+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/osv-dev/4a97809f-d207-48b9-88e7-06e218e5d829n%40googlegroups.com > <https://groups.google.com/d/msgid/osv-dev/4a97809f-d207-48b9-88e7-06e218e5d829n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CANEVyjvmYpHXzXSTNgXQ4wnAg_p%2B5uobvfkx%3DYCg5URwVUPsmg%40mail.gmail.com.