RE: Assembly Syscall Question
On 31-Jul-2003 Ryan Sommers wrote: > When making a system call to the kernel why is it necessary to push the > syscall value onto the stack when you don't call another function? > > Example: > > access.the.bsd.kernel: > int 80h > ret > > func: > mov eax, 4; Write > call access.the.bsd.kernel > ; End > > Works. However: > func: > mov eax, 4; Write > int 80h > ; End > > Doesn't. > > Now, if you change it to: > > func: > mov eax, 4; Write > push eax > int 80h > ; End > > It does work. I was able to find, "By default, the FreeBSD kernel uses the C > calling convention. Further, although the kernel is accessed using int 80h, > it is assumed the program will call a function that issues int 80h, rather > than issuing int 80h directly," in the developer's handbook. But I can't > figure out why the second example doesn't work. Is the call instruction > pushing the value onto the stack in addition to pushing the instruction > pointer on? > > Thank you in advance. > PS I'm not on the list. First off, why are you using asm for userland stuff? Secondly, the kernel assumes that all the other arguments besides the syscall to execute (i.e. %eax) are passed on the user stack. Thus, it has to have a set location relative to the user stack pointer to find the arguments. It allows for a return IP from a call instruction to be at the top of the stack. You can tell this by looking at syscall() in sys/i386/i386/trap.c: params = (caddr_t)frame.tf_esp + sizeof(int); code = frame.tf_eax; orig_tf_eflags = frame.tf_eflags; params is a userland pointer to the function arguments. Adding the sizeof(int) skips over the saved return address, or in your 3rd case, the dummy %eax value. -- John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
:You need 8K for this, minimally: 4K that's RO user/RW kernel and :4K that's RW user/RW kernel. You can use it for things like zero :system call getpid/getuid/etc.. : :It's also worth a page doubly mapped, with the second mapping with :the PG_G bit set on it (to make it RO visible to user space at the :sampe place in all programs) to hold the timecounter information; :the current timecounter implementation, with a scad of structures, :is both wasteful and unnecessary, given that pointer assigns are :atomic, so you can implement with only two, which only take a small :part of the page. Doing this, you can use a pointer reference and :a structure assign, and a compare-pointer-afterwards to make a zero :system call gettimeofday() and other calls (consider the benefits :to Apache, SQID, and other programs that have hard "logging with :timestamp" requirements). : :I've also been toying with the idea of putting environp ** in a COW :page, and dealing with changes as a "fixup" operation in the fault :handler (really, environp needs to die, to make way for logical name :tables; it persists because POSIX and SuS demand that it persist). : :So, Matt... how does the modified message based system call :interface fare in a before-and-after with "lmbench"? 8-) 8-). : :-- Terry In regards to using shared memory to optimize certain things, like the time functions... I believe it will be possible to do this in a portable fashion by going through the emulation layer (which is really misnamed... basically the syscall interface layer which will run in userspace but whos code will be managed by the kernel). We don't want the user program having direct knowledge of memory mapped gizmos, but it's perfectly acceptable for the emulation layer to have this knowledge, because the emulation layer will know how to fall back if the gizmo isn't where it's supposed to be (i.e. you boot an older kernel with a newer userland). The messaging code works fairly well so far performance-wise. My original getuid() vs messaging getuid() test on a DELL 2550 (SMP build) comes out: getuid() 0.974s 1088300 loops = 0.895uS/loop getuid() 0.937s 1047500 loops = 0.894uS/loop getuid() 0.971s 1084700 loops = 0.895uS/loop getuid_msg() 1.029s 935400 loops = 1.100uS/loop getuid_msg() 0.970s 881600 loops = 1.100uS/loop getuid_msg() 0.963s 876600 loops = 1.099uS/loop About 200ns slower, and I haven't really tried to optimize any of it yet (and I'm not going to worry about it until a lot more work has been accomplished anyway). -Matt Matthew Dillon <[EMAIL PROTECTED]> ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
Matthew Dillon wrote: > I think the ultimate performance solution is to have some explicitly > shared memory between kerneland and userland and store the arguments, > error code, and return value there. Being a fairly small package of > memory multi-threading would not be an issue as each thread would have > its own slot. You need 8K for this, minimally: 4K that's RO user/RW kernel and 4K that's RW user/RW kernel. You can use it for things like zero system call getpid/getuid/etc.. It's also worth a page doubly mapped, with the second mapping with the PG_G bit set on it (to make it RO visible to user space at the sampe place in all programs) to hold the timecounter information; the current timecounter implementation, with a scad of structures, is both wasteful and unnecessary, given that pointer assigns are atomic, so you can implement with only two, which only take a small part of the page. Doing this, you can use a pointer reference and a structure assign, and a compare-pointer-afterwards to make a zero system call gettimeofday() and other calls (consider the benefits to Apache, SQID, and other programs that have hard "logging with timestamp" requirements). I've also been toying with the idea of putting environp ** in a COW page, and dealing with changes as a "fixup" operation in the fault handler (really, environp needs to die, to make way for logical name tables; it persists because POSIX and SuS demand that it persist). So, Matt... how does the modified message based system call interface fare in a before-and-after with "lmbench"? 8-) 8-). -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
In message: <[EMAIL PROTECTED]> Matthew Dillon <[EMAIL PROTECTED]> writes: : Additionally, the mechanism can : be extended to support chained system calls (i.e. issue several system : calls at once), and transactional sequences. VMS has done this for a long time. :-) All of their system calls were asynchronous. The synchronous versions are done in userland by calling async version plus wait. Warner ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
It's a toss up. Linux's use of registers for syscall args is not a panacea, because many system calls require more then just the basic call-used registers which means the linux userland code has to save and restore those registers on the stack anyway. And we won't even mention syscalls that require more arguments then you have registers for. Oops! I think the ultimate performance solution is to have some explicitly shared memory between kerneland and userland and store the arguments, error code, and return value there. Being a fairly small package of memory multi-threading would not be an issue as each thread would have its own slot. Of course, FreeBSD has other problems as well.. if you look at our syscall entry and exit code it is simply *horrendous* It goes through an indirect jump and it has to check carry for error returns to set the error code. It is a real mess. What I am doing in DragonFly is reimplementing system calls as messages. It turns out that the overhead is not signifcantly greater (and I haven't even tried optimizing the code yet). With the system call as a message the return code and error are stored in the message itself which means that there are *NO* no special cases and, at least for IA32, the kernel messaging interface has just three fixed arguments which fit easily in the call-used registers. I've been able to get rid of proc->p_retval[], a considerable amount of unnecessary proc/thread pointer passing, and the mechanism is flexible enough to support both synchronous and asynchronous operation as well as both single and multi-threaded operation using the same exact code. Additionally, the mechanism can be extended to support chained system calls (i.e. issue several system calls at once), and transactional sequences. Of course, I haven't actually done all of this yet... I have the messaging infrastructure in place and I am working on asynchronizing I/O and timing system calls (e.g. sleep(), read(), write(), select(), etc..), which is going to take a lot of work, but once that is done the sky is the limit. -Matt :On Fri, 2003-08-01 at 02:59, Terry Lambert wrote: :> Personally, I like to look at the Linux register-based passing :> mechanism in the same light that they look at the FreeBSD use :> of the MMU hardware to assist VM, at the cost of increased :> FreeBSD VM system complexity (i.e. they think our VM is too :> convoluted, and we think their system calls are too convoluted). : :Maybe, but they also support a lot of MMU-less architectures, so it may :have made things simpler for them to not depend on MMU. I wonder if NUMA :had any bearing on that as well... : :-- :Shawn <[EMAIL PROTECTED]> ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
UVM was the VM system that replaced the old Mach style VM in NetBSD and (I believe) OpenBSD. FreeBSD has already cleaned up a lot of the problems that UVM addresses. However, there are still some things that could be done to make map passing easier, which I believe would make zero-copy support cleaner if not faster. This is the UVM home page: http://ccrc.wustl.edu/pub/chuck/tech/uvm/ -Kip ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
On Sat, 2003-08-02 at 16:23, Kip Macy wrote: > table layout differs from the default. The introduction to the UVM thesis has > some good points in this regard. UVM thesis? (I am subscribed to list.) -- Shawn <[EMAIL PROTECTED]> http://drevil.warpcore.org/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
> Maybe, but they also support a lot of MMU-less architectures, so it may > have made things simpler for them to not depend on MMU. I wonder if NUMA > had any bearing on that as well... No. The initial design of their VM greatly preceded NUMA and uCLinux. It actually makes the system less portable in that it can require interspersing of machine dependent code in the machine independent parts when the machines page table layout differs from the default. The introduction to the UVM thesis has some good points in this regard. -Kip ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
On Fri, 2003-08-01 at 02:59, Terry Lambert wrote: > Personally, I like to look at the Linux register-based passing > mechanism in the same light that they look at the FreeBSD use > of the MMU hardware to assist VM, at the cost of increased > FreeBSD VM system complexity (i.e. they think our VM is too > convoluted, and we think their system calls are too convoluted). Maybe, but they also support a lot of MMU-less architectures, so it may have made things simpler for them to not depend on MMU. I wonder if NUMA had any bearing on that as well... -- Shawn <[EMAIL PROTECTED]> http://drevil.warpcore.org/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
Ryan Sommers wrote: > When making a system call to the kernel why is it necessary to push the > syscall value onto the stack when you don't call another function? The stack is visible in both user space and kernel space; in general, the register space won't be, unless you are on an architecture with an abundance of registers that doesn't do a save/restore on trap entries. By pushing it onto the stack, you are *positive* that the vale is visible. There is also the (small) possibility that the C compiler will take advanatage of the calling conventions to assume that a value will not change over a system call. Short of declaring that all registers are volatile, you can't really guarantee that the registers pushed in will have the values after the call that they had before the call, unless you save and restore all of them (which is more expensive than the copyin, for system calls with 3 arguments or less -- which is most of them; cost, of course, will vary by architecture). Personally, I like to look at the Linux register-based passing mechanism in the same light that they look at the FreeBSD use of the MMU hardware to assist VM, at the cost of increased FreeBSD VM system complexity (i.e. they think our VM is too convoluted, and we think their system calls are too convoluted). -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Assembly Syscall Question
On Thu, Jul 31, 2003 at 04:12:27PM -0400, Ryan Sommers wrote: > When making a system call to the kernel why is it necessary to push the > syscall value onto the stack when you don't call another function? > > Example: > > access.the.bsd.kernel: > int 80h > ret > > func: > mov eax, 4; Write > call access.the.bsd.kernel > ; End > > Works. However: > func: > mov eax, 4; Write > int 80h > ; End > > Doesn't. > This is because in a C library, all system calls are wrapped into C functions, so the stack looks like this when in the syscall code in libc: return address to a program syscall args So the kernel knows how to account for a return address to access actual arguments. So when calling the kernel directly (not through a C library wrapper function), we need to align the stack to fake the kernel we're calling it from the syscall code in libc. Cheers, -- Ruslan Ermilov Sysadmin and DBA, [EMAIL PROTECTED] Sunbay Software Ltd, [EMAIL PROTECTED] FreeBSD committer pgp0.pgp Description: PGP signature
Re: Assembly Syscall Question
> When making a system call to the kernel why is it necessary to push the > syscall value onto the stack when you don't call another function? You have to push anything the size of an address. Because the call return pushes the return adress on the stack. The 1st and 3rd both push 4 bytes, so they work. _why_ this is needed is probably because the routine that does the int 80h can also check and process the int 80h returnvalue and errorcode, and people wanted to avoid duplication of that code, at least that is my guess :-) ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"