RE: Assembly Syscall Question

2003-08-04 Thread John Baldwin

On 31-Jul-2003 Ryan Sommers wrote:
 When making a system call to the kernel why is it necessary to push the 
 syscall value onto the stack when you don't call another function? 
 
 Example: 
 
 access.the.bsd.kernel:
  int 80h
  ret 
 
 func:
  mov eax, 4; Write
  call access.the.bsd.kernel
 ; End 
 
 Works. However:
 func:
  mov eax, 4; Write
  int 80h
 ; End 
 
 Doesn't. 
 
 Now, if you change it to: 
 
 func:
  mov eax, 4; Write
  push eax
  int 80h
 ; End 
 
 It does work. I was able to find, By default, the FreeBSD kernel uses the C 
 calling convention. Further, although the kernel is accessed using int 80h, 
 it is assumed the program will call a function that issues int 80h, rather 
 than issuing int 80h directly, in the developer's handbook. But I can't 
 figure out why the second example doesn't work. Is the call instruction 
 pushing the value onto the stack in addition to pushing the instruction 
 pointer on? 
 
 Thank you in advance.
 PS I'm not on the list. 

First off, why are you using asm for userland stuff?  Secondly, the kernel
assumes that all the other arguments besides the syscall to execute (i.e.
%eax) are passed on the user stack.  Thus, it has to have a set location
relative to the user stack pointer to find the arguments.  It allows for
a return IP from a call instruction to be at the top of the stack.  You
can tell this by looking at syscall() in sys/i386/i386/trap.c:

params = (caddr_t)frame.tf_esp + sizeof(int);
code = frame.tf_eax;
orig_tf_eflags = frame.tf_eflags;

params is a userland pointer to the function arguments.  Adding the
sizeof(int) skips over the saved return address, or in your 3rd case,
the dummy %eax value.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-03 Thread Matthew Dillon
:You need 8K for this, minimally: 4K that's RO user/RW kernel and
:4K that's RW user/RW kernel.  You can use it for things like zero
:system call getpid/getuid/etc..
:
:It's also worth a page doubly mapped, with the second mapping with
:the PG_G bit set on it (to make it RO visible to user space at the
:sampe place in all programs) to hold the timecounter information;
:the current timecounter implementation, with a scad of structures,
:is both wasteful and unnecessary, given that pointer assigns are
:atomic, so you can implement with only two, which only take a small
:part of the page.  Doing this, you can use a pointer reference and
:a structure assign, and a compare-pointer-afterwards to make a zero
:system call gettimeofday() and other calls (consider the benefits
:to Apache, SQID, and other programs that have hard logging with
:timestamp requirements).
:
:I've also been toying with the idea of putting environp ** in a COW
:page, and dealing with changes as a fixup operation in the fault
:handler (really, environp needs to die, to make way for logical name
:tables; it persists because POSIX and SuS demand that it persist).
:
:So, Matt... how does the modified message based system call
:interface fare in a before-and-after with lmbench?  8-) 8-).
:
:-- Terry

In regards to using shared memory to optimize certain things, like
the time functions... I believe it will be possible to do this in a
portable fashion by going through the emulation layer (which is really
misnamed... basically the syscall interface layer which will run in
userspace but whos code will be managed by the kernel).  We don't want
the user program having direct knowledge of memory mapped gizmos, but
it's perfectly acceptable for the emulation layer to have this knowledge,
because the emulation layer will know how to fall back if the gizmo
isn't where it's supposed to be (i.e. you boot an older kernel with a
newer userland).

The messaging code works fairly well so far performance-wise.   My
original getuid() vs messaging getuid() test on a DELL 2550 (SMP build)
comes out:

getuid() 0.974s 1088300 loops =  0.895uS/loop
getuid() 0.937s 1047500 loops =  0.894uS/loop
getuid() 0.971s 1084700 loops =  0.895uS/loop
getuid_msg() 1.029s 935400 loops =  1.100uS/loop
getuid_msg() 0.970s 881600 loops =  1.100uS/loop
getuid_msg() 0.963s 876600 loops =  1.099uS/loop

About 200ns slower, and I haven't really tried to optimize any of it
yet (and I'm not going to worry about it until a lot more work has
been accomplished anyway).

-Matt
Matthew Dillon 
[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-02 Thread Shawn
On Fri, 2003-08-01 at 02:59, Terry Lambert wrote:
 Personally, I like to look at the Linux register-based passing
 mechanism in the same light that they look at the FreeBSD use
 of the MMU hardware to assist VM, at the cost of increased
 FreeBSD VM system complexity (i.e. they think our VM is too
 convoluted, and we think their system calls are too convoluted).

Maybe, but they also support a lot of MMU-less architectures, so it may
have made things simpler for them to not depend on MMU. I wonder if NUMA
had any bearing on that as well...

-- 
Shawn [EMAIL PROTECTED]
http://drevil.warpcore.org/

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-02 Thread Kip Macy
 Maybe, but they also support a lot of MMU-less architectures, so it may
 have made things simpler for them to not depend on MMU. I wonder if NUMA
 had any bearing on that as well...


No. The initial design of their VM greatly preceded NUMA and uCLinux. It
actually makes the system less portable in that it can require interspersing of
machine dependent code in the machine independent parts when the machines page
table layout differs from the default. The introduction to the UVM thesis has
some good points in this regard.


-Kip

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-02 Thread Shawn
On Sat, 2003-08-02 at 16:23, Kip Macy wrote:
 table layout differs from the default. The introduction to the UVM thesis has
 some good points in this regard.

UVM thesis?

(I am subscribed to list.)
-- 
Shawn [EMAIL PROTECTED]
http://drevil.warpcore.org/

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-02 Thread Kip Macy
UVM was the VM system that replaced the old Mach style VM in NetBSD and (I
believe) OpenBSD. FreeBSD has already cleaned up a lot of the problems that
UVM addresses. However, there are still some things that could be done to 
make map passing easier, which I believe would make zero-copy support cleaner 
if not faster.

This is the UVM home page:
http://ccrc.wustl.edu/pub/chuck/tech/uvm/

-Kip

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-02 Thread Matthew Dillon
It's a toss up.   Linux's use of registers for syscall args is not 
a panacea, because many system calls require more then just the basic
call-used registers which means the linux userland code has to save
and restore those registers on the stack anyway.  And we won't even
mention syscalls that require more arguments then you have registers
for.  Oops!

I think the ultimate performance solution is to have some explicitly
shared memory between kerneland and userland and store the arguments,
error code, and return value there.  Being a fairly small package of
memory multi-threading would not be an issue as each thread would have
its own slot.

Of course, FreeBSD has other problems as well.. if you look at our syscall
entry and exit code it is simply *horrendous*  It goes through an
indirect jump and it has to check carry for error returns to set the
error code.  It is a real mess.

What I am doing in DragonFly is reimplementing system calls as messages.
It turns out that the overhead is not signifcantly greater (and I haven't
even tried optimizing the code yet).  With the system call as a message
the return code and error are stored in the message itself which means
that there are *NO* no special cases and, at least for IA32, the kernel
messaging interface has just three fixed arguments which fit easily
in the call-used registers.  I've been able to get rid of proc-p_retval[],
a considerable amount of unnecessary proc/thread pointer passing,
and the mechanism is flexible enough to support both synchronous 
and asynchronous operation as well as both single and multi-threaded
operation using the same exact code.  Additionally, the mechanism can
be extended to support chained system calls (i.e. issue several system
calls at once), and transactional sequences.  Of course, I haven't 
actually done all of this yet... I have the messaging infrastructure in
place and I am working on asynchronizing I/O and timing system calls
(e.g. sleep(), read(), write(), select(), etc..), which is going to take
a lot of work, but once that is done the sky is the limit.

-Matt

:On Fri, 2003-08-01 at 02:59, Terry Lambert wrote:
: Personally, I like to look at the Linux register-based passing
: mechanism in the same light that they look at the FreeBSD use
: of the MMU hardware to assist VM, at the cost of increased
: FreeBSD VM system complexity (i.e. they think our VM is too
: convoluted, and we think their system calls are too convoluted).
:
:Maybe, but they also support a lot of MMU-less architectures, so it may
:have made things simpler for them to not depend on MMU. I wonder if NUMA
:had any bearing on that as well...
:
:-- 
:Shawn [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-02 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Matthew Dillon [EMAIL PROTECTED] writes:
: Additionally, the mechanism can
: be extended to support chained system calls (i.e. issue several system
: calls at once), and transactional sequences.

VMS has done this for a long time. :-) All of their system calls were
asynchronous.  The synchronous versions are done in userland by
calling async version plus wait.

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-02 Thread Terry Lambert
Matthew Dillon wrote:
 I think the ultimate performance solution is to have some explicitly
 shared memory between kerneland and userland and store the arguments,
 error code, and return value there.  Being a fairly small package of
 memory multi-threading would not be an issue as each thread would have
 its own slot.

You need 8K for this, minimally: 4K that's RO user/RW kernel and
4K that's RW user/RW kernel.  You can use it for things like zero
system call getpid/getuid/etc..

It's also worth a page doubly mapped, with the second mapping with
the PG_G bit set on it (to make it RO visible to user space at the
sampe place in all programs) to hold the timecounter information;
the current timecounter implementation, with a scad of structures,
is both wasteful and unnecessary, given that pointer assigns are
atomic, so you can implement with only two, which only take a small
part of the page.  Doing this, you can use a pointer reference and
a structure assign, and a compare-pointer-afterwards to make a zero
system call gettimeofday() and other calls (consider the benefits
to Apache, SQID, and other programs that have hard logging with
timestamp requirements).

I've also been toying with the idea of putting environp ** in a COW
page, and dealing with changes as a fixup operation in the fault
handler (really, environp needs to die, to make way for logical name
tables; it persists because POSIX and SuS demand that it persist).

So, Matt... how does the modified message based system call
interface fare in a before-and-after with lmbench?  8-) 8-).

-- Terry
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-08-01 Thread Terry Lambert
Ryan Sommers wrote:
 When making a system call to the kernel why is it necessary to push the
 syscall value onto the stack when you don't call another function?

The stack is visible in both user space and kernel space; in
general, the register space won't be, unless you are on an
architecture with an abundance of registers that doesn't do a
save/restore on trap entries.

By pushing it onto the stack, you are *positive* that the vale
is visible.

There is also the (small) possibility that the C compiler will
take advanatage of the calling conventions to assume that a
value will not change over a system call.  Short of declaring
that all registers are volatile, you can't really guarantee
that the registers pushed in will have the values after the
call that they had before the call, unless you save and restore
all of them (which is more expensive than the copyin, for system
calls with 3 arguments or less -- which is most of them; cost,
of course, will vary by architecture).

Personally, I like to look at the Linux register-based passing
mechanism in the same light that they look at the FreeBSD use
of the MMU hardware to assist VM, at the cost of increased
FreeBSD VM system complexity (i.e. they think our VM is too
convoluted, and we think their system calls are too convoluted).

-- Terry
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Assembly Syscall Question

2003-07-31 Thread Ryan Sommers
When making a system call to the kernel why is it necessary to push the 
syscall value onto the stack when you don't call another function? 

Example: 

access.the.bsd.kernel:
int 80h
ret 

func:
mov eax, 4; Write
call access.the.bsd.kernel
; End 

Works. However:
func:
mov eax, 4; Write
int 80h
; End 

Doesn't. 

Now, if you change it to: 

func:
mov eax, 4; Write
push eax
int 80h
; End 

It does work. I was able to find, By default, the FreeBSD kernel uses the C 
calling convention. Further, although the kernel is accessed using int 80h, 
it is assumed the program will call a function that issues int 80h, rather 
than issuing int 80h directly, in the developer's handbook. But I can't 
figure out why the second example doesn't work. Is the call instruction 
pushing the value onto the stack in addition to pushing the instruction 
pointer on? 

Thank you in advance.
PS I'm not on the list. 



--
Ryan leadZERO Sommers
Gamer's Impact President
[EMAIL PROTECTED]
ICQ: 1019590
AIM/MSN: leadZERO 

-= http://www.gamersimpact.com =- 
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-07-31 Thread Marco van de Voort
 When making a system call to the kernel why is it necessary to push the 
 syscall value onto the stack when you don't call another function? 

You have to push anything the size of an address. Because the call return
pushes the return adress on the stack. The 1st and 3rd both push 4 bytes,
so they work.

_why_ this is needed is probably because the routine that does the int 80h
can also check and process the int 80h returnvalue and errorcode, and people
wanted to avoid duplication of that code, at least that is my guess :-)

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Assembly Syscall Question

2003-07-31 Thread Ruslan Ermilov
On Thu, Jul 31, 2003 at 04:12:27PM -0400, Ryan Sommers wrote:
 When making a system call to the kernel why is it necessary to push the 
 syscall value onto the stack when you don't call another function? 
 
 Example: 
 
 access.the.bsd.kernel:
 int 80h
 ret 
 
 func:
 mov eax, 4; Write
 call access.the.bsd.kernel
 ; End 
 
 Works. However:
 func:
 mov eax, 4; Write
 int 80h
 ; End 
 
 Doesn't. 
 
This is because in a C library, all system calls are wrapped into
C functions, so the stack looks like this when in the syscall
code in libc:

return address to a program
syscall args

So the kernel knows how to account for a return address to access
actual arguments.

So when calling the kernel directly (not through a C library
wrapper function), we need to align the stack to fake the kernel
we're calling it from the syscall code in libc.


Cheers,
-- 
Ruslan Ermilov  Sysadmin and DBA,
[EMAIL PROTECTED]   Sunbay Software Ltd,
[EMAIL PROTECTED]   FreeBSD committer


pgp0.pgp
Description: PGP signature