Re: Fast syscalls via sysenter

2012-10-14 Thread Daniil Cherednik

On 06/23/2012 08:58 PM, Konstantin Belousov wrote:

On Sat, Jun 23, 2012 at 02:17:53PM +0800, David Xu wrote:

On 2012/06/21 20:11, John Baldwin wrote:

On Monday, June 18, 2012 2:56:30 pm Daniil Cherednik wrote:

Hi!

I am trying to continue the work started by DavidXu on implemention of
fast
syscalls via sysenter/sysexit.
http://people.freebsd.org/~davidxu/sysenter/kernel/
I have ported it on FreeBSD9. It looks like it works. Unfortunately I am a
beginner in kernel so I have some questions:

1. see http://people.freebsd.org/~davidxu/sysenter/kernel/kernel.patch
/*
* If %edx was changed, we can not use sysexit, because it
* needs %edx to restore userland %eip.
*/
if (orig_edx != frame.tf_edx)
td-td_pcb-pcb_flags |= PCB_FULLCTX;

What is the reason why we have to do this additional check? In
http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
we store %edx to the stack in
pushl %edx  /* ring 3 next %eip */
and we restore the register in
popl%edx/* ring 3 %eip */

Some system calls return two return values (pipe(2)) or return a 64-bit
off_t (lseek(2)).  Those system calls change %edx's value and need that
changed value to make it out to userland.


2. see http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
movlPCPU(CURPCB),%esi
callsyscall

Why do we  movl PCPU(CURPCB),%esi before calling syscall? syscall is just
c-
function.

No clue on this one, looks like it is not needed.


[kib@ is cc'ed]
I implemented the sysenter syscall long time ago, it indeed can reduce
system call overhead on i386. I think it might be the time to implement
linux like vdso syscall now based on the work kib@ recently has done,
though I don''t know how to hook it into kib's code.
I quick googled it, and found they put some data into aux vector:
http://www.trilithium.com/johan/2005/08/linux-gate/
http://www.takatan.net/lxr/source/arch/um/os-Linux/elf_aux.c?a=x86_64#L40

Yes, intent is to eventually switch to VDSO from current situation were
libc is aware of shared page content. This was extensively discussed in
flame that resulted in me writing the current gettimeofday(2) patch.
It was arch@ several weeks ago, AFAIR.

Committed gettimeofday() code structure allows for VDSO interposing without
breaking normal symbol visibility rules.

I do not see a sense in implementing syscall or sysenter support for
i386 kernel. On the other hand, using syscall for 32bit binaries on amd64
looks reasonable.

I was not able to write some time, sorry.
So. What about implementing vdso now? I know it was a patch and feature 
request 
http://lists.freebsd.org/pipermail/freebsd-bugs/2010-April/039597.html


About sysenter: I have ported sysenter patch for 9.0-RELEASE-p4, it 
looks fine. I made some fixes in SYS.h. The reason is (if i understand 
it right) we have to get elf without DT_TEXTREL in ld-elf.so

You can find the patch here:
https://redmine.sportcomitet.org/projects/dev-freebsd/repository/revisions/master/raw/sysenter.patch
https://redmine.sportcomitet.org/projects/dev-freebsd/repository/revisions/master/raw/sys/i386/i386/sysenter.s

But now, this patch breaks compatibility with i386 XEN PV kernel. I 
wanted to fix it, but without VDSO it would be limited solution. It is 
one of reasons why I am interested about vdso status.


So, about using 32bit binaries on amd64. It is reasonable. But if we 
will use it I think we have to implement vdso support in i386 kernel too 
for compatibility and it is better to implement sysenter too.







___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Fast syscalls via sysenter

2012-06-23 Thread David Xu

On 2012/06/21 20:11, John Baldwin wrote:

On Monday, June 18, 2012 2:56:30 pm Daniil Cherednik wrote:

Hi!

I am trying to continue the work started by DavidXu on implemention of fast
syscalls via sysenter/sysexit.
http://people.freebsd.org/~davidxu/sysenter/kernel/
I have ported it on FreeBSD9. It looks like it works. Unfortunately I am a
beginner in kernel so I have some questions:

1. see http://people.freebsd.org/~davidxu/sysenter/kernel/kernel.patch
/*
* If %edx was changed, we can not use sysexit, because it
* needs %edx to restore userland %eip.
*/
if (orig_edx != frame.tf_edx)
td-td_pcb-pcb_flags |= PCB_FULLCTX;

What is the reason why we have to do this additional check? In
http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
we store %edx to the stack in
pushl %edx  /* ring 3 next %eip */
and we restore the register in
popl%edx/* ring 3 %eip */

Some system calls return two return values (pipe(2)) or return a 64-bit
off_t (lseek(2)).  Those system calls change %edx's value and need that
changed value to make it out to userland.


2. see http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
movlPCPU(CURPCB),%esi
callsyscall

Why do we  movl PCPU(CURPCB),%esi before calling syscall? syscall is just c-
function.

No clue on this one, looks like it is not needed.


[kib@ is cc'ed]
I implemented the sysenter syscall long time ago, it indeed can reduce
system call overhead on i386. I think it might be the time to implement
linux like vdso syscall now based on the work kib@ recently has done,
though I don''t know how to hook it into kib's code.
I quick googled it, and found they put some data into aux vector:
http://www.trilithium.com/johan/2005/08/linux-gate/
http://www.takatan.net/lxr/source/arch/um/os-Linux/elf_aux.c?a=x86_64#L40

Regards,
David Xu

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Fast syscalls via sysenter

2012-06-23 Thread Konstantin Belousov
On Sat, Jun 23, 2012 at 02:17:53PM +0800, David Xu wrote:
 On 2012/06/21 20:11, John Baldwin wrote:
 On Monday, June 18, 2012 2:56:30 pm Daniil Cherednik wrote:
 Hi!
 
 I am trying to continue the work started by DavidXu on implemention of 
 fast
 syscalls via sysenter/sysexit.
 http://people.freebsd.org/~davidxu/sysenter/kernel/
 I have ported it on FreeBSD9. It looks like it works. Unfortunately I am a
 beginner in kernel so I have some questions:
 
 1. see http://people.freebsd.org/~davidxu/sysenter/kernel/kernel.patch
 /*
 * If %edx was changed, we can not use sysexit, because it
 * needs %edx to restore userland %eip.
 */
 if (orig_edx != frame.tf_edx)
 td-td_pcb-pcb_flags |= PCB_FULLCTX;
 
 What is the reason why we have to do this additional check? In
 http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
 we store %edx to the stack in
 pushl %edx  /* ring 3 next %eip */
 and we restore the register in
 popl%edx/* ring 3 %eip */
 Some system calls return two return values (pipe(2)) or return a 64-bit
 off_t (lseek(2)).  Those system calls change %edx's value and need that
 changed value to make it out to userland.
 
 2. see http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
 movlPCPU(CURPCB),%esi
 callsyscall
 
 Why do we  movl PCPU(CURPCB),%esi before calling syscall? syscall is just 
 c-
 function.
 No clue on this one, looks like it is not needed.
 
 [kib@ is cc'ed]
 I implemented the sysenter syscall long time ago, it indeed can reduce
 system call overhead on i386. I think it might be the time to implement
 linux like vdso syscall now based on the work kib@ recently has done,
 though I don''t know how to hook it into kib's code.
 I quick googled it, and found they put some data into aux vector:
 http://www.trilithium.com/johan/2005/08/linux-gate/
 http://www.takatan.net/lxr/source/arch/um/os-Linux/elf_aux.c?a=x86_64#L40

Yes, intent is to eventually switch to VDSO from current situation were
libc is aware of shared page content. This was extensively discussed in
flame that resulted in me writing the current gettimeofday(2) patch.
It was arch@ several weeks ago, AFAIR.

Committed gettimeofday() code structure allows for VDSO interposing without
breaking normal symbol visibility rules.

I do not see a sense in implementing syscall or sysenter support for
i386 kernel. On the other hand, using syscall for 32bit binaries on amd64
looks reasonable.


pgp459emKisb7.pgp
Description: PGP signature


Re: Fast syscalls via sysenter

2012-06-21 Thread John Baldwin
On Monday, June 18, 2012 2:56:30 pm Daniil Cherednik wrote:
 Hi!
 
 I am trying to continue the work started by DavidXu on implemention of fast 
 syscalls via sysenter/sysexit.
 http://people.freebsd.org/~davidxu/sysenter/kernel/
 I have ported it on FreeBSD9. It looks like it works. Unfortunately I am a 
 beginner in kernel so I have some questions:
 
 1. see http://people.freebsd.org/~davidxu/sysenter/kernel/kernel.patch
 /*
 * If %edx was changed, we can not use sysexit, because it
 * needs %edx to restore userland %eip.
 */
 if (orig_edx != frame.tf_edx)
   td-td_pcb-pcb_flags |= PCB_FULLCTX;
 
 What is the reason why we have to do this additional check? In 
 http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s 
 we store %edx to the stack in
 pushl %edx/* ring 3 next %eip */
 and we restore the register in
 popl  %edx/* ring 3 %eip */

Some system calls return two return values (pipe(2)) or return a 64-bit
off_t (lseek(2)).  Those system calls change %edx's value and need that
changed value to make it out to userland.

 2. see http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
 movl  PCPU(CURPCB),%esi
 call  syscall
 
 Why do we  movl PCPU(CURPCB),%esi before calling syscall? syscall is just c-
 function.

No clue on this one, looks like it is not needed.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Fast syscalls via sysenter

2012-06-18 Thread Daniil Cherednik
Hi!

I am trying to continue the work started by DavidXu on implemention of fast 
syscalls via sysenter/sysexit.
http://people.freebsd.org/~davidxu/sysenter/kernel/
I have ported it on FreeBSD9. It looks like it works. Unfortunately I am a 
beginner in kernel so I have some questions:

1. see http://people.freebsd.org/~davidxu/sysenter/kernel/kernel.patch
/*
* If %edx was changed, we can not use sysexit, because it
* needs %edx to restore userland %eip.
*/
if (orig_edx != frame.tf_edx)
td-td_pcb-pcb_flags |= PCB_FULLCTX;

What is the reason why we have to do this additional check? In 
http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s 
we store %edx to the stack in
pushl %edx  /* ring 3 next %eip */
and we restore the register in
popl%edx/* ring 3 %eip */

2. see http://people.freebsd.org/~davidxu/sysenter/kernel/sysenter.s
movlPCPU(CURPCB),%esi
callsyscall

Why do we  movl PCPU(CURPCB),%esi before calling syscall? syscall is just c-
function.


-- 
Daniil Cherednik

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org