Re: + elf-loader-crash-while-zero-filling-bss.patch added to -mm tree
On Wed, Feb 13, 2008 at 12:15:06AM -0800, [EMAIL PROTECTED] wrote: > Subject: Elf loader crash while zero-filling .bss > From: "Abel Bernabeu" <[EMAIL PROTECTED]> > > I've finally found a solution for the crash in load_binary_elf I > reported last week: > > http://lkml.org/lkml/2008/1/30/171 > > The attached patch solves my problem. > > set_brk(start, end) allocs just page aligned regions (by "collapsing" both > extremes to the start of the page in which they lay)... That means than > even if both pointers are not equal there are still some chances that > set_brk has allocated no space at all because ELF_PAGEALIGN(elf_bss) == > ELF_PAGEALIGN(elf_brk). > > So the condition was not correct. This patch is wrong. ELF_PAGEALIGN rounds up to the end of the page, not down to the start of the page. If elf_bss is in the middle of a page, set_brk allocates any additional pages after the one already allocated. elf_bss is the start of the area that needs to be zero initialized, elf_brk is its end. So if elf_bss != elf_brk then there's garbage mapped in BSS from the file and if you don't clear it some of your zero-initialized variables won't be zero initialized at all. In the linked message, set_brk is passed elf_bss so its actual arguments are set_brk (0xa3801, 0x000a4ec8). It should map one page. 0xa3801 should be an already mapped page, and clear_user should succeed in clearing it. -- Daniel Jacobowitz CodeSourcery -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + elf-loader-crash-while-zero-filling-bss.patch added to -mm tree
On Wed, Feb 13, 2008 at 12:15:06AM -0800, [EMAIL PROTECTED] wrote: Subject: Elf loader crash while zero-filling .bss From: Abel Bernabeu [EMAIL PROTECTED] I've finally found a solution for the crash in load_binary_elf I reported last week: http://lkml.org/lkml/2008/1/30/171 The attached patch solves my problem. set_brk(start, end) allocs just page aligned regions (by collapsing both extremes to the start of the page in which they lay)... That means than even if both pointers are not equal there are still some chances that set_brk has allocated no space at all because ELF_PAGEALIGN(elf_bss) == ELF_PAGEALIGN(elf_brk). So the condition was not correct. This patch is wrong. ELF_PAGEALIGN rounds up to the end of the page, not down to the start of the page. If elf_bss is in the middle of a page, set_brk allocates any additional pages after the one already allocated. elf_bss is the start of the area that needs to be zero initialized, elf_brk is its end. So if elf_bss != elf_brk then there's garbage mapped in BSS from the file and if you don't clear it some of your zero-initialized variables won't be zero initialized at all. In the linked message, set_brk is passed elf_bss so its actual arguments are set_brk (0xa3801, 0x000a4ec8). It should map one page. 0xa3801 should be an already mapped page, and clear_user should succeed in clearing it. -- Daniel Jacobowitz CodeSourcery -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PAGE_SIZE Availability Inconsistency
On Thu, Mar 08, 2007 at 04:08:52PM +, Christoph Hellwig wrote: > No, no no. We should never export PAGE_SIZE. We might export NBPG > as deprecated symbol for gdb if it really needs it, but that should > happen only on a.out systems, and it it should be a true constant, > not depending on PAGE_SIZE. > > I've Cc'ed the gdb list on whether they have any comments on this > issue. Sounds reasonable. I do not believe that GDB has any dependence on PAGE_SIZE; bfd (i.e. both gdb and binutils) use NBPG on a large number of systems. Looks like i386, alpha, m68k, s390, vax - but don't quote me on that, I had to guess from the configure script. -- Daniel Jacobowitz CodeSourcery - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PAGE_SIZE Availability Inconsistency
On Thu, Mar 08, 2007 at 04:08:52PM +, Christoph Hellwig wrote: No, no no. We should never export PAGE_SIZE. We might export NBPG as deprecated symbol for gdb if it really needs it, but that should happen only on a.out systems, and it it should be a true constant, not depending on PAGE_SIZE. I've Cc'ed the gdb list on whether they have any comments on this issue. Sounds reasonable. I do not believe that GDB has any dependence on PAGE_SIZE; bfd (i.e. both gdb and binutils) use NBPG on a large number of systems. Looks like i386, alpha, m68k, s390, vax - but don't quote me on that, I had to guess from the configure script. -- Daniel Jacobowitz CodeSourcery - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Contents of core dumps
On Tue, Jan 02, 2007 at 08:57:21PM -0800, David Miller wrote: > So I'd say we should just put this change in, as-is. It fixes bugs, > and in all the time that has passed since my initial posting there > has not been any serious dissent. Fine with me. In that case, I will wait until the kernel is fixed, verify it, and then probably adjust the GDB test to pass on either patched or unpatched kernels. -- Daniel Jacobowitz CodeSourcery - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Contents of core dumps
On Tue, Jan 02, 2007 at 08:57:21PM -0800, David Miller wrote: So I'd say we should just put this change in, as-is. It fixes bugs, and in all the time that has passed since my initial posting there has not been any serious dissent. Fine with me. In that case, I will wait until the kernel is fixed, verify it, and then probably adjust the GDB test to pass on either patched or unpatched kernels. -- Daniel Jacobowitz CodeSourcery - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Contents of core dumps (was: Re: fs/binfmt_elf.c:maydump())
[Please CC, I am not subscribed to lkml.] On Thu, Apr 06, 2006 at 10:18:07PM -0700, David S. Miller wrote: > How about something like the following patch? If it's executable > and not written to, skip it. This would skip the main executable > image and all text segments of the shared libraries mapped in. I've been going through GDB test failures (... again...) and I'm down to a respectably small number on x86_64, but this is one of the remaining ones. I don't suppose there's been any change since we discussed this in April? A refresher for those following along: there's a GDB test that mmaps a file using MAP_PRIVATE and PROT_WRITE. It expects the contents to end up in the core dump. Right now, they don't. I can fix the test by making sure it writes to the mapping, but before I change the test, I want to raise the question of what _should_ be in a core dump. I took a peek at what Solaris includes in core dumps. They offer (not surprisingly) a pile of configuration options. The default is just about everything except for file-backed shared memory and some symbol table data - it includes text segments, rodata, anonymous shared memory, file backed mappings, et cetera. I guess that's another argument in favor of dumping more. Then you can control it globally, per process, et cetera. http://src.opensolaris.org/source/xref/loficc/crypto/usr/src/uts/common/sys/corectl.h I also checked an AIX manual since there was a reference to SA_FULLDUMP in the GDB test: By default, the user data, anonymously mapped regions, and vm_infox structures are not included in a core dump. This partial core dump includes the current thread stack, the thread thrdctx structures, the user structure, and the state of the registers at the time of the fault. A partial core dump contains sufficient information for a stack traceback. The size of a core dump can also be limited by the setrlimit or setrlimit64 subroutine. To enable a full core dump, set the SA_FULLDUMP flag in the sigaction subroutine for the signal that is to generate a full core dump. If this flag is set when the core is dumped, the user data section, vm_infox, and anonymously mapped region structures are included in the core dump. Not really sure what that translates to, but it's less than what Solaris dumps, I think. Does Linux need knobs for this? > > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index 537893a..9ec5c2b 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -1167,8 +1167,10 @@ static int maydump(struct vm_area_struct > if (vma->vm_flags & VM_SHARED) > return vma->vm_file->f_dentry->d_inode->i_nlink == 0; > > - /* If it hasn't been written to, don't write it out */ > - if (!vma->anon_vma) > + /* If it is executable and hasn't been written to, > + * don't write it out. > + */ > + if ((vma->vm_flags & VM_EXEC) && !vma->anon_vma) > return 0; > > return 1; > > -- Daniel Jacobowitz CodeSourcery - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Re: [Bug 7210] New: Clone flag CLONE_PARENT_TIDPTR leaves invalid results in memory.
From: Daniel Jacobowitz <[EMAIL PROTECTED]> Do not implement CLONE_PARENT_SETTID until we know that clone will succeed. If we do it too early NPTL's data structures temporarily reference a non-existant TID. Signed-off-by: Daniel Jacobowitz <[EMAIL PROTECTED]> --- On Tue, Sep 26, 2006 at 08:59:15PM -0700, Linus Torvalds wrote: > > > On Tue, 26 Sep 2006, Roland McGrath wrote: > > > > It can go last, right before return, after unlock. > > Userland only cares that parent_tidptr set before parent syscall returns, > > and child_tidptr set before child returns. > > Ok, as long as people are sure, I don't care. Then we have to just ignore > the error, though, since we can't recover (we've already "exposed" the > child on the task lists). > > I don't think it's a big deal. Ignoring the error just means that if you > pass in an invalid ptr, it's as if the bit to set that value wasn't set. > Not a problem. > > Especially if there is a test-program, can we just have a patch to try > that has been verified? It _sounded_ like somebody actually had a program > that could trigger this with some horrid code that sent signals and cloned > all the time? I never got back to you about this... Refresher, if there isn't enough above: CLONE_PARENT_SETTID is currently implemented right after a TID is assigned. There's a lot of clone left to go at that point including a check for pending signals which can lead to clone failing. This leaves a TID in NPTL's thread list which doesn't correspond to a thread. I found Sunday another place where this is a problem, besides the process-global UID stuff in glibc. GDB tries to attach to the nonexistant thread and gets upset. I've made it cope, but at the same time it provides a convenient test case. Without the attached patch, tls.exp in the GDB testsuite would intermittently report that it could not attach to a thread - always within half an hour. With the patch it ran for four hours without a problem. kernel/fork.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) Index: linux-source-2.6.18/kernel/fork.c === --- linux-source-2.6.18.orig/kernel/fork.c 2007-01-02 13:45:28.0 -0500 +++ linux-source-2.6.18/kernel/fork.c 2007-01-02 13:52:09.0 -0500 @@ -1012,10 +1012,6 @@ static struct task_struct *copy_process( delayacct_tsk_init(p); /* Must remain after dup_task_struct() */ copy_flags(clone_flags, p); p->pid = pid; - retval = -EFAULT; - if (clone_flags & CLONE_PARENT_SETTID) - if (put_user(p->pid, parent_tidptr)) - goto bad_fork_cleanup_delays_binfmt; INIT_LIST_HEAD(>children); INIT_LIST_HEAD(>sibling); @@ -1251,6 +1247,14 @@ static struct task_struct *copy_process( total_forks++; spin_unlock(>sighand->siglock); write_unlock_irq(_lock); + + /* +* Now that we know the fork has succeeded, record the new +* TID. It's too late to back out if this fails. +*/ + if (clone_flags & CLONE_PARENT_SETTID) + put_user(p->pid, parent_tidptr); + proc_fork_connector(p); return p; @@ -1281,7 +1285,6 @@ bad_fork_cleanup_policy: bad_fork_cleanup_cpuset: #endif cpuset_exit(p); -bad_fork_cleanup_delays_binfmt: delayacct_tsk_free(p); if (p->binfmt) module_put(p->binfmt->module); -- Daniel Jacobowitz CodeSourcery - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Re: [Bug 7210] New: Clone flag CLONE_PARENT_TIDPTR leaves invalid results in memory.
From: Daniel Jacobowitz [EMAIL PROTECTED] Do not implement CLONE_PARENT_SETTID until we know that clone will succeed. If we do it too early NPTL's data structures temporarily reference a non-existant TID. Signed-off-by: Daniel Jacobowitz [EMAIL PROTECTED] --- On Tue, Sep 26, 2006 at 08:59:15PM -0700, Linus Torvalds wrote: On Tue, 26 Sep 2006, Roland McGrath wrote: It can go last, right before return, after unlock. Userland only cares that parent_tidptr set before parent syscall returns, and child_tidptr set before child returns. Ok, as long as people are sure, I don't care. Then we have to just ignore the error, though, since we can't recover (we've already exposed the child on the task lists). I don't think it's a big deal. Ignoring the error just means that if you pass in an invalid ptr, it's as if the bit to set that value wasn't set. Not a problem. Especially if there is a test-program, can we just have a patch to try that has been verified? It _sounded_ like somebody actually had a program that could trigger this with some horrid code that sent signals and cloned all the time? I never got back to you about this... Refresher, if there isn't enough above: CLONE_PARENT_SETTID is currently implemented right after a TID is assigned. There's a lot of clone left to go at that point including a check for pending signals which can lead to clone failing. This leaves a TID in NPTL's thread list which doesn't correspond to a thread. I found Sunday another place where this is a problem, besides the process-global UID stuff in glibc. GDB tries to attach to the nonexistant thread and gets upset. I've made it cope, but at the same time it provides a convenient test case. Without the attached patch, tls.exp in the GDB testsuite would intermittently report that it could not attach to a thread - always within half an hour. With the patch it ran for four hours without a problem. kernel/fork.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) Index: linux-source-2.6.18/kernel/fork.c === --- linux-source-2.6.18.orig/kernel/fork.c 2007-01-02 13:45:28.0 -0500 +++ linux-source-2.6.18/kernel/fork.c 2007-01-02 13:52:09.0 -0500 @@ -1012,10 +1012,6 @@ static struct task_struct *copy_process( delayacct_tsk_init(p); /* Must remain after dup_task_struct() */ copy_flags(clone_flags, p); p-pid = pid; - retval = -EFAULT; - if (clone_flags CLONE_PARENT_SETTID) - if (put_user(p-pid, parent_tidptr)) - goto bad_fork_cleanup_delays_binfmt; INIT_LIST_HEAD(p-children); INIT_LIST_HEAD(p-sibling); @@ -1251,6 +1247,14 @@ static struct task_struct *copy_process( total_forks++; spin_unlock(current-sighand-siglock); write_unlock_irq(tasklist_lock); + + /* +* Now that we know the fork has succeeded, record the new +* TID. It's too late to back out if this fails. +*/ + if (clone_flags CLONE_PARENT_SETTID) + put_user(p-pid, parent_tidptr); + proc_fork_connector(p); return p; @@ -1281,7 +1285,6 @@ bad_fork_cleanup_policy: bad_fork_cleanup_cpuset: #endif cpuset_exit(p); -bad_fork_cleanup_delays_binfmt: delayacct_tsk_free(p); if (p-binfmt) module_put(p-binfmt-module); -- Daniel Jacobowitz CodeSourcery - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Contents of core dumps (was: Re: fs/binfmt_elf.c:maydump())
[Please CC, I am not subscribed to lkml.] On Thu, Apr 06, 2006 at 10:18:07PM -0700, David S. Miller wrote: How about something like the following patch? If it's executable and not written to, skip it. This would skip the main executable image and all text segments of the shared libraries mapped in. I've been going through GDB test failures (... again...) and I'm down to a respectably small number on x86_64, but this is one of the remaining ones. I don't suppose there's been any change since we discussed this in April? A refresher for those following along: there's a GDB test that mmaps a file using MAP_PRIVATE and PROT_WRITE. It expects the contents to end up in the core dump. Right now, they don't. I can fix the test by making sure it writes to the mapping, but before I change the test, I want to raise the question of what _should_ be in a core dump. I took a peek at what Solaris includes in core dumps. They offer (not surprisingly) a pile of configuration options. The default is just about everything except for file-backed shared memory and some symbol table data - it includes text segments, rodata, anonymous shared memory, file backed mappings, et cetera. I guess that's another argument in favor of dumping more. Then you can control it globally, per process, et cetera. http://src.opensolaris.org/source/xref/loficc/crypto/usr/src/uts/common/sys/corectl.h I also checked an AIX manual since there was a reference to SA_FULLDUMP in the GDB test: By default, the user data, anonymously mapped regions, and vm_infox structures are not included in a core dump. This partial core dump includes the current thread stack, the thread thrdctx structures, the user structure, and the state of the registers at the time of the fault. A partial core dump contains sufficient information for a stack traceback. The size of a core dump can also be limited by the setrlimit or setrlimit64 subroutine. To enable a full core dump, set the SA_FULLDUMP flag in the sigaction subroutine for the signal that is to generate a full core dump. If this flag is set when the core is dumped, the user data section, vm_infox, and anonymously mapped region structures are included in the core dump. Not really sure what that translates to, but it's less than what Solaris dumps, I think. Does Linux need knobs for this? diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 537893a..9ec5c2b 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1167,8 +1167,10 @@ static int maydump(struct vm_area_struct if (vma-vm_flags VM_SHARED) return vma-vm_file-f_dentry-d_inode-i_nlink == 0; - /* If it hasn't been written to, don't write it out */ - if (!vma-anon_vma) + /* If it is executable and hasn't been written to, + * don't write it out. + */ + if ((vma-vm_flags VM_EXEC) !vma-anon_vma) return 0; return 1; -- Daniel Jacobowitz CodeSourcery - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13 SMP on AMD Athlon64 X2 + FC4: PS/2 keyboard b0rken; taskset/sched_setaffinity() saves the day!
On Tue, Sep 06, 2005 at 11:10:29PM +0200, Frank van Maarseveen wrote: > While playing with a new AMD Athlon64 X2 3800+ (i386) the keyboard goes > wild for 10 (20?) seconds, behaves normally for 10 (20?) seconds, and > then goes wild again: when "wild", every keypress results in a random > number of repeats, e.g.: > > $ pppsss aaxxxuuu > bash: pppsss: command not found > $ > $ > $ > $ > $ > $ > $ > $ > > Upgrading Xorg to xorg-x11-6.8.2-37.FC4.45 did not help. > > Booting with "nosmp" seems to fix it. And this _seems_ to fix it too: > > taskset -p 1 `ps axo comm,pid|awk '$1=="X"{print $2}'` > > I haven't seen this problem on the console. This is probably the same problem as the earlier one you reported. If you take a look at bugzilla, you'll see that the normal manifestation is messed up key repeat rates... -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13 SMP on AMD Athlon64 X2 + FC4: PS/2 keyboard b0rken; taskset/sched_setaffinity() saves the day!
On Tue, Sep 06, 2005 at 11:10:29PM +0200, Frank van Maarseveen wrote: While playing with a new AMD Athlon64 X2 3800+ (i386) the keyboard goes wild for 10 (20?) seconds, behaves normally for 10 (20?) seconds, and then goes wild again: when wild, every keypress results in a random number of repeats, e.g.: $ pppsss aaxxxuuu bash: pppsss: command not found $ $ $ $ $ $ $ $ Upgrading Xorg to xorg-x11-6.8.2-37.FC4.45 did not help. Booting with nosmp seems to fix it. And this _seems_ to fix it too: taskset -p 1 `ps axo comm,pid|awk '$1==X{print $2}'` I haven't seen this problem on the console. This is probably the same problem as the earlier one you reported. If you take a look at bugzilla, you'll see that the normal manifestation is messed up key repeat rates... -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13 SMP on Athlon X2: nanosleep returning waay to soon, clock_gettime(CLOCK_REALTIME...) proceeding too fast
On Sun, Sep 04, 2005 at 01:39:15PM +0200, Frank van Maarseveen wrote: > After replacing the kernel on a fresh FC4 install with a stock 2.6.13 > (using gcc 3.2) and my own config it appears that the clock is going too > fast: it gains at least an hour every 12 hours or so. FC4 kernel (rpm: > kernel-2.6.11-1.1369_FC4) seems ok Mind sticking this information in bugzilla.kernel.org, bug 5105? > annotated output: > > CPU0 CPU1 Total > --- > 1 0 + 251 = 251 > 2 0 + 251 = 251 > 3 0 + 251 = 251 > 4 0 + 251 = 251 > 5 0 + 251 = 251 > 6 52 + 196 = 248<== (?) > 7 251 + 0 = 251 > 8 251 + 0 = 251 > 9 251 + 0 = 251 > 10 251 + 0 = 251 > 11 251 + 0 = 251 > 12 251 + 0 = 251 > 13 251 + 0 = 251 > 14 251 + 0 = 251 > 15 251 + 0 = 251 > 16 147 + 1 = 148 <== > 17 0 + 252 = 252 Hmm, very interesting. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13 SMP on Athlon X2: nanosleep returning waay to soon, clock_gettime(CLOCK_REALTIME...) proceeding too fast
On Sun, Sep 04, 2005 at 01:39:15PM +0200, Frank van Maarseveen wrote: After replacing the kernel on a fresh FC4 install with a stock 2.6.13 (using gcc 3.2) and my own config it appears that the clock is going too fast: it gains at least an hour every 12 hours or so. FC4 kernel (rpm: kernel-2.6.11-1.1369_FC4) seems ok Mind sticking this information in bugzilla.kernel.org, bug 5105? annotated output: CPU0 CPU1 Total --- 1 0 + 251 = 251 2 0 + 251 = 251 3 0 + 251 = 251 4 0 + 251 = 251 5 0 + 251 = 251 6 52 + 196 = 248== (?) 7 251 + 0 = 251 8 251 + 0 = 251 9 251 + 0 = 251 10 251 + 0 = 251 11 251 + 0 = 251 12 251 + 0 = 251 13 251 + 0 = 251 14 251 + 0 = 251 15 251 + 0 = 251 16 147 + 1 = 148 == 17 0 + 252 = 252 Hmm, very interesting. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fix 32-bit thread debugging on x86_64
The IA32 ptrace emulation currently returns the wrong registers for fs/gs; it's returning what x86_64 calls gs_base. We need regs.gsindex in order for GDB to correctly locate the TLS area. Without this patch, the 32-bit GDB testsuite bombs on a 64-bit kernel. With it, results look about like I'd expect, although there are still a handful of kernel-related failures (vsyscall related?). Signed-off-by: Daniel Jacobowitz <[EMAIL PROTECTED]> diff -r -p -u z/linux-2.6.11/arch/x86_64/ia32/ptrace32.c linux-2.6.11/arch/x86_64/ia32/ptrace32.c --- linux-2.6.12.3.orig/arch/x86_64/ia32/ptrace32.c 2005-03-02 02:37:52.0 -0500 +++ linux-2.6.12.3/arch/x86_64/ia32/ptrace32.c 2005-07-31 15:29:48.0 -0400 @@ -43,11 +43,11 @@ static int putreg32(struct task_struct * switch (regno) { case offsetof(struct user32, regs.fs): if (val && (val & 3) != 3) return -EIO; - child->thread.fs = val & 0x; + child->thread.fsindex = val & 0x; break; case offsetof(struct user32, regs.gs): if (val && (val & 3) != 3) return -EIO; - child->thread.gs = val & 0x; + child->thread.gsindex = val & 0x; break; case offsetof(struct user32, regs.ds): if (val && (val & 3) != 3) return -EIO; @@ -138,10 +138,10 @@ static int getreg32(struct task_struct * switch (regno) { case offsetof(struct user32, regs.fs): - *val = child->thread.fs; + *val = child->thread.fsindex; break; case offsetof(struct user32, regs.gs): - *val = child->thread.gs; + *val = child->thread.gsindex; break; case offsetof(struct user32, regs.ds): *val = child->thread.ds; -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fix 32-bit thread debugging on x86_64
The IA32 ptrace emulation currently returns the wrong registers for fs/gs; it's returning what x86_64 calls gs_base. We need regs.gsindex in order for GDB to correctly locate the TLS area. Without this patch, the 32-bit GDB testsuite bombs on a 64-bit kernel. With it, results look about like I'd expect, although there are still a handful of kernel-related failures (vsyscall related?). Signed-off-by: Daniel Jacobowitz [EMAIL PROTECTED] diff -r -p -u z/linux-2.6.11/arch/x86_64/ia32/ptrace32.c linux-2.6.11/arch/x86_64/ia32/ptrace32.c --- linux-2.6.12.3.orig/arch/x86_64/ia32/ptrace32.c 2005-03-02 02:37:52.0 -0500 +++ linux-2.6.12.3/arch/x86_64/ia32/ptrace32.c 2005-07-31 15:29:48.0 -0400 @@ -43,11 +43,11 @@ static int putreg32(struct task_struct * switch (regno) { case offsetof(struct user32, regs.fs): if (val (val 3) != 3) return -EIO; - child-thread.fs = val 0x; + child-thread.fsindex = val 0x; break; case offsetof(struct user32, regs.gs): if (val (val 3) != 3) return -EIO; - child-thread.gs = val 0x; + child-thread.gsindex = val 0x; break; case offsetof(struct user32, regs.ds): if (val (val 3) != 3) return -EIO; @@ -138,10 +138,10 @@ static int getreg32(struct task_struct * switch (regno) { case offsetof(struct user32, regs.fs): - *val = child-thread.fs; + *val = child-thread.fsindex; break; case offsetof(struct user32, regs.gs): - *val = child-thread.gs; + *val = child-thread.gsindex; break; case offsetof(struct user32, regs.ds): *val = child-thread.ds; -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86-64: ptrace ia32 BP fix
On Tue, Jul 05, 2005 at 02:31:15AM -0700, Roland McGrath wrote: > > When the 32-bit vDSO is used to make a system call, the %ebp register for > the 6th syscall arg has to be loaded from the user stack (where it's pushed > by the vDSO user code). The native i386 kernel always does this before > stopping for syscall tracing, so %ebp can be seen and modified via ptrace > to access the 6th syscall argument. The x86-64 kernel fails to do this, > presenting the stack address to ptrace instead. This makes the %rbp value > seen by 64-bit ptrace of a 32-bit process, and the %ebp value seen by a > 32-bit caller of ptrace, both differ from the native i386 behavior. > > This patch fixes the problem by putting the word loaded from the user stack > into %rbp before calling syscall_trace_enter, and reloading the 6th syscall > argument from there afterwards (so ptrace can change it). This makes the > behavior match that of i386 kernels. Wouldn't this to botch a debugger which supported both backtracing and PTRACE_SYSCALL, when stopped in a syscall? We have unwind information for the VDSO and it's not going to tell us that the kernel has done something clever to the value of %ebp. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86-64: ptrace ia32 BP fix
On Tue, Jul 05, 2005 at 02:31:15AM -0700, Roland McGrath wrote: When the 32-bit vDSO is used to make a system call, the %ebp register for the 6th syscall arg has to be loaded from the user stack (where it's pushed by the vDSO user code). The native i386 kernel always does this before stopping for syscall tracing, so %ebp can be seen and modified via ptrace to access the 6th syscall argument. The x86-64 kernel fails to do this, presenting the stack address to ptrace instead. This makes the %rbp value seen by 64-bit ptrace of a 32-bit process, and the %ebp value seen by a 32-bit caller of ptrace, both differ from the native i386 behavior. This patch fixes the problem by putting the word loaded from the user stack into %rbp before calling syscall_trace_enter, and reloading the 6th syscall argument from there afterwards (so ptrace can change it). This makes the behavior match that of i386 kernels. Wouldn't this to botch a debugger which supported both backtracing and PTRACE_SYSCALL, when stopped in a syscall? We have unwind information for the VDSO and it's not going to tell us that the kernel has done something clever to the value of %ebp. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: strange incremental patch size [2.6.12-rc2 to 2.6.12-rc3]
On Thu, Apr 21, 2005 at 12:32:59PM +0200, Maciej Soltysiak wrote: > Hi, > > These are the sizes of rc2 and rc3 patches > > # ls -la patch-2.6.12* > -rw-r--r-- 1 root src 18011382 Apr 4 18:50 patch-2.6.12-rc2 > -rw-r--r-- 1 root src 19979854 Apr 21 02:29 patch-2.6.12-rc3 > > Let us make an incremental patch from rc2 to rc3 > > # interdiff patch-2.6.12-rc2 patch-2.6.12-rc3 >x > > Let us see how big it is. > # ls -ld x > -rw-r--r-- 1 root src 37421924 Apr 21 12:28 x > > How come interdiff from rc2 (18MB) to rc3 (20MB) gave me > 37MB worth of patch-code ? I would expect something about > 2MB but 40MB ? Try interdiff -p1? -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: strange incremental patch size [2.6.12-rc2 to 2.6.12-rc3]
On Thu, Apr 21, 2005 at 12:32:59PM +0200, Maciej Soltysiak wrote: Hi, These are the sizes of rc2 and rc3 patches # ls -la patch-2.6.12* -rw-r--r-- 1 root src 18011382 Apr 4 18:50 patch-2.6.12-rc2 -rw-r--r-- 1 root src 19979854 Apr 21 02:29 patch-2.6.12-rc3 Let us make an incremental patch from rc2 to rc3 # interdiff patch-2.6.12-rc2 patch-2.6.12-rc3 x Let us see how big it is. # ls -ld x -rw-r--r-- 1 root src 37421924 Apr 21 12:28 x How come interdiff from rc2 (18MB) to rc3 (20MB) gave me 37MB worth of patch-code ? I would expect something about 2MB but 40MB ? Try interdiff -p1? -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH x86_64] Live Patching Function on 2.6.11.7
On Mon, Apr 18, 2005 at 01:19:57PM +0900, Takashi Ikebe wrote: > GDB based approach seems not fit to our requirements. GDB(ptrace) based > functions are basically need to be done when target process is stopping. > In addition to that current PTRACE_PEEK/POKE* allows us to copy only a > *word* size... While true, this is easily fixable. There is even an interface precedent on OpenBSD (and possibly other platforms as well). -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386 & x86_64: Live Patching Funcion on 2.6.11.7
On Mon, Apr 18, 2005 at 10:41:23AM +0900, Takashi Ikebe wrote: > Daniel-san, > GDB based approach seems not fit to our requirements. GDB(ptrace) based > functions are basically need to be done when target process is stopping. > From our experience, sometimes patches became to dozens to hundreds at > one patching, and in this case GDB based approach cause target process's > availability descent. That's right, it does require the target process be stopped. If it isn't stopped how do you know it isn't executing the same instruction you're currently patching? Even with hundreds of kilobytes of patch, I have trouble imagining this takes a substantial amount of time. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386 & x86_64: Live Patching Funcion on 2.6.11.7
On Sat, Apr 16, 2005 at 11:44:39PM -0700, David S. Miller wrote: > > Takashi-san, have you ever investigated using kprobes to > implement this feature? It seems a perfect fit, and would > allow support on several architectures other than just x86 > and x86_64. > > If kprobes does not meet your needs completely, it could > be trivially extended to do so. > > I think implementing something like this from scratch is > not a good idea when we have much of the needed logic and > infrastructure already. Takashi-san's description was not very clear, but it sounds like it's a patching mechanism for userspace applications - not for kernel space. So kprobes would not be a good fit. If I'm right, I'm not sure why some of the bits of it were done separately instead of via the existing ptrace mechanism. And GDB would appreciate a mechanism for mmap/munmap/mprotect in a debugged process, also. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386 x86_64: Live Patching Funcion on 2.6.11.7
On Sat, Apr 16, 2005 at 11:44:39PM -0700, David S. Miller wrote: Takashi-san, have you ever investigated using kprobes to implement this feature? It seems a perfect fit, and would allow support on several architectures other than just x86 and x86_64. If kprobes does not meet your needs completely, it could be trivially extended to do so. I think implementing something like this from scratch is not a good idea when we have much of the needed logic and infrastructure already. Takashi-san's description was not very clear, but it sounds like it's a patching mechanism for userspace applications - not for kernel space. So kprobes would not be a good fit. If I'm right, I'm not sure why some of the bits of it were done separately instead of via the existing ptrace mechanism. And GDB would appreciate a mechanism for mmap/munmap/mprotect in a debugged process, also. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386 x86_64: Live Patching Funcion on 2.6.11.7
On Mon, Apr 18, 2005 at 10:41:23AM +0900, Takashi Ikebe wrote: Daniel-san, GDB based approach seems not fit to our requirements. GDB(ptrace) based functions are basically need to be done when target process is stopping. From our experience, sometimes patches became to dozens to hundreds at one patching, and in this case GDB based approach cause target process's availability descent. That's right, it does require the target process be stopped. If it isn't stopped how do you know it isn't executing the same instruction you're currently patching? Even with hundreds of kilobytes of patch, I have trouble imagining this takes a substantial amount of time. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH x86_64] Live Patching Function on 2.6.11.7
On Mon, Apr 18, 2005 at 01:19:57PM +0900, Takashi Ikebe wrote: GDB based approach seems not fit to our requirements. GDB(ptrace) based functions are basically need to be done when target process is stopping. In addition to that current PTRACE_PEEK/POKE* allows us to copy only a *word* size... While true, this is easily fixable. There is even an interface precedent on OpenBSD (and possibly other platforms as well). -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 09:56:29PM +0200, Miklos Szeredi wrote: > Well the sanity check on the "server" side is always enforced. You > can't "trick" sftp or ftp to not check permissions. So checking on > the "client" side too (where the fuse daemon is running) makes no > sense, does it? That argument doesn't make much sense to me. But we're at the end of my useful contributions to this discussion; I'm going to be quiet now and hope some folks who know more about filesystems have more useful responses. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 09:10:46PM +0200, Miklos Szeredi wrote: > > Root squashing is actually a much less obnoxious restriction. It means > > that local uid 0 doesn't automatically correspond to remote uid 0. > > I don't agree that it's less obnoxious. Root squashing and a > restricted directory (-rwx--) would have exactly the same affect: > root is denied all access. That's considerably less obnoxious, because such directories are comparatively rare; most files, root can still read. There are still a couple unintuitive cases where root has less privelege than a particular non-root user, of course. But your model gives root normally fewer privileges than the user that mounted th e FS. > > But why does the kernel need to know anything about this? Why can't > > the userspace library present the permissions appropriately to the > > kernel? > > That is exactly what you should do if you use the default_permissions > options. You set the file mode, and the kernel checks the permission. So why not make default_permissions a feature of the userspace? > > I'm going to be pretty confused if I see a mode 666 file that I > > can't even read. So will various programs. > > How would you get such I file? I don't understand. The permissions exposed by the FUSE layer apparently don't correspond to what local users can do with them. That's the problem here. It may be that I'm completely misunderstanding you - but from what you've described, the userspace daemon can mark a file's permissions as 666, and then with allow_other and allow_root off no one else will be able to read it, despite those permissions. > > Except for the allow_root bits, I think that having userspace handle > > the issue entirely would cover both objections. > > If I want to allow unprivileged users to be able to mount their > filesystems, then handling everything in userspace is not an option. > For example if you could mount a filesystem in which files have > user=root instead of your own user ID, you could probably confuse some > applications running as root, and cause information leak. That's > exactly why allow_root and allow_other are disabled for normal users. > > The only safe option that I can imagine is that the kernel will reset > the user and group fields of the file attributes. This would again > require a kernel option, but would be far less useful IMO. I think we've got a boundary problem here. You are exposing some arbitrary, user-supplied values in the permissions, and then performing sanity checks at access time; I'm suggesting performing the sanity checking on the other side, when the permissions are supplied to the kernel by the daemon. Why would it be less useful to show files that have been "created" by a user as owned by that user? Or files that the user has requested no other users be able to write as unwritable by group/other? Sure, it makes your tarfs a little less mapped onto the tar file. But that's one of the recurring objections to implementing archivers as filesystems: the ownership in the archive is _not_ relevant to the mounted copy. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 07:22:57PM +0100, Jamie Lokier wrote: > > 1) Only allow mount over a directory for which the user has write > > access (and is not sticky) > > Seems good - but why not sticky? Mounting a user filesystem in > /tmp/user-xxx/my-mount-point seems not unreasonable - provided the > administrator can delete the directory (which is possible with > detachable mount points). Because then they could mount over /tmp. "and (is not sticky || is owned by the user)" may be more appropriate. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 05:56:09PM +0200, Miklos Szeredi wrote: > > > 3) No other user should have access to files under the mount, not > > > even root[5] > > > > > [5] Obviously root cannot be restricted, but accidental access to > > > private data is still a good idea. E.g. root squashing by NFS servers > > > has a similar affect. > > > > Could you explain a little more? I don't see the point in denying > > access to root, but I also can't tell from your explanation whether you > > do or not. > > Fuse by default does. This can be disabled by one of two mount > options: "allow_other" and "allow_root". The former implies the > later. These mount options are only allowed for mounting by root, but > this can be relaxed with a configuration option. So the behavior that Cristoph was objecting to here is in fact configurable? > > I don't really see the point of this restriction, anyway. Could you > > explain why this shouldn't be a matter of policy, and kept out of the > > kernel? Have the userspace file servers default to putting restrictive > > permissions on mounts unless requested otherwise. > > That's an option. However you can't restrict root that way, and you > need an extra directory, since permissions on the mountpoint are > ignored after the mount. No, you need the userspace daemon to set the permissions on the root directory of the new mount restrictively. What am I missing? > Restricting root is needed, so that a sysadmin won't accidently go > into a user's private mount (e.g. sshfs to some machine to which the > sysadmin otherwise has no access). Root can still gain access by > doing 'su me', but at least he will have a bad conscience. This is > not such a stupid idea as it first sounds IMO, and by default all NFS > servers exhibit a similar behavior (root squashing). Root squashing is actually a much less obnoxious restriction. It means that local uid 0 doesn't automatically correspond to remote uid 0. > > > 4) Access should not be further restricted for the owner of the > > > mount, even if permission bits, uid or gid would suggest > > > otherwise > > > > Similar questions. > > This behavior can be disabled by the "default_permissions" mount > option (wich is not privileged, since it adds restrictions). A FUSE > filesystem mounted by root (and not for private purposes) would > normally be done with "allow_other,default_permissions". But why does the kernel need to know anything about this? Why can't the userspace library present the permissions appropriately to the kernel? I'm going to be pretty confused if I see a mode 666 file that I can't even read. So will various programs. Except for the allow_root bits, I think that having userspace handle the issue entirely would cover both objections. > Does this answer your questions? More or less. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 04:43:32PM +0200, Miklos Szeredi wrote: > 3) No other user should have access to files under the mount, not > even root[5] > [5] Obviously root cannot be restricted, but accidental access to > private data is still a good idea. E.g. root squashing by NFS servers > has a similar affect. Could you explain a little more? I don't see the point in denying access to root, but I also can't tell from your explanation whether you do or not. If I mount a filesystem using ssh, I want to be able to "sudo cp foo.txt /etc" and not get an inexplicable permissions error. I don't really see the point of this restriction, anyway. Could you explain why this shouldn't be a matter of policy, and kept out of the kernel? Have the userspace file servers default to putting restrictive permissions on mounts unless requested otherwise. I can think of plenty of uses for this. > 4) Access should not be further restricted for the owner of the > mount, even if permission bits, uid or gid would suggest > otherwise Similar questions. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 04:43:32PM +0200, Miklos Szeredi wrote: 3) No other user should have access to files under the mount, not even root[5] [5] Obviously root cannot be restricted, but accidental access to private data is still a good idea. E.g. root squashing by NFS servers has a similar affect. Could you explain a little more? I don't see the point in denying access to root, but I also can't tell from your explanation whether you do or not. If I mount a filesystem using ssh, I want to be able to sudo cp foo.txt /etc and not get an inexplicable permissions error. I don't really see the point of this restriction, anyway. Could you explain why this shouldn't be a matter of policy, and kept out of the kernel? Have the userspace file servers default to putting restrictive permissions on mounts unless requested otherwise. I can think of plenty of uses for this. 4) Access should not be further restricted for the owner of the mount, even if permission bits, uid or gid would suggest otherwise Similar questions. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 05:56:09PM +0200, Miklos Szeredi wrote: 3) No other user should have access to files under the mount, not even root[5] [5] Obviously root cannot be restricted, but accidental access to private data is still a good idea. E.g. root squashing by NFS servers has a similar affect. Could you explain a little more? I don't see the point in denying access to root, but I also can't tell from your explanation whether you do or not. Fuse by default does. This can be disabled by one of two mount options: allow_other and allow_root. The former implies the later. These mount options are only allowed for mounting by root, but this can be relaxed with a configuration option. So the behavior that Cristoph was objecting to here is in fact configurable? I don't really see the point of this restriction, anyway. Could you explain why this shouldn't be a matter of policy, and kept out of the kernel? Have the userspace file servers default to putting restrictive permissions on mounts unless requested otherwise. That's an option. However you can't restrict root that way, and you need an extra directory, since permissions on the mountpoint are ignored after the mount. No, you need the userspace daemon to set the permissions on the root directory of the new mount restrictively. What am I missing? Restricting root is needed, so that a sysadmin won't accidently go into a user's private mount (e.g. sshfs to some machine to which the sysadmin otherwise has no access). Root can still gain access by doing 'su me', but at least he will have a bad conscience. This is not such a stupid idea as it first sounds IMO, and by default all NFS servers exhibit a similar behavior (root squashing). Root squashing is actually a much less obnoxious restriction. It means that local uid 0 doesn't automatically correspond to remote uid 0. 4) Access should not be further restricted for the owner of the mount, even if permission bits, uid or gid would suggest otherwise Similar questions. This behavior can be disabled by the default_permissions mount option (wich is not privileged, since it adds restrictions). A FUSE filesystem mounted by root (and not for private purposes) would normally be done with allow_other,default_permissions. But why does the kernel need to know anything about this? Why can't the userspace library present the permissions appropriately to the kernel? I'm going to be pretty confused if I see a mode 666 file that I can't even read. So will various programs. Except for the allow_root bits, I think that having userspace handle the issue entirely would cover both objections. Does this answer your questions? More or less. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 07:22:57PM +0100, Jamie Lokier wrote: 1) Only allow mount over a directory for which the user has write access (and is not sticky) Seems good - but why not sticky? Mounting a user filesystem in /tmp/user-xxx/my-mount-point seems not unreasonable - provided the administrator can delete the directory (which is possible with detachable mount points). Because then they could mount over /tmp. and (is not sticky || is owned by the user) may be more appropriate. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 09:10:46PM +0200, Miklos Szeredi wrote: Root squashing is actually a much less obnoxious restriction. It means that local uid 0 doesn't automatically correspond to remote uid 0. I don't agree that it's less obnoxious. Root squashing and a restricted directory (-rwx--) would have exactly the same affect: root is denied all access. That's considerably less obnoxious, because such directories are comparatively rare; most files, root can still read. There are still a couple unintuitive cases where root has less privelege than a particular non-root user, of course. But your model gives root normally fewer privileges than the user that mounted th e FS. But why does the kernel need to know anything about this? Why can't the userspace library present the permissions appropriately to the kernel? That is exactly what you should do if you use the default_permissions options. You set the file mode, and the kernel checks the permission. So why not make default_permissions a feature of the userspace? I'm going to be pretty confused if I see a mode 666 file that I can't even read. So will various programs. How would you get such I file? I don't understand. The permissions exposed by the FUSE layer apparently don't correspond to what local users can do with them. That's the problem here. It may be that I'm completely misunderstanding you - but from what you've described, the userspace daemon can mark a file's permissions as 666, and then with allow_other and allow_root off no one else will be able to read it, despite those permissions. Except for the allow_root bits, I think that having userspace handle the issue entirely would cover both objections. If I want to allow unprivileged users to be able to mount their filesystems, then handling everything in userspace is not an option. For example if you could mount a filesystem in which files have user=root instead of your own user ID, you could probably confuse some applications running as root, and cause information leak. That's exactly why allow_root and allow_other are disabled for normal users. The only safe option that I can imagine is that the kernel will reset the user and group fields of the file attributes. This would again require a kernel option, but would be far less useful IMO. I think we've got a boundary problem here. You are exposing some arbitrary, user-supplied values in the permissions, and then performing sanity checks at access time; I'm suggesting performing the sanity checking on the other side, when the permissions are supplied to the kernel by the daemon. Why would it be less useful to show files that have been created by a user as owned by that user? Or files that the user has requested no other users be able to write as unwritable by group/other? Sure, it makes your tarfs a little less mapped onto the tar file. But that's one of the recurring objections to implementing archivers as filesystems: the ownership in the archive is _not_ relevant to the mounted copy. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] FUSE permission modell (Was: fuse review bits)
On Mon, Apr 11, 2005 at 09:56:29PM +0200, Miklos Szeredi wrote: Well the sanity check on the server side is always enforced. You can't trick sftp or ftp to not check permissions. So checking on the client side too (where the fuse daemon is running) makes no sense, does it? That argument doesn't make much sense to me. But we're at the end of my useful contributions to this discussion; I'm going to be quiet now and hope some folks who know more about filesystems have more useful responses. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux-2.6.11 can't disable CAD
On Thu, Apr 07, 2005 at 04:50:32PM -0400, Richard B. Johnson wrote: > On Thu, 7 Apr 2005, Jan Harkes wrote: > > >On Thu, Apr 07, 2005 at 11:16:14AM -0400, Richard B. Johnson wrote: > >>In the not-too distant past, one could disable Ctl-Alt-DEL. > >>Can't do it anymore. > >... > >>Observe that reboot() returns 0 and `strace` understands what > >>parameters were passed. The result is that, if I hit Ctl-Alt-Del, > >>`init` will still execute the shutdown-order (INIT 0). > > > >Actually, if CAD is enabled in the kernel, it will just reboot. > >If CAD is disabled in the kernel a SIGINT is sent to pid 1 (/sbin/init). > > > > No, that's not how it ever worked. There are parameters that are > available in the reboot-system call that define the operation that > will occur when the 3-finger salute occurs. > > Execute man 2 reboot. Take your own advice. From the man page: LINUX_REBOOT_CMD_CAD_ON (RB_ENABLE_CAD, 0x89abcdef). CAD is enabled. This means that the CAD keystroke will immediately cause the action associated with LINUX_REBOOT_CMD_RESTART. LINUX_REBOOT_CMD_CAD_OFF (RB_DISABLE_CAD, 0). CAD is disabled. This means that the CAD keystroke will cause a SIGINT signal to be sent to init (process 1), whereupon this process may decide upon a proper action (maybe: kill all processes, sync, reboot). -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: connector is missing in 2.6.12-rc2-mm1]
On Thu, Apr 07, 2005 at 11:02:22PM -0700, David S. Miller wrote: > On Fri, 08 Apr 2005 09:19:39 +0400 > Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: > > > > I know, the same thing holds for most architectures, including i386. > > > However, this is not an issue for uni-processor kernels anywhere else, > > > so what's so special about MIPS? > > > > Does i386 or ppc has cached and uncached memory? > > Yes, they do. > > > No, i386, ppc and others do not require sync on uncached memory access, > > and only instruction not data cache sync on SMP. > > On MIPS, all the MIPS atomic operations will operate on cached memory. > And as far as a uniprocessor cpu is concerned, updating the cache is > all that matters. > > In fact, this SYNC instruction seems unnecessary even on SMP. If the > cache is updated, it is part of the coherent memory space and thus > MOESI main bus SMP cache coherency transactions will see the update > value. When another processor does a "read-to-share" or "read-to-own" > request on the main bus, the processor which did the atomic OP will > provide the correct data from it's cache in response to that transaction. > > So what you have to do is show me an example where the MIPS kernel can > do an atomic.h operation on uncached memory. I even think that is > invalid, come to think of it. It better be... My impression is that the MIPS story isn't so simple, because the architecture only offers very weak coherency guarantees. Most of the SMP implementations offer strong coherency in practice, but at least one (RM9000) doesn't. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: connector is missing in 2.6.12-rc2-mm1]
On Thu, Apr 07, 2005 at 11:02:22PM -0700, David S. Miller wrote: On Fri, 08 Apr 2005 09:19:39 +0400 Evgeniy Polyakov [EMAIL PROTECTED] wrote: I know, the same thing holds for most architectures, including i386. However, this is not an issue for uni-processor kernels anywhere else, so what's so special about MIPS? Does i386 or ppc has cached and uncached memory? Yes, they do. No, i386, ppc and others do not require sync on uncached memory access, and only instruction not data cache sync on SMP. On MIPS, all the MIPS atomic operations will operate on cached memory. And as far as a uniprocessor cpu is concerned, updating the cache is all that matters. In fact, this SYNC instruction seems unnecessary even on SMP. If the cache is updated, it is part of the coherent memory space and thus MOESI main bus SMP cache coherency transactions will see the update value. When another processor does a read-to-share or read-to-own request on the main bus, the processor which did the atomic OP will provide the correct data from it's cache in response to that transaction. So what you have to do is show me an example where the MIPS kernel can do an atomic.h operation on uncached memory. I even think that is invalid, come to think of it. It better be... My impression is that the MIPS story isn't so simple, because the architecture only offers very weak coherency guarantees. Most of the SMP implementations offer strong coherency in practice, but at least one (RM9000) doesn't. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux-2.6.11 can't disable CAD
On Thu, Apr 07, 2005 at 04:50:32PM -0400, Richard B. Johnson wrote: On Thu, 7 Apr 2005, Jan Harkes wrote: On Thu, Apr 07, 2005 at 11:16:14AM -0400, Richard B. Johnson wrote: In the not-too distant past, one could disable Ctl-Alt-DEL. Can't do it anymore. ... Observe that reboot() returns 0 and `strace` understands what parameters were passed. The result is that, if I hit Ctl-Alt-Del, `init` will still execute the shutdown-order (INIT 0). Actually, if CAD is enabled in the kernel, it will just reboot. If CAD is disabled in the kernel a SIGINT is sent to pid 1 (/sbin/init). No, that's not how it ever worked. There are parameters that are available in the reboot-system call that define the operation that will occur when the 3-finger salute occurs. Execute man 2 reboot. Take your own advice. From the man page: LINUX_REBOOT_CMD_CAD_ON (RB_ENABLE_CAD, 0x89abcdef). CAD is enabled. This means that the CAD keystroke will immediately cause the action associated with LINUX_REBOOT_CMD_RESTART. LINUX_REBOOT_CMD_CAD_OFF (RB_DISABLE_CAD, 0). CAD is disabled. This means that the CAD keystroke will cause a SIGINT signal to be sent to init (process 1), whereupon this process may decide upon a proper action (maybe: kill all processes, sync, reboot). -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Do not misuse Coverity please (Was: sound/oss/cs46xx.c: fix a check after use)
On Mon, Mar 28, 2005 at 10:23:48PM -0800, Andrew Morton wrote: > > > -int old=card->amplifier; > > > +int old; > > > if(!card) > > > { > > > CS_DBGOUT(CS_ERROR, 2, printk(KERN_INFO > > > "cs46xx: amp_hercules() called before > > initialized.\n")); > > > return; > > > } > > > +old = card->amplifier; > No, there is a third case: the pointer can be NULL, but the compiler > happened to move the dereference down to after the check. > > If the optimiser is later changed, or if someone tries to compile the code > with -O0, it will oops. The thing GCC is most likely to do with this code is discard the NULL check entirely and leave only the oops; the "if (!card)" can not be reached without passing through "card->amplifier", and a pointer which is dereferenced can not be NULL in a valid program. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Do not misuse Coverity please (Was: sound/oss/cs46xx.c: fix a check after use)
On Mon, Mar 28, 2005 at 10:23:48PM -0800, Andrew Morton wrote: -int old=card-amplifier; +int old; if(!card) { CS_DBGOUT(CS_ERROR, 2, printk(KERN_INFO cs46xx: amp_hercules() called before initialized.\n)); return; } +old = card-amplifier; No, there is a third case: the pointer can be NULL, but the compiler happened to move the dereference down to after the check. If the optimiser is later changed, or if someone tries to compile the code with -O0, it will oops. The thing GCC is most likely to do with this code is discard the NULL check entirely and leave only the oops; the if (!card) can not be reached without passing through card-amplifier, and a pointer which is dereferenced can not be NULL in a valid program. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)
On Sun, Mar 13, 2005 at 07:50:09PM -0500, Trond Myklebust wrote: > Sorry, but you should _never_ have gotten an ESTALE error if the file > was not in use when you deleted the old copy of glibc. A fresh call to > open() will always result in a new lookup of the filehandle. > What may have happened in the case of the EIO error is that you may have > raced: i.e. a client starts reading the file while it is being copied > to. It is in a separate root filesystem, currently not used by anything on the target. It is likely to be in cache, but I can absolutely guarantee it isn't open. Hmm, server is x86_64 2.6.7, client is 2.6.10 MIPS. I should upgrade them and see if that helps. Unfortunately I haven't found any smaller testcases than installing an entire root FS. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)
On Sun, Mar 13, 2005 at 03:42:29PM -0500, Trond Myklebust wrote: > su den 13.03.2005 Klokka 15:04 (-0500) skreiv Daniel Jacobowitz: > > > I can't find any documentation about this, but it seems like the same > > problem that has been causing me headaches lately; when I replace glibc > > from the server side of an nfsroot, the client has a couple of > > variously wrong reads before it sees the new files. If it breaks NFS > > so badly, why is it the default for the Linux NFS server? > > No, that's a very different issue: you are violating the NFS cache > consistency rules if you are changing a file that is being held open by > other machines. > The correct way to do the above is to use GNU install with the '-b' > option: that will rename the version of glibc that is in use, and then > install the new glibc in a different inode. [closed and/or irrelevant lists removed from CC:] No, the copy of glibc in question is not in use at the time. The next attempt to open it on the client will sometimes generate a "stale NFS handle" message, or if the open succeeds a read will sometimes return EIO. But it sounds like this is a different problem than the original poster was testing for. I'm still curious about the answer to my question above :-) -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 13, 2005 at 12:27:58AM -0800, Roland McGrath wrote: > This patch further cleans up the appearance of TF in eflags when ptrace is > involved. With this, PTRACE_SINGLESTEP will not cause TF to appear in > eflags as seen by PTRACE_GETREGS and the like, when the instruction faulted > for some reason other than the single-step trap. > > This moves the check added by Dan's patch from setup_sigcontext to > handle_signal. This is a cosmetic difference, but I think it makes more > sense to consolidate all the "reset registers to canonical state" work in > the same place (i.e. put it with the syscall rollback code), separate from > the signal handler setup. The change that matters is moving the similar > check out of do_debug, where it only covers the case of a single-step trap. > Instead, it goes into the ptrace_signal_deliver macro, which is called > before the ptrace stop for whatever signal results from whatever kind of > fault in that instruction (or asynchronous signal). With that, the > handle_signal check is still needed only for the case of PTRACE_SINGLESTEP > with a handled signal. > > > Thanks, > Roland Thanks, looks right to me! -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)
On Sun, Mar 13, 2005 at 12:04:27AM -0500, Trond Myklebust wrote: > lau den 12.03.2005 Klokka 03:56 (-0800) skreiv Junfeng Yang: > > Hi, > > > > We checked NFS on top of ext3 using FiSC (our file system model checker) > > and found a case where NFS stat cache can contain inconsistent entries. > > > > Basically, to trigger this inconsistency, just do the following steps: > > 1. create a file A1, write a few bytes to it, so A1 is 4 words > > 2. create a hard link A2, pointing to A1 > > 3. stat on A2. A2's size is 4 words > > 4. truncate A1 to a larger size, write a few bytes at the end. now it's > > 1031 words. > > 5. stat on A2. it's size is still 4 words, which should be 1031 words > > > > We have a test case to re-create this warning. You can download it at > > http://fisc.stanford.edu/bug16/crash.c. It includes some sudo commands > > to mount nfs partitions, which you might want to change according to your > > local settings. > > > > cat /etc/exports shows: > > /mnt/sbd0-export localhost(rw,sync) > > /mnt/sbd1-export localhost(rw,sync) > > > > Let me know if you have any problems reproducing the warning. We'd > > appreciate any confirmations/clarifications. > > > > This is a known problem. Turn off the (default - grrr) subtree checking > export option on the server, and it will all work properly. The subtree > checking option violates the NFS standards for filehandle generation in > so many ways, that it isn't even funny. I can't find any documentation about this, but it seems like the same problem that has been causing me headaches lately; when I replace glibc from the server side of an nfsroot, the client has a couple of variously wrong reads before it sees the new files. If it breaks NFS so badly, why is it the default for the Linux NFS server? -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)
On Sun, Mar 13, 2005 at 12:04:27AM -0500, Trond Myklebust wrote: lau den 12.03.2005 Klokka 03:56 (-0800) skreiv Junfeng Yang: Hi, We checked NFS on top of ext3 using FiSC (our file system model checker) and found a case where NFS stat cache can contain inconsistent entries. Basically, to trigger this inconsistency, just do the following steps: 1. create a file A1, write a few bytes to it, so A1 is 4 words 2. create a hard link A2, pointing to A1 3. stat on A2. A2's size is 4 words 4. truncate A1 to a larger size, write a few bytes at the end. now it's 1031 words. 5. stat on A2. it's size is still 4 words, which should be 1031 words We have a test case to re-create this warning. You can download it at http://fisc.stanford.edu/bug16/crash.c. It includes some sudo commands to mount nfs partitions, which you might want to change according to your local settings. cat /etc/exports shows: /mnt/sbd0-export localhost(rw,sync) /mnt/sbd1-export localhost(rw,sync) Let me know if you have any problems reproducing the warning. We'd appreciate any confirmations/clarifications. This is a known problem. Turn off the (default - grrr) subtree checking export option on the server, and it will all work properly. The subtree checking option violates the NFS standards for filehandle generation in so many ways, that it isn't even funny. I can't find any documentation about this, but it seems like the same problem that has been causing me headaches lately; when I replace glibc from the server side of an nfsroot, the client has a couple of variously wrong reads before it sees the new files. If it breaks NFS so badly, why is it the default for the Linux NFS server? -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 13, 2005 at 12:27:58AM -0800, Roland McGrath wrote: This patch further cleans up the appearance of TF in eflags when ptrace is involved. With this, PTRACE_SINGLESTEP will not cause TF to appear in eflags as seen by PTRACE_GETREGS and the like, when the instruction faulted for some reason other than the single-step trap. This moves the check added by Dan's patch from setup_sigcontext to handle_signal. This is a cosmetic difference, but I think it makes more sense to consolidate all the reset registers to canonical state work in the same place (i.e. put it with the syscall rollback code), separate from the signal handler setup. The change that matters is moving the similar check out of do_debug, where it only covers the case of a single-step trap. Instead, it goes into the ptrace_signal_deliver macro, which is called before the ptrace stop for whatever signal results from whatever kind of fault in that instruction (or asynchronous signal). With that, the handle_signal check is still needed only for the case of PTRACE_SINGLESTEP with a handled signal. Thanks, Roland Thanks, looks right to me! -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)
On Sun, Mar 13, 2005 at 03:42:29PM -0500, Trond Myklebust wrote: su den 13.03.2005 Klokka 15:04 (-0500) skreiv Daniel Jacobowitz: I can't find any documentation about this, but it seems like the same problem that has been causing me headaches lately; when I replace glibc from the server side of an nfsroot, the client has a couple of variously wrong reads before it sees the new files. If it breaks NFS so badly, why is it the default for the Linux NFS server? No, that's a very different issue: you are violating the NFS cache consistency rules if you are changing a file that is being held open by other machines. The correct way to do the above is to use GNU install with the '-b' option: that will rename the version of glibc that is in use, and then install the new glibc in a different inode. [closed and/or irrelevant lists removed from CC:] No, the copy of glibc in question is not in use at the time. The next attempt to open it on the client will sometimes generate a stale NFS handle message, or if the open succeeds a read will sometimes return EIO. But it sounds like this is a different problem than the original poster was testing for. I'm still curious about the answer to my question above :-) -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)
On Sun, Mar 13, 2005 at 07:50:09PM -0500, Trond Myklebust wrote: Sorry, but you should _never_ have gotten an ESTALE error if the file was not in use when you deleted the old copy of glibc. A fresh call to open() will always result in a new lookup of the filehandle. What may have happened in the case of the EIO error is that you may have raced: i.e. a client starts reading the file while it is being copied to. It is in a separate root filesystem, currently not used by anything on the target. It is likely to be in cache, but I can absolutely guarantee it isn't open. Hmm, server is x86_64 2.6.7, client is 2.6.10 MIPS. I should upgrade them and see if that helps. Unfortunately I haven't found any smaller testcases than installing an entire root FS. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Mon, Mar 07, 2005 at 01:29:12PM -0800, Roland McGrath wrote: > > Is this semantically different from the patch I posted, i.e. is there > > any case which one of them covers and not the other? > > Yes, the second case that I described when I said there were two cases! > (Sheesh.) Calm down, there were already two cases. I reread your message and couldn't pick out the answer, or I wouldn't have asked. > To repeat, when the process was doing PTRACE_SINGLESTEP and then > stops on some other signal rather than because of the single-step trap > (e.g. single-stepping an instruction that faults), ptrace will show TF set > in its registers. With my patch, it will show TF clear. I can reproduce this problem with the patch that Linus committed, so you should probably update your patch for a current snapshot and nag him about it. > > That is an inability to set breakpoints in the vsyscall page. Andrew > > told me (last May, wow) that he thought this worked in Fedora, but I > > haven't seen any signs of the code. It would certainly be a Good Thing > > if it is possible! > > Fedora kernels use a normal mapping (with randomized location) for the > page, rather than the fixed high address in the vanilla kernel. The > FIXADDR_USER_START area is globally mapped in a special way not using > normal vma data structures, and is permanently read-only in all tasks. > COW via ptrace works normally for Fedora's flavor, but no writing is ever > possible to the fixmap page. Blech. I assume that there is no way to map a normal VMA over top of the fixed page, for a particular process? This makes debugging the vsyscall DSO a real pain. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Mon, Mar 07, 2005 at 01:29:12PM -0800, Roland McGrath wrote: Is this semantically different from the patch I posted, i.e. is there any case which one of them covers and not the other? Yes, the second case that I described when I said there were two cases! (Sheesh.) Calm down, there were already two cases. I reread your message and couldn't pick out the answer, or I wouldn't have asked. To repeat, when the process was doing PTRACE_SINGLESTEP and then stops on some other signal rather than because of the single-step trap (e.g. single-stepping an instruction that faults), ptrace will show TF set in its registers. With my patch, it will show TF clear. I can reproduce this problem with the patch that Linus committed, so you should probably update your patch for a current snapshot and nag him about it. That is an inability to set breakpoints in the vsyscall page. Andrew told me (last May, wow) that he thought this worked in Fedora, but I haven't seen any signs of the code. It would certainly be a Good Thing if it is possible! Fedora kernels use a normal mapping (with randomized location) for the page, rather than the fixed high address in the vanilla kernel. The FIXADDR_USER_START area is globally mapped in a special way not using normal vma data structures, and is permanently read-only in all tasks. COW via ptrace works normally for Fedora's flavor, but no writing is ever possible to the fixmap page. Blech. I assume that there is no way to map a normal VMA over top of the fixed page, for a particular process? This makes debugging the vsyscall DSO a real pain. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 06, 2005 at 07:16:37PM -0800, Roland McGrath wrote: > > I think mine is more correct; the problem doesn't occur because the > > debugger cancelled a signal, it occurs because a bogus TF bit was saved > > to the signal context. I like keeping solutions close to their > > problems. But that's just aesthetic. > > I understand the scenario. Understanding how it comes about made me > recognize there is another scenario that is also handled wrong. > I didn't say the second scenario was what you are seeing. > > Dan's patch covers the case of PTRACE_SINGLESTEP called to deliver a signal > that has a handler to run. That's because there TF is set after the ptrace > stop, when it's resuming. This is a "normalize register state" operation. > I think it would be a little clearer to do this in handle_signal where the > similar case of tweaking register state to back up a system call is done. > > The patch I posted moves the resetting of TF from the trap handler to > ptrace_signal_deliver. This is necessary to ensure that TF is not shown as > set in the registers retrieved by the debugger when the process stops for > something other than the single-step trap requested by PTRACE_SINGLESTEP. Is this semantically different from the patch I posted, i.e. is there any case which one of them covers and not the other? > Here is a patch that does both of those things. This had no effect on any > of the gdb testsuite cases (for good or ill) aside from sigstep.exp, and: > > $ grep 'FAIL.*sigstep' testsuite/gdb.sum > KFAIL: gdb.base/sigstep.exp: finish from handleri; leave handler (could not > set breakpoint) (PRMS: gdb/1736) > > I don't know what that one is about, but it was KFAIL before the change too. That is an inability to set breakpoints in the vsyscall page. Andrew told me (last May, wow) that he thought this worked in Fedora, but I haven't seen any signs of the code. It would certainly be a Good Thing if it is possible! > -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 06, 2005 at 01:22:25PM -0800, Roland McGrath wrote: > > I _think_ your test-case would work right if you just moved that code from > > the special-case in do_debug(), and moved it to the top of > > setup_sigcontext() instead. I've not tested it, though, and haven't really > > given it any "deep thought". Maybe somebody smarter can say "yeah, that's > > obviously the right thing to do" or "no, that won't work because.." > > Indeed, this is what my original changes for this did, before you started > cleaning things up to be nice to TF users other than PTRACE_SINGLESTEP. > > I note, btw, that the x86_64 code is still at that prior stage. So I think > it doesn't have this new wrinkle, but it also doesn't have the advantages > of the more recent i386 changes. Once we're sure about the i386 state, we > should update the x86_64 code to match. > > I'm not sure what kind of smart this makes me, but I'll say that your plan > would work and no, it's obviously not the right thing to do. ;-) I haven't > tested the following, not having tracked down the specific problem case you > folks are talking about. But I think this is the right solution. The > difference is that when we stop for some signal and report to the debugger, > the debugger looking at our registers will see TF clear instead of set, > before it decides whether to continue us with the signal or what to do. > With the change yo suggested, (I think) if the debugger decides to eat the > signal and resume, we would get a spurious single-step trap after executing > the next instruction, instead of resuming normally as requested. Roland, the sigstep.exp test in the GDB testsuite will show this problem; if your patch monotonically improves GDB HEAD testsuite results and removes all the FAILs for sigstep.exp, then it's probably equivalent to the one I just posted for this testcase. I think mine is more correct; the problem doesn't occur because the debugger cancelled a signal, it occurs because a bogus TF bit was saved to the signal context. I like keeping solutions close to their problems. But that's just aesthetic. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 06, 2005 at 12:03:22PM -0800, Linus Torvalds wrote: > I _think_ your test-case would work right if you just moved that code from > the special-case in do_debug(), and moved it to the top of > setup_sigcontext() instead. I've not tested it, though, and haven't really > given it any "deep thought". Maybe somebody smarter can say "yeah, that's > obviously the right thing to do" or "no, that won't work because.." I bought it, but the GDB testsuite didn't. Both copies seem to be necessary; there's generally no signal handler for SIGTRAP, so moving it disables the test in the most common case. I didn't poke at it long enough to figure out what the failing case was, but it introduced a different situation which could leave TF enabled. This, however, worked: If a debugger set the TF bit, make sure to clear it when creating a signal context. Otherwise, TF will be incorrectly restored by sigreturn. Signed-off-by: Daniel Jacobowitz <[EMAIL PROTECTED]> = arch/i386/kernel/signal.c 1.53 vs edited = --- 1.53/arch/i386/kernel/signal.c 2005-01-31 01:20:14 -05:00 +++ edited/arch/i386/kernel/signal.c2005-03-06 15:36:41 -05:00 @@ -277,6 +277,18 @@ { int tmp, err = 0; + /* +* If TF is set due to a debugger (PT_DTRACE), clear the TF +* flag so that register information in the sigcontext is +* correct. +*/ + if (unlikely(regs->eflags & TF_MASK)) { + if (likely(current->ptrace & PT_DTRACE)) { + current->ptrace &= ~PT_DTRACE; + regs->eflags &= ~TF_MASK; + } + } + tmp = 0; __asm__("movl %%gs,%0" : "=r"(tmp): "0"(tmp)); err |= __put_user(tmp, (unsigned int __user *)>gs); -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 06, 2005 at 02:38:41PM -0500, Daniel Jacobowitz wrote: > The reason this happens is that when the inferior hits a breakpoint, the > first thing GDB will do is remove the breakpoint, single-step past it, and > reinsert it. So GDB does a PTRACE_SINGLESTEP, and the kernel invokes the > signal handler (without single-step - good so far). When the signal handler > returns, we've lost track of the fact that ptrace set the single-step flag, > however. So the single-step completes and returns SIGTRAP to GDB. GDB is > expecting a SIGTRAP and reinserts the breakpoint. Then it resumes the > inferior, but now the trap flag is set in $eflags. So, oops, the continue > acts like a step instead. Eh, I got the event sequence wrong as usual, but the basic description is right. - Original SIGTRAP at breakpoint - user says "cont" - GDB tries to singlestep past the breakpoint - PTRACE_SINGLESTEP, no signal - GDB receives SIGALRM at the same PC - GDB tries to singlestep past the breakpoint - PTRACE_SINGLESTEP, SIGALRM - GDB receives SIGTRAP at the first instruction of the handler - GDB reinserts the breakpoint at line 18. This is a "step-resume" breakpoint - we were stepping, we were interrupted by a signal. - GDB issues PTRACE_CONT, no signal - GDB receives SIGTRAP at the sigreturn location - this is the step-resume breakpoint. - GDB remove that and issues PTRACE_SINGLESTEP, no signal - It is trying again to get past the breakpoint location so that it can honor the user's "cont" request. - GDB receives SIGTRAP at the instruction after the breakpoint. - GDB reinserts the original breakpoint and issues PTRACE_CONTINUE. All of this is what's supposed to happen. The executable be running free now until it hits the breakpoint again. - GDB receives an unexpected SIGTRAP at the next instruction (the second instruction after the original breakpoint). If your compiler uses only two instructions for the loop, you might not see this. gcc -O0 will use three by default. Just stick something else in the loop. > What to do? We need to know when we restore the trap bit in sigreturn > whether it was set by ptrace or by the application (possibly including by > the signal handler). If I'm following this right, then the saved value of eflags in the signal handler should not contain the trap bit at this point. It does, though. It's hard to see this in GDB, because the CFI does not express %eflags, so "print $eflags" won't track up the stack. I don't think there's a handy dwarf register number for it at the moment. But you can print out the struct sigcontext by hand once you locate it on the stack. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
More trouble with i386 EFLAGS and ptrace
It looks like the changes to preserve eflags when single-stepping don't work right with signals. Take this test case: #include #include volatile int done; void handler (int sig) { done = 1; } int main() { while (1) { done = 0; signal (SIGALRM, handler); alarm (1); while (!done); } } And this GDB session: (gdb) b 18 Breakpoint 1 at 0x804840d: file test.c, line 18. (gdb) r Starting program: /home/drow/eflags/test Breakpoint 1, main () at test.c:18 18while (!done); (gdb) p/x $eflags $1 = 0x200217 (gdb) c Continuing. Program received signal SIGTRAP, Trace/breakpoint trap. 0x08048414 in main () at test.c:18 18while (!done); (gdb) p/x $eflags $2 = 0x200302 There's an implied delay before the "c" which is long enough for the signal handler to become pending. The reason this happens is that when the inferior hits a breakpoint, the first thing GDB will do is remove the breakpoint, single-step past it, and reinsert it. So GDB does a PTRACE_SINGLESTEP, and the kernel invokes the signal handler (without single-step - good so far). When the signal handler returns, we've lost track of the fact that ptrace set the single-step flag, however. So the single-step completes and returns SIGTRAP to GDB. GDB is expecting a SIGTRAP and reinserts the breakpoint. Then it resumes the inferior, but now the trap flag is set in $eflags. So, oops, the continue acts like a step instead. What to do? We need to know when we restore the trap bit in sigreturn whether it was set by ptrace or by the application (possibly including by the signal handler). Andrew, serious kudos for GDB's sigstep.exp, which uncovered this problem (through a much more complicated test - I may add the smaller one). -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
More trouble with i386 EFLAGS and ptrace
It looks like the changes to preserve eflags when single-stepping don't work right with signals. Take this test case: snip #include signal.h #include unistd.h volatile int done; void handler (int sig) { done = 1; } int main() { while (1) { done = 0; signal (SIGALRM, handler); alarm (1); while (!done); } } snip And this GDB session: (gdb) b 18 Breakpoint 1 at 0x804840d: file test.c, line 18. (gdb) r Starting program: /home/drow/eflags/test Breakpoint 1, main () at test.c:18 18while (!done); (gdb) p/x $eflags $1 = 0x200217 (gdb) c Continuing. Program received signal SIGTRAP, Trace/breakpoint trap. 0x08048414 in main () at test.c:18 18while (!done); (gdb) p/x $eflags $2 = 0x200302 There's an implied delay before the c which is long enough for the signal handler to become pending. The reason this happens is that when the inferior hits a breakpoint, the first thing GDB will do is remove the breakpoint, single-step past it, and reinsert it. So GDB does a PTRACE_SINGLESTEP, and the kernel invokes the signal handler (without single-step - good so far). When the signal handler returns, we've lost track of the fact that ptrace set the single-step flag, however. So the single-step completes and returns SIGTRAP to GDB. GDB is expecting a SIGTRAP and reinserts the breakpoint. Then it resumes the inferior, but now the trap flag is set in $eflags. So, oops, the continue acts like a step instead. What to do? We need to know when we restore the trap bit in sigreturn whether it was set by ptrace or by the application (possibly including by the signal handler). Andrew, serious kudos for GDB's sigstep.exp, which uncovered this problem (through a much more complicated test - I may add the smaller one). -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 06, 2005 at 02:38:41PM -0500, Daniel Jacobowitz wrote: The reason this happens is that when the inferior hits a breakpoint, the first thing GDB will do is remove the breakpoint, single-step past it, and reinsert it. So GDB does a PTRACE_SINGLESTEP, and the kernel invokes the signal handler (without single-step - good so far). When the signal handler returns, we've lost track of the fact that ptrace set the single-step flag, however. So the single-step completes and returns SIGTRAP to GDB. GDB is expecting a SIGTRAP and reinserts the breakpoint. Then it resumes the inferior, but now the trap flag is set in $eflags. So, oops, the continue acts like a step instead. Eh, I got the event sequence wrong as usual, but the basic description is right. - Original SIGTRAP at breakpoint - user says cont - GDB tries to singlestep past the breakpoint - PTRACE_SINGLESTEP, no signal - GDB receives SIGALRM at the same PC - GDB tries to singlestep past the breakpoint - PTRACE_SINGLESTEP, SIGALRM - GDB receives SIGTRAP at the first instruction of the handler - GDB reinserts the breakpoint at line 18. This is a step-resume breakpoint - we were stepping, we were interrupted by a signal. - GDB issues PTRACE_CONT, no signal - GDB receives SIGTRAP at the sigreturn location - this is the step-resume breakpoint. - GDB remove that and issues PTRACE_SINGLESTEP, no signal - It is trying again to get past the breakpoint location so that it can honor the user's cont request. - GDB receives SIGTRAP at the instruction after the breakpoint. - GDB reinserts the original breakpoint and issues PTRACE_CONTINUE. All of this is what's supposed to happen. The executable be running free now until it hits the breakpoint again. - GDB receives an unexpected SIGTRAP at the next instruction (the second instruction after the original breakpoint). If your compiler uses only two instructions for the loop, you might not see this. gcc -O0 will use three by default. Just stick something else in the loop. What to do? We need to know when we restore the trap bit in sigreturn whether it was set by ptrace or by the application (possibly including by the signal handler). If I'm following this right, then the saved value of eflags in the signal handler should not contain the trap bit at this point. It does, though. It's hard to see this in GDB, because the CFI does not express %eflags, so print $eflags won't track up the stack. I don't think there's a handy dwarf register number for it at the moment. But you can print out the struct sigcontext by hand once you locate it on the stack. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 06, 2005 at 12:03:22PM -0800, Linus Torvalds wrote: I _think_ your test-case would work right if you just moved that code from the special-case in do_debug(), and moved it to the top of setup_sigcontext() instead. I've not tested it, though, and haven't really given it any deep thought. Maybe somebody smarter can say yeah, that's obviously the right thing to do or no, that won't work because.. I bought it, but the GDB testsuite didn't. Both copies seem to be necessary; there's generally no signal handler for SIGTRAP, so moving it disables the test in the most common case. I didn't poke at it long enough to figure out what the failing case was, but it introduced a different situation which could leave TF enabled. This, however, worked: If a debugger set the TF bit, make sure to clear it when creating a signal context. Otherwise, TF will be incorrectly restored by sigreturn. Signed-off-by: Daniel Jacobowitz [EMAIL PROTECTED] = arch/i386/kernel/signal.c 1.53 vs edited = --- 1.53/arch/i386/kernel/signal.c 2005-01-31 01:20:14 -05:00 +++ edited/arch/i386/kernel/signal.c2005-03-06 15:36:41 -05:00 @@ -277,6 +277,18 @@ { int tmp, err = 0; + /* +* If TF is set due to a debugger (PT_DTRACE), clear the TF +* flag so that register information in the sigcontext is +* correct. +*/ + if (unlikely(regs-eflags TF_MASK)) { + if (likely(current-ptrace PT_DTRACE)) { + current-ptrace = ~PT_DTRACE; + regs-eflags = ~TF_MASK; + } + } + tmp = 0; __asm__(movl %%gs,%0 : =r(tmp): 0(tmp)); err |= __put_user(tmp, (unsigned int __user *)sc-gs); -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 06, 2005 at 01:22:25PM -0800, Roland McGrath wrote: I _think_ your test-case would work right if you just moved that code from the special-case in do_debug(), and moved it to the top of setup_sigcontext() instead. I've not tested it, though, and haven't really given it any deep thought. Maybe somebody smarter can say yeah, that's obviously the right thing to do or no, that won't work because.. Indeed, this is what my original changes for this did, before you started cleaning things up to be nice to TF users other than PTRACE_SINGLESTEP. I note, btw, that the x86_64 code is still at that prior stage. So I think it doesn't have this new wrinkle, but it also doesn't have the advantages of the more recent i386 changes. Once we're sure about the i386 state, we should update the x86_64 code to match. I'm not sure what kind of smart this makes me, but I'll say that your plan would work and no, it's obviously not the right thing to do. ;-) I haven't tested the following, not having tracked down the specific problem case you folks are talking about. But I think this is the right solution. The difference is that when we stop for some signal and report to the debugger, the debugger looking at our registers will see TF clear instead of set, before it decides whether to continue us with the signal or what to do. With the change yo suggested, (I think) if the debugger decides to eat the signal and resume, we would get a spurious single-step trap after executing the next instruction, instead of resuming normally as requested. Roland, the sigstep.exp test in the GDB testsuite will show this problem; if your patch monotonically improves GDB HEAD testsuite results and removes all the FAILs for sigstep.exp, then it's probably equivalent to the one I just posted for this testcase. I think mine is more correct; the problem doesn't occur because the debugger cancelled a signal, it occurs because a bogus TF bit was saved to the signal context. I like keeping solutions close to their problems. But that's just aesthetic. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More trouble with i386 EFLAGS and ptrace
On Sun, Mar 06, 2005 at 07:16:37PM -0800, Roland McGrath wrote: I think mine is more correct; the problem doesn't occur because the debugger cancelled a signal, it occurs because a bogus TF bit was saved to the signal context. I like keeping solutions close to their problems. But that's just aesthetic. I understand the scenario. Understanding how it comes about made me recognize there is another scenario that is also handled wrong. I didn't say the second scenario was what you are seeing. Dan's patch covers the case of PTRACE_SINGLESTEP called to deliver a signal that has a handler to run. That's because there TF is set after the ptrace stop, when it's resuming. This is a normalize register state operation. I think it would be a little clearer to do this in handle_signal where the similar case of tweaking register state to back up a system call is done. The patch I posted moves the resetting of TF from the trap handler to ptrace_signal_deliver. This is necessary to ensure that TF is not shown as set in the registers retrieved by the debugger when the process stops for something other than the single-step trap requested by PTRACE_SINGLESTEP. Is this semantically different from the patch I posted, i.e. is there any case which one of them covers and not the other? Here is a patch that does both of those things. This had no effect on any of the gdb testsuite cases (for good or ill) aside from sigstep.exp, and: $ grep 'FAIL.*sigstep' testsuite/gdb.sum KFAIL: gdb.base/sigstep.exp: finish from handleri; leave handler (could not set breakpoint) (PRMS: gdb/1736) I don't know what that one is about, but it was KFAIL before the change too. That is an inability to set breakpoints in the vsyscall page. Andrew told me (last May, wow) that he thought this worked in Fedora, but I haven't seen any signs of the code. It would certainly be a Good Thing if it is possible! -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ARM undefined symbols. Again.
On Fri, Feb 25, 2005 at 08:23:49PM +, Russell King wrote: > On Fri, Feb 25, 2005 at 11:59:01AM -0800, Linus Torvalds wrote: > > On Fri, 25 Feb 2005, Russell King wrote: > > > So, what's happening about this? > > > > Btw, is there any real reason why the ARM _tools_ can't just be fixed? I > > don't see why this isn't a tools bug? > > It is a tools bug. But the issue is that *all* versions of binutils > currently available which are kernel-capable (since the inclusion of > the kbuild .incbin requirement on binutils) have this bug, with the > exception of maybe CVS versions. > > We can't say "you must use the current CVS binutils to build the > kernel" because that's not a sane toolchain base to build products > on. > > I've been wanting to see a version of binutils released pretty damn > quick so I can say "kernel only builds with latest toolchain" but > I suspect even that's going to be seen as being unreasonable. Not sure who you asked, but since I run the binutils releases... I am fairly positive that this bug has been fixed in the binutils CVS: 2004-07-02 Nick Clifton <[EMAIL PROTECTED]> * config/tc-arm.c (md_apply_fix3:BFD_RELOC_ARM_IMMEDIATE): Do not allow values which have come from undefined symbols. Always consider this fixup to have been processed as a reloc cannot be generated for it. I know several ARM kernel developers who are using tools with this patch applied already. Also, I anticipate the release of binutils 2.16 including the fix in about a month. > And yes, the toolchain peoples point of view is "fix the kernel". Huh? Obviously the kernel isn't broken, unless you're talking about the kallsyms checks now. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ARM undefined symbols. Again.
On Fri, Feb 25, 2005 at 08:23:49PM +, Russell King wrote: On Fri, Feb 25, 2005 at 11:59:01AM -0800, Linus Torvalds wrote: On Fri, 25 Feb 2005, Russell King wrote: So, what's happening about this? Btw, is there any real reason why the ARM _tools_ can't just be fixed? I don't see why this isn't a tools bug? It is a tools bug. But the issue is that *all* versions of binutils currently available which are kernel-capable (since the inclusion of the kbuild .incbin requirement on binutils) have this bug, with the exception of maybe CVS versions. We can't say you must use the current CVS binutils to build the kernel because that's not a sane toolchain base to build products on. I've been wanting to see a version of binutils released pretty damn quick so I can say kernel only builds with latest toolchain but I suspect even that's going to be seen as being unreasonable. Not sure who you asked, but since I run the binutils releases... I am fairly positive that this bug has been fixed in the binutils CVS: 2004-07-02 Nick Clifton [EMAIL PROTECTED] * config/tc-arm.c (md_apply_fix3:BFD_RELOC_ARM_IMMEDIATE): Do not allow values which have come from undefined symbols. Always consider this fixup to have been processed as a reloc cannot be generated for it. I know several ARM kernel developers who are using tools with this patch applied already. Also, I anticipate the release of binutils 2.16 including the fix in about a month. And yes, the toolchain peoples point of view is fix the kernel. Huh? Obviously the kernel isn't broken, unless you're talking about the kallsyms checks now. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Consolidate compat_sys_waitid
On Tue, Feb 15, 2005 at 02:01:49PM +1100, Stephen Rothwell wrote: > Hi all, > > This patch does: > - consolidate the three implementations of compat_sys_waitid > (some were called sys32_waitid). > - adds sys_waitid syscall to ppc > - adds sys_waitid and compat_sys_waitid syscalls to ppc64 > > Parisc seemed to assume th existance of compat_sys_waitid. The MIPS > syscall tables have me confused and may need updating. I have arbitrarily > chosen the next available syscall number on ppc and ppc64, I hope this is > correct. I posted a (not-consolidated) sys32_waitid to the MIPS list on Sunday. The syscall tables should confuse you :-) N32 needs to use compat versions of most structures, but not siginfo_t. O32 needs to use compat versions of everything. Your new version can replace the sys32_waitid from my patch, but not sysn32_waitid. Ralf, I'll let you sort it out :-) -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Consolidate compat_sys_waitid
On Tue, Feb 15, 2005 at 02:01:49PM +1100, Stephen Rothwell wrote: Hi all, This patch does: - consolidate the three implementations of compat_sys_waitid (some were called sys32_waitid). - adds sys_waitid syscall to ppc - adds sys_waitid and compat_sys_waitid syscalls to ppc64 Parisc seemed to assume th existance of compat_sys_waitid. The MIPS syscall tables have me confused and may need updating. I have arbitrarily chosen the next available syscall number on ppc and ppc64, I hope this is correct. I posted a (not-consolidated) sys32_waitid to the MIPS list on Sunday. The syscall tables should confuse you :-) N32 needs to use compat versions of most structures, but not siginfo_t. O32 needs to use compat versions of everything. Your new version can replace the sys32_waitid from my patch, but not sysn32_waitid. Ralf, I'll let you sort it out :-) -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Blocking behavior changed for pipes in 2.6.11-rc3
This program [cribbed loosely from tst-cancel17.c in glibc] has changed behavior with the recent pipe changes. It used to block; which makes sense. It gets the maximum buffer size for the pipe (or a page if that's larger), and writes that many bytes plus two to it. It reads one back. The write "shouldn't" have room to finish. Checking the POSIX language for _PC_PIPE_BUF I think this is OK - it doesn't say that no more bytes than that can be written at once, just that this is the maximum which are guaranteed to be written atomically. So I'm guessing this change is a feature, not a bug. Right? [snip] #include #include #include #include #include #include void * tf (void *fd) { int *fds = fd; char mem[1]; read (fds[0], mem, 1); } int main (void) { pthread_t th; int len; int fds[2]; if (pipe (fds) != 0) { puts ("pipe failed"); return 1; } size_t len2 = fpathconf (fds[1], _PC_PIPE_BUF); size_t page_size = sysconf (_SC_PAGESIZE); len2 = (len2 < page_size ? page_size : len2) + 1 + 1; char *mem2 = malloc (len2); pthread_create (, NULL, tf, fds); write (fds[1], mem2, len2); return 0; } [/snip] -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Blocking behavior changed for pipes in 2.6.11-rc3
This program [cribbed loosely from tst-cancel17.c in glibc] has changed behavior with the recent pipe changes. It used to block; which makes sense. It gets the maximum buffer size for the pipe (or a page if that's larger), and writes that many bytes plus two to it. It reads one back. The write shouldn't have room to finish. Checking the POSIX language for _PC_PIPE_BUF I think this is OK - it doesn't say that no more bytes than that can be written at once, just that this is the maximum which are guaranteed to be written atomically. So I'm guessing this change is a feature, not a bug. Right? [snip] #include errno.h #include pthread.h #include stdio.h #include stdlib.h #include string.h #include unistd.h void * tf (void *fd) { int *fds = fd; char mem[1]; read (fds[0], mem, 1); } int main (void) { pthread_t th; int len; int fds[2]; if (pipe (fds) != 0) { puts (pipe failed); return 1; } size_t len2 = fpathconf (fds[1], _PC_PIPE_BUF); size_t page_size = sysconf (_SC_PAGESIZE); len2 = (len2 page_size ? page_size : len2) + 1 + 1; char *mem2 = malloc (len2); pthread_create (th, NULL, tf, fds); write (fds[1], mem2, len2); return 0; } [/snip] -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
On Tue, Feb 08, 2005 at 06:10:18PM -0800, Andrew Morton wrote: > We could just remove the printk and stick a comment over it. If the > application later tries to access the not-there pages then it'll just > fault. > > However I worry if there is some way in which we can leave unzeroed memory > accessible to the application, although it's hard to see how that could > happen. > > Daniel, Pavel cruelly chopped you off the Cc when replying. What's your > diagnosis on the below? It's asking for a lot of unwritable zeroed space. See this: > LOAD 0x00 0x08048000 0x08048000 0xb7354 0x1b7354 R E 0x1000 > LOAD 0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW 0x1000 The 0xb7354 is size to map from the file, the 0x1b7354 is size to map in memory. We're supposed to zero-fill the rest. Now that I think about it I can see why this is a problem - the kernel probably assumes that any segment with MemSiz > FileSiz will be writable. Certainly it's a bit weird for the app to request unwritable zeroed pages. clear_user's probably not the right way to provide the extra zeroing. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
On Tue, Feb 08, 2005 at 06:10:18PM -0800, Andrew Morton wrote: We could just remove the printk and stick a comment over it. If the application later tries to access the not-there pages then it'll just fault. However I worry if there is some way in which we can leave unzeroed memory accessible to the application, although it's hard to see how that could happen. Daniel, Pavel cruelly chopped you off the Cc when replying. What's your diagnosis on the below? It's asking for a lot of unwritable zeroed space. See this: LOAD 0x00 0x08048000 0x08048000 0xb7354 0x1b7354 R E 0x1000 LOAD 0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW 0x1000 The 0xb7354 is size to map from the file, the 0x1b7354 is size to map in memory. We're supposed to zero-fill the rest. Now that I think about it I can see why this is a problem - the kernel probably assumes that any segment with MemSiz FileSiz will be writable. Certainly it's a bit weird for the app to request unwritable zeroed pages. clear_user's probably not the right way to provide the extra zeroing. -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
On Tue, Feb 08, 2005 at 06:51:06PM +0100, Pavel Machek wrote: > Hi! > > > I wonder if reverting the patch will restore the old behaviour? > > This seems to be minimal fix to get Kylix application back to the > working state... Maybe it is good idea for 2.6.11? Why does clearing the BSS fail? Are the program headers bogus? (readelf -l). -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
On Tue, Feb 08, 2005 at 06:51:06PM +0100, Pavel Machek wrote: Hi! I wonder if reverting the patch will restore the old behaviour? This seems to be minimal fix to get Kylix application back to the working state... Maybe it is good idea for 2.6.11? Why does clearing the BSS fail? Are the program headers bogus? (readelf -l). -- Daniel Jacobowitz CodeSourcery, LLC - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel CVS troubles with cvsps
On Tue, Jan 25, 2005 at 05:42:03PM +0100, Andrea Arcangeli wrote: > Any help is appreciated. I'm just starting to look more seriously into > this since I've some tools that depends on the cvsps to work and kernel > CVS is the only fully coherent linearized source of info in open format > (rest is either a priorietary format or unusable because out of > synchrony because not linearized). Until now I hoped that by waiting it > would automatically fixup, but it didn't yet ;). FYI, I haven't tried using cvsps on the kernel CVS, but I used to use it on GCC - and it fell down like this on a constant basis. You might want to take a look at 'xcvs', by Jun Sun. It's much more reliable and does everything I used to use cvsps for. And generally faster too. -- Daniel Jacobowitz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel CVS troubles with cvsps
On Tue, Jan 25, 2005 at 05:42:03PM +0100, Andrea Arcangeli wrote: Any help is appreciated. I'm just starting to look more seriously into this since I've some tools that depends on the cvsps to work and kernel CVS is the only fully coherent linearized source of info in open format (rest is either a priorietary format or unusable because out of synchrony because not linearized). Until now I hoped that by waiting it would automatically fixup, but it didn't yet ;). FYI, I haven't tried using cvsps on the kernel CVS, but I used to use it on GCC - and it fell down like this on a constant basis. You might want to take a look at 'xcvs', by Jun Sun. It's much more reliable and does everything I used to use cvsps for. And generally faster too. -- Daniel Jacobowitz - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: forestalling GNU incompatibility - proposal for binary relative dynamic linking
On Mon, Jan 24, 2005 at 03:53:11PM -0800, Edward Peschko wrote: > On Mon, Jan 24, 2005 at 03:38:49PM -0800, Richard Henderson wrote: > > On Mon, Jan 24, 2005 at 03:16:36PM -0800, Edward Peschko wrote: > > > cool.. any chance for some syntactic sugar so me (and other > > > users/vendors) wouldn't need to change any of their build scripts > > > and compilation processes? > > > > Uh, like what? That's about as simple as you can get. > > > > > > r~ > > I don't understand. > > Which is simpler, changing an environmental variable, or adding extra > CFLAGS to every single compile and recompiling? > > In addition, in your --rpath example, the relative pathing is hardcoded > into the executable, wheras with "*" you could modify the runtime behavior > of the executable at runtime. I suppose you could change this with chrpath, > but why bother? What if you want to test out two versions of relative > libraries side by side? You might want to take a look at Richard's suggestion again. The string '$ORIGIN' gets hardcoded into the binary and handled by the dynamic linker. But really, RPATH is a good solution to almost no problems. -- Daniel Jacobowitz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: seccomp for 2.6.11-rc1-bk8
On Sun, Jan 23, 2005 at 07:34:24AM +, David Wagner wrote: > Chris Wright wrote: > >* David Wagner ([EMAIL PROTECTED]) wrote: > >> There is a simple tweak to ptrace which fixes that: one could add an > >> API to specify a set of syscalls that ptrace should not trap on. To get > >> seccomp-like semantics, the user program could specify {read,write}, but > >> if the user program ever wants to change its policy, it could change that > >> set. Solaris /proc (which is what is used for tracing) has this feature. > >> I coded up such an extension to ptrace semantics a long time ago, and > >> it seemed to work fine for me, though of course I am not a ptrace expert. > > > >Hmm, yeah, that'd be nice. That only leaves the issue of tracer dying > >(say from that crazy oom killer ;-). > > Yes, I also implemented was a ptrace option which causes the child to be > slaughtered if the parent dies for any reason. I could dig up the code, > but I don't recall it being very hard. This was ages ago (a 2.0.x kernel) > and I have no idea what might have changed. Also, am definitely not a > guru on kernel internals, so it is always possible I missed something. > But, at least on the surface this doesn't seem hard to implement. Maybe it's time to resubmit both of these. OTOH, maybe it's time to do something more drastic to ptrace to untangle it from signals... > A third thing I implemented was a option which would cause ptrace() to be > inherited across forks. The way that strace does this (last I looked) > is an unreliable abomination: when it sees a request to call fork(), it > sets a breakpoint at the next instruction after the fork() by re-writing > the code of the parent, then when that breakpoint triggers it attaches to > the child, restores the parent's code, and lets them continue executing. > This is icky, and I have little confidence in its security to prevent > children from escaping a ptrace() jail, so I added a feature to ptrace() > that remedies the situation. This has since been done in 2.5.x; see PTRACE_EVENT_FORK. GDB even uses it nowadays. I'm not sure if strace does. -- Daniel Jacobowitz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: seccomp for 2.6.11-rc1-bk8
On Sun, Jan 23, 2005 at 07:34:24AM +, David Wagner wrote: Chris Wright wrote: * David Wagner ([EMAIL PROTECTED]) wrote: There is a simple tweak to ptrace which fixes that: one could add an API to specify a set of syscalls that ptrace should not trap on. To get seccomp-like semantics, the user program could specify {read,write}, but if the user program ever wants to change its policy, it could change that set. Solaris /proc (which is what is used for tracing) has this feature. I coded up such an extension to ptrace semantics a long time ago, and it seemed to work fine for me, though of course I am not a ptrace expert. Hmm, yeah, that'd be nice. That only leaves the issue of tracer dying (say from that crazy oom killer ;-). Yes, I also implemented was a ptrace option which causes the child to be slaughtered if the parent dies for any reason. I could dig up the code, but I don't recall it being very hard. This was ages ago (a 2.0.x kernel) and I have no idea what might have changed. Also, am definitely not a guru on kernel internals, so it is always possible I missed something. But, at least on the surface this doesn't seem hard to implement. Maybe it's time to resubmit both of these. OTOH, maybe it's time to do something more drastic to ptrace to untangle it from signals... A third thing I implemented was a option which would cause ptrace() to be inherited across forks. The way that strace does this (last I looked) is an unreliable abomination: when it sees a request to call fork(), it sets a breakpoint at the next instruction after the fork() by re-writing the code of the parent, then when that breakpoint triggers it attaches to the child, restores the parent's code, and lets them continue executing. This is icky, and I have little confidence in its security to prevent children from escaping a ptrace() jail, so I added a feature to ptrace() that remedies the situation. This has since been done in 2.5.x; see PTRACE_EVENT_FORK. GDB even uses it nowadays. I'm not sure if strace does. -- Daniel Jacobowitz - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: forestalling GNU incompatibility - proposal for binary relative dynamic linking
On Mon, Jan 24, 2005 at 03:53:11PM -0800, Edward Peschko wrote: On Mon, Jan 24, 2005 at 03:38:49PM -0800, Richard Henderson wrote: On Mon, Jan 24, 2005 at 03:16:36PM -0800, Edward Peschko wrote: cool.. any chance for some syntactic sugar so me (and other users/vendors) wouldn't need to change any of their build scripts and compilation processes? Uh, like what? That's about as simple as you can get. r~ I don't understand. Which is simpler, changing an environmental variable, or adding extra CFLAGS to every single compile and recompiling? In addition, in your --rpath example, the relative pathing is hardcoded into the executable, wheras with * you could modify the runtime behavior of the executable at runtime. I suppose you could change this with chrpath, but why bother? What if you want to test out two versions of relative libraries side by side? You might want to take a look at Richard's suggestion again. The string '$ORIGIN' gets hardcoded into the binary and handled by the dynamic linker. But really, RPATH is a good solution to almost no problems. -- Daniel Jacobowitz - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New topic (PowerPC Linux PCI HELL)
On Wed, Sep 13, 2000 at 05:29:58PM -0700, Andre Hedrick wrote: > > Okay who can teach me how to force hooks and ram this down the PPC > > pci_write_config_word(dev, PCI_COMMAND, 0x05); > > I have all the address registered. > My new PPC G3 (7600/132) toy is not allowing IO's on PCI cards to come > alive. Thus I get some of the most beuatiful lockups ever. > I suspect that this needs to be handled down in the arch. > > ./linux/arch/ppc/kernel/{chrp_pci.c|mbx_pci.c|pmac_pci.c|prep_pci.c} > > Basically I can not get the IO's active, regardless of BIOS on the card. > Yes this is the old trick that used to work of making ix86 cards run in > non ix86-pci slots. > > Here is the fun part, I have a native mac/ppc Ultra-66 card that is fin > under Mac OS, but the IO's are not enable in linux and it crash like a big > dog also. I'm going to bet you need to look at Michel Lanners' (did I spell that right this time?) PCI patches. For instance, I've always needed this hideous patch on my 7300/200 to get a Promise Ultra66 card to work: diff -ur merging-bk/drivers/block/ide-pci.c work-bk/drivers/block/ide-pci.c --- merging-bk/drivers/block/ide-pci.c Tue Apr 4 22:19:16 2000 +++ work-bk/drivers/block/ide-pci.c Thu Mar 9 15:33:25 2000 @@ -468,6 +468,15 @@ printk("%s: error accessing PCI regs\n", d->name); return; } +#ifdef __powerpc__ + if (!(pcicmd & PCI_COMMAND_IO)) { /* is device disabled? */ + pci_write_config_word(dev, PCI_COMMAND, pcicmd | PCI_COMMAND_IO); + if (pci_read_config_word(dev, PCI_COMMAND, )) { + printk("%s: error accessing PCI regs\n", d->name); + return; + } + } +#endif if (!(pcicmd & PCI_COMMAND_IO)) { /* is device disabled? */ /* * PnP BIOS was *supposed* to have set this device up for us, Dan /\ /\ | Daniel Jacobowitz|__|SCS Class of 2002 | | Debian GNU/Linux Developer__Carnegie Mellon University | | [EMAIL PROTECTED] | | [EMAIL PROTECTED] | \/ \/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/