RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
> On Thursday 16 August 2007 01:39, Satyam Sharma wrote: > > > > static inline void wait_for_init_deassert(atomic_t *deassert) > > { > > - while (!atomic_read(deassert)); > > + while (!atomic_read(deassert)) > > + cpu_relax(); > > return; > > } > > For less-than-briliant people like me, it's totally non-obvious that > cpu_relax() is needed for correctness here, not just to make P4 happy. > > IOW: "atomic_read" name quite unambiguously means "I will read > this variable from main memory". Which is not true and creates > potential for confusion and bugs. To me, "atomic_read" means a read which is synchronized with other changes to the variable (using the atomic_XXX functions) in such a way that I will always only see the "before" or "after" state of the variable - never an intermediate state while a modification is happening. It doesn't imply that I have to see the "after" state immediately after another thread modifies it. Perhaps the Linux atomic_XXX functions work like that, or used to work like that, but it's counter-intuitive to me that "atomic" should imply a memory read. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
On Thursday 16 August 2007 01:39, Satyam Sharma wrote: static inline void wait_for_init_deassert(atomic_t *deassert) { - while (!atomic_read(deassert)); + while (!atomic_read(deassert)) + cpu_relax(); return; } For less-than-briliant people like me, it's totally non-obvious that cpu_relax() is needed for correctness here, not just to make P4 happy. IOW: atomic_read name quite unambiguously means I will read this variable from main memory. Which is not true and creates potential for confusion and bugs. To me, atomic_read means a read which is synchronized with other changes to the variable (using the atomic_XXX functions) in such a way that I will always only see the before or after state of the variable - never an intermediate state while a modification is happening. It doesn't imply that I have to see the after state immediately after another thread modifies it. Perhaps the Linux atomic_XXX functions work like that, or used to work like that, but it's counter-intuitive to me that atomic should imply a memory read. Later, Kenn - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [LV] start_thread question...
On Sun, May 20, 2001 at 05:24:48PM +0100, Dave Airlie wrote: > > I'm implementing start_thread for the VAX port and am wondering does > start_thread have to return to load_elf_binary? I'm working on the init > thread and what is happening is it is returning the whole way back to the > execve caller .. which I know shouldn't happen. > > so I suppose what I'm looking for is the point where the user space code > gets control... is it when the registers are set in the start_thread? if > so how does start_thread return > > On the VAX we have to call a return from interrupt to get to user space > and I'm trying to figure out where this should happen... I haven't got time to look at this in detail, but you could probably do it by frobbing the saved registers that will be restored by the ret_from_syscall in entry.S. Do you have a pt_regs *regs function argument at the right point? If so, it should point to these saved registers. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
IP autoconfig via DHCP?
Quick question... Back in 2.2, we could use DHCP to auto-config the IP setup. In fact, the choice was DHCP, BOOTP or RARP. Now there is only BOOTP or RARP. What happened to DHCP support? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
IP autoconfig via DHCP?
Quick question... Back in 2.2, we could use DHCP to auto-config the IP setup. In fact, the choice was DHCP, BOOTP or RARP. Now there is only BOOTP or RARP. What happened to DHCP support? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kmalloc() alignment
On Mon, Mar 05, 2001 at 04:15:36PM -0800, H. Peter Anvin wrote: > > So, to summarise (for 32-bit CPUs): > > > > o Alan Cox & Manfred Spraul say 4-byte alignment is guaranteed. > > > > o If you need larger alignment, you need to alloc a larger space, > >round as necessary, and keep the original pointer for kfree() > > > > Maybe I'll just use get_free_pages, since it's a 64KB chunk that > > I need (and it's only a once-off). > > > > It might be worth asking the question if larger blocks are more > aligned? OK, I'll bite... Are larger blocks more aligned? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kmalloc() alignment
On Sun, Mar 04, 2001 at 11:41:12PM +0100, Manfred Spraul wrote: > > > > Does kmalloc() make any guarantees of the alignment of allocated > > blocks? Will the returned block always be 4-, 8- or 16-byte > > aligned, for example? > > > > 4-byte alignment is guaranteed on 32-bit cpus, 8-byte alignment on > 64-bit cpus. So, to summarise (for 32-bit CPUs): o Alan Cox & Manfred Spraul say 4-byte alignment is guaranteed. o If you need larger alignment, you need to alloc a larger space, round as necessary, and keep the original pointer for kfree() Maybe I'll just use get_free_pages, since it's a 64KB chunk that I need (and it's only a once-off). Thanks for your advice. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kmalloc() alignment
On Sun, Mar 04, 2001 at 11:41:12PM +0100, Manfred Spraul wrote: Does kmalloc() make any guarantees of the alignment of allocated blocks? Will the returned block always be 4-, 8- or 16-byte aligned, for example? 4-byte alignment is guaranteed on 32-bit cpus, 8-byte alignment on 64-bit cpus. So, to summarise (for 32-bit CPUs): o Alan Cox Manfred Spraul say 4-byte alignment is guaranteed. o If you need larger alignment, you need to alloc a larger space, round as necessary, and keep the original pointer for kfree() Maybe I'll just use get_free_pages, since it's a 64KB chunk that I need (and it's only a once-off). Thanks for your advice. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kmalloc() alignment
On Mon, Mar 05, 2001 at 04:15:36PM -0800, H. Peter Anvin wrote: So, to summarise (for 32-bit CPUs): o Alan Cox Manfred Spraul say 4-byte alignment is guaranteed. o If you need larger alignment, you need to alloc a larger space, round as necessary, and keep the original pointer for kfree() Maybe I'll just use get_free_pages, since it's a 64KB chunk that I need (and it's only a once-off). It might be worth asking the question if larger blocks are more aligned? OK, I'll bite... Are larger blocks more aligned? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kmalloc() alignment
Does kmalloc() make any guarantees of the alignment of allocated blocks? Will the returned block always be 4-, 8- or 16-byte aligned, for example? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kmalloc() alignment
Does kmalloc() make any guarantees of the alignment of allocated blocks? Will the returned block always be 4-, 8- or 16-byte aligned, for example? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel_thread() & thread starting
On Sun, Feb 18, 2001 at 10:53:16PM +, Russell King wrote: > Kenn Humborg writes: > > When starting bdflush and kupdated, bdflush_init() uses a semaphore to > > make sure that the threads have run before continuing. Shouldn't > > start_context_thread() do something similar? > > I think this would be a good idea. Here is a patch to try. Please report > back if it works so that it can be forwarded to Linus. Thanks. Works perfectly for me. I'll leave it up to you guys to decide what's the right way to deal with this and pass a patch to Linus/Alan. Meanwhile, I'll keep Russell's patch below in our CVS tree. Thanks, Kenn > --- orig/kernel/context.c Tue Jan 30 13:31:11 2001 > +++ linux/kernel/context.cSun Feb 18 22:51:56 2001 > @@ -63,7 +63,7 @@ > return ret; > } > > -static int context_thread(void *dummy) > +static int context_thread(void *sem) > { > struct task_struct *curtask = current; > DECLARE_WAITQUEUE(wait, curtask); > @@ -79,6 +79,8 @@ > recalc_sigpending(curtask); > spin_unlock_irq(>sigmask_lock); > > + up((struct semaphore *)sem); > + > /* Install a handler so SIGCLD is delivered */ > sa.sa.sa_handler = SIG_IGN; > sa.sa.sa_flags = 0; > @@ -148,7 +150,9 @@ > > int start_context_thread(void) > { > - kernel_thread(context_thread, NULL, CLONE_FS | CLONE_FILES); > + DECLARE_MUTEX_LOCKED(sem); > + kernel_thread(context_thread, , CLONE_FS | CLONE_FILES); > + down(); > return 0; > } > > > > -- > Russell King ([EMAIL PROTECTED])The developer of ARM Linux > http://www.arm.linux.org.uk/personal/aboutme.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel_thread() thread starting
On Sun, Feb 18, 2001 at 10:53:16PM +, Russell King wrote: Kenn Humborg writes: When starting bdflush and kupdated, bdflush_init() uses a semaphore to make sure that the threads have run before continuing. Shouldn't start_context_thread() do something similar? I think this would be a good idea. Here is a patch to try. Please report back if it works so that it can be forwarded to Linus. Thanks. Works perfectly for me. I'll leave it up to you guys to decide what's the right way to deal with this and pass a patch to Linus/Alan. Meanwhile, I'll keep Russell's patch below in our CVS tree. Thanks, Kenn --- orig/kernel/context.c Tue Jan 30 13:31:11 2001 +++ linux/kernel/context.cSun Feb 18 22:51:56 2001 @@ -63,7 +63,7 @@ return ret; } -static int context_thread(void *dummy) +static int context_thread(void *sem) { struct task_struct *curtask = current; DECLARE_WAITQUEUE(wait, curtask); @@ -79,6 +79,8 @@ recalc_sigpending(curtask); spin_unlock_irq(curtask-sigmask_lock); + up((struct semaphore *)sem); + /* Install a handler so SIGCLD is delivered */ sa.sa.sa_handler = SIG_IGN; sa.sa.sa_flags = 0; @@ -148,7 +150,9 @@ int start_context_thread(void) { - kernel_thread(context_thread, NULL, CLONE_FS | CLONE_FILES); + DECLARE_MUTEX_LOCKED(sem); + kernel_thread(context_thread, sem, CLONE_FS | CLONE_FILES); + down(sem); return 0; } -- Russell King ([EMAIL PROTECTED])The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel_thread() & thread starting
In init/main.c, do_basic_setup() we have: start_context_thread(); do_initcalls(); start_context_thread() calls kernel_thread() to start the keventd thread. Then do_initcalls() calls all the init functions and finishes by calling flush_scheduled_tasks(). This function ends up calling schedule_task() which checks if keventd is running. With a very stripped down kernel, it seems possible that do_initcalls() can complete without context_thread() having had a chance to run (and set the flag that keventd is running). Right now, in the Linux/VAX project, I'm working with a very stripped down kernel and I'm seeing this behaviour. Depending on what I enable in the .config, I can get schedule_task() to fail with: schedule_task(): keventd has not started When starting bdflush and kupdated, bdflush_init() uses a semaphore to make sure that the threads have run before continuing. Shouldn't start_context_thread() do something similar? Or am I missing something? Thanks, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel_thread() thread starting
In init/main.c, do_basic_setup() we have: start_context_thread(); do_initcalls(); start_context_thread() calls kernel_thread() to start the keventd thread. Then do_initcalls() calls all the init functions and finishes by calling flush_scheduled_tasks(). This function ends up calling schedule_task() which checks if keventd is running. With a very stripped down kernel, it seems possible that do_initcalls() can complete without context_thread() having had a chance to run (and set the flag that keventd is running). Right now, in the Linux/VAX project, I'm working with a very stripped down kernel and I'm seeing this behaviour. Depending on what I enable in the .config, I can get schedule_task() to fail with: schedule_task(): keventd has not started When starting bdflush and kupdated, bdflush_init() uses a semaphore to make sure that the threads have run before continuing. Shouldn't start_context_thread() do something similar? Or am I missing something? Thanks, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Third arg to switch_to()
On Mon, Oct 30, 2000 at 07:15:58PM +, I wrote: > > Can anyone point me to an explanation of the third arg to > switch_to(prev, next, last)? > > It appeared in 2.2.8. > > What exactly is supposed to be written to it? Mea culpa... Further digging revealed that it's for returning prev in the new task, to deal with the fact that the stack has changed so local variables in schedule() don't exist anymore. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Third arg to switch_to()
Can anyone point me to an explanation of the third arg to switch_to(prev, next, last)? It appeared in 2.2.8. What exactly is supposed to be written to it? Thanks, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Third arg to switch_to()
Can anyone point me to an explanation of the third arg to switch_to(prev, next, last)? It appeared in 2.2.8. What exactly is supposed to be written to it? Thanks, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Third arg to switch_to()
On Mon, Oct 30, 2000 at 07:15:58PM +, I wrote: Can anyone point me to an explanation of the third arg to switch_to(prev, next, last)? It appeared in 2.2.8. What exactly is supposed to be written to it? Mea culpa... Further digging revealed that it's for returning prev in the new task, to deal with the fact that the stack has changed so local variables in schedule() don't exist anymore. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: 2.4 MM overview?
> > That's not the worst! Considering the 4-byte PTE and the > 40-byte mem_map_t, > > our memory management overhead is at least 44 bytes/page or 8.5%! > > use a logical page size of 4kb. > > > We are formulating cunning plans of aggregating 2, 4 or 8 pages together > > into "bigpages", telling the arch-independent code that we've got > > larger pages than we really have and manipulating multiple PTEs in the > > set_pte() primitive and friends. > > > > We don't know how feasible this is yet.. > > why wouldn't it be feasible ? Because I don't know this part of the kernel well enough yet :-) Maybe there are cuncurrency issues with modifying multiple PTEs when the kernel thinks it's only modifying one. There may be hardware-mandated limitations on this too. I'll have to check _very_ closely with the VAX Architecture Reference Manual. > > > OTOH, I think mapping all physical memory makes sense with > the three page > > > table setup. > > > > It might and it might not. Expanding the system page table is pretty > > much out of the question because it needs to be physically contiguous. > > agreed. > > > So we need to allocate system PTEs for the following at boot time: > > > >1. Map all physical memory pages > >2. Spare PTEs for mapping I/O space via ioremap(). > >3. Spare PTEs for vmalloc() > 4. Spare PTEs for making user process page tables virtually > contiguous. Couldn't we use vmalloc() for this? > Note > that this effectively gives you a two-level page table. > (Actually, a 3-level > page table, with 2 pmds per pgd, 4K PTEs per 3rd-level page table, and 512 > bytes per page.) > > So, here's what I'm proposing: I'll need to examine this more closely when I get home later. Too busy right now :-( > > It seems a bit wasteful that process pages will have two PTEs, one in > > the relevant process page table and one in the system page table. > > why ? You lose 0.78 % of your physical memory compared to the more > complicated design, which shouldn't hurt too much. The 'scarce resource' I'm thinking about here is not memory, it's system PTEs. > It might make sense > if you have tons of physical memory though so you can use all of it > (where tons I'd guess to be about 1.8 GB, not knowing too much about > the architecture). Memory from 0xc000 to 0x is not usable in VAX, so map-all-memory will give a maximum of just under 1GB. I have a feeling that there is an architectural limit of 1GB anyway (21-bit page frame number + 9 bit PAGE_SHIFT = 30 bits = 1GB). > > How much space tends to be vmalloc()-ed in a running system? > > See the discussion for alpha a week or so ago. It tends to not > be very much > but for some applications (TUX, for example), it's expected to be most of > physical memory. Dammit! Must have been just before I subscribed... I'll do some archive archeology later. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: 2.4 MM overview?
> > We've kind of got 1.5-level page tables. There are actually 3 > page tables. > > The system page table maps memory starting at 0x8000. The > P0 process > > page table maps from 0x0 up and the P1 process page table maps from > > 0x7fff down. > > And they have to be physically contiguous I guess ? The system page table must be physically contiguous. The process tables are actually referred to via virtual addresses, so they only have to be virtually contiguous in system space. > > This means that sparse address spaces are going to be _really_ expensive > > on PTEs. I don't know how much of a problem this is going to be yet, > > but I'm sure it's going to be fun :-) > > 512 byte pages, 4 bytes per pte ? Ouch. Can you fill the TLB manually ? That's not the worst! Considering the 4-byte PTE and the 40-byte mem_map_t, our memory management overhead is at least 44 bytes/page or 8.5%! We are formulating cunning plans of aggregating 2, 4 or 8 pages together into "bigpages", telling the arch-independent code that we've got larger pages than we really have and manipulating multiple PTEs in the set_pte() primitive and friends. We don't know how feasible this is yet.. > OTOH, I think mapping all physical memory makes sense with the three page > table setup. It might and it might not. Expanding the system page table is pretty much out of the question because it needs to be physically contiguous. So we need to allocate system PTEs for the following at boot time: 1. Map all physical memory pages 2. Spare PTEs for mapping I/O space via ioremap(). 3. Spare PTEs for vmalloc() It seems a bit wasteful that process pages will have two PTEs, one in the relevant process page table and one in the system page table. If we could get away without needing the system PTE, then this would either provide more space for #2 and #3 above, or reduce the size of the system page table. How much space tends to be vmalloc()-ed in a running system? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: 2.4 MM overview?
We've kind of got 1.5-level page tables. There are actually 3 page tables. The system page table maps memory starting at 0x8000. The P0 process page table maps from 0x0 up and the P1 process page table maps from 0x7fff down. And they have to be physically contiguous I guess ? The system page table must be physically contiguous. The process tables are actually referred to via virtual addresses, so they only have to be virtually contiguous in system space. This means that sparse address spaces are going to be _really_ expensive on PTEs. I don't know how much of a problem this is going to be yet, but I'm sure it's going to be fun :-) 512 byte pages, 4 bytes per pte ? Ouch. Can you fill the TLB manually ? That's not the worst! Considering the 4-byte PTE and the 40-byte mem_map_t, our memory management overhead is at least 44 bytes/page or 8.5%! We are formulating cunning plans of aggregating 2, 4 or 8 pages together into "bigpages", telling the arch-independent code that we've got larger pages than we really have and manipulating multiple PTEs in the set_pte() primitive and friends. We don't know how feasible this is yet.. OTOH, I think mapping all physical memory makes sense with the three page table setup. It might and it might not. Expanding the system page table is pretty much out of the question because it needs to be physically contiguous. So we need to allocate system PTEs for the following at boot time: 1. Map all physical memory pages 2. Spare PTEs for mapping I/O space via ioremap(). 3. Spare PTEs for vmalloc() It seems a bit wasteful that process pages will have two PTEs, one in the relevant process page table and one in the system page table. If we could get away without needing the system PTE, then this would either provide more space for #2 and #3 above, or reduce the size of the system page table. How much space tends to be vmalloc()-ed in a running system? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 MM overview?
On Sun, Oct 15, 2000 at 09:45:11PM +0100, Alan Cox wrote: > > Well, we ain't got these luxuries/complications in VAXland... Hell, > > we don't even have two-level page tables :-( > > Really. Ugh. I always assumed Vax had at least two levels because mmap on > 4.2 BSD used to panic on 128K+ blocks. I guess there was a different reason > for that then We've kind of got 1.5-level page tables. There are actually 3 page tables. The system page table maps memory starting at 0x8000. The P0 process page table maps from 0x0 up and the P1 process page table maps from 0x7fff down. This means that sparse address spaces are going to be _really_ expensive on PTEs. I don't know how much of a problem this is going to be yet, but I'm sure it's going to be fun :-) You can be sure that the first valid page won't be at 0x08048000... Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 MM overview?
On Sun, Oct 15, 2000 at 09:22:58PM +0100, Alan Cox wrote: > > > or you have a sane memory management model with tags/spaces then its a non issue > > > > You've lost me here. Tags/spaces? > > A lot of memory management hardware allows you to build page tables that contain > more than just the addresses. Instead a tag register or the processor state > or both are combined in the lookup. This is paticularly important for a > virtually tagged cache to avoid flushing the cache on task switches [Consults VMS/Alpha Internals & Data Structure manual...] You mean like the way the Alpha has a PTE bit that says 'this page is valid at the same address in every process', and the address space number (ASN) that can be used to 'uniquefy' cache entries for the same virtual addresses in different processes? Well, we ain't got these luxuries/complications in VAXland... Hell, we don't even have two-level page tables :-( Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 MM overview?
On Sun, Oct 15, 2000 at 08:35:46PM +0100, Alan Cox wrote: > > I understand that 2.4 no longer maps all physical memory as 2.2 > > and earlier used to do. > > Its really up to you if you choose to do that or not. If you have enough > address space to create all your virtual and physical mappings without problems, OK... > or you have a sane memory management model with tags/spaces then its a non issue You've lost me here. Tags/spaces? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 MM overview?
On Sun, Oct 15, 2000 at 08:07:06PM +0200, Andi Kleen wrote: > On Sun, Oct 15, 2000 at 05:29:46PM +0100, Kenn Humborg wrote: > > > > > > __pa() and __va() are still defined as addr -/+ PAGE_OFFSET. So > > where did I hear about 2.4 not mapping all memory? Could it be > > that this applies only to "high memory" in x86? > > It only applies to high memory. To access it you have to use kmap(). > To use __pa you need to create a bounce buffer in lowmem first as of 2.4. Excellent! Thanks for clearing that up. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 MM overview?
On Sun, Oct 15, 2000 at 06:03:40PM +0200, Erik Mouw wrote: > On Sun, Oct 15, 2000 at 04:24:45PM +0100, Kenn Humborg wrote: > > I understand that 2.4 no longer maps all physical memory as 2.2 > > and earlier used to do. > > > > Is there any documentation on this change and how it affects > > arch-specific code? > > > > Specifically, we've been basing the VAX port on 2.2 while waiting > > for 2.4 to stabilize. Now we're looking at moving to 2.4. > > Have a look at the Linux-MM pages at: > > http://www.linux.eu.org/Linux-MM/ The stuff linked to from there seems to cover the higher-level VM aspects like balancing the VM. Basically arch-independent stuff. I'm looking for info on the impact the 2.4 changes will have on the "API" between the arch-indep and arch-dep code. For example, 2.2 assumes that you can access by referencing + PAGE_OFFSET. AFAIK, this is no longer true in 2.4. So what's the new mechanism for accessing physical memory? OK, this particular question is easily answered by reading the source... __pa() and __va() are still defined as addr -/+ PAGE_OFFSET. So where did I hear about 2.4 not mapping all memory? Could it be that this applies only to "high memory" in x86? Dazed and confused, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
2.4 MM overview?
I understand that 2.4 no longer maps all physical memory as 2.2 and earlier used to do. Is there any documentation on this change and how it affects arch-specific code? Specifically, we've been basing the VAX port on 2.2 while waiting for 2.4 to stabilize. Now we're looking at moving to 2.4. Thanks, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
2.4 MM overview?
I understand that 2.4 no longer maps all physical memory as 2.2 and earlier used to do. Is there any documentation on this change and how it affects arch-specific code? Specifically, we've been basing the VAX port on 2.2 while waiting for 2.4 to stabilize. Now we're looking at moving to 2.4. Thanks, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 MM overview?
On Sun, Oct 15, 2000 at 08:07:06PM +0200, Andi Kleen wrote: On Sun, Oct 15, 2000 at 05:29:46PM +0100, Kenn Humborg wrote: Surprise! __pa() and __va() are still defined as addr -/+ PAGE_OFFSET. So where did I hear about 2.4 not mapping all memory? Could it be that this applies only to "high memory" in x86? It only applies to high memory. To access it you have to use kmap(). To use __pa you need to create a bounce buffer in lowmem first as of 2.4. Excellent! Thanks for clearing that up. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 MM overview?
On Sun, Oct 15, 2000 at 09:22:58PM +0100, Alan Cox wrote: or you have a sane memory management model with tags/spaces then its a non issue You've lost me here. Tags/spaces? A lot of memory management hardware allows you to build page tables that contain more than just the addresses. Instead a tag register or the processor state or both are combined in the lookup. This is paticularly important for a virtually tagged cache to avoid flushing the cache on task switches [Consults VMS/Alpha Internals Data Structure manual...] You mean like the way the Alpha has a PTE bit that says 'this page is valid at the same address in every process', and the address space number (ASN) that can be used to 'uniquefy' cache entries for the same virtual addresses in different processes? Well, we ain't got these luxuries/complications in VAXland... Hell, we don't even have two-level page tables :-( Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Tue, Oct 10, 2000 at 12:55:33AM +0200, Andi Kleen wrote: > On Mon, Oct 09, 2000 at 11:45:18PM +0100, Kenn Humborg wrote: > > Simple. Each interrupt stack is, say, 8 pages. You have an array > > of N interrupt stacks. Then you calculate > > > >cpu_id = (sp & ~(INT_STACK_SIZE-1)) >> (PAGE_SHIFT + 3); > > > > Actually, I'd put the interrupt stack and any other per-cpu data > > structures together in this region. > > > So your smp_processor_id() looks like: > > #define smp_processor_id() \ > (in_interrupt() ? (sp & ~(INT_STACK_SIZE-1)) >> (PAGE_SHIFT + 3) : > (struct task_struct *)(sp & -8192)->current_cpu) > > > ? Nope. > There is just an ugly problem: in_interrupt already requires the CPU id > to look up the table of interrupt counters. The PSL (processor status longword) has a bit that tells you whether you're currently on the interrupt stack or not. You can test this in two instructions: movpsl r0 # get PSL bbs $0x25, r0,# branch if I bit set (bit 25) Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Tue, Oct 10, 2000 at 12:36:35AM +0200, Andi Kleen wrote: > On Mon, Oct 09, 2000 at 11:30:50PM +0100, Alan Cox wrote: > > > I think I'll go for the 'current is in a well-known register' > > > approach and see how this goes... > > > > Failing that the 2.0 approach will work, current is a global in uniprocessor > > and a #define to an array indexed by cpu id in smp > > The problem is where to get the cpuid from (see how smp_processor_id > is currently defined ;) When you don't have a hidden register in the > CPU you're screwed. > [x86-64 has one btw] Simple. Each interrupt stack is, say, 8 pages. You have an array of N interrupt stacks. Then you calculate cpu_id = (sp & ~(INT_STACK_SIZE-1)) >> (PAGE_SHIFT + 3); Actually, I'd put the interrupt stack and any other per-cpu data structures together in this region. I don't know yet how you decide which secondary processor is which at boot time. Maybe it doesn't matter, so you can just let them fight over the per-cpu data structures by trying to claim spinlocks on each one in turn. Anyway, this SMP stuff will be quite academic for a while unless someone wants to donate a workstation-sized SMP VAX (if such a beast exists at all :-) Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Tue, Oct 10, 2000 at 09:04:30AM +1100, Keith Owens wrote: > On 9 Oct 2000 11:08:36 -0700, > [EMAIL PROTECTED] (Linus Torvalds) wrote: > >Note that there are alternative approaches. For example, you could make > >the interrupt stack be in the same multi-page as the regular stack, and > >switch them both at task-switch time - just allocate four pages instead > >of two, and use "current = esp & ~16383" instead or something like that. > > Ouch. Too many places in the source have hard coded 8191 or 8192. > Would you take a patch to replace all those hard coded numbers with > #defines or is that best left for 2.4.1? It wouldn't work anyway. There has to be _one_ interrupt stack per CPU. When an interrupt happens (or a certain bit is set in an exception vector), the CPU saves SP in the USP or KSP register (user or kernel SP) and loads SP from the ISP register. USP and KSP are considered part of process context and are saved and restored across context switches. ISP is not. Thinks out loud... Or maybe we could play tricks with ISP during the context switch... It would certainly be made simpler by Linux's all-or-nothing approach to enabling/disabling interrupts. (In contrast to VMS which makes extensive use of the VAX's 31 interrupt priority levels. Process re-scheduling happens at priority 3, devices interrupt at priority 16-23, power failure interrupts at 30, for example.) Or maybe not... What if a device interrupted at IPL 20. Another device could interrupt at IPL 21 in the window before we block all interrupts in the first interrupt handler. Then this second handler triggers a resched. If we switch to a different interrupt stack then we'll destroy the stack context of the first handler. Unless we either copy around the stack context (ugh) or map the same physical stack into each task_struct. Seems a bit wasteful. I think I'll go for the 'current is in a well-known register' approach and see how this goes... Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 03:54:21AM +0200, Andi Kleen wrote: > On Mon, Oct 09, 2000 at 02:45:54AM +0100, Kenn Humborg wrote: > > On Mon, Oct 09, 2000 at 02:21:09AM +0100, Kenn Humborg wrote: > > > On Mon, Oct 09, 2000 at 02:20:27AM +0200, Andi Kleen wrote: > > > > 2.4 TCP code relies on current being valid in a softirq. > > > > > > And what the hell does TCP need current for anyway? > > > > I think the only reference is in tcp_input.c, tcp_data_queue(). > > This does: > > [...] > > It is actually used in two places, in the fast path and there. It isn't > as bad as it looks because it is only used in user context and could > be fixed by putting a special flag into the sock for the execute > in user context case (or just supply an argument that is passed around) > > The point was just that there are probably other users of current > in interrupt context and AFAIK it works currently in all ports > so you would need to fix these (mostly buggy) occurrences. OK. I'm convinced. current will be valid in interrupt context. > If you ever wanted to do a SMP VAX port you would also need to fix > smp_processor_id(). No problem. I've already come up with a couple of ways of doing that. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 03:54:21AM +0200, Andi Kleen wrote: On Mon, Oct 09, 2000 at 02:45:54AM +0100, Kenn Humborg wrote: On Mon, Oct 09, 2000 at 02:21:09AM +0100, Kenn Humborg wrote: On Mon, Oct 09, 2000 at 02:20:27AM +0200, Andi Kleen wrote: 2.4 TCP code relies on current being valid in a softirq. And what the hell does TCP need current for anyway? I think the only reference is in tcp_input.c, tcp_data_queue(). This does: [...] It is actually used in two places, in the fast path and there. It isn't as bad as it looks because it is only used in user context and could be fixed by putting a special flag into the sock for the execute in user context case (or just supply an argument that is passed around) The point was just that there are probably other users of current in interrupt context and AFAIK it works currently in all ports so you would need to fix these (mostly buggy) occurrences. OK. I'm convinced. current will be valid in interrupt context. If you ever wanted to do a SMP VAX port you would also need to fix smp_processor_id(). No problem. I've already come up with a couple of ways of doing that. Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Tue, Oct 10, 2000 at 09:04:30AM +1100, Keith Owens wrote: On 9 Oct 2000 11:08:36 -0700, [EMAIL PROTECTED] (Linus Torvalds) wrote: Note that there are alternative approaches. For example, you could make the interrupt stack be in the same multi-page as the regular stack, and switch them both at task-switch time - just allocate four pages instead of two, and use "current = esp ~16383" instead or something like that. Ouch. Too many places in the source have hard coded 8191 or 8192. Would you take a patch to replace all those hard coded numbers with #defines or is that best left for 2.4.1? It wouldn't work anyway. There has to be _one_ interrupt stack per CPU. When an interrupt happens (or a certain bit is set in an exception vector), the CPU saves SP in the USP or KSP register (user or kernel SP) and loads SP from the ISP register. USP and KSP are considered part of process context and are saved and restored across context switches. ISP is not. Thinks out loud... Or maybe we could play tricks with ISP during the context switch... It would certainly be made simpler by Linux's all-or-nothing approach to enabling/disabling interrupts. (In contrast to VMS which makes extensive use of the VAX's 31 interrupt priority levels. Process re-scheduling happens at priority 3, devices interrupt at priority 16-23, power failure interrupts at 30, for example.) Or maybe not... What if a device interrupted at IPL 20. Another device could interrupt at IPL 21 in the window before we block all interrupts in the first interrupt handler. Then this second handler triggers a resched. If we switch to a different interrupt stack then we'll destroy the stack context of the first handler. Unless we either copy around the stack context (ugh) or map the same physical stack into each task_struct. Seems a bit wasteful. I think I'll go for the 'current is in a well-known register' approach and see how this goes... Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Tue, Oct 10, 2000 at 12:36:35AM +0200, Andi Kleen wrote: On Mon, Oct 09, 2000 at 11:30:50PM +0100, Alan Cox wrote: I think I'll go for the 'current is in a well-known register' approach and see how this goes... Failing that the 2.0 approach will work, current is a global in uniprocessor and a #define to an array indexed by cpu id in smp The problem is where to get the cpuid from (see how smp_processor_id is currently defined ;) When you don't have a hidden register in the CPU you're screwed. [x86-64 has one btw] Simple. Each interrupt stack is, say, 8 pages. You have an array of N interrupt stacks. Then you calculate cpu_id = (sp ~(INT_STACK_SIZE-1)) (PAGE_SHIFT + 3); Actually, I'd put the interrupt stack and any other per-cpu data structures together in this region. I don't know yet how you decide which secondary processor is which at boot time. Maybe it doesn't matter, so you can just let them fight over the per-cpu data structures by trying to claim spinlocks on each one in turn. Anyway, this SMP stuff will be quite academic for a while unless someone wants to donate a workstation-sized SMP VAX (if such a beast exists at all :-) Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Tue, Oct 10, 2000 at 12:55:33AM +0200, Andi Kleen wrote: On Mon, Oct 09, 2000 at 11:45:18PM +0100, Kenn Humborg wrote: Simple. Each interrupt stack is, say, 8 pages. You have an array of N interrupt stacks. Then you calculate cpu_id = (sp ~(INT_STACK_SIZE-1)) (PAGE_SHIFT + 3); Actually, I'd put the interrupt stack and any other per-cpu data structures together in this region. So your smp_processor_id() looks like: #define smp_processor_id() \ (in_interrupt() ? (sp ~(INT_STACK_SIZE-1)) (PAGE_SHIFT + 3) : (struct task_struct *)(sp -8192)-current_cpu) ? Nope. There is just an ugly problem: in_interrupt already requires the CPU id to look up the table of interrupt counters. The PSL (processor status longword) has a bit that tells you whether you're currently on the interrupt stack or not. You can test this in two instructions: movpsl r0 # get PSL bbs $0x25, r0, dst # branch if I bit set (bit 25) Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 02:21:09AM +0100, Kenn Humborg wrote: > On Mon, Oct 09, 2000 at 02:20:27AM +0200, Andi Kleen wrote: > > 2.4 TCP code relies on current being valid in a softirq. > > And what the hell does TCP need current for anyway? I think the only reference is in tcp_input.c, tcp_data_queue(). This does: 2483 /* Queue data for delivery to the user. 2484 * Packets in sequence go to the receive queue. 2485 * Out of sequence packets to the out_of_order_queue. 2486 */ 2487 if (TCP_SKB_CB(skb)->seq == tp->rcv_nxt) { 2488 /* Ok. In sequence. */ 2489 if (tp->ucopy.task == current && 2490 tp->copied_seq == tp->rcv_nxt && 2491 tp->ucopy.len && 2492 sk->lock.users && 2493 !tp->urg_data) { 2494 int chunk = min(skb->len, tp->ucopy.len); 2495 2496 __set_current_state(TASK_RUNNING); Hmmm... I think I like the idea of having a different current for interrupt context code. It looks like it's either that, slowing down get_current() by checking for interrupt or kernel stack, or devoting a register to current. Must sleep on this... Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 02:20:27AM +0200, Andi Kleen wrote: > 2.4 TCP code relies on current being valid in a softirq. Well, then as long as Linux guarantees that there is always a valid 'current task' on a CPU, then I can special-case the called-from-interrupt case. The previous kernel stack pointer is accessible from another processor register, so I can go in there and pull it out and use it to calculate current. Is it possible to get an interrupt during context switching, for example? Or any other window during which there isn't a valid current? And what the hell does TCP need current for anyway? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 01:02:21AM +0200, Jamie Lokier wrote: > [EMAIL PROTECTED] wrote: > > BTW: there is an implicit reference to "current" in smp_processor_id. > > Yes I forgot about that. (Self-flagellate). However that is > architecture specific. If it's not an SMP Vax port, no big deal. If it > is, there's a way to arrange that smp_processor_id returns the correct > processor id even from the interrupt stack. Yes, that's easily done. Interrupt stacks are per-processor, so they are part of the per-cpu data structures. So we can use a similar trick to the task_struct/kernel stack hack. (And still get a crash if current is used from interrupt context.) Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 02:20:27AM +0200, Andi Kleen wrote: > On Mon, Oct 09, 2000 at 12:30:17AM +0200, Jamie Lokier wrote: > > Kenn Humborg wrote: > > > My feeling is that interrupt code has no business calling current(), > > > but I don't know the kernel well enough to be sure. Is there any > > > interrupt-level code that calls current() or is it a design > > > principle that it cannot be called? ... > > So if you can make the machine crash utterly when calling "current" in > > irq context, or when dereferencing the result, that would probably be a > > good thing :-) Easily done. Because I don't really know how big we need to make the stacks yet, I've put a non-accessible guard page just below the interrupt stack. I can arrange for (SP & ~8192) to hit this page. > > 2.4 TCP code relies on current being valid in a softirq. > > The m68k port which has a interrupt stack solves the problem by > loading current into a global register variable on all kernel entries. > x86-64 will likely do the same. How do you tell GCC to stay away from that register when compiling the kernel without also making it unusable in userland? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Calling current() from interrupt context
I'd just like to confirm that it's illegal to call current() from interrupt-handling code. I'm working on the VAX port and the reason I ask is that the VAX has separate stack pointers for user, kernel and interrupt contexts. Therefore, the current = (SP & ~8192) hack will give completely bogus results when handling an interrupt. My feeling is that interrupt code has no business calling current(), but I don't know the kernel well enough to be sure. Is there any interrupt-level code that calls current() or is it a design principle that it cannot be called? Thanks, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Calling current() from interrupt context
I'd just like to confirm that it's illegal to call current() from interrupt-handling code. I'm working on the VAX port and the reason I ask is that the VAX has separate stack pointers for user, kernel and interrupt contexts. Therefore, the current = (SP ~8192) hack will give completely bogus results when handling an interrupt. My feeling is that interrupt code has no business calling current(), but I don't know the kernel well enough to be sure. Is there any interrupt-level code that calls current() or is it a design principle that it cannot be called? Thanks, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 02:20:27AM +0200, Andi Kleen wrote: On Mon, Oct 09, 2000 at 12:30:17AM +0200, Jamie Lokier wrote: Kenn Humborg wrote: My feeling is that interrupt code has no business calling current(), but I don't know the kernel well enough to be sure. Is there any interrupt-level code that calls current() or is it a design principle that it cannot be called? ... So if you can make the machine crash utterly when calling "current" in irq context, or when dereferencing the result, that would probably be a good thing :-) Easily done. Because I don't really know how big we need to make the stacks yet, I've put a non-accessible guard page just below the interrupt stack. I can arrange for (SP ~8192) to hit this page. 2.4 TCP code relies on current being valid in a softirq. The m68k port which has a interrupt stack solves the problem by loading current into a global register variable on all kernel entries. x86-64 will likely do the same. How do you tell GCC to stay away from that register when compiling the kernel without also making it unusable in userland? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 01:02:21AM +0200, Jamie Lokier wrote: [EMAIL PROTECTED] wrote: BTW: there is an implicit reference to "current" in smp_processor_id. Yes I forgot about that. (Self-flagellate). However that is architecture specific. If it's not an SMP Vax port, no big deal. If it is, there's a way to arrange that smp_processor_id returns the correct processor id even from the interrupt stack. Yes, that's easily done. Interrupt stacks are per-processor, so they are part of the per-cpu data structures. So we can use a similar trick to the task_struct/kernel stack hack. (And still get a crash if current is used from interrupt context.) Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 02:20:27AM +0200, Andi Kleen wrote: 2.4 TCP code relies on current being valid in a softirq. Well, then as long as Linux guarantees that there is always a valid 'current task' on a CPU, then I can special-case the called-from-interrupt case. The previous kernel stack pointer is accessible from another processor register, so I can go in there and pull it out and use it to calculate current. Is it possible to get an interrupt during context switching, for example? Or any other window during which there isn't a valid current? And what the hell does TCP need current for anyway? Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Calling current() from interrupt context
On Mon, Oct 09, 2000 at 02:21:09AM +0100, Kenn Humborg wrote: On Mon, Oct 09, 2000 at 02:20:27AM +0200, Andi Kleen wrote: 2.4 TCP code relies on current being valid in a softirq. And what the hell does TCP need current for anyway? I think the only reference is in tcp_input.c, tcp_data_queue(). This does: 2483 /* Queue data for delivery to the user. 2484 * Packets in sequence go to the receive queue. 2485 * Out of sequence packets to the out_of_order_queue. 2486 */ 2487 if (TCP_SKB_CB(skb)-seq == tp-rcv_nxt) { 2488 /* Ok. In sequence. */ 2489 if (tp-ucopy.task == current 2490 tp-copied_seq == tp-rcv_nxt 2491 tp-ucopy.len 2492 sk-lock.users 2493 !tp-urg_data) { 2494 int chunk = min(skb-len, tp-ucopy.len); 2495 2496 __set_current_state(TASK_RUNNING); Hmmm... I think I like the idea of having a different current for interrupt context code. It looks like it's either that, slowing down get_current() by checking for interrupt or kernel stack, or devoting a register to current. Must sleep on this... Later, Kenn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/