Re: large memory support for x86
Petr Vandrovec writes: > Sure it does not. Selectors point to linear addresses, before passing them > through pagetables. You have 32+14 bits of virtual address (32 = offset, > 14 = valid bits in selector), which are translated, together with > offset, to 32 bit linear address. This 32bit linear address is passed > through pagetables to 36 bit physical address. So it must go through > 32bit linear address, and there is no easy way to overcome this limit. ... > make complete segment non present, and on pagefault > unmap all pages belonging to some selector + invalidate selector, and > map something else in. You must create at least four such areas, > as you must have mapped at least CS, SS, ES and one of DS/FS/GS to > successfully execute MOVSB... So each area should be < 256MB. This does it... You sort of page (segment?) the virtual space with segment faults. Um, you want 3-level software page tables with that? > Are you really sure that it is worth of effort? Also do not forget > that 'sizeof(void*) > sizeof(long)' in such environment, so tons of > code broke... And someone must translate pointers from 48bits to 32 > for kernel use... No, you don't need that. Nobody could tolerate it anyway. You need to hack gcc to use a 64-bit long, and a 64-bit pointer. The pointer has 16 dead bits, 16 segment bits, and 32 offset bits. Hey, look, LP64 on ia32! You might want a separate personality for this, so that programs compiled with this feature would get their own system calls. Otherwise, there will surely be serious monkey business in libc trying to get a pointer the kernel will use correctly. If you select a good segment size and cache hardware page tables that aren't active, performance might not be... abysmal. The worst part might be that the CPU becomes less parallel when it encounters operations that access different segments. Of course 64-bit pointer math doesn't come free either. I once had a 486DX-75 and I liked it, so you're doing fine if you can beat that. At least you can use this to test software for LP64. Hey, threads are extra cool. They don't really need to share the hardware page tables. If you only share the software page tables, then you don't thrash so bad. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
Petr Vandrovec writes: Sure it does not. Selectors point to linear addresses, before passing them through pagetables. You have 32+14 bits of virtual address (32 = offset, 14 = valid bits in selector), which are translated, together with offset, to 32 bit linear address. This 32bit linear address is passed through pagetables to 36 bit physical address. So it must go through 32bit linear address, and there is no easy way to overcome this limit. ... make complete segment non present, and on pagefault unmap all pages belonging to some selector + invalidate selector, and map something else in. You must create at least four such areas, as you must have mapped at least CS, SS, ES and one of DS/FS/GS to successfully execute MOVSB... So each area should be 256MB. This does it... You sort of page (segment?) the virtual space with segment faults. Um, you want 3-level software page tables with that? Are you really sure that it is worth of effort? Also do not forget that 'sizeof(void*) sizeof(long)' in such environment, so tons of code broke... And someone must translate pointers from 48bits to 32 for kernel use... No, you don't need that. Nobody could tolerate it anyway. You need to hack gcc to use a 64-bit long, and a 64-bit pointer. The pointer has 16 dead bits, 16 segment bits, and 32 offset bits. Hey, look, LP64 on ia32! You might want a separate personality for this, so that programs compiled with this feature would get their own system calls. Otherwise, there will surely be serious monkey business in libc trying to get a pointer the kernel will use correctly. If you select a good segment size and cache hardware page tables that aren't active, performance might not be... abysmal. The worst part might be that the CPU becomes less parallel when it encounters operations that access different segments. Of course 64-bit pointer math doesn't come free either. I once had a 486DX-75 and I liked it, so you're doing fine if you can beat that. At least you can use this to test software for LP64. Hey, threads are extra cool. They don't really need to share the hardware page tables. If you only share the software page tables, then you don't thrash so bad. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
BTW, this fork program did appear to kill about 2 sun servers here... The linux kernel v2.2.16 that they were running survived fine. On Thu, 12 Oct 2000, Richard B. Johnson wrote: > On Thu, 12 Oct 2000, Oliver Xymoron wrote: > > > On Wed, 11 Oct 2000, Kiril Vidimce wrote: > > > > > My primary concern is whether a process can allocate more than 4 GB of > > > memory, rather than just be able to use more than 4 GB of physical > > > memory in the system. > > > > Define allocate. There are tricks you can play, but userspace is still a > > flat 32-bit address space per-process. > > > > --- per process. Which means, in principle, that one could have 100 > processes that are accessing a total of 400 Gb of virtual memory. > > It gets to be a bit less than that, though. Process virtual address > space doesn't start at 0 > > Script started on Thu Oct 12 13:25:45 2000 > cat xxx.c > main() > { > printf("main() starts at %p\n", main); > } > # gcc -o xxx xxx.c > # ./xxx > main() starts at 0x8048488 > # exit > exit > Script done on Thu Oct 12 13:26:08 2000 > > > > Cheers, > Dick Johnson > > Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). > > "Memory is like gasoline. You use it up when you are running. Of > course you get it all back when you reboot..."; Actual explanation > obtained from the Micro$oft help desk. > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
BTW, this fork program did appear to kill about 2 sun servers here... The linux kernel v2.2.16 that they were running survived fine. On Thu, 12 Oct 2000, Richard B. Johnson wrote: On Thu, 12 Oct 2000, Oliver Xymoron wrote: On Wed, 11 Oct 2000, Kiril Vidimce wrote: My primary concern is whether a process can allocate more than 4 GB of memory, rather than just be able to use more than 4 GB of physical memory in the system. Define allocate. There are tricks you can play, but userspace is still a flat 32-bit address space per-process. --- per process. Which means, in principle, that one could have 100 processes that are accessing a total of 400 Gb of virtual memory. It gets to be a bit less than that, though. Process virtual address space doesn't start at 0 Script started on Thu Oct 12 13:25:45 2000 cat xxx.c main() { printf("main() starts at %p\n", main); } # gcc -o xxx xxx.c # ./xxx main() starts at 0x8048488 # exit exit Script done on Thu Oct 12 13:26:08 2000 Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
Timur Tabi wrote: > I understand that a normal virtual address (i.e. a pointer) can only address a > single 32-bit (4GB) memory block. My point was that by also using more than > one 16-bit selector, you can have multiple 4GB areas. So for instance, > 1000: can point to one physical address, and 1001: can point a > different physical address. > > Yes, this means that you need to use multiple selectors in order to access more > than 4GB of virtual space. If you were to try to overflow the linear address you would either get a fault or a wraparound. It won't work like the old DOS highmem tricks. > According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE > increases the size of the page table entries to 64 bits. There are other > changes, such as extended the 20-bit page directory base address to 27 bits. > All this means that a virtual address (selector:offset) can point to a physical > address larger than 32 bits. > > Frankly, the whole thing makes my head hurt. Section 3.9.1, quote: "No matter which 36-bit addressing feature is used (PAE or 36-bit PSE), the linear address space of the processor remains at 32 bits. Applications must partition the address space of their work loads across multiple operating system processes to take advantage of the additonal physical memory provided in the system." -- Brian Gerst - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Fri, 13 Oct 2000, Timur Tabi wrote: > ** Reply to message from Alexander Viro <[EMAIL PROTECTED]> on Fri, 13 Oct 2000 > 15:25:31 -0400 (EDT) > > > > Ditto with PAE: 16:32->32->36. > > In _all_ cases you are limited by the size of linear address. I.e. all > > address modes are limited to 4Gb. All you can get from PAE is placing of > > these 4Gb in 64Gb of physical memory. > > Then how are you supposed to access all 64GB of RAM in your machine? The > kernel must be able to access all 64GB of RAM at once, otherwise it can't do > proper memory management. Kernel doesn't have to access it all at once. Most of the time it doesn't care about the contents of most pages. It certainly needs some permanently mapped stuff, but it's _way_ less than the full memory. The rest is mapped and unmapped on demand. That's what kmap() and kunmap() do. Moreover, 4.4BSD derivatives never bothered mapping the whole physical memory in kernel space, even on 386. It's more complex than what we used to do, but it's doable quite fine. There is no need to keep the whole memory mapped by the kernel. > I've been reading on the PAE in Intel's manuals. I admit, some of it is over > my head. I was under the impression that it was 16:32->64->36 with PAE enabled. Nope. 16:32->32->36. Paging is _after_ the 32-bit bottleneck. You'ld have to change segment descriptors format to expand it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Fri, 13 Oct 2000, Timur Tabi wrote: > I understand that a normal virtual address (i.e. a pointer) can only address a > single 32-bit (4GB) memory block. My point was that by also using more than > one 16-bit selector, you can have multiple 4GB areas. So for instance, > 1000: can point to one physical address, and 1001: can point a > different physical address. _All_ of them are piped through the 4Gb address space. I.e. every segment is mapped to a part of the same (for all segments) 4Gb. That address space is, in turn, mapped to 64Gb of physical memory. At any moment you can't get more than 2^32 different elements of physical memory accessible, even though you have 48 bits of address in the beginning and 36 bits in the end. Think of it that way: we have two functions: u32 map_segment(u48); u36 map_paging(u32); and processor does map_paging(map_segment(address)) when it calculates the physical addresses. Even though both the range and domain are larger than 2^32, the number of different values is less or equal to it. > Yes, this means that you need to use multiple selectors in order to access more > than 4GB of virtual space. > > According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE > increases the size of the page table entries to 64 bits. There are other > changes, such as extended the 20-bit page directory base address to 27 bits. > All this means that a virtual address (selector:offset) can point to a physical > address larger than 32 bits. Virtual address gives linear address. _Then_ it is translated into physical address. Page tables describe the latter mapping. Descriptor tables - the former. Size of linear address is the bottleneck here and no changes past that bottleneck can expand the number of possible values. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from Alexander Viro <[EMAIL PROTECTED]> on Fri, 13 Oct 2000 15:25:31 -0400 (EDT) > Ditto with PAE: 16:32->32->36. > In _all_ cases you are limited by the size of linear address. I.e. all > address modes are limited to 4Gb. All you can get from PAE is placing of > these 4Gb in 64Gb of physical memory. Then how are you supposed to access all 64GB of RAM in your machine? The kernel must be able to access all 64GB of RAM at once, otherwise it can't do proper memory management. I've been reading on the PAE in Intel's manuals. I admit, some of it is over my head. I was under the impression that it was 16:32->64->36 with PAE enabled. -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Fri, 13 Oct 2000, Timur Tabi wrote: > ** Reply to message from Ingo Molnar <[EMAIL PROTECTED]> on Fri, 13 Oct 2000 > 20:44:19 +0200 (CEST) > > > > processes are not limited to a single segment, eg. Wine uses nonstandard > > segments. But as i said, using multiple segments does not let you out of > > 32 bits of virtual memory. > > Sure it does, just like segments let 16-bit apps access more than 64KB of > memory. If you have two selectors, each one can point to a different physical > base address, and IIRC, the size of the physical address base can be 36 bits. > That gives you 16 physically contiguous 4GB memory blocks. RTFM. Take any manual on x86 architecture and stare at the pictures. Ones that describe how segments work. Real mode: 16:16 -> 21 or 20. Real mode with undocumented twists: 16:16->21->32 (you _can_ get paging with real mode style of segments). 286 protected mode: 16:16 -> 24. Ditto with twists: 16:16->24->32. 386 protected mode: 16:32->32. Ditto with paging: 16:32->32->32. Ditto with PAE: 16:32->32->36. In _all_ cases you are limited by the size of linear address. I.e. all address modes are limited to 4Gb. All you can get from PAE is placing of these 4Gb in 64Gb of physical memory. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from Brian Gerst <[EMAIL PROTECTED]> on Fri, 13 Oct 2000 15:07:42 -0400 > You missed the point. The layering on the X86 memory managment is such: > >Segment > | > Virtual Address<- limited to 32 bits > | > Physical Address > > Segmentation never directly gives you a physical address, even in real > mode. Although in real mode virtual address is hardwired to physical > address. Virtual addresses are always 32 bits on the x86. In real > mode, in protected mode, and with PAE enabled. I understand that a normal virtual address (i.e. a pointer) can only address a single 32-bit (4GB) memory block. My point was that by also using more than one 16-bit selector, you can have multiple 4GB areas. So for instance, 1000: can point to one physical address, and 1001: can point a different physical address. Yes, this means that you need to use multiple selectors in order to access more than 4GB of virtual space. According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE increases the size of the page table entries to 64 bits. There are other changes, such as extended the 20-bit page directory base address to 27 bits. All this means that a virtual address (selector:offset) can point to a physical address larger than 32 bits. Frankly, the whole thing makes my head hurt. -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: large memory support for x86
On Fri, 13 Oct 2000, Chris Swiedler wrote: > > Why is it that a user process can't intentionally switch segments? > Dereferencing a 32-bit address causes the address to be calculated using the > "current" segment descriptor, right? It seems to me that a process could set > a new segment selector, in which case a dereference would operate on a whole > new segment. Is there a reason why processes are limited to a single > segment? > > chris You can (although not in user-mode). The problem is; "What RAM do you reference?". You have 32-bits worth of addressing available in some machine that has 32-bits worth of addressible RAM. What can be done is to set one page to be 'not present'. Then, when you page-fault, the page-fault handler can set some bit(s) in a hardware page-register to map in another bank of RAM that isn't in use yet. In principle, this would allow access to (2^16/8 -1) * (2^32 -1) unique areas of RAM (segment descriptors are 16 bits, but they are numbered in increments of 8. So, you now want to copy from a segment addressed by DS, back to your original DS, you have to rewrite the 'C' runtime library to use 'full pointers', i.e., DS:ESI, ES:EDI, etc., with ES and ES being different. You have to dereference the full pointer on avery access! Caching has to be turned off when RAM values that may be in the cache reference RAM that is not back-switch in. You would have a slow mess. However, in principle it could work. Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
Timur Tabi wrote: > > ** Reply to message from Ingo Molnar <[EMAIL PROTECTED]> on Fri, 13 Oct 2000 > 20:44:19 +0200 (CEST) > > > processes are not limited to a single segment, eg. Wine uses nonstandard > > segments. But as i said, using multiple segments does not let you out of > > 32 bits of virtual memory. > > Sure it does, just like segments let 16-bit apps access more than 64KB of > memory. If you have two selectors, each one can point to a different physical > base address, and IIRC, the size of the physical address base can be 36 bits. > That gives you 16 physically contiguous 4GB memory blocks. You missed the point. The layering on the X86 memory managment is such: Segment | Virtual Address<- limited to 32 bits | Physical Address Segmentation never directly gives you a physical address, even in real mode. Although in real mode virtual address is hardwired to physical address. Virtual addresses are always 32 bits on the x86. In real mode, in protected mode, and with PAE enabled. -- Brian Gerst - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On 13 Oct 00 at 13:42, Timur Tabi wrote: > ** Reply to message from Ingo Molnar <[EMAIL PROTECTED]> on Fri, 13 Oct 2000 > 20:44:19 +0200 (CEST) > > processes are not limited to a single segment, eg. Wine uses nonstandard > > segments. But as i said, using multiple segments does not let you out of > > 32 bits of virtual memory. > > Sure it does, just like segments let 16-bit apps access more than 64KB of > memory. If you have two selectors, each one can point to a different physical > base address, and IIRC, the size of the physical address base can be 36 bits. > That gives you 16 physically contiguous 4GB memory blocks. Sure it does not. Selectors point to linear addresses, before passing them through pagetables. You have 32+14 bits of virtual address (32 = offset, 14 = valid bits in selector), which are translated, together with offset, to 32 bit linear address. This 32bit linear address is passed through pagetables to 36 bit physical address. So it must go through 32bit linear address, and there is no easy way to overcome this limit. You can either (1) forget about simple pointers and dereferencing pointers, and go through realmode windows way - you must GlobalLock() memory area, and you'll get pointer. Then you GlobalUnlock()... And you can lock at most 2GB (maybe 3GB if you'll thought really good algorithm) of such data... Or (2) make complete segment non present, and on pagefault unmap all pages belonging to some selector + invalidate selector, and map something else in. You must create at least four such areas, as you must have mapped at least CS, SS, ES and one of DS/FS/GS to successfully execute MOVSB... So each area should be < 256MB. Are you really sure that it is worth of effort? Also do not forget that 'sizeof(void*) > sizeof(long)' in such environment, so tons of code broke... And someone must translate pointers from 48bits to 32 for kernel use... Best regards, Petr Vandrovec [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Fri, 13 Oct 2000, Timur Tabi wrote: > Sure it does, just like segments let 16-bit apps access more than 64KB of > memory. If you have two selectors, each one can point to a different physical > base address, and IIRC, the size of the physical address base can be 36 bits. > That gives you 16 physically contiguous 4GB memory blocks. No. The segment base and length is confined to the 32 bit address space mapped by page tables. -ben - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
Chris Swiedler wrote: > > > no, x86 virtual memory is 32 bits - segmentation only provides a way to > > segment this 4GB virtual memory, but cannot extend it. Under Linux there > > is 3GB virtual memory available to user-space processes. > > > > this 3GB virtual memory does not have to be mapped to the same physical > > pages all the time - and this is nothing new. mmap()/munmap()-ing memory > > dynamically is one way to 'extend' the amount of physical RAM controlled > > by a single process. I doubt this would be very economical though. > > > > Such big-RAM systems are almost always SMP systems, so eg. a 4-way system > > can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can > > utilize up to 24 GB RAM at once, without having to play mmap/munmap > > 'memory extender' tricks. > > Why is it that a user process can't intentionally switch segments? > Dereferencing a 32-bit address causes the address to be calculated using the > "current" segment descriptor, right? It seems to me that a process could set > a new segment selector, in which case a dereference would operate on a whole > new segment. Is there a reason why processes are limited to a single > segment? A segment is just a window mapped on top of virtual memory. A process can have many segments, but only has one virtual memory mapping. You cannot use segments to access memory that isn't mapped into the virtual address space, which is where the 32-bit limit exists. It may be possible to use the segment not present fault to switch page tables, but this would be very unportable and would incur alot of extra overhead. -- Brian Gerst - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from Ingo Molnar <[EMAIL PROTECTED]> on Fri, 13 Oct 2000 20:44:19 +0200 (CEST) > processes are not limited to a single segment, eg. Wine uses nonstandard > segments. But as i said, using multiple segments does not let you out of > 32 bits of virtual memory. Sure it does, just like segments let 16-bit apps access more than 64KB of memory. If you have two selectors, each one can point to a different physical base address, and IIRC, the size of the physical address base can be 36 bits. That gives you 16 physically contiguous 4GB memory blocks. -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: large memory support for x86
On Fri, 13 Oct 2000, Chris Swiedler wrote: > Why is it that a user process can't intentionally switch segments? > Dereferencing a 32-bit address causes the address to be calculated > using the "current" segment descriptor, right? It seems to me that a > process could set a new segment selector, in which case a dereference > would operate on a whole new segment. Is there a reason why processes > are limited to a single segment? processes are not limited to a single segment, eg. Wine uses nonstandard segments. But as i said, using multiple segments does not let you out of 32 bits of virtual memory. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: large memory support for x86
> no, x86 virtual memory is 32 bits - segmentation only provides a way to > segment this 4GB virtual memory, but cannot extend it. Under Linux there > is 3GB virtual memory available to user-space processes. > > this 3GB virtual memory does not have to be mapped to the same physical > pages all the time - and this is nothing new. mmap()/munmap()-ing memory > dynamically is one way to 'extend' the amount of physical RAM controlled > by a single process. I doubt this would be very economical though. > > Such big-RAM systems are almost always SMP systems, so eg. a 4-way system > can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can > utilize up to 24 GB RAM at once, without having to play mmap/munmap > 'memory extender' tricks. Why is it that a user process can't intentionally switch segments? Dereferencing a 32-bit address causes the address to be calculated using the "current" segment descriptor, right? It seems to me that a process could set a new segment selector, in which case a dereference would operate on a whole new segment. Is there a reason why processes are limited to a single segment? chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: large memory support for x86
On Fri, 13 Oct 2000, Chris Swiedler wrote: Why is it that a user process can't intentionally switch segments? Dereferencing a 32-bit address causes the address to be calculated using the "current" segment descriptor, right? It seems to me that a process could set a new segment selector, in which case a dereference would operate on a whole new segment. Is there a reason why processes are limited to a single segment? processes are not limited to a single segment, eg. Wine uses nonstandard segments. But as i said, using multiple segments does not let you out of 32 bits of virtual memory. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from Ingo Molnar [EMAIL PROTECTED] on Fri, 13 Oct 2000 20:44:19 +0200 (CEST) processes are not limited to a single segment, eg. Wine uses nonstandard segments. But as i said, using multiple segments does not let you out of 32 bits of virtual memory. Sure it does, just like segments let 16-bit apps access more than 64KB of memory. If you have two selectors, each one can point to a different physical base address, and IIRC, the size of the physical address base can be 36 bits. That gives you 16 physically contiguous 4GB memory blocks. -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Fri, 13 Oct 2000, Timur Tabi wrote: Sure it does, just like segments let 16-bit apps access more than 64KB of memory. If you have two selectors, each one can point to a different physical base address, and IIRC, the size of the physical address base can be 36 bits. That gives you 16 physically contiguous 4GB memory blocks. No. The segment base and length is confined to the 32 bit address space mapped by page tables. -ben - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On 13 Oct 00 at 13:42, Timur Tabi wrote: ** Reply to message from Ingo Molnar [EMAIL PROTECTED] on Fri, 13 Oct 2000 20:44:19 +0200 (CEST) processes are not limited to a single segment, eg. Wine uses nonstandard segments. But as i said, using multiple segments does not let you out of 32 bits of virtual memory. Sure it does, just like segments let 16-bit apps access more than 64KB of memory. If you have two selectors, each one can point to a different physical base address, and IIRC, the size of the physical address base can be 36 bits. That gives you 16 physically contiguous 4GB memory blocks. Sure it does not. Selectors point to linear addresses, before passing them through pagetables. You have 32+14 bits of virtual address (32 = offset, 14 = valid bits in selector), which are translated, together with offset, to 32 bit linear address. This 32bit linear address is passed through pagetables to 36 bit physical address. So it must go through 32bit linear address, and there is no easy way to overcome this limit. You can either (1) forget about simple pointers and dereferencing pointers, and go through realmode windows way - you must GlobalLock() memory area, and you'll get pointer. Then you GlobalUnlock()... And you can lock at most 2GB (maybe 3GB if you'll thought really good algorithm) of such data... Or (2) make complete segment non present, and on pagefault unmap all pages belonging to some selector + invalidate selector, and map something else in. You must create at least four such areas, as you must have mapped at least CS, SS, ES and one of DS/FS/GS to successfully execute MOVSB... So each area should be 256MB. Are you really sure that it is worth of effort? Also do not forget that 'sizeof(void*) sizeof(long)' in such environment, so tons of code broke... And someone must translate pointers from 48bits to 32 for kernel use... Best regards, Petr Vandrovec [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from Brian Gerst [EMAIL PROTECTED] on Fri, 13 Oct 2000 15:07:42 -0400 You missed the point. The layering on the X86 memory managment is such: Segment | Virtual Address- limited to 32 bits | Physical Address Segmentation never directly gives you a physical address, even in real mode. Although in real mode virtual address is hardwired to physical address. Virtual addresses are always 32 bits on the x86. In real mode, in protected mode, and with PAE enabled. I understand that a normal virtual address (i.e. a pointer) can only address a single 32-bit (4GB) memory block. My point was that by also using more than one 16-bit selector, you can have multiple 4GB areas. So for instance, 1000: can point to one physical address, and 1001: can point a different physical address. Yes, this means that you need to use multiple selectors in order to access more than 4GB of virtual space. According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE increases the size of the page table entries to 64 bits. There are other changes, such as extended the 20-bit page directory base address to 27 bits. All this means that a virtual address (selector:offset) can point to a physical address larger than 32 bits. Frankly, the whole thing makes my head hurt. -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Fri, 13 Oct 2000, Timur Tabi wrote: I understand that a normal virtual address (i.e. a pointer) can only address a single 32-bit (4GB) memory block. My point was that by also using more than one 16-bit selector, you can have multiple 4GB areas. So for instance, 1000: can point to one physical address, and 1001: can point a different physical address. _All_ of them are piped through the 4Gb address space. I.e. every segment is mapped to a part of the same (for all segments) 4Gb. That address space is, in turn, mapped to 64Gb of physical memory. At any moment you can't get more than 2^32 different elements of physical memory accessible, even though you have 48 bits of address in the beginning and 36 bits in the end. Think of it that way: we have two functions: u32 map_segment(u48); u36 map_paging(u32); and processor does map_paging(map_segment(address)) when it calculates the physical addresses. Even though both the range and domain are larger than 2^32, the number of different values is less or equal to it. Yes, this means that you need to use multiple selectors in order to access more than 4GB of virtual space. According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE increases the size of the page table entries to 64 bits. There are other changes, such as extended the 20-bit page directory base address to 27 bits. All this means that a virtual address (selector:offset) can point to a physical address larger than 32 bits. Virtual address gives linear address. _Then_ it is translated into physical address. Page tables describe the latter mapping. Descriptor tables - the former. Size of linear address is the bottleneck here and no changes past that bottleneck can expand the number of possible values. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Fri, 13 Oct 2000, Timur Tabi wrote: ** Reply to message from Alexander Viro [EMAIL PROTECTED] on Fri, 13 Oct 2000 15:25:31 -0400 (EDT) Ditto with PAE: 16:32-32-36. In _all_ cases you are limited by the size of linear address. I.e. all address modes are limited to 4Gb. All you can get from PAE is placing of these 4Gb in 64Gb of physical memory. Then how are you supposed to access all 64GB of RAM in your machine? The kernel must be able to access all 64GB of RAM at once, otherwise it can't do proper memory management. Kernel doesn't have to access it all at once. Most of the time it doesn't care about the contents of most pages. It certainly needs some permanently mapped stuff, but it's _way_ less than the full memory. The rest is mapped and unmapped on demand. That's what kmap() and kunmap() do. Moreover, 4.4BSD derivatives never bothered mapping the whole physical memory in kernel space, even on 386. It's more complex than what we used to do, but it's doable quite fine. There is no need to keep the whole memory mapped by the kernel. I've been reading on the PAE in Intel's manuals. I admit, some of it is over my head. I was under the impression that it was 16:32-64-36 with PAE enabled. Nope. 16:32-32-36. Paging is _after_ the 32-bit bottleneck. You'ld have to change segment descriptors format to expand it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, Oct 12, 2000 at 07:19:32PM -0400, Dan Maas wrote: > The memory map of a user process on x86 looks like this: > > - > KERNEL (always present here) > 0xC000 > - > 0xBFFF > STACK > - > MAPPED FILES (incl. shared libs) > 0x4000 > - > HEAP (brk()/malloc()) > EXECUTABLE CODE > 0x08048000 > - > > Try examining /proc/*/maps, and also watch your programs call brk() using > strace; you'll see all this in action... > > > So why does the process space start at such a high virtual > > address (why not closer to 0x)? Seems we're wasting ~128 megs of > > RAM. Not a huge amount compared to 4G, but signifigant. > > I don't know; anyone care to comment? Apparently to catch NULL pointer references with array indices (int *p = NULL; p[5000]) I agree that is is very wasteful use of precious virtual memory. > > Can kernel > > pages be swapped out / faulted in just like user process pages? > > Linux does not swap kernel memory; the kernel is so small it's not worth the > trouble (are there other reasons?). e.g. My Linux boxes run 1-2MB of kernel > code; my NT machine is running >6MB at the moment... Actually most linux boxes do, but with the old term for swapping before virtual memory (or overlaying in dos terms). They have a cronjob that expires modules with usage count 0 (or in 2.0 kerneld that did the same) It is a rather dangerous thing though, module unloading tends to be one of the most race prone and in addition not too well tested places in the kernel. I usually recommend to turn it off on any production machine. In 2.4 with the new fine grained SMP locking is much much more dangerous, nearly impossible to solve properly. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
The memory map of a user process on x86 looks like this: - KERNEL (always present here) 0xC000 - 0xBFFF STACK - MAPPED FILES (incl. shared libs) 0x4000 - HEAP (brk()/malloc()) EXECUTABLE CODE 0x08048000 - Try examining /proc/*/maps, and also watch your programs call brk() using strace; you'll see all this in action... > So why does the process space start at such a high virtual > address (why not closer to 0x)? Seems we're wasting ~128 megs of > RAM. Not a huge amount compared to 4G, but signifigant. I don't know; anyone care to comment? > Another question: how (and where in the code) do we translate virtual > user-addresses to physical addresses? In hardware, with the TLB and, if the TLB misses, then page tables. > Does the MMU do it, or does it call a > kernel handler function? Only when an attempt is made to access an unmapped or protected page; then you get an interrupt (page fault), which the kernel code handles. > Why is the kernel allowed to reference physical > addresses, while user processes go through the translation step? Not even the kernel accesses physical memory directly. It can, however, choose to map the physical memory into its own address space contiguously. Linux puts it at 0xC000 and up. (question for the gurus- what happens on machines with >1GB of RAM?) > Can kernel > pages be swapped out / faulted in just like user process pages? Linux does not swap kernel memory; the kernel is so small it's not worth the trouble (are there other reasons?). e.g. My Linux boxes run 1-2MB of kernel code; my NT machine is running >6MB at the moment... Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Timur Tabi wrote: > Of course, you could define a pointer to be a 48-bit value, but I > doubt that would really work. no, x86 virtual memory is 32 bits - segmentation only provides a way to segment this 4GB virtual memory, but cannot extend it. Under Linux there is 3GB virtual memory available to user-space processes. this 3GB virtual memory does not have to be mapped to the same physical pages all the time - and this is nothing new. mmap()/munmap()-ing memory dynamically is one way to 'extend' the amount of physical RAM controlled by a single process. I doubt this would be very economical though. Such big-RAM systems are almost always SMP systems, so eg. a 4-way system can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can utilize up to 24 GB RAM at once, without having to play mmap/munmap 'memory extender' tricks. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
Timur Tabi writes: > ** Reply to message from Jeff Epler <[EMAIL PROTECTED]> on Thu, 12 Oct 2000 > 13:08:19 -0500 > > What the support for >4G of memory on x86 is about, is the "PAE", Page Address > > Extension, supported on P6 generation of machines, as well as on Athlons > > (I think). With these, the kernel can use >4G of memory, but it still can't > > present a >32bit address space to user processes. But you could have 8G > > physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core. > > How about the kernel itself? How do I access the memory above 4GB inside a > device driver? It depends on what you have already. If you're given a (kernel) virtual address, just dereference it. The unit of currency for physical pages is the "struct page". If you want to allocate a physical page for your own use (from anywhere in physical memory) them you do struct page *page = alloc_page(GFP_FOO); If you want to read/write to that page directly from kernel space then you need to map it into kernel space: char *va = kmap(page); /* read/write from the page starting at virtual address va */ kunmap(va); The implementations of kmap and kunmap are such that mappings are cached (within reason) so it's "reasonable" fast doing kmap/kunmap. If you want to do something else with the page (like get some I/O done to/from it) then the new (and forthcoming) kiobuf functions take struct page units and handle all the internal mapping gubbins without you having to worry about it. --Malcolm -- Malcolm Beattie <[EMAIL PROTECTED]> Unix Systems Programmer Oxford University Computing Services - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from "Richard B. Johnson" <[EMAIL PROTECTED]> on Thu, 12 Oct 2000 15:17:15 -0400 (EDT) > With ix86 processors in the kernel, you can create multiple segments > and multiple page-tables. Does the kernel provide services for this, or will I have to hack up the x86 page tables myself? If there are kernel services, what are they? > If you have some 'hardware-specific' way > of mapping in real RAM that is not used by anybody else, you can have > your own private segment. But is this part of the established Linux >4GB support? I don't think so. > Page registers can map in multiple sticks > of RAM into a single window. The page-fault handler can manipulate > the page registers for user-mode code. It's being done in PAE. But how does one go about doing all this? -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: large memory support for x86
> > Am I reading this correctly--the address of the main() function for a > > process is guaranteed to be the lowest possible virtual address? > > > > chris > > > > It is one of the lowest. The 'C' runtime library puts section > .text (the code) first, then .data, then .bss, then .stack. The > .stack section is co-located with the heap which can be extended > by setting a new break address. > > When a process is created, the lowest address is the entry point of > crt0.o _init. We can see where that is by: > > Script started on Thu Oct 12 14:25:35 2000 > # cat xxx.c > > extern int _init(); > main() > { > printf("_init is at %p\n", _init); > } > > # gcc -o xxx xxx.c > # ./xxx > _init is at 0x804838c > # exit > exit > Script done on Thu Oct 12 14:25:51 2000 > > That said, remember that in Unix, the 'C' rutime library exists in the > lower portion of the .text section. So your code's virtual address space > starts above that address space. This is MMAPed so everybody gets > to share the same pages. In this way, you don't all have to keep a > private copy of the 'C' runtime library. User-process virtual addresses have no direct relation to physical addresses, right? So why does the process space start at such a high virtual address (why not closer to 0x)? Seems we're wasting ~128 megs of RAM. Not a huge amount compared to 4G, but signifigant. Is that space used (libc can't be that big!) or reserved somehow? Another question: how (and where in the code) do we translate virtual user-addresses to physical addresses? Does the MMU do it, or does it call a kernel handler function? Why is the kernel allowed to reference physical addresses, while user processes go through the translation step? Can kernel pages be swapped out / faulted in just like user process pages? Sorry to pounce on you with all of these questions. I've read up on this stuff but can't always find answers... thanks-- chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Timur Tabi wrote: > ** Reply to message from Jeff Epler <[EMAIL PROTECTED]> on Thu, 12 Oct 2000 > 13:08:19 -0500 > > > > What the support for >4G of memory on x86 is about, is the "PAE", Page Address > > Extension, supported on P6 generation of machines, as well as on Athlons > > (I think). With these, the kernel can use >4G of memory, but it still can't > > present a >32bit address space to user processes. But you could have 8G > > physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core. > > How about the kernel itself? How do I access the memory above 4GB inside a > device driver? > > > There may or may not be some way to support an abomination like the old "far" > > pointers in DOS (multiple 4G segments), but I don't think it has been written > > yet. > > Yes, it's ugly, but it works and it's compatible. Well, compatible with 32-bit > code, probably not compatible with Linux code overall. > > Of course, you could define a pointer to be a 48-bit value, but I doubt that > would really work. > With ix86 processors in the kernel, you can create multiple segments and multiple page-tables. If you have some 'hardware-specific' way of mapping in real RAM that is not used by anybody else, you can have your own private segment. Page registers can map in multiple sticks of RAM into a single window. The page-fault handler can manipulate the page registers for user-mode code. It's being done in PAE. Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from Jeff Epler <[EMAIL PROTECTED]> on Thu, 12 Oct 2000 13:08:19 -0500 > What the support for >4G of memory on x86 is about, is the "PAE", Page Address > Extension, supported on P6 generation of machines, as well as on Athlons > (I think). With these, the kernel can use >4G of memory, but it still can't > present a >32bit address space to user processes. But you could have 8G > physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core. How about the kernel itself? How do I access the memory above 4GB inside a device driver? > There may or may not be some way to support an abomination like the old "far" > pointers in DOS (multiple 4G segments), but I don't think it has been written > yet. Yes, it's ugly, but it works and it's compatible. Well, compatible with 32-bit code, probably not compatible with Linux code overall. Of course, you could define a pointer to be a 48-bit value, but I doubt that would really work. -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, Oct 12, 2000 at 10:36:38AM -0700, Kiril Vidimce wrote: > Allocate = malloc(). The process needs to be able to operate on >4 GB > chunks of memory. I understand that it's only a 32 bit address space > which is why I was surprised when I read that Linux 2.4.x will support > upwards of 64 GB's of memory. > > Thanks for all the responses. Pointers are still 32 bits on x86, and the visible address space for any particular process is still somewhat less than 4G. I believe that if you select Linux on Alpha that you can have more than 4G per process, but that may or may not be true. What the support for >4G of memory on x86 is about, is the "PAE", Page Address Extension, supported on P6 generation of machines, as well as on Athlons (I think). With these, the kernel can use >4G of memory, but it still can't present a >32bit address space to user processes. But you could have 8G physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core. There may or may not be some way to support an abomination like the old "far" pointers in DOS (multiple 4G segments), but I don't think it has been written yet. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
>On Thu, 12 Oct 2000, Oliver Xymoron wrote: >> On Wed, 11 Oct 2000, Kiril Vidimce wrote: >> >> > My primary concern is whether a process can allocate more than 4 GB of >> > memory, rather than just be able to use more than 4 GB of physical >> > memory in the system. >> >> Define allocate. There are tricks you can play, but userspace is still a >> flat 32-bit address space per-process. > > >Allocate = malloc(). The process needs to be able to operate on >4 GB >chunks of memory. I understand that it's only a 32 bit address space >which is why I was surprised when I read that Linux 2.4.x will support >upwards of 64 GB's of memory. > > >Thanks for all the responses. > > >KV You can, of course, bank switch memory by using a shared segment and cloning additional process heaps. Obviously a _single_ 32bit address space can only access 4GB at a time. --Jonathan-- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Wed, 11 Oct 2000, Kiril Vidimce wrote: > > Hi there, > > I am trying to find out more information on large memory support (> 4 GB) > for Linux IA32. Is there a document that elaborates on what is supported > and what isn't and how this scheme actually works in the kernel? > > My primary concern is whether a process can allocate more than 4 GB of > memory, rather than just be able to use more than 4 GB of physical > memory in the system. > > Also, I see that the highmem support is just an option in the kernel. > Does this mean that there is a significant performance penalty in using > this extension? Linux does support 64G of physical memory. My machine has 6G RAM and runs absolutely nice and smooth as it should. Everything "just works" (ok, not counting a few subtle cache-related issues and stability of PCI subsystem but those happened many hours ago and were quickly fixed so they are history now and therefore long forgotten ;). As for PAE, yes it does incur penalty of about 3-6% of performance "overall". By overall performance I meant the unixbench numbers. I have published the numbers comparing PAE/4G/nohighmem kernels on the same machine sometime ago... So, it only makes sense to enable PAE if you have more than 4G of memory. Regards, Tigran - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Oliver Xymoron wrote: > On Wed, 11 Oct 2000, Kiril Vidimce wrote: > > > My primary concern is whether a process can allocate more than 4 GB of > > memory, rather than just be able to use more than 4 GB of physical > > memory in the system. > > Define allocate. There are tricks you can play, but userspace is still a > flat 32-bit address space per-process. Allocate = malloc(). The process needs to be able to operate on >4 GB chunks of memory. I understand that it's only a 32 bit address space which is why I was surprised when I read that Linux 2.4.x will support upwards of 64 GB's of memory. Thanks for all the responses. KV -- ___ Studio Tools[EMAIL PROTECTED] Pixar Animation Studioshttp://www.pixar.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Oliver Xymoron wrote: > On Wed, 11 Oct 2000, Kiril Vidimce wrote: > > > My primary concern is whether a process can allocate more than 4 GB of > > memory, rather than just be able to use more than 4 GB of physical > > memory in the system. > > Define allocate. There are tricks you can play, but userspace is still a > flat 32-bit address space per-process. > --- per process. Which means, in principle, that one could have 100 processes that are accessing a total of 400 Gb of virtual memory. It gets to be a bit less than that, though. Process virtual address space doesn't start at 0 Script started on Thu Oct 12 13:25:45 2000 cat xxx.c main() { printf("main() starts at %p\n", main); } # gcc -o xxx xxx.c # ./xxx main() starts at 0x8048488 # exit exit Script done on Thu Oct 12 13:26:08 2000 Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Wed, 11 Oct 2000, Kiril Vidimce wrote: > My primary concern is whether a process can allocate more than 4 GB of > memory, rather than just be able to use more than 4 GB of physical > memory in the system. Define allocate. There are tricks you can play, but userspace is still a flat 32-bit address space per-process. > Also, I see that the highmem support is just an option in the kernel. > Does this mean that there is a significant performance penalty in using > this extension? It doesn't come for free, but it's almost certainly a win for anyone who has that much memory. -- "Love the dolphins," she advised him. "Write by W.A.S.T.E.." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Wed, 11 Oct 2000, Kiril Vidimce wrote: My primary concern is whether a process can allocate more than 4 GB of memory, rather than just be able to use more than 4 GB of physical memory in the system. Define allocate. There are tricks you can play, but userspace is still a flat 32-bit address space per-process. Also, I see that the highmem support is just an option in the kernel. Does this mean that there is a significant performance penalty in using this extension? It doesn't come for free, but it's almost certainly a win for anyone who has that much memory. -- "Love the dolphins," she advised him. "Write by W.A.S.T.E.." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Oliver Xymoron wrote: On Wed, 11 Oct 2000, Kiril Vidimce wrote: My primary concern is whether a process can allocate more than 4 GB of memory, rather than just be able to use more than 4 GB of physical memory in the system. Define allocate. There are tricks you can play, but userspace is still a flat 32-bit address space per-process. --- per process. Which means, in principle, that one could have 100 processes that are accessing a total of 400 Gb of virtual memory. It gets to be a bit less than that, though. Process virtual address space doesn't start at 0 Script started on Thu Oct 12 13:25:45 2000 cat xxx.c main() { printf("main() starts at %p\n", main); } # gcc -o xxx xxx.c # ./xxx main() starts at 0x8048488 # exit exit Script done on Thu Oct 12 13:26:08 2000 Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Oliver Xymoron wrote: On Wed, 11 Oct 2000, Kiril Vidimce wrote: My primary concern is whether a process can allocate more than 4 GB of memory, rather than just be able to use more than 4 GB of physical memory in the system. Define allocate. There are tricks you can play, but userspace is still a flat 32-bit address space per-process. Allocate = malloc(). The process needs to be able to operate on 4 GB chunks of memory. I understand that it's only a 32 bit address space which is why I was surprised when I read that Linux 2.4.x will support upwards of 64 GB's of memory. Thanks for all the responses. KV -- ___ Studio Tools[EMAIL PROTECTED] Pixar Animation Studioshttp://www.pixar.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Wed, 11 Oct 2000, Kiril Vidimce wrote: Hi there, I am trying to find out more information on large memory support ( 4 GB) for Linux IA32. Is there a document that elaborates on what is supported and what isn't and how this scheme actually works in the kernel? My primary concern is whether a process can allocate more than 4 GB of memory, rather than just be able to use more than 4 GB of physical memory in the system. Also, I see that the highmem support is just an option in the kernel. Does this mean that there is a significant performance penalty in using this extension? Linux does support 64G of physical memory. My machine has 6G RAM and runs absolutely nice and smooth as it should. Everything "just works" (ok, not counting a few subtle cache-related issues and stability of PCI subsystem but those happened many hours ago and were quickly fixed so they are history now and therefore long forgotten ;). As for PAE, yes it does incur penalty of about 3-6% of performance "overall". By overall performance I meant the unixbench numbers. I have published the numbers comparing PAE/4G/nohighmem kernels on the same machine sometime ago... So, it only makes sense to enable PAE if you have more than 4G of memory. Regards, Tigran - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Oliver Xymoron wrote: On Wed, 11 Oct 2000, Kiril Vidimce wrote: My primary concern is whether a process can allocate more than 4 GB of memory, rather than just be able to use more than 4 GB of physical memory in the system. Define allocate. There are tricks you can play, but userspace is still a flat 32-bit address space per-process. Allocate = malloc(). The process needs to be able to operate on 4 GB chunks of memory. I understand that it's only a 32 bit address space which is why I was surprised when I read that Linux 2.4.x will support upwards of 64 GB's of memory. Thanks for all the responses. KV You can, of course, bank switch memory by using a shared segment and cloning additional process heaps. Obviously a _single_ 32bit address space can only access 4GB at a time. --Jonathan-- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, Oct 12, 2000 at 10:36:38AM -0700, Kiril Vidimce wrote: Allocate = malloc(). The process needs to be able to operate on 4 GB chunks of memory. I understand that it's only a 32 bit address space which is why I was surprised when I read that Linux 2.4.x will support upwards of 64 GB's of memory. Thanks for all the responses. Pointers are still 32 bits on x86, and the visible address space for any particular process is still somewhat less than 4G. I believe that if you select Linux on Alpha that you can have more than 4G per process, but that may or may not be true. What the support for 4G of memory on x86 is about, is the "PAE", Page Address Extension, supported on P6 generation of machines, as well as on Athlons (I think). With these, the kernel can use 4G of memory, but it still can't present a 32bit address space to user processes. But you could have 8G physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core. There may or may not be some way to support an abomination like the old "far" pointers in DOS (multiple 4G segments), but I don't think it has been written yet. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from Jeff Epler [EMAIL PROTECTED] on Thu, 12 Oct 2000 13:08:19 -0500 What the support for 4G of memory on x86 is about, is the "PAE", Page Address Extension, supported on P6 generation of machines, as well as on Athlons (I think). With these, the kernel can use 4G of memory, but it still can't present a 32bit address space to user processes. But you could have 8G physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core. How about the kernel itself? How do I access the memory above 4GB inside a device driver? There may or may not be some way to support an abomination like the old "far" pointers in DOS (multiple 4G segments), but I don't think it has been written yet. Yes, it's ugly, but it works and it's compatible. Well, compatible with 32-bit code, probably not compatible with Linux code overall. Of course, you could define a pointer to be a 48-bit value, but I doubt that would really work. -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Timur Tabi wrote: ** Reply to message from Jeff Epler [EMAIL PROTECTED] on Thu, 12 Oct 2000 13:08:19 -0500 What the support for 4G of memory on x86 is about, is the "PAE", Page Address Extension, supported on P6 generation of machines, as well as on Athlons (I think). With these, the kernel can use 4G of memory, but it still can't present a 32bit address space to user processes. But you could have 8G physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core. How about the kernel itself? How do I access the memory above 4GB inside a device driver? There may or may not be some way to support an abomination like the old "far" pointers in DOS (multiple 4G segments), but I don't think it has been written yet. Yes, it's ugly, but it works and it's compatible. Well, compatible with 32-bit code, probably not compatible with Linux code overall. Of course, you could define a pointer to be a 48-bit value, but I doubt that would really work. With ix86 processors in the kernel, you can create multiple segments and multiple page-tables. If you have some 'hardware-specific' way of mapping in real RAM that is not used by anybody else, you can have your own private segment. Page registers can map in multiple sticks of RAM into a single window. The page-fault handler can manipulate the page registers for user-mode code. It's being done in PAE. Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: large memory support for x86
Am I reading this correctly--the address of the main() function for a process is guaranteed to be the lowest possible virtual address? chris It is one of the lowest. The 'C' runtime library puts section .text (the code) first, then .data, then .bss, then .stack. The .stack section is co-located with the heap which can be extended by setting a new break address. When a process is created, the lowest address is the entry point of crt0.o _init. We can see where that is by: Script started on Thu Oct 12 14:25:35 2000 # cat xxx.c extern int _init(); main() { printf("_init is at %p\n", _init); } # gcc -o xxx xxx.c # ./xxx _init is at 0x804838c # exit exit Script done on Thu Oct 12 14:25:51 2000 That said, remember that in Unix, the 'C' rutime library exists in the lower portion of the .text section. So your code's virtual address space starts above that address space. This is MMAPed so everybody gets to share the same pages. In this way, you don't all have to keep a private copy of the 'C' runtime library. User-process virtual addresses have no direct relation to physical addresses, right? So why does the process space start at such a high virtual address (why not closer to 0x)? Seems we're wasting ~128 megs of RAM. Not a huge amount compared to 4G, but signifigant. Is that space used (libc can't be that big!) or reserved somehow? Another question: how (and where in the code) do we translate virtual user-addresses to physical addresses? Does the MMU do it, or does it call a kernel handler function? Why is the kernel allowed to reference physical addresses, while user processes go through the translation step? Can kernel pages be swapped out / faulted in just like user process pages? Sorry to pounce on you with all of these questions. I've read up on this stuff but can't always find answers... thanks-- chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
** Reply to message from "Richard B. Johnson" [EMAIL PROTECTED] on Thu, 12 Oct 2000 15:17:15 -0400 (EDT) With ix86 processors in the kernel, you can create multiple segments and multiple page-tables. Does the kernel provide services for this, or will I have to hack up the x86 page tables myself? If there are kernel services, what are they? If you have some 'hardware-specific' way of mapping in real RAM that is not used by anybody else, you can have your own private segment. But is this part of the established Linux 4GB support? I don't think so. Page registers can map in multiple sticks of RAM into a single window. The page-fault handler can manipulate the page registers for user-mode code. It's being done in PAE. But how does one go about doing all this? -- Timur Tabi - [EMAIL PROTECTED] Interactive Silicon - http://www.interactivesi.com When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
Timur Tabi writes: ** Reply to message from Jeff Epler [EMAIL PROTECTED] on Thu, 12 Oct 2000 13:08:19 -0500 What the support for 4G of memory on x86 is about, is the "PAE", Page Address Extension, supported on P6 generation of machines, as well as on Athlons (I think). With these, the kernel can use 4G of memory, but it still can't present a 32bit address space to user processes. But you could have 8G physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core. How about the kernel itself? How do I access the memory above 4GB inside a device driver? It depends on what you have already. If you're given a (kernel) virtual address, just dereference it. The unit of currency for physical pages is the "struct page". If you want to allocate a physical page for your own use (from anywhere in physical memory) them you do struct page *page = alloc_page(GFP_FOO); If you want to read/write to that page directly from kernel space then you need to map it into kernel space: char *va = kmap(page); /* read/write from the page starting at virtual address va */ kunmap(va); The implementations of kmap and kunmap are such that mappings are cached (within reason) so it's "reasonable" fast doing kmap/kunmap. If you want to do something else with the page (like get some I/O done to/from it) then the new (and forthcoming) kiobuf functions take struct page units and handle all the internal mapping gubbins without you having to worry about it. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Unix Systems Programmer Oxford University Computing Services - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, 12 Oct 2000, Timur Tabi wrote: Of course, you could define a pointer to be a 48-bit value, but I doubt that would really work. no, x86 virtual memory is 32 bits - segmentation only provides a way to segment this 4GB virtual memory, but cannot extend it. Under Linux there is 3GB virtual memory available to user-space processes. this 3GB virtual memory does not have to be mapped to the same physical pages all the time - and this is nothing new. mmap()/munmap()-ing memory dynamically is one way to 'extend' the amount of physical RAM controlled by a single process. I doubt this would be very economical though. Such big-RAM systems are almost always SMP systems, so eg. a 4-way system can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can utilize up to 24 GB RAM at once, without having to play mmap/munmap 'memory extender' tricks. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
The memory map of a user process on x86 looks like this: - KERNEL (always present here) 0xC000 - 0xBFFF STACK - MAPPED FILES (incl. shared libs) 0x4000 - HEAP (brk()/malloc()) EXECUTABLE CODE 0x08048000 - Try examining /proc/*/maps, and also watch your programs call brk() using strace; you'll see all this in action... So why does the process space start at such a high virtual address (why not closer to 0x)? Seems we're wasting ~128 megs of RAM. Not a huge amount compared to 4G, but signifigant. I don't know; anyone care to comment? Another question: how (and where in the code) do we translate virtual user-addresses to physical addresses? In hardware, with the TLB and, if the TLB misses, then page tables. Does the MMU do it, or does it call a kernel handler function? Only when an attempt is made to access an unmapped or protected page; then you get an interrupt (page fault), which the kernel code handles. Why is the kernel allowed to reference physical addresses, while user processes go through the translation step? Not even the kernel accesses physical memory directly. It can, however, choose to map the physical memory into its own address space contiguously. Linux puts it at 0xC000 and up. (question for the gurus- what happens on machines with 1GB of RAM?) Can kernel pages be swapped out / faulted in just like user process pages? Linux does not swap kernel memory; the kernel is so small it's not worth the trouble (are there other reasons?). e.g. My Linux boxes run 1-2MB of kernel code; my NT machine is running 6MB at the moment... Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: large memory support for x86
On Thu, Oct 12, 2000 at 07:19:32PM -0400, Dan Maas wrote: The memory map of a user process on x86 looks like this: - KERNEL (always present here) 0xC000 - 0xBFFF STACK - MAPPED FILES (incl. shared libs) 0x4000 - HEAP (brk()/malloc()) EXECUTABLE CODE 0x08048000 - Try examining /proc/*/maps, and also watch your programs call brk() using strace; you'll see all this in action... So why does the process space start at such a high virtual address (why not closer to 0x)? Seems we're wasting ~128 megs of RAM. Not a huge amount compared to 4G, but signifigant. I don't know; anyone care to comment? Apparently to catch NULL pointer references with array indices (int *p = NULL; p[5000]) I agree that is is very wasteful use of precious virtual memory. Can kernel pages be swapped out / faulted in just like user process pages? Linux does not swap kernel memory; the kernel is so small it's not worth the trouble (are there other reasons?). e.g. My Linux boxes run 1-2MB of kernel code; my NT machine is running 6MB at the moment... Actually most linux boxes do, but with the old term for swapping before virtual memory (or overlaying in dos terms). They have a cronjob that expires modules with usage count 0 (or in 2.0 kerneld that did the same) It is a rather dangerous thing though, module unloading tends to be one of the most race prone and in addition not too well tested places in the kernel. I usually recommend to turn it off on any production machine. In 2.4 with the new fine grained SMP locking is much much more dangerous, nearly impossible to solve properly. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/