Re: large memory support for x86

2000-10-25 Thread Albert D. Cahalan

Petr Vandrovec writes:

> Sure it does not. Selectors point to linear addresses, before passing them
> through pagetables. You have 32+14 bits of virtual address (32 = offset,
> 14 = valid bits in selector), which are translated, together with
> offset, to 32 bit linear address. This 32bit linear address is passed
> through pagetables to 36 bit physical address. So it must go through
> 32bit linear address, and there is no easy way to overcome this limit.
...
> make complete segment non present, and on pagefault
> unmap all pages belonging to some selector + invalidate selector, and
> map something else in. You must create at least four such areas,
> as you must have mapped at least CS, SS, ES and one of DS/FS/GS to
> successfully execute MOVSB... So each area should be < 256MB.

This does it...

You sort of page (segment?) the virtual space with segment faults.
Um, you want 3-level software page tables with that?

> Are you really sure that it is worth of effort? Also do not forget
> that 'sizeof(void*) > sizeof(long)' in such environment, so tons of
> code broke... And someone must translate pointers from 48bits to 32
> for kernel use...

No, you don't need that. Nobody could tolerate it anyway.
You need to hack gcc to use a 64-bit long, and a 64-bit pointer.
The pointer has 16 dead bits, 16 segment bits, and 32 offset bits.
Hey, look, LP64 on ia32!

You might want a separate personality for this, so that programs
compiled with this feature would get their own system calls.
Otherwise, there will surely be serious monkey business in libc
trying to get a pointer the kernel will use correctly.

If you select a good segment size and cache hardware page tables
that aren't active, performance might not be... abysmal. The worst
part might be that the CPU becomes less parallel when it encounters
operations that access different segments. Of course 64-bit pointer
math doesn't come free either. I once had a 486DX-75 and I liked it,
so you're doing fine if you can beat that. At least you can use this
to test software for LP64.

Hey, threads are extra cool. They don't really need to share the
hardware page tables. If you only share the software page tables,
then you don't thrash so bad.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-25 Thread Albert D. Cahalan

Petr Vandrovec writes:

 Sure it does not. Selectors point to linear addresses, before passing them
 through pagetables. You have 32+14 bits of virtual address (32 = offset,
 14 = valid bits in selector), which are translated, together with
 offset, to 32 bit linear address. This 32bit linear address is passed
 through pagetables to 36 bit physical address. So it must go through
 32bit linear address, and there is no easy way to overcome this limit.
...
 make complete segment non present, and on pagefault
 unmap all pages belonging to some selector + invalidate selector, and
 map something else in. You must create at least four such areas,
 as you must have mapped at least CS, SS, ES and one of DS/FS/GS to
 successfully execute MOVSB... So each area should be  256MB.

This does it...

You sort of page (segment?) the virtual space with segment faults.
Um, you want 3-level software page tables with that?

 Are you really sure that it is worth of effort? Also do not forget
 that 'sizeof(void*)  sizeof(long)' in such environment, so tons of
 code broke... And someone must translate pointers from 48bits to 32
 for kernel use...

No, you don't need that. Nobody could tolerate it anyway.
You need to hack gcc to use a 64-bit long, and a 64-bit pointer.
The pointer has 16 dead bits, 16 segment bits, and 32 offset bits.
Hey, look, LP64 on ia32!

You might want a separate personality for this, so that programs
compiled with this feature would get their own system calls.
Otherwise, there will surely be serious monkey business in libc
trying to get a pointer the kernel will use correctly.

If you select a good segment size and cache hardware page tables
that aren't active, performance might not be... abysmal. The worst
part might be that the CPU becomes less parallel when it encounters
operations that access different segments. Of course 64-bit pointer
math doesn't come free either. I once had a 486DX-75 and I liked it,
so you're doing fine if you can beat that. At least you can use this
to test software for LP64.

Hey, threads are extra cool. They don't really need to share the
hardware page tables. If you only share the software page tables,
then you don't thrash so bad.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-16 Thread lange92

BTW, this fork program did appear to kill about 2 sun servers here...
The linux kernel v2.2.16 that they were running survived fine.


On Thu, 12 Oct 2000, Richard B. Johnson wrote:

> On Thu, 12 Oct 2000, Oliver Xymoron wrote:
> 
> > On Wed, 11 Oct 2000, Kiril Vidimce wrote:
> > 
> > > My primary concern is whether a process can allocate more than 4 GB of 
> > > memory, rather than just be able to use more than 4 GB of physical 
> > > memory in the system.
> > 
> > Define allocate. There are tricks you can play, but userspace is still a
> > flat 32-bit address space per-process.
> >  
> 
> --- per process. Which means, in principle, that one could have 100
> processes that are accessing a total of 400 Gb of virtual memory.
> 
> It gets to be a bit less than that, though. Process virtual address
> space doesn't start at 0
> 
> Script started on Thu Oct 12 13:25:45 2000
> cat xxx.c
> main()
> {
> printf("main() starts at %p\n", main);
> }
> # gcc -o xxx xxx.c
> # ./xxx
> main() starts at 0x8048488
> # exit
> exit
> Script done on Thu Oct 12 13:26:08 2000
> 
> 
> 
> Cheers,
> Dick Johnson
> 
> Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips).
> 
> "Memory is like gasoline. You use it up when you are running. Of
> course you get it all back when you reboot..."; Actual explanation
> obtained from the Micro$oft help desk.
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-16 Thread lange92

BTW, this fork program did appear to kill about 2 sun servers here...
The linux kernel v2.2.16 that they were running survived fine.


On Thu, 12 Oct 2000, Richard B. Johnson wrote:

 On Thu, 12 Oct 2000, Oliver Xymoron wrote:
 
  On Wed, 11 Oct 2000, Kiril Vidimce wrote:
  
   My primary concern is whether a process can allocate more than 4 GB of 
   memory, rather than just be able to use more than 4 GB of physical 
   memory in the system.
  
  Define allocate. There are tricks you can play, but userspace is still a
  flat 32-bit address space per-process.
   
 
 --- per process. Which means, in principle, that one could have 100
 processes that are accessing a total of 400 Gb of virtual memory.
 
 It gets to be a bit less than that, though. Process virtual address
 space doesn't start at 0
 
 Script started on Thu Oct 12 13:25:45 2000
 cat xxx.c
 main()
 {
 printf("main() starts at %p\n", main);
 }
 # gcc -o xxx xxx.c
 # ./xxx
 main() starts at 0x8048488
 # exit
 exit
 Script done on Thu Oct 12 13:26:08 2000
 
 
 
 Cheers,
 Dick Johnson
 
 Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips).
 
 "Memory is like gasoline. You use it up when you are running. Of
 course you get it all back when you reboot..."; Actual explanation
 obtained from the Micro$oft help desk.
 
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Brian Gerst

Timur Tabi wrote:
> I understand that a normal virtual address (i.e. a pointer) can only address a
> single 32-bit (4GB) memory block.  My point was that by also using more than
> one 16-bit selector, you can have multiple 4GB areas.  So for instance,
> 1000: can point to one physical address, and 1001: can point a
> different physical address.
> 
> Yes, this means that you need to use multiple selectors in order to access more
> than 4GB of virtual space.

If you were to try to overflow the linear address you would either get a
fault or a wraparound.  It won't work like the old DOS highmem tricks.

> According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE
> increases the size of the page table entries to 64 bits.  There are other
> changes, such as extended the 20-bit page directory base address to 27 bits.
> All this means that a virtual address (selector:offset) can point to a physical
> address larger than 32 bits.
> 
> Frankly, the whole thing makes my head hurt.

Section 3.9.1, quote:
"No matter which 36-bit addressing feature is used (PAE or 36-bit PSE),
the linear address space of the processor remains at 32 bits.
Applications must partition the address space of their work loads across
multiple operating system processes to take advantage of the additonal
physical memory provided in the system."

--

Brian Gerst
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Alexander Viro



On Fri, 13 Oct 2000, Timur Tabi wrote:

> ** Reply to message from Alexander Viro <[EMAIL PROTECTED]> on Fri, 13 Oct 2000
> 15:25:31 -0400 (EDT)
> 
> 
> > Ditto with PAE: 16:32->32->36.
> > In _all_ cases you are limited by the size of linear address. I.e. all
> > address modes are limited to 4Gb. All you can get from PAE is placing of
> > these 4Gb in 64Gb of physical memory.
> 
> Then how are you supposed to access all 64GB of RAM in your machine?  The
> kernel must be able to access all 64GB of RAM at once, otherwise it can't do
> proper memory management.

Kernel doesn't have to access it all at once. Most of the time it doesn't
care about the contents of most pages. It certainly needs some permanently
mapped stuff, but it's _way_ less than the full memory. The rest is mapped
and unmapped on demand. That's what kmap() and kunmap() do.

Moreover, 4.4BSD derivatives never bothered mapping the whole physical
memory in kernel space, even on 386. It's more complex than what we used
to do, but it's doable quite fine. There is no need to keep the whole
memory mapped by the kernel.

> I've been reading on the PAE in Intel's manuals.  I admit, some of it is over
> my head.  I was under the impression that it was 16:32->64->36 with PAE enabled.

Nope. 16:32->32->36. Paging is _after_ the 32-bit bottleneck. You'ld have
to change segment descriptors format to expand it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Alexander Viro



On Fri, 13 Oct 2000, Timur Tabi wrote:

> I understand that a normal virtual address (i.e. a pointer) can only address a
> single 32-bit (4GB) memory block.  My point was that by also using more than
> one 16-bit selector, you can have multiple 4GB areas.  So for instance,
> 1000: can point to one physical address, and 1001: can point a
> different physical address.

_All_ of them are piped through the 4Gb address space. I.e. every segment
is mapped to a part of the same (for all segments) 4Gb. That address space
is, in turn, mapped to 64Gb of physical memory. At any moment you can't
get more than 2^32 different elements of physical memory accessible, even
though you have 48 bits of address in the beginning and 36 bits in the
end.

Think of it that way: we have two functions:

u32 map_segment(u48);
u36 map_paging(u32);

and processor does map_paging(map_segment(address)) when it calculates the
physical addresses. Even though both the range and domain are larger than
2^32, the number of different values is less or equal to it.

> Yes, this means that you need to use multiple selectors in order to access more
> than 4GB of virtual space.
> 
> According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE
> increases the size of the page table entries to 64 bits.  There are other
> changes, such as extended the 20-bit page directory base address to 27 bits.
> All this means that a virtual address (selector:offset) can point to a physical
> address larger than 32 bits.

Virtual address gives linear address. _Then_ it is translated into
physical address. Page tables describe the latter mapping. Descriptor
tables - the former. Size of linear address is the bottleneck here and no
changes past that bottleneck can expand the number of possible values.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Timur Tabi

** Reply to message from Alexander Viro <[EMAIL PROTECTED]> on Fri, 13 Oct 2000
15:25:31 -0400 (EDT)


> Ditto with PAE: 16:32->32->36.
> In _all_ cases you are limited by the size of linear address. I.e. all
> address modes are limited to 4Gb. All you can get from PAE is placing of
> these 4Gb in 64Gb of physical memory.

Then how are you supposed to access all 64GB of RAM in your machine?  The
kernel must be able to access all 64GB of RAM at once, otherwise it can't do
proper memory management.

I've been reading on the PAE in Intel's manuals.  I admit, some of it is over
my head.  I was under the impression that it was 16:32->64->36 with PAE enabled.



-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Alexander Viro



On Fri, 13 Oct 2000, Timur Tabi wrote:

> ** Reply to message from Ingo Molnar <[EMAIL PROTECTED]> on Fri, 13 Oct 2000
> 20:44:19 +0200 (CEST)
> 
> 
> > processes are not limited to a single segment, eg. Wine uses nonstandard
> > segments. But as i said, using multiple segments does not let you out of
> > 32 bits of virtual memory.
> 
> Sure it does, just like segments let 16-bit apps access more than 64KB of
> memory.  If you have two selectors, each one can point to a different physical
> base address, and IIRC, the size of the physical address base can be 36 bits.
> That gives you 16 physically contiguous 4GB memory blocks.

RTFM. Take any manual on x86 architecture and stare at the
pictures. Ones that describe how segments work. Real mode: 16:16 ->
21 or 20. Real mode with undocumented twists: 16:16->21->32 (you _can_ get
paging with real mode style of segments). 286 protected mode: 16:16 -> 24.
Ditto with twists: 16:16->24->32. 386 protected mode: 16:32->32.
Ditto with paging: 16:32->32->32. Ditto with PAE: 16:32->32->36.
In _all_ cases you are limited by the size of linear address. I.e. all
address modes are limited to 4Gb. All you can get from PAE is placing of
these 4Gb in 64Gb of physical memory.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Timur Tabi

** Reply to message from Brian Gerst <[EMAIL PROTECTED]> on Fri, 13 Oct 2000
15:07:42 -0400


> You missed the point.  The layering on the X86 memory managment is such:
> 
>Segment
>   |
> Virtual Address<- limited to 32 bits
>   |
> Physical Address
> 
> Segmentation never directly gives you a physical address, even in real
> mode.  Although in real mode virtual address is hardwired to physical
> address.  Virtual addresses are always 32 bits on the x86.  In real
> mode, in protected mode, and with PAE enabled.

I understand that a normal virtual address (i.e. a pointer) can only address a
single 32-bit (4GB) memory block.  My point was that by also using more than
one 16-bit selector, you can have multiple 4GB areas.  So for instance,
1000: can point to one physical address, and 1001: can point a
different physical address.

Yes, this means that you need to use multiple selectors in order to access more
than 4GB of virtual space.

According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE
increases the size of the page table entries to 64 bits.  There are other
changes, such as extended the 20-bit page directory base address to 27 bits.
All this means that a virtual address (selector:offset) can point to a physical
address larger than 32 bits.

Frankly, the whole thing makes my head hurt.



-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: large memory support for x86

2000-10-13 Thread Richard B. Johnson

On Fri, 13 Oct 2000, Chris Swiedler wrote:
> 
> Why is it that a user process can't intentionally switch segments?
> Dereferencing a 32-bit address causes the address to be calculated using the
> "current" segment descriptor, right? It seems to me that a process could set
> a new segment selector, in which case a dereference would operate on a whole
> new segment. Is there a reason why processes are limited to a single
> segment?
> 
> chris

You can (although not in user-mode). The problem is; "What RAM do you
reference?". You have 32-bits worth of addressing available in some
machine that has 32-bits worth of addressible RAM.

What can be done is to set one page to be 'not present'. Then, when
you page-fault, the page-fault handler can set some bit(s) in a
hardware page-register to map in another bank of RAM that isn't in
use yet. In principle, this would allow access to (2^16/8 -1) * (2^32 -1)
unique areas of RAM (segment descriptors are 16 bits, but they are
numbered in increments of 8.

So, you now want to copy from a segment addressed by DS, back to your
original DS, you have to rewrite the 'C' runtime library to use
'full pointers', i.e., DS:ESI, ES:EDI, etc., with ES and ES being
different. You have to dereference the full pointer on avery access!

Caching has to be turned off when RAM values that may be in the cache
reference RAM that is not back-switch in. You would have a slow mess.

However, in principle it could work.


Cheers,
Dick Johnson

Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Brian Gerst

Timur Tabi wrote:
> 
> ** Reply to message from Ingo Molnar <[EMAIL PROTECTED]> on Fri, 13 Oct 2000
> 20:44:19 +0200 (CEST)
> 
> > processes are not limited to a single segment, eg. Wine uses nonstandard
> > segments. But as i said, using multiple segments does not let you out of
> > 32 bits of virtual memory.
> 
> Sure it does, just like segments let 16-bit apps access more than 64KB of
> memory.  If you have two selectors, each one can point to a different physical
> base address, and IIRC, the size of the physical address base can be 36 bits.
> That gives you 16 physically contiguous 4GB memory blocks.

You missed the point.  The layering on the X86 memory managment is such:

   Segment
  |
Virtual Address<- limited to 32 bits
  |
Physical Address

Segmentation never directly gives you a physical address, even in real
mode.  Although in real mode virtual address is hardwired to physical
address.  Virtual addresses are always 32 bits on the x86.  In real
mode, in protected mode, and with PAE enabled.

--

Brian Gerst
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Petr Vandrovec

On 13 Oct 00 at 13:42, Timur Tabi wrote:
> ** Reply to message from Ingo Molnar <[EMAIL PROTECTED]> on Fri, 13 Oct 2000
> 20:44:19 +0200 (CEST)
> > processes are not limited to a single segment, eg. Wine uses nonstandard
> > segments. But as i said, using multiple segments does not let you out of
> > 32 bits of virtual memory.
> 
> Sure it does, just like segments let 16-bit apps access more than 64KB of
> memory.  If you have two selectors, each one can point to a different physical
> base address, and IIRC, the size of the physical address base can be 36 bits.
> That gives you 16 physically contiguous 4GB memory blocks.

Sure it does not. Selectors point to linear addresses, before passing them
through pagetables. You have 32+14 bits of virtual address (32 = offset,
14 = valid bits in selector), which are translated, together with
offset, to 32 bit linear address. This 32bit linear address is passed
through pagetables to 36 bit physical address. So it must go through
32bit linear address, and there is no easy way to overcome this limit.

You can either (1) forget about simple pointers and dereferencing pointers,
and go through realmode windows way - you must GlobalLock() memory area,
and you'll get pointer. Then you GlobalUnlock()... And you can lock
at most 2GB (maybe 3GB if you'll thought really good algorithm) of such 
data... Or (2) make complete segment non present, and on pagefault
unmap all pages belonging to some selector + invalidate selector, and
map something else in. You must create at least four such areas,
as you must have mapped at least CS, SS, ES and one of DS/FS/GS to
successfully execute MOVSB... So each area should be < 256MB.
Are you really sure that it is worth of effort? Also do not forget
that 'sizeof(void*) > sizeof(long)' in such environment, so tons of
code broke... And someone must translate pointers from 48bits to 32
for kernel use...
Best regards,
Petr Vandrovec
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread kernel

On Fri, 13 Oct 2000, Timur Tabi wrote:

> Sure it does, just like segments let 16-bit apps access more than 64KB of
> memory.  If you have two selectors, each one can point to a different physical
> base address, and IIRC, the size of the physical address base can be 36 bits.
> That gives you 16 physically contiguous 4GB memory blocks.

No.  The segment base and length is confined to the 32 bit address space
mapped by page tables.

-ben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Brian Gerst

Chris Swiedler wrote:
> 
> > no, x86 virtual memory is 32 bits - segmentation only provides a way to
> > segment this 4GB virtual memory, but cannot extend it. Under Linux there
> > is 3GB virtual memory available to user-space processes.
> >
> > this 3GB virtual memory does not have to be mapped to the same physical
> > pages all the time - and this is nothing new. mmap()/munmap()-ing memory
> > dynamically is one way to 'extend' the amount of physical RAM controlled
> > by a single process. I doubt this would be very economical though.
> >
> > Such big-RAM systems are almost always SMP systems, so eg. a 4-way system
> > can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can
> > utilize up to 24 GB RAM at once, without having to play mmap/munmap
> > 'memory extender' tricks.
> 
> Why is it that a user process can't intentionally switch segments?
> Dereferencing a 32-bit address causes the address to be calculated using the
> "current" segment descriptor, right? It seems to me that a process could set
> a new segment selector, in which case a dereference would operate on a whole
> new segment. Is there a reason why processes are limited to a single
> segment?

A segment is just a window mapped on top of virtual memory.  A process
can have many segments, but only has one virtual memory mapping.  You
cannot use segments to access memory that isn't mapped into the virtual
address space, which is where the 32-bit limit exists.  It may be
possible to use the segment not present fault to switch page tables, but
this would be very unportable and would incur alot of extra overhead.

--

Brian Gerst
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Timur Tabi

** Reply to message from Ingo Molnar <[EMAIL PROTECTED]> on Fri, 13 Oct 2000
20:44:19 +0200 (CEST)


> processes are not limited to a single segment, eg. Wine uses nonstandard
> segments. But as i said, using multiple segments does not let you out of
> 32 bits of virtual memory.

Sure it does, just like segments let 16-bit apps access more than 64KB of
memory.  If you have two selectors, each one can point to a different physical
base address, and IIRC, the size of the physical address base can be 36 bits.
That gives you 16 physically contiguous 4GB memory blocks.



-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: large memory support for x86

2000-10-13 Thread Ingo Molnar


On Fri, 13 Oct 2000, Chris Swiedler wrote:

> Why is it that a user process can't intentionally switch segments?
> Dereferencing a 32-bit address causes the address to be calculated
> using the "current" segment descriptor, right? It seems to me that a
> process could set a new segment selector, in which case a dereference
> would operate on a whole new segment. Is there a reason why processes
> are limited to a single segment?

processes are not limited to a single segment, eg. Wine uses nonstandard
segments. But as i said, using multiple segments does not let you out of
32 bits of virtual memory.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: large memory support for x86

2000-10-13 Thread Chris Swiedler

> no, x86 virtual memory is 32 bits - segmentation only provides a way to
> segment this 4GB virtual memory, but cannot extend it. Under Linux there
> is 3GB virtual memory available to user-space processes.
>
> this 3GB virtual memory does not have to be mapped to the same physical
> pages all the time - and this is nothing new. mmap()/munmap()-ing memory
> dynamically is one way to 'extend' the amount of physical RAM controlled
> by a single process. I doubt this would be very economical though.
>
> Such big-RAM systems are almost always SMP systems, so eg. a 4-way system
> can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can
> utilize up to 24 GB RAM at once, without having to play mmap/munmap
> 'memory extender' tricks.

Why is it that a user process can't intentionally switch segments?
Dereferencing a 32-bit address causes the address to be calculated using the
"current" segment descriptor, right? It seems to me that a process could set
a new segment selector, in which case a dereference would operate on a whole
new segment. Is there a reason why processes are limited to a single
segment?

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: large memory support for x86

2000-10-13 Thread Ingo Molnar


On Fri, 13 Oct 2000, Chris Swiedler wrote:

 Why is it that a user process can't intentionally switch segments?
 Dereferencing a 32-bit address causes the address to be calculated
 using the "current" segment descriptor, right? It seems to me that a
 process could set a new segment selector, in which case a dereference
 would operate on a whole new segment. Is there a reason why processes
 are limited to a single segment?

processes are not limited to a single segment, eg. Wine uses nonstandard
segments. But as i said, using multiple segments does not let you out of
32 bits of virtual memory.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Timur Tabi

** Reply to message from Ingo Molnar [EMAIL PROTECTED] on Fri, 13 Oct 2000
20:44:19 +0200 (CEST)


 processes are not limited to a single segment, eg. Wine uses nonstandard
 segments. But as i said, using multiple segments does not let you out of
 32 bits of virtual memory.

Sure it does, just like segments let 16-bit apps access more than 64KB of
memory.  If you have two selectors, each one can point to a different physical
base address, and IIRC, the size of the physical address base can be 36 bits.
That gives you 16 physically contiguous 4GB memory blocks.



-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread kernel

On Fri, 13 Oct 2000, Timur Tabi wrote:

 Sure it does, just like segments let 16-bit apps access more than 64KB of
 memory.  If you have two selectors, each one can point to a different physical
 base address, and IIRC, the size of the physical address base can be 36 bits.
 That gives you 16 physically contiguous 4GB memory blocks.

No.  The segment base and length is confined to the 32 bit address space
mapped by page tables.

-ben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Petr Vandrovec

On 13 Oct 00 at 13:42, Timur Tabi wrote:
 ** Reply to message from Ingo Molnar [EMAIL PROTECTED] on Fri, 13 Oct 2000
 20:44:19 +0200 (CEST)
  processes are not limited to a single segment, eg. Wine uses nonstandard
  segments. But as i said, using multiple segments does not let you out of
  32 bits of virtual memory.
 
 Sure it does, just like segments let 16-bit apps access more than 64KB of
 memory.  If you have two selectors, each one can point to a different physical
 base address, and IIRC, the size of the physical address base can be 36 bits.
 That gives you 16 physically contiguous 4GB memory blocks.

Sure it does not. Selectors point to linear addresses, before passing them
through pagetables. You have 32+14 bits of virtual address (32 = offset,
14 = valid bits in selector), which are translated, together with
offset, to 32 bit linear address. This 32bit linear address is passed
through pagetables to 36 bit physical address. So it must go through
32bit linear address, and there is no easy way to overcome this limit.

You can either (1) forget about simple pointers and dereferencing pointers,
and go through realmode windows way - you must GlobalLock() memory area,
and you'll get pointer. Then you GlobalUnlock()... And you can lock
at most 2GB (maybe 3GB if you'll thought really good algorithm) of such 
data... Or (2) make complete segment non present, and on pagefault
unmap all pages belonging to some selector + invalidate selector, and
map something else in. You must create at least four such areas,
as you must have mapped at least CS, SS, ES and one of DS/FS/GS to
successfully execute MOVSB... So each area should be  256MB.
Are you really sure that it is worth of effort? Also do not forget
that 'sizeof(void*)  sizeof(long)' in such environment, so tons of
code broke... And someone must translate pointers from 48bits to 32
for kernel use...
Best regards,
Petr Vandrovec
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Timur Tabi

** Reply to message from Brian Gerst [EMAIL PROTECTED] on Fri, 13 Oct 2000
15:07:42 -0400


 You missed the point.  The layering on the X86 memory managment is such:
 
Segment
   |
 Virtual Address- limited to 32 bits
   |
 Physical Address
 
 Segmentation never directly gives you a physical address, even in real
 mode.  Although in real mode virtual address is hardwired to physical
 address.  Virtual addresses are always 32 bits on the x86.  In real
 mode, in protected mode, and with PAE enabled.

I understand that a normal virtual address (i.e. a pointer) can only address a
single 32-bit (4GB) memory block.  My point was that by also using more than
one 16-bit selector, you can have multiple 4GB areas.  So for instance,
1000: can point to one physical address, and 1001: can point a
different physical address.

Yes, this means that you need to use multiple selectors in order to access more
than 4GB of virtual space.

According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE
increases the size of the page table entries to 64 bits.  There are other
changes, such as extended the 20-bit page directory base address to 27 bits.
All this means that a virtual address (selector:offset) can point to a physical
address larger than 32 bits.

Frankly, the whole thing makes my head hurt.



-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Alexander Viro



On Fri, 13 Oct 2000, Timur Tabi wrote:

 I understand that a normal virtual address (i.e. a pointer) can only address a
 single 32-bit (4GB) memory block.  My point was that by also using more than
 one 16-bit selector, you can have multiple 4GB areas.  So for instance,
 1000: can point to one physical address, and 1001: can point a
 different physical address.

_All_ of them are piped through the 4Gb address space. I.e. every segment
is mapped to a part of the same (for all segments) 4Gb. That address space
is, in turn, mapped to 64Gb of physical memory. At any moment you can't
get more than 2^32 different elements of physical memory accessible, even
though you have 48 bits of address in the beginning and 36 bits in the
end.

Think of it that way: we have two functions:

u32 map_segment(u48);
u36 map_paging(u32);

and processor does map_paging(map_segment(address)) when it calculates the
physical addresses. Even though both the range and domain are larger than
2^32, the number of different values is less or equal to it.

 Yes, this means that you need to use multiple selectors in order to access more
 than 4GB of virtual space.
 
 According to section 3.8 of Intel's P3 manual, Volume 3, enabling the PAE
 increases the size of the page table entries to 64 bits.  There are other
 changes, such as extended the 20-bit page directory base address to 27 bits.
 All this means that a virtual address (selector:offset) can point to a physical
 address larger than 32 bits.

Virtual address gives linear address. _Then_ it is translated into
physical address. Page tables describe the latter mapping. Descriptor
tables - the former. Size of linear address is the bottleneck here and no
changes past that bottleneck can expand the number of possible values.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-13 Thread Alexander Viro



On Fri, 13 Oct 2000, Timur Tabi wrote:

 ** Reply to message from Alexander Viro [EMAIL PROTECTED] on Fri, 13 Oct 2000
 15:25:31 -0400 (EDT)
 
 
  Ditto with PAE: 16:32-32-36.
  In _all_ cases you are limited by the size of linear address. I.e. all
  address modes are limited to 4Gb. All you can get from PAE is placing of
  these 4Gb in 64Gb of physical memory.
 
 Then how are you supposed to access all 64GB of RAM in your machine?  The
 kernel must be able to access all 64GB of RAM at once, otherwise it can't do
 proper memory management.

Kernel doesn't have to access it all at once. Most of the time it doesn't
care about the contents of most pages. It certainly needs some permanently
mapped stuff, but it's _way_ less than the full memory. The rest is mapped
and unmapped on demand. That's what kmap() and kunmap() do.

Moreover, 4.4BSD derivatives never bothered mapping the whole physical
memory in kernel space, even on 386. It's more complex than what we used
to do, but it's doable quite fine. There is no need to keep the whole
memory mapped by the kernel.

 I've been reading on the PAE in Intel's manuals.  I admit, some of it is over
 my head.  I was under the impression that it was 16:32-64-36 with PAE enabled.

Nope. 16:32-32-36. Paging is _after_ the 32-bit bottleneck. You'ld have
to change segment descriptors format to expand it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Andi Kleen

On Thu, Oct 12, 2000 at 07:19:32PM -0400, Dan Maas wrote:
> The memory map of a user process on x86 looks like this:
> 
> -
> KERNEL (always present here)
> 0xC000
> -
> 0xBFFF
> STACK
> -
> MAPPED FILES (incl. shared libs)
> 0x4000
> -
> HEAP (brk()/malloc())
> EXECUTABLE CODE
> 0x08048000
> -
> 
> Try examining /proc/*/maps, and also watch your programs call brk() using
> strace; you'll see all this in action...
> 
> > So why does the process space start at such a high virtual
> > address (why not closer to 0x)? Seems we're wasting ~128 megs of
> > RAM. Not a huge amount compared to 4G, but signifigant.
> 
> I don't know; anyone care to comment?

Apparently to catch NULL pointer references with array indices
(int *p = NULL;  p[5000]) 

I agree that is is very wasteful use of precious virtual memory.

> > Can kernel
> > pages be swapped out / faulted in just like user process pages?
> 
> Linux does not swap kernel memory; the kernel is so small it's not worth the
> trouble (are there other reasons?). e.g. My Linux boxes run 1-2MB of kernel
> code; my NT machine is running >6MB at the moment...

Actually most linux boxes do, but with the old term for swapping before 
virtual memory (or overlaying in dos terms). They have a cronjob that
expires modules with usage count 0 (or in 2.0 kerneld that did the same)
It is a rather dangerous thing though, module unloading tends to be 
one of the most race prone and in addition not too well tested places in 
the kernel. I usually recommend to turn it off on any production machine.
In 2.4 with the new fine grained SMP locking is much much more dangerous,
nearly impossible to solve properly.



-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Dan Maas

The memory map of a user process on x86 looks like this:

-
KERNEL (always present here)
0xC000
-
0xBFFF
STACK
-
MAPPED FILES (incl. shared libs)
0x4000
-
HEAP (brk()/malloc())
EXECUTABLE CODE
0x08048000
-

Try examining /proc/*/maps, and also watch your programs call brk() using
strace; you'll see all this in action...

> So why does the process space start at such a high virtual
> address (why not closer to 0x)? Seems we're wasting ~128 megs of
> RAM. Not a huge amount compared to 4G, but signifigant.

I don't know; anyone care to comment?

> Another question: how (and where in the code) do we translate virtual
> user-addresses to physical addresses?

In hardware, with the TLB and, if the TLB misses, then page tables.

> Does the MMU do it, or does it call a
> kernel handler function?

Only when an attempt is made to access an unmapped or protected page; then
you get an interrupt (page fault), which the kernel code handles.

> Why is the kernel allowed to reference physical
> addresses, while user processes go through the translation step?

Not even the kernel accesses physical memory directly. It can, however,
choose to map the physical memory into its own address space contiguously.
Linux puts it at 0xC000 and up. (question for the gurus- what happens on
machines with >1GB of RAM?)

> Can kernel
> pages be swapped out / faulted in just like user process pages?

Linux does not swap kernel memory; the kernel is so small it's not worth the
trouble (are there other reasons?). e.g. My Linux boxes run 1-2MB of kernel
code; my NT machine is running >6MB at the moment...

Dan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Ingo Molnar


On Thu, 12 Oct 2000, Timur Tabi wrote:

> Of course, you could define a pointer to be a 48-bit value, but I
> doubt that would really work.

no, x86 virtual memory is 32 bits - segmentation only provides a way to
segment this 4GB virtual memory, but cannot extend it. Under Linux there
is 3GB virtual memory available to user-space processes.

this 3GB virtual memory does not have to be mapped to the same physical
pages all the time - and this is nothing new. mmap()/munmap()-ing memory
dynamically is one way to 'extend' the amount of physical RAM controlled
by a single process. I doubt this would be very economical though.

Such big-RAM systems are almost always SMP systems, so eg. a 4-way system
can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can
utilize up to 24 GB RAM at once, without having to play mmap/munmap
'memory extender' tricks.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Malcolm Beattie

Timur Tabi writes:
> ** Reply to message from Jeff Epler <[EMAIL PROTECTED]> on Thu, 12 Oct 2000
> 13:08:19 -0500
> > What the support for >4G of memory on x86 is about, is the "PAE", Page Address
> > Extension, supported on P6 generation of machines, as well as on Athlons
> > (I think).  With these, the kernel can use >4G of memory, but it still can't
> > present a >32bit address space to user processes.  But you could have 8G
> > physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core.
> 
> How about the kernel itself?  How do I access the memory above 4GB inside a
> device driver?

It depends on what you have already. If you're given a (kernel)
virtual address, just dereference it. The unit of currency for
physical pages is the "struct page". If you want to allocate a
physical page for your own use (from anywhere in physical memory)
them you do

struct page *page = alloc_page(GFP_FOO);

If you want to read/write to that page directly from kernel space
then you need to map it into kernel space:

char *va = kmap(page);
/* read/write from the page starting at virtual address va */
kunmap(va);

The implementations of kmap and kunmap are such that mappings are
cached (within reason) so it's "reasonable" fast doing kmap/kunmap.
If you want to do something else with the page (like get some I/O
done to/from it) then the new (and forthcoming) kiobuf functions
take struct page units and handle all the internal mapping gubbins
without you having to worry about it.

--Malcolm

-- 
Malcolm Beattie <[EMAIL PROTECTED]>
Unix Systems Programmer
Oxford University Computing Services
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Timur Tabi

** Reply to message from "Richard B. Johnson" <[EMAIL PROTECTED]> on Thu,
12 Oct 2000 15:17:15 -0400 (EDT)


> With ix86 processors in the kernel, you can create multiple segments
> and multiple page-tables.

Does the kernel provide services for this, or will I have to hack up the x86
page tables myself?  If there are kernel services, what are they?

> If you have some 'hardware-specific' way
> of mapping in real RAM that is not used by anybody else, you can have
> your own private segment.

But is this part of the established Linux >4GB support?  I don't think so.

> Page registers can map in multiple sticks
> of RAM into a single window. The page-fault handler can manipulate
> the page registers for user-mode code. It's being done in PAE.

But how does one go about doing all this?



-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: large memory support for x86

2000-10-12 Thread Chris Swiedler

> > Am I reading this correctly--the address of the main() function for a
> > process is guaranteed to be the lowest possible virtual address?
> >
> > chris
> >
>
> It is one of the lowest. The 'C' runtime library puts section
> .text (the code) first, then .data, then .bss, then .stack.  The
> .stack section is co-located with the heap which can be extended
> by setting a new break address.
>
> When a process is created, the lowest address is the entry point of
> crt0.o  _init. We can see where that is by:
>
> Script started on Thu Oct 12 14:25:35 2000
> # cat xxx.c
>
> extern int _init();
> main()
> {
> printf("_init is at %p\n", _init);
> }
>
> # gcc -o xxx xxx.c
> # ./xxx
> _init is at 0x804838c
> # exit
> exit
> Script done on Thu Oct 12 14:25:51 2000
>
> That said, remember that in Unix, the 'C' rutime library exists in the
> lower portion of the .text section. So your code's virtual address space
> starts above that address space. This is MMAPed so everybody gets
> to share the same pages. In this way, you don't all have to keep a
> private copy of the 'C' runtime library.

User-process virtual addresses have no direct relation to physical
addresses, right? So why does the process space start at such a high virtual
address (why not closer to 0x)? Seems we're wasting ~128 megs of
RAM. Not a huge amount compared to 4G, but signifigant. Is that space used
(libc can't be that big!) or reserved somehow?

Another question: how (and where in the code) do we translate virtual
user-addresses to physical addresses? Does the MMU do it, or does it call a
kernel handler function? Why is the kernel allowed to reference physical
addresses, while user processes go through the translation step? Can kernel
pages be swapped out / faulted in just like user process pages?

Sorry to pounce on you with all of these questions. I've read up on this
stuff but can't always find answers...

thanks--
chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Richard B. Johnson

On Thu, 12 Oct 2000, Timur Tabi wrote:

> ** Reply to message from Jeff Epler <[EMAIL PROTECTED]> on Thu, 12 Oct 2000
> 13:08:19 -0500
> 
> 
> > What the support for >4G of memory on x86 is about, is the "PAE", Page Address
> > Extension, supported on P6 generation of machines, as well as on Athlons
> > (I think).  With these, the kernel can use >4G of memory, but it still can't
> > present a >32bit address space to user processes.  But you could have 8G
> > physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core.
> 
> How about the kernel itself?  How do I access the memory above 4GB inside a
> device driver?
> 
> > There may or may not be some way to support an abomination like the old "far"
> > pointers in DOS (multiple 4G segments), but I don't think it has been written
> > yet.
> 
> Yes, it's ugly, but it works and it's compatible.  Well, compatible with 32-bit
> code, probably not compatible with Linux code overall.
> 
> Of course, you could define a pointer to be a 48-bit value, but I doubt that
> would really work.
> 

With ix86 processors in the kernel, you can create multiple segments
and multiple page-tables. If you have some 'hardware-specific' way
of mapping in real RAM that is not used by anybody else, you can have
your own private segment. Page registers can map in multiple sticks
of RAM into a single window. The page-fault handler can manipulate
the page registers for user-mode code. It's being done in PAE.


Cheers,
Dick Johnson

Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Timur Tabi

** Reply to message from Jeff Epler <[EMAIL PROTECTED]> on Thu, 12 Oct 2000
13:08:19 -0500


> What the support for >4G of memory on x86 is about, is the "PAE", Page Address
> Extension, supported on P6 generation of machines, as well as on Athlons
> (I think).  With these, the kernel can use >4G of memory, but it still can't
> present a >32bit address space to user processes.  But you could have 8G
> physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core.

How about the kernel itself?  How do I access the memory above 4GB inside a
device driver?

> There may or may not be some way to support an abomination like the old "far"
> pointers in DOS (multiple 4G segments), but I don't think it has been written
> yet.

Yes, it's ugly, but it works and it's compatible.  Well, compatible with 32-bit
code, probably not compatible with Linux code overall.

Of course, you could define a pointer to be a 48-bit value, but I doubt that
would really work.






-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Jeff Epler

On Thu, Oct 12, 2000 at 10:36:38AM -0700, Kiril Vidimce wrote:
> Allocate = malloc(). The process needs to be able to operate on >4 GB
> chunks of memory. I understand that it's only a 32 bit address space
> which is why I was surprised when I read that Linux 2.4.x will support
> upwards of 64 GB's of memory.
> 
> Thanks for all the responses.

Pointers are still 32 bits on x86, and the visible address space for
any particular process is still somewhat less than 4G.

I believe that if you select Linux on Alpha that you can have more than 4G
per process, but that may or may not be true.

What the support for >4G of memory on x86 is about, is the "PAE", Page Address
Extension, supported on P6 generation of machines, as well as on Athlons
(I think).  With these, the kernel can use >4G of memory, but it still can't
present a >32bit address space to user processes.  But you could have 8G
physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core.

There may or may not be some way to support an abomination like the old "far"
pointers in DOS (multiple 4G segments), but I don't think it has been written
yet.

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Jonathan George

>On Thu, 12 Oct 2000, Oliver Xymoron wrote: 
>> On Wed, 11 Oct 2000, Kiril Vidimce wrote: 
>> 
>> > My primary concern is whether a process can allocate more than 4 GB of 
>> > memory, rather than just be able to use more than 4 GB of physical 
>> > memory in the system. 
>> 
>> Define allocate. There are tricks you can play, but userspace is still a 
>> flat 32-bit address space per-process.
>
>
>Allocate = malloc(). The process needs to be able to operate on >4 GB 
>chunks of memory. I understand that it's only a 32 bit address space 
>which is why I was surprised when I read that Linux 2.4.x will support 
>upwards of 64 GB's of memory. 
>
>
>Thanks for all the responses. 
>
>
>KV

You can, of course, bank switch memory by using a shared segment and cloning
additional process heaps.  Obviously a _single_ 32bit address space can only
access 4GB at a time.

--Jonathan--


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Tigran Aivazian

On Wed, 11 Oct 2000, Kiril Vidimce wrote:

> 
> Hi there,
> 
> I am trying to find out more information on large memory support (> 4 GB) 
> for Linux IA32. Is there a document that elaborates on what is supported
> and what isn't and how this scheme actually works in the kernel?
> 
> My primary concern is whether a process can allocate more than 4 GB of 
> memory, rather than just be able to use more than 4 GB of physical 
> memory in the system.
> 
> Also, I see that the highmem support is just an option in the kernel. 
> Does this mean that there is a significant performance penalty in using 
> this extension?

Linux does support 64G of physical memory. My machine has 6G RAM and runs
absolutely nice and smooth as it should. Everything "just works" (ok, not
counting a few subtle cache-related issues and stability of PCI subsystem
but those happened many hours ago and were quickly fixed so they are
history now and therefore long forgotten ;).

As for PAE, yes it does incur penalty of about 3-6% of performance
"overall". By overall performance I meant the unixbench numbers. I have
published the numbers comparing PAE/4G/nohighmem kernels on the same
machine sometime ago...

So, it only makes sense to enable PAE if you have more than 4G of memory.

Regards,
Tigran

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Kiril Vidimce

On Thu, 12 Oct 2000, Oliver Xymoron wrote:
> On Wed, 11 Oct 2000, Kiril Vidimce wrote:
> 
> > My primary concern is whether a process can allocate more than 4 GB of 
> > memory, rather than just be able to use more than 4 GB of physical 
> > memory in the system.
> 
> Define allocate. There are tricks you can play, but userspace is still a
> flat 32-bit address space per-process.

Allocate = malloc(). The process needs to be able to operate on >4 GB
chunks of memory. I understand that it's only a 32 bit address space
which is why I was surprised when I read that Linux 2.4.x will support
upwards of 64 GB's of memory.

Thanks for all the responses.

KV
--
  ___
  Studio Tools[EMAIL PROTECTED]
  Pixar Animation Studioshttp://www.pixar.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Richard B. Johnson

On Thu, 12 Oct 2000, Oliver Xymoron wrote:

> On Wed, 11 Oct 2000, Kiril Vidimce wrote:
> 
> > My primary concern is whether a process can allocate more than 4 GB of 
> > memory, rather than just be able to use more than 4 GB of physical 
> > memory in the system.
> 
> Define allocate. There are tricks you can play, but userspace is still a
> flat 32-bit address space per-process.
>  

--- per process. Which means, in principle, that one could have 100
processes that are accessing a total of 400 Gb of virtual memory.

It gets to be a bit less than that, though. Process virtual address
space doesn't start at 0

Script started on Thu Oct 12 13:25:45 2000
cat xxx.c
main()
{
printf("main() starts at %p\n", main);
}
# gcc -o xxx xxx.c
# ./xxx
main() starts at 0x8048488
# exit
exit
Script done on Thu Oct 12 13:26:08 2000



Cheers,
Dick Johnson

Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Oliver Xymoron

On Wed, 11 Oct 2000, Kiril Vidimce wrote:

> My primary concern is whether a process can allocate more than 4 GB of 
> memory, rather than just be able to use more than 4 GB of physical 
> memory in the system.

Define allocate. There are tricks you can play, but userspace is still a
flat 32-bit address space per-process.
 
> Also, I see that the highmem support is just an option in the kernel. 
> Does this mean that there is a significant performance penalty in using 
> this extension?

It doesn't come for free, but it's almost certainly a win for anyone who
has that much memory.

--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.." 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Oliver Xymoron

On Wed, 11 Oct 2000, Kiril Vidimce wrote:

 My primary concern is whether a process can allocate more than 4 GB of 
 memory, rather than just be able to use more than 4 GB of physical 
 memory in the system.

Define allocate. There are tricks you can play, but userspace is still a
flat 32-bit address space per-process.
 
 Also, I see that the highmem support is just an option in the kernel. 
 Does this mean that there is a significant performance penalty in using 
 this extension?

It doesn't come for free, but it's almost certainly a win for anyone who
has that much memory.

--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.." 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Richard B. Johnson

On Thu, 12 Oct 2000, Oliver Xymoron wrote:

 On Wed, 11 Oct 2000, Kiril Vidimce wrote:
 
  My primary concern is whether a process can allocate more than 4 GB of 
  memory, rather than just be able to use more than 4 GB of physical 
  memory in the system.
 
 Define allocate. There are tricks you can play, but userspace is still a
 flat 32-bit address space per-process.
  

--- per process. Which means, in principle, that one could have 100
processes that are accessing a total of 400 Gb of virtual memory.

It gets to be a bit less than that, though. Process virtual address
space doesn't start at 0

Script started on Thu Oct 12 13:25:45 2000
cat xxx.c
main()
{
printf("main() starts at %p\n", main);
}
# gcc -o xxx xxx.c
# ./xxx
main() starts at 0x8048488
# exit
exit
Script done on Thu Oct 12 13:26:08 2000



Cheers,
Dick Johnson

Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Kiril Vidimce

On Thu, 12 Oct 2000, Oliver Xymoron wrote:
 On Wed, 11 Oct 2000, Kiril Vidimce wrote:
 
  My primary concern is whether a process can allocate more than 4 GB of 
  memory, rather than just be able to use more than 4 GB of physical 
  memory in the system.
 
 Define allocate. There are tricks you can play, but userspace is still a
 flat 32-bit address space per-process.

Allocate = malloc(). The process needs to be able to operate on 4 GB
chunks of memory. I understand that it's only a 32 bit address space
which is why I was surprised when I read that Linux 2.4.x will support
upwards of 64 GB's of memory.

Thanks for all the responses.

KV
--
  ___
  Studio Tools[EMAIL PROTECTED]
  Pixar Animation Studioshttp://www.pixar.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Tigran Aivazian

On Wed, 11 Oct 2000, Kiril Vidimce wrote:

 
 Hi there,
 
 I am trying to find out more information on large memory support ( 4 GB) 
 for Linux IA32. Is there a document that elaborates on what is supported
 and what isn't and how this scheme actually works in the kernel?
 
 My primary concern is whether a process can allocate more than 4 GB of 
 memory, rather than just be able to use more than 4 GB of physical 
 memory in the system.
 
 Also, I see that the highmem support is just an option in the kernel. 
 Does this mean that there is a significant performance penalty in using 
 this extension?

Linux does support 64G of physical memory. My machine has 6G RAM and runs
absolutely nice and smooth as it should. Everything "just works" (ok, not
counting a few subtle cache-related issues and stability of PCI subsystem
but those happened many hours ago and were quickly fixed so they are
history now and therefore long forgotten ;).

As for PAE, yes it does incur penalty of about 3-6% of performance
"overall". By overall performance I meant the unixbench numbers. I have
published the numbers comparing PAE/4G/nohighmem kernels on the same
machine sometime ago...

So, it only makes sense to enable PAE if you have more than 4G of memory.

Regards,
Tigran

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Jonathan George

On Thu, 12 Oct 2000, Oliver Xymoron wrote: 
 On Wed, 11 Oct 2000, Kiril Vidimce wrote: 
 
  My primary concern is whether a process can allocate more than 4 GB of 
  memory, rather than just be able to use more than 4 GB of physical 
  memory in the system. 
 
 Define allocate. There are tricks you can play, but userspace is still a 
 flat 32-bit address space per-process.


Allocate = malloc(). The process needs to be able to operate on 4 GB 
chunks of memory. I understand that it's only a 32 bit address space 
which is why I was surprised when I read that Linux 2.4.x will support 
upwards of 64 GB's of memory. 


Thanks for all the responses. 


KV

You can, of course, bank switch memory by using a shared segment and cloning
additional process heaps.  Obviously a _single_ 32bit address space can only
access 4GB at a time.

--Jonathan--


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Jeff Epler

On Thu, Oct 12, 2000 at 10:36:38AM -0700, Kiril Vidimce wrote:
 Allocate = malloc(). The process needs to be able to operate on 4 GB
 chunks of memory. I understand that it's only a 32 bit address space
 which is why I was surprised when I read that Linux 2.4.x will support
 upwards of 64 GB's of memory.
 
 Thanks for all the responses.

Pointers are still 32 bits on x86, and the visible address space for
any particular process is still somewhat less than 4G.

I believe that if you select Linux on Alpha that you can have more than 4G
per process, but that may or may not be true.

What the support for 4G of memory on x86 is about, is the "PAE", Page Address
Extension, supported on P6 generation of machines, as well as on Athlons
(I think).  With these, the kernel can use 4G of memory, but it still can't
present a 32bit address space to user processes.  But you could have 8G
physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core.

There may or may not be some way to support an abomination like the old "far"
pointers in DOS (multiple 4G segments), but I don't think it has been written
yet.

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Timur Tabi

** Reply to message from Jeff Epler [EMAIL PROTECTED] on Thu, 12 Oct 2000
13:08:19 -0500


 What the support for 4G of memory on x86 is about, is the "PAE", Page Address
 Extension, supported on P6 generation of machines, as well as on Athlons
 (I think).  With these, the kernel can use 4G of memory, but it still can't
 present a 32bit address space to user processes.  But you could have 8G
 physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core.

How about the kernel itself?  How do I access the memory above 4GB inside a
device driver?

 There may or may not be some way to support an abomination like the old "far"
 pointers in DOS (multiple 4G segments), but I don't think it has been written
 yet.

Yes, it's ugly, but it works and it's compatible.  Well, compatible with 32-bit
code, probably not compatible with Linux code overall.

Of course, you could define a pointer to be a 48-bit value, but I doubt that
would really work.






-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Richard B. Johnson

On Thu, 12 Oct 2000, Timur Tabi wrote:

 ** Reply to message from Jeff Epler [EMAIL PROTECTED] on Thu, 12 Oct 2000
 13:08:19 -0500
 
 
  What the support for 4G of memory on x86 is about, is the "PAE", Page Address
  Extension, supported on P6 generation of machines, as well as on Athlons
  (I think).  With these, the kernel can use 4G of memory, but it still can't
  present a 32bit address space to user processes.  But you could have 8G
  physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core.
 
 How about the kernel itself?  How do I access the memory above 4GB inside a
 device driver?
 
  There may or may not be some way to support an abomination like the old "far"
  pointers in DOS (multiple 4G segments), but I don't think it has been written
  yet.
 
 Yes, it's ugly, but it works and it's compatible.  Well, compatible with 32-bit
 code, probably not compatible with Linux code overall.
 
 Of course, you could define a pointer to be a 48-bit value, but I doubt that
 would really work.
 

With ix86 processors in the kernel, you can create multiple segments
and multiple page-tables. If you have some 'hardware-specific' way
of mapping in real RAM that is not used by anybody else, you can have
your own private segment. Page registers can map in multiple sticks
of RAM into a single window. The page-fault handler can manipulate
the page registers for user-mode code. It's being done in PAE.


Cheers,
Dick Johnson

Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: large memory support for x86

2000-10-12 Thread Chris Swiedler

  Am I reading this correctly--the address of the main() function for a
  process is guaranteed to be the lowest possible virtual address?
 
  chris
 

 It is one of the lowest. The 'C' runtime library puts section
 .text (the code) first, then .data, then .bss, then .stack.  The
 .stack section is co-located with the heap which can be extended
 by setting a new break address.

 When a process is created, the lowest address is the entry point of
 crt0.o  _init. We can see where that is by:

 Script started on Thu Oct 12 14:25:35 2000
 # cat xxx.c

 extern int _init();
 main()
 {
 printf("_init is at %p\n", _init);
 }

 # gcc -o xxx xxx.c
 # ./xxx
 _init is at 0x804838c
 # exit
 exit
 Script done on Thu Oct 12 14:25:51 2000

 That said, remember that in Unix, the 'C' rutime library exists in the
 lower portion of the .text section. So your code's virtual address space
 starts above that address space. This is MMAPed so everybody gets
 to share the same pages. In this way, you don't all have to keep a
 private copy of the 'C' runtime library.

User-process virtual addresses have no direct relation to physical
addresses, right? So why does the process space start at such a high virtual
address (why not closer to 0x)? Seems we're wasting ~128 megs of
RAM. Not a huge amount compared to 4G, but signifigant. Is that space used
(libc can't be that big!) or reserved somehow?

Another question: how (and where in the code) do we translate virtual
user-addresses to physical addresses? Does the MMU do it, or does it call a
kernel handler function? Why is the kernel allowed to reference physical
addresses, while user processes go through the translation step? Can kernel
pages be swapped out / faulted in just like user process pages?

Sorry to pounce on you with all of these questions. I've read up on this
stuff but can't always find answers...

thanks--
chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Timur Tabi

** Reply to message from "Richard B. Johnson" [EMAIL PROTECTED] on Thu,
12 Oct 2000 15:17:15 -0400 (EDT)


 With ix86 processors in the kernel, you can create multiple segments
 and multiple page-tables.

Does the kernel provide services for this, or will I have to hack up the x86
page tables myself?  If there are kernel services, what are they?

 If you have some 'hardware-specific' way
 of mapping in real RAM that is not used by anybody else, you can have
 your own private segment.

But is this part of the established Linux 4GB support?  I don't think so.

 Page registers can map in multiple sticks
 of RAM into a single window. The page-fault handler can manipulate
 the page registers for user-mode code. It's being done in PAE.

But how does one go about doing all this?



-- 
Timur Tabi - [EMAIL PROTECTED]
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just 
get two copies of the same message.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Malcolm Beattie

Timur Tabi writes:
 ** Reply to message from Jeff Epler [EMAIL PROTECTED] on Thu, 12 Oct 2000
 13:08:19 -0500
  What the support for 4G of memory on x86 is about, is the "PAE", Page Address
  Extension, supported on P6 generation of machines, as well as on Athlons
  (I think).  With these, the kernel can use 4G of memory, but it still can't
  present a 32bit address space to user processes.  But you could have 8G
  physical RAM and run 4 ~2G or 2 ~4G processes simultaneously in core.
 
 How about the kernel itself?  How do I access the memory above 4GB inside a
 device driver?

It depends on what you have already. If you're given a (kernel)
virtual address, just dereference it. The unit of currency for
physical pages is the "struct page". If you want to allocate a
physical page for your own use (from anywhere in physical memory)
them you do

struct page *page = alloc_page(GFP_FOO);

If you want to read/write to that page directly from kernel space
then you need to map it into kernel space:

char *va = kmap(page);
/* read/write from the page starting at virtual address va */
kunmap(va);

The implementations of kmap and kunmap are such that mappings are
cached (within reason) so it's "reasonable" fast doing kmap/kunmap.
If you want to do something else with the page (like get some I/O
done to/from it) then the new (and forthcoming) kiobuf functions
take struct page units and handle all the internal mapping gubbins
without you having to worry about it.

--Malcolm

-- 
Malcolm Beattie [EMAIL PROTECTED]
Unix Systems Programmer
Oxford University Computing Services
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Ingo Molnar


On Thu, 12 Oct 2000, Timur Tabi wrote:

 Of course, you could define a pointer to be a 48-bit value, but I
 doubt that would really work.

no, x86 virtual memory is 32 bits - segmentation only provides a way to
segment this 4GB virtual memory, but cannot extend it. Under Linux there
is 3GB virtual memory available to user-space processes.

this 3GB virtual memory does not have to be mapped to the same physical
pages all the time - and this is nothing new. mmap()/munmap()-ing memory
dynamically is one way to 'extend' the amount of physical RAM controlled
by a single process. I doubt this would be very economical though.

Such big-RAM systems are almost always SMP systems, so eg. a 4-way system
can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can
utilize up to 24 GB RAM at once, without having to play mmap/munmap
'memory extender' tricks.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Dan Maas

The memory map of a user process on x86 looks like this:

-
KERNEL (always present here)
0xC000
-
0xBFFF
STACK
-
MAPPED FILES (incl. shared libs)
0x4000
-
HEAP (brk()/malloc())
EXECUTABLE CODE
0x08048000
-

Try examining /proc/*/maps, and also watch your programs call brk() using
strace; you'll see all this in action...

 So why does the process space start at such a high virtual
 address (why not closer to 0x)? Seems we're wasting ~128 megs of
 RAM. Not a huge amount compared to 4G, but signifigant.

I don't know; anyone care to comment?

 Another question: how (and where in the code) do we translate virtual
 user-addresses to physical addresses?

In hardware, with the TLB and, if the TLB misses, then page tables.

 Does the MMU do it, or does it call a
 kernel handler function?

Only when an attempt is made to access an unmapped or protected page; then
you get an interrupt (page fault), which the kernel code handles.

 Why is the kernel allowed to reference physical
 addresses, while user processes go through the translation step?

Not even the kernel accesses physical memory directly. It can, however,
choose to map the physical memory into its own address space contiguously.
Linux puts it at 0xC000 and up. (question for the gurus- what happens on
machines with 1GB of RAM?)

 Can kernel
 pages be swapped out / faulted in just like user process pages?

Linux does not swap kernel memory; the kernel is so small it's not worth the
trouble (are there other reasons?). e.g. My Linux boxes run 1-2MB of kernel
code; my NT machine is running 6MB at the moment...

Dan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: large memory support for x86

2000-10-12 Thread Andi Kleen

On Thu, Oct 12, 2000 at 07:19:32PM -0400, Dan Maas wrote:
 The memory map of a user process on x86 looks like this:
 
 -
 KERNEL (always present here)
 0xC000
 -
 0xBFFF
 STACK
 -
 MAPPED FILES (incl. shared libs)
 0x4000
 -
 HEAP (brk()/malloc())
 EXECUTABLE CODE
 0x08048000
 -
 
 Try examining /proc/*/maps, and also watch your programs call brk() using
 strace; you'll see all this in action...
 
  So why does the process space start at such a high virtual
  address (why not closer to 0x)? Seems we're wasting ~128 megs of
  RAM. Not a huge amount compared to 4G, but signifigant.
 
 I don't know; anyone care to comment?

Apparently to catch NULL pointer references with array indices
(int *p = NULL;  p[5000]) 

I agree that is is very wasteful use of precious virtual memory.

  Can kernel
  pages be swapped out / faulted in just like user process pages?
 
 Linux does not swap kernel memory; the kernel is so small it's not worth the
 trouble (are there other reasons?). e.g. My Linux boxes run 1-2MB of kernel
 code; my NT machine is running 6MB at the moment...

Actually most linux boxes do, but with the old term for swapping before 
virtual memory (or overlaying in dos terms). They have a cronjob that
expires modules with usage count 0 (or in 2.0 kerneld that did the same)
It is a rather dangerous thing though, module unloading tends to be 
one of the most race prone and in addition not too well tested places in 
the kernel. I usually recommend to turn it off on any production machine.
In 2.4 with the new fine grained SMP locking is much much more dangerous,
nearly impossible to solve properly.



-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/