Roland Mainz wrote:
This could be very unfortunate since it limits future development (at least for Solaris... other operating systems like Linux are likely not affected, right ?). The decision may be acceptable today - but in twenty years it _may_ become a real problem - assuming the "optimum" page size continues to grow like it did in the last twenty years (remember VAX had a page size of 512bytes, now 4k/8k are very common and in the future a larger default page size may become usefull). Fixing the default page size in the ABI to 8k may be disastrous then.
It's not baked into the ABI; I was simply pointing out that it might as well be at the moment as far as our ability to change it goes. ;)
We still have the freedom to change this down the road, it's just a matter of making the decision that the benefits outweigh the loss of backward compatibility with some apps which assume protections are 8K-granular. In regards to DaveM's remarks I agree completely that it's best to have as few constraints as possible in the design of any system, and that this is a constraint that we don't like to have.
Aside from the apps that broke the next largest drawback seen with the 64K prototype kernels was that when you have only 64K pages, if you touch 8K in the middle of a mmap() region you end up writing back 64K to the backing store. If you do this enough (which some apps do) you see a performance degradation. The same factor comes into play with paging, since you lose track of your working set; TANSTAAFL. And overall, the performance gains weren't always stellar due to the reduction in page coloring on systems with direct-mapped or 2-way associative external caches.
As a result of the lessons learned from the 64K project I believe the ability to do 8K protection granular mappings to files at the user level moving forward is probably not a bad thing to keep in our pocket, although it is in fairness an undue constraint. I believe that if we depart from the conventional wisdom of physical memory management in the kernel and make the physical and virtual page sizes are correctly decoupled in the VM system (a problem I've been looking at for awhile now -- it ain't easy) this is not a difficult constraint to retain.
Also note keeping support for 8K user mappings doesn't necessarily mean we need to constrain ourselves to using 8K everywhere in the VM, or even keeping PAGESIZE at 8K (excepting some tricky issues with MAXBSIZE and filesystems); only that we should (at least for the moment) keep supporting 8K pages for user mappings. We have discussed in the past, and am still considering, changing the basic currency of physical memory in the kernel to be considerably larger than 8K, and sucking it up when it comes to fragmentation loss since segments on average are relatively large. The nice side effect of that would be that every 64K aligned 64K sized span would automatically be promotable to a 64K mapping should we choose without the page relocation and renaming overhead that MPSS has to go through today.
BTW: The discussion was about a _tuneable_ which could be set to a value used as default page size (used by kernel and returned by |getpagesize()|&co.) - the default for this tuneable should remain 8k.
The disadvantage of doing this is that it would bring back from the dead the 32bit/64bit dual testing scenarios that caused us nightmares before we EOF'ed 32-bit SPARC kernel support. This was a problem not just for Sun but also made life more difficult for our ISVs also. If the gains were on average double digit, we might be able to justify such a move, but the typical application gains were a lot more modest, and we can in most cases approach the maximum gains simply by instituting more aggressive automatic MPSS policies. I think a fixed page size per architecture approach would be better received.
It would allow people to switch to 64k pages on demand and even allows them to return to the 8k size if something breaks (e.g. setting the tunable is not mandatory). Additionally a shared library similar to /usr/lib/[EMAIL PROTECTED] could be provided to switch to the old page size if individual userland applications cause trouble. And there would be a way to fix all those broken applications out there. Without having such a tuneable it is almost impossible to fix the applications which makes the situation even worse (which may backfire at some point in the future).
If we get to the point where we can support sysconf(_SC_PAGESIZE) of 8K or 64K depending on which library you have in your LD_LIBRARY_PATH and manage memory underneath with 64K this would be a great thing; this is a fine long-term goal. It will take a lot of work to get the system into that kind of shape though.
- Eric _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org