I'm sponsoring the following case for Mike Corcoran. Time out 04/07/08.
The case introduces a new system call, mmapfd(2). This call is primarily
targeted for use by ld.so.1(1), and provides for the efficient mapping of
ELF files (and 4.x AOUT files).
Release Binding: Patch/Micro
mmapfd(2): Consolidation Private
--------------------------------------------------------------------------
1. Introduction
1.1. Project/Component Working Name:
mmapfd: mmap file descriptor
1.2. Name of Document Author/Supplier:
Michael Corcoran
1.3. Date of This Document:
03/24/08
1.4. Name of Major Document Customer(s)/Consumer(s):
1.4.1. The PAC or CPT you expect to review your project:
Solaris PAC
1.4.2. The ARC(s) you expect to review your project:
PSARC
1.4.3. The Director/VP who is "Sponsoring" this project:
William.Franklin at sun.com
1.4.4. The name of your business unit:
Software
1.5. Email Aliases:
1.5.1. Responsible Manager: Darrin.Johnson at sun.com
1.5.2. Responsible Engineer: Michael.Corcoran at sun.com
2. Project Summary
2.1. Project Description:
mmapfd is a new system call targeted for use in mapping files that
need to be interpreted. The runtime linker (ld.so.1) will make
use of this system call to map dynamic objects. Both ELF and AOUT
(4.x) file formats are supported. Under the covers, the OS can
optimize the placement of these interpreted dynamic objects.
4. Technical Description:
4.1. Details:
mmapfd is a new system call which can interpret and map ELF and AOUT
(4.x) objects. This system call allows the interpretation and mapping
of ELF and AOUT files to be carried out completely by the kernel rather
than by ld.so.1.
mmapfd also provides for mapping a whole file, without interpretation
in a read only mode.
mmapfd returns a description of the mappings that have been used to
represent the associated file. This data is used by ld.so.1 to
continue
processing the file - searching for dependencies, symbols and
performing
relocations. The data provides for individual operations to be carried
out for each mapping, ie. munmap(2), mprotect(2), etc.
All of the mapping capabilities that ld.so.1 provides today, such as
the
addition of padding (for dbx) and fixed address object mapping, are
available with mmapfd.
In the past, there have been many requests for ld.so.1 to handle
different platforms in different manners by passing new flags or by
trying to have ld.so.1 discover what to do for a given platform.
By creating a single mmapfd interface, these special flags and
behaviors can be removed from ld.so.1, resulting in a much cleaner
ld.so.1 and more flexibility for different platforms to perform
optimizations.
For example, one optimization is for the kernel to use large pages
where
applicable. Another optimization might be to use the same virtual
address for the same object among different processes. These
optimizations can be achieved more easily as the kernel will now
interpret the program headers of the associated file, and thus can
deduce segment size and segment alignment requirements, together with
the files used throughout the system.
Future projects which can build off of the new mmapfd system call
include:
- Having elfexec and mmapfd share a common set of routines to do
all mapping of ELF files. This will provide a centralized location
for interpreting all ELF objects which will be easier to support
and provide a consistent behavior for all applications.
- Having the kernel do hardware capability checking, thus relieving
ld.so.1 from having to do this work, while mapping in an ELF
file. Once again, elfexec and mmapfd can share common interfaces
so there is no duplication of code and consistent behaviors.
- Allow DTrace better access to ELF information. DTrace engineers have
been looking to access DTrace information from within an ELF file
within critical regions. This DTrace information can be gathered as
an extension to the ELF file processing already undertaken by mmapfd.
- Allow new file types to be interpreted in the future via this
central interface.
4.2. Bug/RFE Number(s):
6502792: Same dynamic libraries should be mapped at the same
virtual addresses in different processes
6561987 data vac_conflict faults on lipthread libthread libs in s10
5. Reference Documents:
Linker and Libraries Guide
http://docs.sfbay.sun.com/app/docs/doc/819-0690
CR 6502792 Same dynamic libraries should be mapped at the same
virtual addresses in different processes
http://monaco.sfbay/detail.jsf?cr=6502792
Originally a request to add new linker support for shared
contexts. Lots of info about why this was desired
CR 6561987 data vac_conflict faults on lipthread libthread libs in s10.
http://monaco.sfbay/detail.jsf?cr=6561987
We can reduce vac_conflicts with mmapfd since we can map
libraries at the same virtual color throughout the kernel to
prevent vac conflicts.
6. Resources and Schedule:
6.4. Product Approval Committee requested information:
6.4.1. Consolidation or Component Name:
ON
6.5. ARC review type:
FastTrack
6.6. ARC Exposure:
open
--------------------------------------------------------------------------
System Calls mmapfd(2)
NAME
mmapfd - map a file descriptor in the appropriate manner.
SYNOPSIS
#include <sys/mman.h>
int
mmapfd(int fd, uint_t flags, mmapfd_result_t *storage,
uint_t *elements, void *arg)
DESCRIPTION
The mmapfd() function establishes a set of mappings between a process's
address space and a file. By default, mmapfd maps the whole file as a
single,
private, read-only mapping. The MMFD_INTERPRET flag instructs mmapfd to
attempt to interpret the file and map it according to the rules for that file
format. Currently only the following ELF and AOUT formats are supported.
ET_EXEC and AOUT executables
Result in one or more mappings whose size, alignment and protections
are as described by the files program header information. The address
of each mapping is explicitly defined by the files program headers.
ET_DYN and AOUT shared objects
Result in one or more mappings whose size, alignment and protections
are as described by the files program header information. The base
address of the initial mapping is obtained by mapfd(). The address of
adjacent mappings are based off of this base address as explicitly
defined by the files program headers.
ET_REL and ET_CORE
Result in a single, read-only mapping. The base address of this
mapping is obtained by mmapfd().
mmapfd will not map over any currently used mappings within the process
except for the case of an ELF file for which a previous reservation has been
made via /dev/null.
PARAMETERS
fd The open file descriptor for the file to be mapped.
flags Indicates that the default behavior of mmapfd should be modified
accordingly. Available flags are MMFD_INTERPRET and MMFD_PADDING.
storage
A pointer to the mmapfd_result_t array where the mapping
data will be copied out after a successful mapping of fd.
elements
A pointer to the number of mmapfd_result_t elements pointed to by
storage. On return, elements contains the number of mappings required
to fully map the requested object. If the original value of
elements was too small, an error will be returned, and elements
will be modified to contain the number of mappings necessary.
arg A pointer to additional information that might be associated with the
specific request. Presently, only the MMFD_PADDING request uses this
argument. In this case, args should be a pointer to size_t that
indicates how much padding is requested. This amount of padding is
added before the first mapping and immediately after the last mapping.
FLAGS
MMFD_INTERPRET
Interpret the contents of the file descriptor instead of just mapping a
single image. Can only be used with ELF and AOUT files.
MMFD_PADDING
When mapping in the file descriptor, padding of the amount pointed to
by
arg is requested before the lowest mapping and after the highest
mapping.
TYPES USED
typedef struct {
caddr_t mr_addr; /* mapping address */
size_t mr_msize; /* mapping size */
size_t mr_fsize; /* file size */
size_t mr_offset; /* offset into file */
int mr_prot; /* the protections provided */
uint_t mr_flags; /* info on the mapping */
} mmapfd_result_t;
Values for mr_flags include:
MFD_ELF_HDR 0x1 /* the ELF header is mapped at mr_addr */
MFD_AOUT_HDR 0x2 /* the AOUT header is mapped at mr_addr */
MFD_PADDING 0x4 /* this mapping represents requested padding */
RETURN VALUES
-1 indicates an error occurred and errno will hold the reason. No
data will be copied to storage.
On success, 0 will be returned and elements will contain how many
program headers were mapped for fd. The data for these elements will
be copied to storage such that the first <elements> members of the
storage array will contain valid mapping data.
ERROR VALUES
E2BIG Elements was not large enough to hold the number of loadable
segments in fd. elements will be modified to contain the
number of segments required.
EBADF fd was not a valid open file descriptor
EPERM fd was not open for reading
EINVAL MMFD_INTERPRET was specified and fd is not a valid file type to
be interpreted.
MMFD_PADDING was specified and arg is NULL.
flags contains and invalid flag.
EACCES The file system containing the fd to be interpreted does not
provide for execute access.
ENOMEM Insufficient memory is available to hold the program headers.
EADDRINUSE
The mapping requirements overlap an object that is already used
by the process.
EFAULT storage or args, points to an invalid address.
ENOTSUP The current user data model does not match the fd to be
interpreted. Thus a 32-bit process that tried to use mmapfd
to interpret a 64-bit object would return ENOTSUP.
fd is an ELF file whose type can not be interpreted.
SEE ALSO
ld.so.1(1), mmap(2), attributes(5)
Linker and Libraries Guide
--------------------------------------------------------------------------
--
Rod