> Date: Tue, 24 Jun 2014 15:53:20 -0700
> From: Matthew Dempsky <[email protected]>
> 
> On Tue, Jun 24, 2014 at 11:04:10AM -0700, Matthew Dempsky wrote:
> >   SIGBUS/BUS_ADRERR: Accessing a mapped page that exceeds the end of
> >   the underlying mapped file.
> 
> Generating SIGBUS for this case has proven controversial due to
> concern that this is Linux invented behavior and not compatible with
> Solaris, so I decided to collect some more background information on
> the subject.
> 
> - SunOS 4.1.3's mmap() manual specifies: "Any reference to addresses
> beyond the end of the object, however, will result in the delivery of
> a SIGBUS signal." This wording was relaxed to "SIGBUS or SIGSEGV" in
> SunOS 5.6 and remains in current manuals. (I'm not sure, but I suspect
> this may be to simply reflect that memory protection violations take
> priority over bounds checking.)

It makes sense that memory protection violations take priority over
bounds checking.

>   SunOS 4.1.3: 
> http://www.freebsd.org/cgi/man.cgi?query=mmap&sektion=2&manpath=SunOS+4.1.3
>   SunOS 5.6: 
> http://www.freebsd.org/cgi/man.cgi?query=mmap&sektion=2&manpath=SunOS+5.6
>   Solaris 11: http://docs.oracle.com/cd/E23824_01/html/821-1463/mmap-2.html
> 
> - Many other SVR-derived OSes similarly document SIGBUS in their
> mmap() manuals too:
> 
>   AIX: 
> http://www-01.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.basetechref/doc/basetrf1/mmap.htm?lang=en
>   HPUX: 
> http://h20566.www2.hp.com/portal/site/hpsc/template.BINARYPORTLET/public/kb/docDisplay/resource.process/?spf_p.tpst=kbDocDisplay_ws_BI&spf_p.rid_kbDocDisplay=docDisplayResURL&javax.portlet.begCacheTok=com.vignette.cachetoken&spf_p.rst_kbDocDisplay=wsrp-resourceState%3DdocId%253Demr_na-c02261243-2%257CdocLocale%253D&javax.portlet.endCacheTok=com.vignette.cachetoken
>   UnixWare: http://uw714doc.sco.com/en/man/html.2/mmap.2.html
> 
> - This behavior has been (awkwardly) specified for mmap() since SUSv2:
> "References within the address range starting at pa and continuing for
> len bytes to whole pages following the end of an object shall result
> in delivery of a SIGBUS signal." Later versions of POSIX have the same
> wording.
> 
>   SUSv2: http://pubs.opengroup.org/onlinepubs/007908799/xsh/mmap.html
>   POSIX.2001: 
> http://pubs.opengroup.org/onlinepubs/009695399/functions/mmap.html
>   POSIX.2008: 
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html
> 
> - More generally, POSIX explains the SIGBUS/SIGSEGV distinction
> thusly: "When an object is mapped, various application accesses to the
> mapped region may result in signals. In this context, SIGBUS is used
> to indicate an error using the mapped object, and SIGSEGV is used to
> indicate a protection violation or misuse of an address." Specific
> examples are provided too:
> 
>   Memory Protection: 
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_08_03_03
>

Generating SIGBUS for access beyond the end of an object makes some
sense.  In this case there is a valid mapping; it's just that the
underlying physical memory pages aren't there.  It is no dissimmilar
to having mapped a physical address that maps to say the PCI bus.  On
real hardware accessing such a mapping will lead to a failed bus
transaction for which the logical representation is a SIGBUS.  (On
PeeCee hardware you'll probably get back an all-ones bit-pattern).
>From a hardware-oriented perspective, SIGSEGV is generated by the MMU
and SIGBUS is generated by the underlying hardware.

So I don't think the Sun engineers made a totally unreasonable
decision here.  Unfortunately the CRSG made a different decision when
they reimplemented mmap support in 4.3BSD-Reno.  Or perhaps things got
broken after that...

In my view, generating SIGBUS under these circumstances is a bit
unfortunate.  Currently, SIGBUS on OpenBSD is a very clear indication
of an alignment issue.  If we would generate SIGBUS for access beyond
the end of a mmap'ed object this would no longer be the case.  We'd
actually have to look at the siginfo, which isn't printed by the shell.

On the other hand, passing memory objects by fd is getting more
common.  Xorg recently modernized its shared memory interface
(MIT-SHM, aka XShm) to support mmap'ing file descriptor passed over
sockets.  And DRM is moving in the same direction to solve security
issues with access to graphics objects.  But this approach has a
downside.  A malicious client could pass an fd to the X server and
subsequently truncate it after the X server mapped it.  If the X
server accesses this mapping, it will crash.  To prevent this from
happening, the X server will install a signal handler for SIGBUS,
check if a shared memory object is being accessed and patch things up
(by mmap'ing anonymous memory on top of the mapping).  This code can
be extended of course by handling SIGSEGV as well.  But this means
more work in xenocara and ports, and we might miss some places where
this needs to be done.

Theo has some worries that changing SIGSEGV to SIGBUS in this case
will lead to problems in ports.  I'm not so worried.  For one thing,
i386 and amd64 actually generate SIGBUS in cases where we really
should generate SIGSEGV.  And Linux does implement the SIGBUS
behaviour specified in POSIX here.

Reply via email to