I am sponsoring the following self-reviewed case for myself.

It improves the ability of ar to handle ELF archives with a size
in the range between 2GB to 4GB, and introduces the ability for
ELF archives to be larger than 4GB. I propose to:

    - Deliver a 64-bit version of the ar command

    - Implement a 64-bit archive symbol table, capable of
      supporting archives larger than the current 32-bit
      limit of 4GB.

    - Add a new option to the ar command (-S) to force the use
      of the 64-bit symbol table format.

I believe this qualifies for self review:

    - The operational details of the proposed 64-bit ar command
      is consistent with the operation of other Solaris linker
      tools.

    - The proposed 64-bit archive symbol table format is a natural
      extension of the existing 32-bit format. It also follows
      existing practice: SGI employed the same extension in their
      IRIX64 Unix over a decade ago.

    - The 64-bit symbol table is transparent to the user. No
      new libelf functions are required, and user programs do
      not need to be rebuilt.

The following supporting documents can be found in the case materials:

    - Original and New versions of the ar(1) manpage, as well
      as diffs (ar.1.orig, ar.1.new, ar.1.pdf, ar.1.diffs).

    - Original and New versions of the ar.h(3HEAD) manpage, as well
      as diffs (ar.h.3head.orig, ar.h.3head.new, ar.h.3head.pdf,
      ar.h.3head.diffs).

    - The SGI/MIPS 64-bit ELF Object File Specification
      (Irix64ELFObjectFileSpecificationDraft2.5.pdf). The
      pages relevant to this case are 96-98.

-----

Release Binding:                                Patch/Micro

64-bit ar command:                              Committed
64-bit archive symbol table format:             Committed
ar -S option:                                   Committed

-----
    /usr/bin/ar is currently delivered as a 32-bit executable.
A standard 32-bit executable is limited to reading and writing
files that are 2GB in size or less. This limit has historically
been of little concern. However, objects continue to grow, both in
size, and in number, and so archives are always trending towards
larger size. In 2004, a fix was implemented to ar that allowed it
to produce archives larger than 2GB:

    4987898 The archive utility ar gives 'could not allocate memory'
            on Solaris x86 platform

By employing the 32-bit large file support available under Solaris,
this fix allowed the creation of archives larger than 2GB. However,
ar is still unable to read archives larger than 2GB, giving the
current situation in which ar can produce a large archive that
it cannot read:

    % ar t big.a
    ar: cannot open big.a
        Value too large for defined data type

This occurs because ar relies on libelf for reading archives, and
the 32-bit libelf is not largefile capable. Looking past this limit,
a 32-bit ar is limited to a 4GB address space, and so cannot produce
archives larger than that.

The fix for 4987898 bought us 6 years, but is clearly insufficient
for the long term. There is currently an open CR against this issue
for an important customer with archives that are larger than 2GB.
There are indications that the 4GB boundary will soon become
an issue as well.

The proper fix for all of the above issues is to deliver a 64-bit
version of /usr/bin/ar (/usr/bin/{amd64,sparcv9}/ar), as is done for
the majority of Solaris linker tools (ld, elfdump, etc) found under
/usr/src/cmd/sgs. As with the other tools, the 32-bit version of ar
will exec the 64-bit version when a 64-bit kernel is running.

By itself, a 64-bit ar is not sufficient to produce archives larger
than 4GB. The existing archive symbol table format, as described by
ar.h.(3HEAD) is 32-bit limited, and cannot be used for archives
that are larger than 4GB. It is therefore necessary to add support
for an alternative archive symbol table format that employs 64-bit
offsets. Fortunately, there is existing precedent in this area: SGI
employed a 64-bit symbol table in IRIX64 archives over a decade ago.
Due to common System V Unix roots, IRIX used the same 32-bit symbol
table format as Solaris. Their 64-bit solution was to retain the
layout and operation of the 32-bit format, employing 64-bit integer
words rather than 32-bit words. 32-bit symbol tables are identified
by the special member name "/". The 64-bit version is named "/SYM64/".

The /SYM64/ symbol table format is the obvious extension to our
32-bit archive symbol tables, and has the benefit of already being
well known, and supported by other ar implementations, notably the
GNU ar. A further benefit is that this solution is transparent to
the end user:

    - No changes or additions are needed to the libelf APIs

    - User code does not need to be recompiled.

For maximum portability, the ar default will be to produce the /SYM64/
symbol table only when the archive is larger than 4GB. A new ar
option (-S) will be provided to force the large format when building
smaller archives. This feature is primarily to support linker testing,
as the smaller format is the better choice where possible.
_______________________________________________
opensolaris-arc mailing list
[email protected]

Reply via email to