Hi silvas,

This is a draft of the cross-compilation document, explaining all arguments, 
hinting on their pitfalls, behaviour, and special meanings.

It also contain a full use case: compiling LLVM+Clang with LLVM+Clang, from 
x86_64 to ARM.

I'm not sure where is the best place for this document, Clang or LLVM (since 
it's related to both), but would be good to have a link to it from one of the 
core documents, since this is something that many people ask on the list about.

http://llvm-reviews.chandlerc.com/D1584

Files:
  docs/HowToCrossCompile.rst
Index: docs/HowToCrossCompile.rst
===================================================================
--- /dev/null
+++ docs/HowToCrossCompile.rst
@@ -0,0 +1,362 @@
+===================================================================
+How To Cross-Compile Clang/LLVM using Clang/LLVM
+===================================================================
+
+Introduction
+============
+
+This document contains information about building LLVM and
+Clang on your laptop, targeting another platform. It also contains
+an example when ``host/target`` is ``x86_64/ARM``.
+
+Cross compilation issues
+========================
+
+Cross-compiling is the art of building your software with a compiler
+that produces binaries to a different platform, for example, compiling
+ARM binaries on an x86_64 laptop, or Windows binaries on a Linux box.
+
+As you can imagine, this will bring all sorts of problems when interacting
+with your system toolchain. You'll have to find a compiler that can
+generate the target code, a linker that can understand the objects,
+libraries for the specific target, or even more than one target, and so on.
+
+In GCC world, every host/target combination has its own set of binaries,
+headers, libraries, etc. So, it's usually simple to download a package
+with all files in, unzip to a directory and point the build system to
+that compiler, that will know about its location and find all it needs to
+when compiling your code.
+
+But Clang/LLVM is natively a cross-compiler, meaning that one set of
+programs can compile to all targets by setting the -target option. But that
+doesn't help finding the headers, libraries or binutils to generate
+target specific code. So you'll need special options to help Clang understand
+what target you're compiling to, where are your stuff, etc.
+
+Another problem is that compilers come with standard libraries only (like
+libgcc, libm, etc), so you'll have to find and let available to the build
+system every other library that you'll need to build your software,
+that is specific to your target. It's not enough to have your host's
+libraries installed.
+
+Finally, not all toolchains are the same, and consequently, not every Clang
+option will make it work magically. Some options (like -sysroot) assume
+all your binaries and libraries are in the same directory, which is not
+true when your cross-compiler was installed by the distribution's package
+management. So, for each specific case, you'll have to use more than one
+option, and in most cases, you'll end up setting include paths (-I) and
+library paths (-L) manually.
+
+To sum up, different toolchains can:
+ * be host/target specific or more flexible
+ * be in a single directory, or spread out your system
+ * have different sets of libraries and headers by default
+ * need special options, which your build system won't be able to figure
+   out by itself
+
+General Cross-Compilation Options in Clang
+==========================================
+
+Target Triple
+-------------
+
+The basic option is to define the target architecture. For that, use
+``-target <triple>``. If you don't specify the target, CPU names won't
+match (since Clang assumes the host triple), and the compilation will
+go ahead, creating code for the host platform, which will break later
+on when assembling or linking.
+
+The triple has the general format ``<arch><sub>-<sys>-<abi>``, where:
+ * ``arch`` = x86, arm, thumb, mips, etc
+ * ``sub`` = for ARM: v5, v6m, v7a, v7m, etc
+ * ``sys`` = none, linux, darwin etc.
+ * ``abi`` = eabi, gnueabi, androideabi, gnueabihf, etc
+
+The sub-architecture options are available for their own architectures,
+of course, so "x86v7a" doesn't make sense. The system name is generally
+the OS (linux, darwin), but could be special like the bare-metal "none".
+
+When a parameter is not important, choose "unknown" and the defaults
+will be used. If you choose a parameter that Clang doesn't know, like
+"blerg", it'll ignore and also choose the default behaviour for the
+architecture.
+
+Finally, the ABI option is something that will pick default CPU/FPU,
+define the specific behaviour of your code (PCS, extensions),
+and also choose the correct library calls, etc.
+
+CPU, FPU, ABI
+-------------
+
+Once your target is specified, it's time to pick the hardware you'll
+be compiling to. For every architecture, a default set of CPU/FPU/ABI
+will be chosen, so you'll almost always have to change it via flags.
+
+Typical flags include:
+ * ``-mcpu=<cpu-name>``, like x86-64, swift, cortex-a15
+ * ``-fpu=<fpu-name>``, like SSE3, NEON, controlling the FP unit available
+ * ``-mfloat-abi=<fabi>``, like soft, hard, controlling which registers
+   to use for floating-point
+
+The default is normally the common denominator, so that Clang doesn't
+generate code that breaks. But that also means you won't get the best
+code for your specific hardware, which may mean orders of magnitude
+slower than you expect.
+
+For example, if your target is "arm-none-eabi", the default CPU will
+be "arm7tdmi" using soft float, which is extremely slow on modern cores,
+whereas if your triple is "armv7a-none-eabi", it'll be Cortex-A8 with
+NEON, but still using soft-float, which is much better, but still not
+great.
+
+Toolchain Options
+-----------------
+
+There are four main options to control access to your cross-compiler:
+``sysroot``, ``-ccc-gcc-name``, ``-I`` and ``-L``. The latter half is
+well known, but they're particularly important for additional libraries
+and headers that are specific to your target. The former half are to
+be used in two different situations:
+
+#. When you have extracted your cross-compiler from a zip file into
+   a directory, you have to use ``-sysroot <path>``. The path is the
+   root directory where you have unpacked your file, and Clang will
+   look for the directories ``bin``, ``lib``, ``include`` in there.
+
+   In this case, your setup should be pretty much done (if no
+   additional headers or libraries are needed), as Clang will find
+   all binaries it needs (assembler, linker, etc) in there.
+
+#. When you have installed via a package manager (modern Linux
+   distributions have cross-compiler packages available), use
+   ``-ccc-gcc-name <gcc>``, where the "gcc" is the full name of
+   the binary, including the target triple, ex: ``arm-linux-gnueabihf-gcc``.
+
+   In this case, Clang will find the other binaries (assembler,
+   linker), but will have no clue where the target headers and libraries
+   are. People add system-spicific clues to Clang often, but as
+   things change, it's more likely that it won't find than the
+   other way around.
+
+   So, here, you'll be a lot safer if you specify the include/library
+   directories manually (via ``-I`` and ``-L``).
+
+Pay special attention to the ``-ccc-gcc-name`` option, as currently,
+even when you specify the target triple, Clang won't guess the correct
+assembler/linker, and could either pick the wrong one (host default),
+or choose the wrong triple for the tool, setting the wrong defaults
+and possibly generating unlinkable code.
+
+If you specify ``-ccc-gcc-name`` in the command line, even when
+``-sysroot`` is defined, you will force Clang to choose the correct
+assembler/linker, and all should behave as expected.
+
+Target-Specific Libraries
+=========================
+
+All libraries that you compile as part of your build will be
+cross-compiled to your target, and your build system will probably
+find them in the right place. But all dependencies that are
+normally checked against (like libxml or libz etc) will match
+against the host platform, not the target.
+
+So, if the build system is not aware that you want to cross-compile
+your code, it will get every dependency wrong, and your compilation
+will fail during build time, not configure time.
+
+Also, finding the libraries to your target are not as easy
+as to your host machine. There aren't many cross-libraries available
+as packages to most OSs, so you'll have to either cross-compile them
+from source, or download the package for your target platform,
+extract the libraries and headers, put them in specific directories
+and add ``-I`` and ``-L`` pointing to them.
+
+Also, some libraries have different dependencies on different targets,
+so configuration tools to find dependencies in the host can get the
+list wrong for the target platform. This means that the configuration
+of your build can get things wrong when setting their own library
+paths, and you'll have to augment it via additional flags (configure,
+Make, CMake, etc).
+
+Multilibs
+---------
+
+When you want to cross-compile to more than one configuration, for
+example hard-float-ARM and soft-float-ARM, you'll have to have multiple
+copies of you libraries and (possibly) headers.
+
+Some Linux distributions have support for Multilib, which handle that
+for you in an easier way, but if you're not careful and, for instance,
+forget to specify ``-ccc-gcc-name armv7l-linux-gnueabihf-gcc`` (which
+uses hard-float), Clang will pick the ``armv7l-linux-gnueabi-ld``
+(which uses soft-float) and linker errors will happen.
+
+The same is true if you're compiling for different ABIs, like ``gnueabi``
+and ``androideabi``, and might even link and run, but produce run-time
+errors, which are much harder to track and fix.
+
+
+Use Case: Cross-Compiling from x86_64 to ARM
+============================================
+
+In this use case, we'll be using CMake and Ninja, on a Debian-based Linux
+system, cross-compiling from an x86_64 host (most Intel and AMD chips
+nowadays) to a hard-float ARM target (most ARM targets nowadays).
+
+The packages you'll need are:
+
+ * cmake
+ * ninja-build (from backports in Ubuntu)
+ * gcc-4.7-arm-linux-gnueabihf
+ * gcc-4.7-multilib-arm-linux-gnueabihf
+ * binutils-arm-linux-gnueabihf
+ * libgcc1-armhf-cross
+ * libsfgcc1-armhf-cross
+ * libstdc++6-armhf-cross
+ * libstdc++6-4.7-dev-armhf-cross
+
+The GCC packages are needed for the target-specific headers, libraries
+and binutils (which come as a dependency), but also to cross-check that
+your cross-compilation works on GCC, too.
+
+Configuring CMake
+-----------------
+
+For more information on how to configure CMake for LLVM/Clang,
+see :doc:`CMake`.
+
+The CMake options you need to add are:
+ * -DCMAKE_CROSSCOMPILING=True
+ * -DCMAKE_INSTALL_PREFIX=<install-dir>
+ * -DLLVM_TABLEGEN=<path-to-host-bin>/llvm-tblgen
+ * -DCLANG_TABLEGEN=<path-to-host-bin>/clang-tblgen
+ * -DLLVM_DEFAULT_TARGET_TRIPLE=arm-linux-gnueabihf
+ * -DLLVM_TARGET_ARCH=ARM
+ * -DLLVM_TARGETS_TO_BUILD=ARM
+ * -DLLVM_ENABLE_PIC=False
+ * -DCMAKE_CXX_FLAGS='-target armv7a-linux-gnueabihf -mcpu=cortex-a9
+    -I/usr/arm-linux-gnueabihf/include/c++/4.7.2/arm-linux-gnueabihf/
+    -I/usr/arm-linux-gnueabihf/include/ -mfloat-abi=hard
+    -ccc-gcc-name arm-linux-gnueabihf-gcc'
+
+The TableGen options are required to compile it with the host compiler,
+and the CXX flags define the target, cpu (which defaults to fpu=VFP3
+with NEON), and forcing the hard-float ABI but *also* setting the
+``-ccc-gcc-name``, to make sure it picks the correct linker.
+
+Most of the time, what you want is to have a native compiler to the
+platform itself, but not others. It might not even be feasible to
+produce x86 binaries from ARM targets, so there's no point in compiling
+all back-ends. For that reason, you should also set the "TARGETS_TO_BUILD"
+to only build the ARM back-end.
+
+You must set the CMAKE_INSTALL_PREFIX, otherwise a ``ninja install``
+will copy ARM binaries to your root filesystem, which is not what you
+want.
+
+Hacks
+-----
+
+There are some bugs in current LLVM, which require some fiddling before
+running CMake:
+
+#. The LLVM ARM back-end is producing absolute relocations on
+   position-independent code (R_ARM_THM_MOVW_ABS_NC), so for now, you
+   should disable PIC, as seen above.
+
+   This is not a problem, since Clang/LLVM libraries are statically
+   linked anyway, it shouldn't affect much.
+
+#. LibXML2's configure script doesn't report LZMA as a dependency,
+   and since newer linkers (from ld 2.21), explicit dependencies are
+   required in the arguments, you need to add it directly to
+   ``clang/tools/c-index-test/CMakeLists.txt``:
+
+   .. code-block:: cmake
+
+    target_link_libraries(c-index-test ${LIBXML2_LIBRARIES} lzma)
+
+#. The ARM libraries won't be installed in your system, and possibly
+   not easily installable anyway, so you'll have to build/download
+   them separatelly. But the CMake prepare step, which check for
+   dependencies, will check the `host` libraries, not the `target`
+   ones.
+
+   A quick way of getting the libraries is to download them from
+   a distribution repository, like Debian (http://packages.debian.org/wheezy/),
+   and download the missing libraries. Note that the `libXXX`
+   will have the shared objects (.so) and the `libXXX-dev` will
+   give you the headers and the static (.a) library. Just in
+   case, download both.
+
+   The ones you need for ARM are: ``libtinfo``, ``zlib1g``,
+   ``libxml2`` and ``liblzma``. In the Debian repository you'll
+   find downloads for all architectures.
+
+   After you download and unpack all `.deb` packages, copy all
+   ``.so`` and ``.a`` to a directory, make the appropriate
+   symbolic links (if necessary), and add the relevant ``-L``
+   and ``-I`` paths to -DCMAKE_CXX_FLAGS above.
+
+
+Running CMake and Building
+--------------------------
+
+Finally, run CMake:
+
+   .. code-block:: bash
+
+     $ CC='clang' CXX='clang++' cmake <source-dir> <options above>
+
+If you have clang/clang++ on the path, it should just work, and special
+Ninja files will be created in the build directory. I strongly suggest
+you to run cmake on a separate build directory, *not* inside the
+source tree.
+
+To build, simply type:
+
+   .. code-block:: bash
+
+     $ ninja
+
+It should automatically find out how many cores you have, what are
+the rules that needs building and will build the whole thing.
+
+You can't run ``ninja check-all`` on this tree because the created
+binaries are targeted to ARM, not x86_64.
+
+Installing and Using
+--------------------
+
+After the LLVM/Clang has built successfully, you should install it
+via:
+
+   .. code-block:: bash
+
+     $ ninja install
+
+which will create a sysroot on the install-dir. You can then TarGz
+that directory into a binary with the full triple name (for easy
+identification), like:
+
+   .. code-block:: bash
+
+     $ ln -sf <install-dir> arm-linux-gnueabihf-clang
+     $ tar zchf arm-linux-gnueabihf-clang.tar.gz arm-linux-gnueabihf-clang
+
+If you copy that TarBall to your target board, you'll be able to use
+it for running the test-suite, for example. Follow the guidelines at
+http://llvm.org/docs/lnt/quickstart.html, unpack the TarBall in the
+test directory, and use options:
+
+   .. code-block:: bash
+
+     $ ./sandbox/bin/python sandbox/bin/lnt runtest nt \
+         --sandbox sandbox \
+         --test-suite `pwd`/test-suite \
+         --cc `pwd`/arm-linux-gnueabihf-clang/bin/clang \
+         --cxx `pwd`/arm-linux-gnueabihf-clang/bin/clang++
+
+Remember to add the ``-jN`` options to ``lnt`` to the number of CPUs
+on your board. Also, the path to your clang has to be absolute, so
+you'll need the `pwd` trick above.
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to