On Sat, Nov 17, 2018 at 11:54:24PM +0100, Michał Górny wrote:
> On Sun, 2018-11-18 at 03:37 +0800, Jason Zaman wrote:
> > Hey all,
> > 
> > I've been using Bazel (https://bazel.build/) to build TensorFlow for a
> > while now. Here is a bazel.eclass I'd like to commit to make it easier
> > for packages that use it to build. It's basically bits that I've
> > refactored out of the TensorFlow ebuild that would be useful to other
> > packages as well. I have a bump to sci-libs/tensorflow-1.12.0 prepared
> > that uses this eclass and have tested a full install.
> > 
> > -- Jason
> > 
> > # Copyright 1999-2018 Jason Zaman
> > # Distributed under the terms of the GNU General Public License v2
> > 
> > # @ECLASS: bazel.eclass
> > # @MAINTAINER:
> > # Jason Zaman <perfin...@gentoo.org>
> > # @AUTHOR:
> > # Jason Zaman <perfin...@gentoo.org>
> > # @BLURB: Utility functions for packages using Bazel Build
> > # @DESCRIPTION:
> > # A utility eclass providing functions to run the Bazel Build system.
> > #
> > # This eclass does not export any phase functions.
> > 
> > case "${EAPI:-0}" in
> >     0|1|2|3|4|5|6)
> >             die "Unsupported EAPI=${EAPI:-0} (too old) for ${ECLASS}"
> >             ;;
> >     7)
> >             ;;
> >     *)
> >             die "Unsupported EAPI=${EAPI} (unknown) for ${ECLASS}"
> >             ;;
> > esac
> > 
> > if [[ ! ${_BAZEL_ECLASS} ]]; then
> > 
> > inherit multiprocessing toolchain-funcs
> > 
> > BDEPEND=">=dev-util/bazel-0.19"
> > 
> > # @FUNCTION: bazel_get_flags
> > # @DESCRIPTION:
> > # Obtain and print the bazel flags for target and host *FLAGS.
> > #
> > # To add more flags to this, append the flags to the
> > # appropriate variable before calling this function
> > bazel_get_flags() {
> >     local i fs=()
> >     for i in ${CFLAGS}; do
> >             fs+=( "--conlyopt=${i}" )
> >     done
> >     for i in ${BUILD_CFLAGS}; do
> >             fs+=( "--host_conlyopt=${i}" )
> >     done
> >     for i in ${CXXFLAGS}; do
> >             fs+=( "--cxxopt=${i}" )
> >     done
> >     for i in ${BUILD_CXXFLAGS}; do
> >             fs+=( "--host_cxxopt=${i}" )
> >     done
> >     for i in ${CPPFLAGS}; do
> >             fs+=( "--conlyopt=${i}" "--cxxopt=${i}" )
> >     done
> >     for i in ${BUILD_CPPFLAGS}; do
> >             fs+=( "--host_conlyopt=${i}" "--host_cxxopt=${i}" )
> >     done
> >     for i in ${LDFLAGS}; do
> >             fs+=( "--linkopt=${i}" )
> >     done
> >     for i in ${BUILD_LDFLAGS}; do
> >             fs+=( "--host_linkopt=${i}" )
> >     done
> >     echo "${fs[*]}"
> > }
> > 
> > # @FUNCTION: bazel_setup_bazelrc
> > # @DESCRIPTION:
> > # Creates the bazelrc with common options that will be passed
> > # to bazel. This will be called by ebazel automatically so
> > # does not need to be called from the ebuild.
> > bazel_setup_bazelrc() {
> >     if [[ -f "${T}/bazelrc" ]]; then
> >             return
> >     fi
> > 
> >     # F: fopen_wr
> >     # P: /proc/self/setgroups
> >     # Even with standalone enabled, the Bazel sandbox binary is run for 
> > feature test:
> >     # 
> > https://github.com/bazelbuild/bazel/blob/7b091c1397a82258e26ab5336df6c8dae1d97384/src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java#L61
> >     # 
> > https://github.com/bazelbuild/bazel/blob/76555482873ffcf1d32fb40106f89231b37f850a/src/main/tools/linux-sandbox-pid1.cc#L113
> >     addpredict /proc
> > 
> >     mkdir -p "${T}/bazel-cache" || die
> >     mkdir -p "${T}/bazel-distdir" || die
> > 
> >     cat > "${T}/bazelrc" <<-EOF || die
> >     startup --batch
> 
> Maybe indent this stuff to make it stand out from ebuild code.
> 
> > 
> >     # dont strip HOME, portage sets a temp per-package dir
> >     build --action_env HOME
> > 
> >     # make bazel respect MAKEOPTS
> >     build --jobs=$(makeopts_jobs)
> >     build --compilation_mode=opt --host_compilation_mode=opt
> > 
> >     # FLAGS
> >     build $(bazel_get_flags)
> > 
> >     # Use standalone strategy to deactivate the bazel sandbox, since it
> >     # conflicts with FEATURES=sandbox.
> >     build --spawn_strategy=standalone --genrule_strategy=standalone
> >     test --spawn_strategy=standalone --genrule_strategy=standalone
> > 
> >     build --strip=never
> >     build --verbose_failures --noshow_loading_progress
> >     test --verbose_test_summary --verbose_failures --noshow_loading_progress
> > 
> >     # make bazel only fetch distfiles from the cache
> >     fetch --repository_cache="${T}/bazel-cache/" 
> > --distdir="${T}/bazel-distdir/"
> >     build --repository_cache="${T}/bazel-cache/" 
> > --distdir="${T}/bazel-distdir/"
> > 
> >     build --define=PREFIX=${EPREFIX%/}/usr
> >     build --define=LIBDIR=\$(PREFIX)/$(get_libdir)
> > 
> >     EOF
> > 
> >     tc-is-cross-compiler || \
> >             echo "build --nodistinct_host_configuration" >> "${T}/bazelrc" 
> > || die
> 
> Don't do || chains, they are unreadable.
ok

> > }
> > 
> > # @FUNCTION: ebazel
> > # @USAGE: [<args>...]
> > # @DESCRIPTION:
> > # Run bazel with the bazelrc and output_base.
> > #
> > # If $MULTIBUILD_VARIANT is set, this will make an output_base
> > # specific to that variant.
> > # bazel_setup_bazelrc will be called and the created bazelrc
> > # will be passed to bazel.
> > #
> > # Will automatically die if bazel does not exit cleanly.
> > ebazel() {
> >     bazel_setup_bazelrc
> > 
> >     # Use different build folders for each multibuild variant.
> >     local base_suffix="${MULTIBUILD_VARIANT+-}${MULTIBUILD_VARIANT}"
> 
> Any reason not to use BUILD_DIR instead of reinventing it?

Isnt $BUILD_DIR $S by default? output_base is a bazel thing with a weird
structure and has nothing to do with the source tree. Doing it this way
meant python_foreach_impl was easy. but I might be confused and will
look into it again and see if it can be better. this way means with and
without multibuild just works tho.

> >     local output_base="${WORKDIR}/bazel-base${base_suffix}"
> >     mkdir -p "${output_base}" || die
> > 
> >     einfo Running: bazel --output_base="${output_base}" "$@"
> >     bazel --bazelrc="${T}/bazelrc" --output_base="${output_base}" $@ || die 
> > "ebazel $@"
> 
> The common practice is to echo >&2 it.  Also, you output different
> arguments than you execute which is going to confuse the hell out of
> users who'll end up having to debug this.  You can use a trick like
> the following to avoid typing args twice:
> 
>   set -- bazel --bazelrc...
>   echo "${*}" >&2
>   "${@}" || die ...

Oh, forgot that trick. yeah I'll do that.

> > }
> > 
> > # @FUNCTION: bazel_load_distfiles
> > # @USAGE: <distfiles>...
> > # @DESCRIPTION:
> > # Populate the bazel distdir to fetch from since it cannot use
> > # the network. Bazel looks in distdir but will only look for the
> > # original filename, not the possibly renamed one that portage
> > # downloaded. If the line has -> we to rename it back. This also
> > # handles use-conditionals that SRC_URI does.
> 
> Why oh why do you have to implement custom parser for the ebuild syntax?
>  That's just asking for horrible failures.

Yeah ... so thats the problem with bazel and the main reason for the
eclass. Bazel has this idea that downloading and unpacking random
tarballs during build is okay. Now bazel has an option when it needs to
fetch something it will look in the dir --distdir= points to instead and
read the file instead of trying to fetch from the (sandboxed thus
non-existent) internet. But bazel only knows the original filenames not
the names that portage renamed things to so I need to rename them back
to the original names.

The other option would be to have ebuilds maintain a second list of urls
on top of the SRC_URI one which seems way more of a maintenance burden,
especially when use-flags are involved. So all I really do here is strip
the use? and ()'s and take the list of urls and put them in the
bazel-distdir with the original filenames.

If you can think of a better way to accomplish that I'd love to hear it
tho. I'll add more comments about what it does but bazel is weird and
just utterly fails if it cant read the tarballs. And unpacking things
myself doesnt work because of the weird file structure bazel uses so
knowing where to put them is non-trivial.

> > #
> > # Example:
> > # @CODE
> > # bazel_external_uris="http://a/file-2.0.tgz
> > #     python? ( http://b/1.0.tgz -> foo-1.0.tgz )"
> > # SRC_URI="http://c/${PV}.tgz
> > #     ${bazel_external_uris}"
> > #
> > # src_unpack() {
> > #     unpack ${PV}.tgz
> > #     bazel_load_distfiles "${bazel_external_uris}"
> > # }
> > # @CODE
> > bazel_load_distfiles() {
> >     local src dst uri rename
> > 
> >     [[ "$@" ]] || die "Missing args"
> >     mkdir -p "${T}/bazel-distdir" || die
> > 
> >     while read uri rename dst; do
> >             src="${uri##*/}"
> >             [[ -z $src ]] && continue
> 
> Please use ${foo} syntax in ebuilds, consistently.
ok

> >             if [[ "$rename" != "->" ]]; then
> >                     dst="${src}"
> >             fi
> > 
> >             [[ ${A} =~ ${dst} ]] || continue
> 
> Why are you doing regex match here?  Last I checked, we didn't use
> regular expressions in SRC_URI.

It was to skip this one if its not in $A (eg disabled with a use-flag)
I'll change it to == *${dst}* instead then.

> > 
> >             if [[ "$dst" == "$src" ]]; then
> >                     einfo "Copying $dst to bazel distdir ..."
> >             else
> >                     einfo "Copying $dst to bazel distdir $src ..."
> 
> Are you using src and dst to mean the opposite?

No, this is right. its SRC_URI="src -> dst" so I need to rename dst back
so its named src.

> >             fi
> >             dst="$(readlink -f "${DISTDIR}/${dst}")"
> 
> Looks like you are hardcoding hacks for implementation details which
> indicates whatever you're doing is a very bad idea, and is going to fail
> whenever the implementation is subtly different than what you've worked
> around so far.

I can skip this line then no problem. It will just end up a link to a
link but thats fine.

> >             ln -s "${dst}" "${T}/bazel-distdir/${src}" || die
> >     done <<< "$(sed -re 's/!?[A-Za-z]+\?\s+\(\s*//g; s/\s+\)//g' <<< "$@")"
> 
> Please don't use horribly unreadable sed expressions.  This just means
> that whoever will have to touch this eclass in the future will wish you
> were never recruited.

I will split it up and add more comments. It just removes "use? (",
"!use? (", and " )". If I don't remove the useflag bits first then figuring out
if its a rename or not becomes a lot more complicated. If you have
other ideas im all ears.


> 
> > }
> > 
> > _BAZEL_ECLASS=1
> > fi
> > 
> > 
> > 
> 
> -- 
> Best regards,
> Michał Górny



Reply via email to