amosbird opened a new pull request #7712: URL: https://github.com/apache/incubator-doris/pull/7712
## Proposed changes Prepare to generate hermetic build using GCC 11 and Clang 13. The ideal toolchain would be ldb toolchain generated by https://github.com/amosbird/ldb_toolchain_gen/tree/doris-ubuntu-18.04-x64 To kick off a clang build, set `DORIS_TOOLCHAIN=clang` before running any build scripts. ## Types of changes What types of changes does your code introduce to Doris? _Put an `x` in the boxes that apply_ - [X] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Documentation Update (if none of the other choices apply) - [X] Code refactor (Modify the code structure, format the code, etc...) - [ ] Optimization. Including functional usability improvements and performance improvements. - [X] Dependency. Such as changes related to third-party components. - [X] Other. ## Checklist _Put an `x` in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code._ - [ ] I have created an issue on (Fix #ISSUE) and described the bug/feature there in detail - [X] Compiling and unit tests pass locally with my changes - [X] I have added tests that prove my fix is effective or that my feature works - [ ] If these changes need document changes, I have updated the document - [X] Any dependent changes have been merged ## Further comments The experience of compiling Doris can be improved in many ways (I failed to compile it several times over the last year). The non-intrusive way of doing it will be providing a dedicated toolchain to build both thirdparty deps and the backend. This PR contains such proposal along with some improvements for building the thirdparty on unclean environment (some environment provides system shared libs which should not be found and used). Benefits: 1. Generate Byte-identical binaries. Good for quality control, testing, debugging, redistributing and forking. 2. No docker dependency (your dev environment will appreciate that). 3. Extremely friendly to new developers and potentially attracting more contributors 4. Much easier to correctly maintain all build deps. With the ldb toolchain, a minimal centos-7 dev machine (without web-ui things) will be the following: ``` yum install -y \ byacc \ patch \ automake \ libtool \ make \ which \ file \ ncurses-devel \ gettext-devel \ bison \ java-11-openjdk-devel \ maven \ unzip \ bzip2 \ zip ``` Why centos 7? Because it's the still maintained linux distribution with oldest GLIBC (2.17). https://gist.github.com/wagenet/35adca1a032cec2999d47b6c40aa45b1 It's also possible to enrich the LDB toolchain so that no external dependency is required. It's worth mentioning if LDB toolchain is used, env `ASAN_SYMBOLIZER_PATH` should be set to the path of llvm-symbolizer explicitly. With alternative toolchains, we can catch undefined behaviors like https://github.com/apache/incubator-doris/blob/master/be/src/util/simd/vstring_function.h#L93-L112 The undefined behavior of `__builtin_ctz` is described here https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html cc @HappenLee New toolchain also indicates that sm3_test has potential mem leak which comes from openssl (maybe false positive). Related PRs https://github.com/apache/incubator-doris/pull/7631 https://github.com/apache/incubator-doris/pull/7569 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
