Copilot commented on code in PR #11712:
URL: https://github.com/apache/gluten/pull/11712#discussion_r3399563009
##########
dev/ci-velox-buildstatic-centos-7.sh:
##########
@@ -17,6 +17,6 @@
set -e
-export NUM_THREADS=4
+cp ep/build-velox/src/socket.h /usr/include/asm-generic/socket.h
./dev/builddeps-veloxbe.sh --enable_vcpkg=ON --build_arrow=OFF
--build_tests=OFF --build_benchmarks=OFF \
--build_examples=OFF --enable_s3=ON --enable_gcs=ON
--enable_hdfs=ON --enable_abfs=ON
Review Comment:
Copying a repo file into /usr/include/asm-generic/socket.h mutates system
headers for the whole build environment and can cause hard-to-debug side
effects. It also assumes the job runs as root and will fail otherwise. Prefer
installing the correct kernel headers package (or patching the dependent
project/include paths) rather than overwriting global headers in CI.
##########
ep/build-velox/src/socket.h:
##########
@@ -0,0 +1,105 @@
+#ifndef __ASM_GENERIC_SOCKET_H
+#define __ASM_GENERIC_SOCKET_H
+
+#include <asm/sockios.h>
Review Comment:
This new header is a verbatim copy of a Linux kernel userspace header but it
has no ASF license header or any attribution/provenance. Adding third-party
header content into the repo without the correct license notice creates
compliance risk, and it’s especially problematic because this file is later
copied into /usr/include during CI.
##########
cpp/velox/utils/ConfigExtractor.cc:
##########
@@ -233,8 +233,6 @@ std::shared_ptr<facebook::velox::config::ConfigBase>
createHiveConnectorSessionC
configs[facebook::velox::connector::hive::HiveConfig::kReadTimestampUnitSession]
= std::string("6");
configs[facebook::velox::connector::hive::HiveConfig::kMaxPartitionsPerWritersSession]
=
conf->get<std::string>(kMaxPartitions, "10000");
- configs[facebook::velox::connector::hive::HiveConfig::kMaxTargetFileSize] =
- conf->get<std::string>(kMaxTargetFileSize, "0B"); // 0 means no limit on
target file size
configs[facebook::velox::connector::hive::HiveConfig::kIgnoreMissingFilesSession]
=
conf->get<bool>(kIgnoreMissingFiles, false) ? "true" : "false";
Review Comment:
The session config no longer propagates
spark.gluten.sql.columnar.backend.velox.maxTargetFileSize (kMaxTargetFileSize
is still defined in cpp/velox/config/VeloxConfig.h). This leaves a user-facing
config key silently unused. Either wire it back up to the appropriate Velox
config (if still supported) or remove/deprecate the Gluten config key so
behavior isn’t surprising.
##########
dev/vcpkg/ports/fbthrift/vcpkg.json:
##########
@@ -0,0 +1,35 @@
+{
+ "name": "fbthrift",
+ "version-string": "2026.02.23.00",
+ "description": "Facebook's branch of Apache Thrift, including a new C++
server.",
+ "homepage": "https://github.com/facebook/fbthrift",
+ "license": "Apache-2.0",
+ "supports": "x64 & static",
+ "dependencies": [
Review Comment:
This port declares it only supports x64 static, but Gluten’s vcpkg setup
also builds on arm64-linux-neon (see
dev/vcpkg/triplets/arm64-linux-neon.cmake). With fbthrift pulled in by default,
arm builds will be forced to use --allow-unsupported or will fail the support
check. If fbthrift is expected to build on arm64, update the supports
expression accordingly; otherwise make the dependency conditional by platform
in the top-level vcpkg manifest.
##########
dev/vcpkg/init.sh:
##########
@@ -92,7 +92,7 @@ if [ "${VCPKG_DYNAMIC_OPENSSL:-OFF}" = "ON" ]; then
fi
-$VCPKG install --no-print-usage \
+$VCPKG install --no-print-usage --allow-unsupported \
--triplet="${VCPKG_TRIPLET}" --host-triplet="${VCPKG_TRIPLET}"
${EXTRA_FEATURES}
Review Comment:
Using --allow-unsupported globally in vcpkg install masks real port/triplet
incompatibilities (e.g., ports that declare limited "supports" expressions) and
can cause later build/link failures that are harder to diagnose. It’s better to
make the ports’ supports/platform constraints accurate (or gate dependencies by
platform) so unsupported combinations fail fast.
##########
ep/build-velox/src/get-velox.sh:
##########
@@ -17,8 +17,8 @@
set -exu
CURRENT_DIR=$(cd "$(dirname "$BASH_SOURCE")"; pwd)
-VELOX_REPO=https://github.com/IBM/velox.git
-VELOX_BRANCH=dft-2026_06_06
+VELOX_REPO=https://github.com/zhouyuan/velox.git
+VELOX_BRANCH=dft-2026_06_06-fbthrift
Review Comment:
The default Velox remote was changed from an organization repo to a personal
fork. This makes builds non-reproducible and can break if the fork is rebased,
force-pushed, or deleted. Keep the default pointing at the canonical
upstream/org repo and rely on the existing --velox_repo/--velox_branch args
when testing forks.
##########
dev/ci-velox-buildshared-centos-9.sh:
##########
@@ -24,5 +24,5 @@ if [ "$(uname -m)" = "aarch64" ]; then
export VCPKG_FORCE_SYSTEM_BINARIES=1;
fi
-./dev/builddeps-veloxbe.sh --run_setup_script=OFF --build_arrow=OFF
--build_tests=ON \
+./dev/builddeps-veloxbe.sh --run_setup_script=ON --build_arrow=OFF
--build_tests=ON \
--build_examples=ON --build_benchmarks=ON
Review Comment:
Switching CI builds to --run_setup_script=ON makes the build depend on
running Velox’s setup logic inside the CI container, which increases runtime
and can introduce non-determinism (package repos changing, transient network
failures). If the container image already contains the needed deps, keep this
OFF and update the image/scripts for any missing packages explicitly.
##########
.github/workflows/velox_backend_x86.yml:
##########
@@ -1178,7 +1178,7 @@ jobs:
mkdir -p /work/.ccache
cd /work
- bash dev/builddeps-veloxbe.sh --run_setup_script=OFF
--build_arrow=OFF --build_tests=ON --build_benchmarks=ON --enable_gpu=ON
+ bash dev/builddeps-veloxbe.sh --run_setup_script=ON
--build_arrow=OFF --build_tests=ON --build_benchmarks=ON --enable_gpu=ON
Review Comment:
Switching the workflow to --run_setup_script=ON makes the build depend on
package installation during the CI run (network + repo state), which reduces
determinism and can significantly increase runtime. If the Docker image is
intended to be self-contained, keep this OFF and bake any missing dependencies
into the image (or install them explicitly in the workflow).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]