Copilot commented on code in PR #11712:
URL: https://github.com/apache/gluten/pull/11712#discussion_r3399563009


##########
dev/ci-velox-buildstatic-centos-7.sh:
##########
@@ -17,6 +17,6 @@
 
 set -e
 
-export NUM_THREADS=4
+cp ep/build-velox/src/socket.h /usr/include/asm-generic/socket.h 
 ./dev/builddeps-veloxbe.sh --enable_vcpkg=ON --build_arrow=OFF 
--build_tests=OFF --build_benchmarks=OFF \
                            --build_examples=OFF --enable_s3=ON --enable_gcs=ON 
--enable_hdfs=ON --enable_abfs=ON

Review Comment:
   Copying a repo file into /usr/include/asm-generic/socket.h mutates system 
headers for the whole build environment and can cause hard-to-debug side 
effects. It also assumes the job runs as root and will fail otherwise. Prefer 
installing the correct kernel headers package (or patching the dependent 
project/include paths) rather than overwriting global headers in CI.



##########
ep/build-velox/src/socket.h:
##########
@@ -0,0 +1,105 @@
+#ifndef __ASM_GENERIC_SOCKET_H
+#define __ASM_GENERIC_SOCKET_H
+
+#include <asm/sockios.h>

Review Comment:
   This new header is a verbatim copy of a Linux kernel userspace header but it 
has no ASF license header or any attribution/provenance. Adding third-party 
header content into the repo without the correct license notice creates 
compliance risk, and it’s especially problematic because this file is later 
copied into /usr/include during CI.



##########
cpp/velox/utils/ConfigExtractor.cc:
##########
@@ -233,8 +233,6 @@ std::shared_ptr<facebook::velox::config::ConfigBase> 
createHiveConnectorSessionC
   
configs[facebook::velox::connector::hive::HiveConfig::kReadTimestampUnitSession]
 = std::string("6");
   
configs[facebook::velox::connector::hive::HiveConfig::kMaxPartitionsPerWritersSession]
 =
       conf->get<std::string>(kMaxPartitions, "10000");
-  configs[facebook::velox::connector::hive::HiveConfig::kMaxTargetFileSize] =
-      conf->get<std::string>(kMaxTargetFileSize, "0B"); // 0 means no limit on 
target file size
   
configs[facebook::velox::connector::hive::HiveConfig::kIgnoreMissingFilesSession]
 =
       conf->get<bool>(kIgnoreMissingFiles, false) ? "true" : "false";

Review Comment:
   The session config no longer propagates 
spark.gluten.sql.columnar.backend.velox.maxTargetFileSize (kMaxTargetFileSize 
is still defined in cpp/velox/config/VeloxConfig.h). This leaves a user-facing 
config key silently unused. Either wire it back up to the appropriate Velox 
config (if still supported) or remove/deprecate the Gluten config key so 
behavior isn’t surprising.



##########
dev/vcpkg/ports/fbthrift/vcpkg.json:
##########
@@ -0,0 +1,35 @@
+{
+  "name": "fbthrift",
+  "version-string": "2026.02.23.00",
+  "description": "Facebook's branch of Apache Thrift, including a new C++ 
server.",
+  "homepage": "https://github.com/facebook/fbthrift";,
+  "license": "Apache-2.0",
+  "supports": "x64 & static",
+  "dependencies": [

Review Comment:
   This port declares it only supports x64 static, but Gluten’s vcpkg setup 
also builds on arm64-linux-neon (see 
dev/vcpkg/triplets/arm64-linux-neon.cmake). With fbthrift pulled in by default, 
arm builds will be forced to use --allow-unsupported or will fail the support 
check. If fbthrift is expected to build on arm64, update the supports 
expression accordingly; otherwise make the dependency conditional by platform 
in the top-level vcpkg manifest.



##########
dev/vcpkg/init.sh:
##########
@@ -92,7 +92,7 @@ if [ "${VCPKG_DYNAMIC_OPENSSL:-OFF}" = "ON" ]; then
 fi
 
 
-$VCPKG install --no-print-usage \
+$VCPKG install --no-print-usage --allow-unsupported \
     --triplet="${VCPKG_TRIPLET}" --host-triplet="${VCPKG_TRIPLET}" 
${EXTRA_FEATURES}

Review Comment:
   Using --allow-unsupported globally in vcpkg install masks real port/triplet 
incompatibilities (e.g., ports that declare limited "supports" expressions) and 
can cause later build/link failures that are harder to diagnose. It’s better to 
make the ports’ supports/platform constraints accurate (or gate dependencies by 
platform) so unsupported combinations fail fast.



##########
ep/build-velox/src/get-velox.sh:
##########
@@ -17,8 +17,8 @@
 set -exu
 
 CURRENT_DIR=$(cd "$(dirname "$BASH_SOURCE")"; pwd)
-VELOX_REPO=https://github.com/IBM/velox.git
-VELOX_BRANCH=dft-2026_06_06
+VELOX_REPO=https://github.com/zhouyuan/velox.git
+VELOX_BRANCH=dft-2026_06_06-fbthrift

Review Comment:
   The default Velox remote was changed from an organization repo to a personal 
fork. This makes builds non-reproducible and can break if the fork is rebased, 
force-pushed, or deleted. Keep the default pointing at the canonical 
upstream/org repo and rely on the existing --velox_repo/--velox_branch args 
when testing forks.



##########
dev/ci-velox-buildshared-centos-9.sh:
##########
@@ -24,5 +24,5 @@ if [ "$(uname -m)" = "aarch64" ]; then
     export VCPKG_FORCE_SYSTEM_BINARIES=1;
 fi
 
-./dev/builddeps-veloxbe.sh --run_setup_script=OFF --build_arrow=OFF 
--build_tests=ON \
+./dev/builddeps-veloxbe.sh --run_setup_script=ON --build_arrow=OFF 
--build_tests=ON \
     --build_examples=ON --build_benchmarks=ON

Review Comment:
   Switching CI builds to --run_setup_script=ON makes the build depend on 
running Velox’s setup logic inside the CI container, which increases runtime 
and can introduce non-determinism (package repos changing, transient network 
failures). If the container image already contains the needed deps, keep this 
OFF and update the image/scripts for any missing packages explicitly.



##########
.github/workflows/velox_backend_x86.yml:
##########
@@ -1178,7 +1178,7 @@ jobs:
           mkdir -p /work/.ccache
 
           cd /work
-          bash dev/builddeps-veloxbe.sh --run_setup_script=OFF 
--build_arrow=OFF --build_tests=ON --build_benchmarks=ON --enable_gpu=ON
+          bash dev/builddeps-veloxbe.sh --run_setup_script=ON 
--build_arrow=OFF --build_tests=ON --build_benchmarks=ON --enable_gpu=ON

Review Comment:
   Switching the workflow to --run_setup_script=ON makes the build depend on 
package installation during the CI run (network + repo state), which reduces 
determinism and can significantly increase runtime. If the Docker image is 
intended to be self-contained, keep this OFF and bake any missing dependencies 
into the image (or install them explicitly in the workflow).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to