Copilot commented on code in PR #4:
URL: https://github.com/apache/paimon-cpp/pull/4#discussion_r3287268159


##########
cmake_modules/DefineOptions.cmake:
##########
@@ -0,0 +1,310 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# 
https://github.com/apache/arrow/blob/main/cpp/cmake_modules/DefineOptions.cmake
+
+macro(set_option_category name)
+    set(PAIMON_OPTION_CATEGORY ${name})
+    list(APPEND "PAIMON_OPTION_CATEGORIES" ${name})
+endmacro()

Review Comment:
   `PAIMON_OPTION_CATEGORY` values like `"Compile and link"` (contains spaces) 
are used to form variable names such as 
`PAIMON_${PAIMON_OPTION_CATEGORY}_OPTION_NAMES`. Variable indirection with 
names containing spaces is error-prone and makes later expansion/iteration much 
harder to reason about. Prefer a normalized category key (e.g., 
`COMPILE_AND_LINK`) for variable names and keep the human-readable label 
separate, or sanitize category names (replace spaces with underscores) before 
composing variable names.



##########
cmake_modules/DefineOptions.cmake:
##########
@@ -0,0 +1,310 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# 
https://github.com/apache/arrow/blob/main/cpp/cmake_modules/DefineOptions.cmake
+
+macro(set_option_category name)
+    set(PAIMON_OPTION_CATEGORY ${name})
+    list(APPEND "PAIMON_OPTION_CATEGORIES" ${name})
+endmacro()
+
+function(check_description_length name description)
+    foreach(description_line ${description})
+        string(LENGTH ${description_line} line_length)
+        if(${line_length} GREATER 80)
+            message(FATAL_ERROR "description for ${name} contained a\n\
+        line ${line_length} characters long!\n\
+        (max is 80). Split it into more lines with semicolons")
+        endif()
+    endforeach()
+endfunction()
+
+function(list_join lst glue out)
+    if("${${lst}}" STREQUAL "")
+        set(${out}
+            ""
+            PARENT_SCOPE)
+        return()
+    endif()
+
+    list(GET ${lst} 0 joined)
+    list(REMOVE_AT ${lst} 0)
+    foreach(item ${${lst}})
+        set(joined "${joined}${glue}${item}")
+    endforeach()
+    set(${out}
+        ${joined}
+        PARENT_SCOPE)
+endfunction()
+
+macro(define_option name description default)
+    check_description_length(${name} ${description})
+    list_join(description "\n" multiline_description)
+
+    option(${name} "${multiline_description}" ${default})

Review Comment:
   `define_option()`/`define_option_string()` are currently broken because 
`description` is passed unquoted and then fed to `list_join(description ...)` 
as if it were a variable name. With calls like 
`define_option(PAIMON_BUILD_STATIC "Build static libraries" ON)`, CMake will 
split the description into multiple arguments (by spaces), and `list_join(...)` 
will mis-assign parameters (e.g., lst/glue/out). Fix by treating the 
description as a single string (quote it), and avoid passing it as a “variable 
name” (e.g., replace `list_join(...)` usage with a direct `string(REPLACE ";" 
"\n" ...)` on the description value, and quote `${description_line}` in 
`string(LENGTH ...)`).



##########
cmake_modules/DefineOptions.cmake:
##########
@@ -0,0 +1,310 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# 
https://github.com/apache/arrow/blob/main/cpp/cmake_modules/DefineOptions.cmake
+
+macro(set_option_category name)
+    set(PAIMON_OPTION_CATEGORY ${name})
+    list(APPEND "PAIMON_OPTION_CATEGORIES" ${name})
+endmacro()
+
+function(check_description_length name description)
+    foreach(description_line ${description})
+        string(LENGTH ${description_line} line_length)
+        if(${line_length} GREATER 80)
+            message(FATAL_ERROR "description for ${name} contained a\n\
+        line ${line_length} characters long!\n\
+        (max is 80). Split it into more lines with semicolons")
+        endif()
+    endforeach()
+endfunction()
+
+function(list_join lst glue out)
+    if("${${lst}}" STREQUAL "")
+        set(${out}
+            ""
+            PARENT_SCOPE)
+        return()
+    endif()
+
+    list(GET ${lst} 0 joined)
+    list(REMOVE_AT ${lst} 0)
+    foreach(item ${${lst}})
+        set(joined "${joined}${glue}${item}")
+    endforeach()
+    set(${out}
+        ${joined}
+        PARENT_SCOPE)
+endfunction()
+
+macro(define_option name description default)
+    check_description_length(${name} ${description})
+    list_join(description "\n" multiline_description)
+
+    option(${name} "${multiline_description}" ${default})
+
+    list(APPEND "PAIMON_${PAIMON_OPTION_CATEGORY}_OPTION_NAMES" ${name})

Review Comment:
   `PAIMON_OPTION_CATEGORY` values like `"Compile and link"` (contains spaces) 
are used to form variable names such as 
`PAIMON_${PAIMON_OPTION_CATEGORY}_OPTION_NAMES`. Variable indirection with 
names containing spaces is error-prone and makes later expansion/iteration much 
harder to reason about. Prefer a normalized category key (e.g., 
`COMPILE_AND_LINK`) for variable names and keep the human-readable label 
separate, or sanitize category names (replace spaces with underscores) before 
composing variable names.



##########
cmake_modules/SetupCxxFlags.cmake:
##########
@@ -0,0 +1,434 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Check if the target architecture and compiler supports some special
+# instruction sets that would boost performance.
+include(CheckCXXCompilerFlag)
+# Get cpu architecture
+
+message(STATUS "System processor: ${CMAKE_SYSTEM_PROCESSOR}")
+
+# Support C11
+if(NOT DEFINED CMAKE_C_STANDARD)
+    set(CMAKE_C_STANDARD 11)
+endif()
+
+# This ensures that things like c++11 get passed correctly
+if(NOT DEFINED CMAKE_CXX_STANDARD)
+    set(CMAKE_CXX_STANDARD 17)
+endif()
+
+# We require a C++11 compliant compiler
+set(CMAKE_CXX_STANDARD_REQUIRED ON)
+
+set(CMAKE_CXX_EXTENSIONS OFF)
+
+# Build with -fPIC so that can static link our libraries into other people's
+# shared libraries
+set(CMAKE_POSITION_INDEPENDENT_CODE ON)
+
+string(TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE)
+
+set(UNKNOWN_COMPILER_MESSAGE
+    "Unknown compiler: ${CMAKE_CXX_COMPILER_VERSION} 
${CMAKE_CXX_COMPILER_VERSION}")
+
+# Defaults BUILD_WARNING_LEVEL to `CHECKIN`, unless CMAKE_BUILD_TYPE is
+# `RELEASE`, then it will default to `PRODUCTION`. The goal of defaulting to
+# `CHECKIN` is to avoid friction with long response time from CI.
+if(NOT BUILD_WARNING_LEVEL)
+    if("${CMAKE_BUILD_TYPE}" STREQUAL "RELEASE")
+        set(BUILD_WARNING_LEVEL PRODUCTION)
+    else()
+        set(BUILD_WARNING_LEVEL CHECKIN)
+    endif()
+endif(NOT BUILD_WARNING_LEVEL)
+string(TOUPPER ${BUILD_WARNING_LEVEL} BUILD_WARNING_LEVEL)
+
+message(STATUS "Paimon build warning level: ${BUILD_WARNING_LEVEL}")
+
+set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Werror")

Review Comment:
   `-Werror` is applied unconditionally via `CXX_COMMON_FLAGS`, but the 
production-build comment states warnings are not treated as errors. Either 
remove/avoid `-Werror` for production builds (and only add it for 
CHECKIN/EVERYTHING), or update the comment/logic so behavior and documentation 
match.
   



##########
cmake_modules/DefineOptions.cmake:
##########
@@ -0,0 +1,310 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# 
https://github.com/apache/arrow/blob/main/cpp/cmake_modules/DefineOptions.cmake
+
+macro(set_option_category name)
+    set(PAIMON_OPTION_CATEGORY ${name})
+    list(APPEND "PAIMON_OPTION_CATEGORIES" ${name})
+endmacro()
+
+function(check_description_length name description)
+    foreach(description_line ${description})
+        string(LENGTH ${description_line} line_length)
+        if(${line_length} GREATER 80)
+            message(FATAL_ERROR "description for ${name} contained a\n\
+        line ${line_length} characters long!\n\
+        (max is 80). Split it into more lines with semicolons")
+        endif()
+    endforeach()
+endfunction()

Review Comment:
   `define_option()`/`define_option_string()` are currently broken because 
`description` is passed unquoted and then fed to `list_join(description ...)` 
as if it were a variable name. With calls like 
`define_option(PAIMON_BUILD_STATIC "Build static libraries" ON)`, CMake will 
split the description into multiple arguments (by spaces), and `list_join(...)` 
will mis-assign parameters (e.g., lst/glue/out). Fix by treating the 
description as a single string (quote it), and avoid passing it as a “variable 
name” (e.g., replace `list_join(...)` usage with a direct `string(REPLACE ";" 
"\n" ...)` on the description value, and quote `${description_line}` in 
`string(LENGTH ...)`).



##########
cmake_modules/DefineOptions.cmake:
##########
@@ -0,0 +1,310 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# 
https://github.com/apache/arrow/blob/main/cpp/cmake_modules/DefineOptions.cmake
+
+macro(set_option_category name)
+    set(PAIMON_OPTION_CATEGORY ${name})
+    list(APPEND "PAIMON_OPTION_CATEGORIES" ${name})
+endmacro()
+
+function(check_description_length name description)
+    foreach(description_line ${description})
+        string(LENGTH ${description_line} line_length)
+        if(${line_length} GREATER 80)
+            message(FATAL_ERROR "description for ${name} contained a\n\
+        line ${line_length} characters long!\n\
+        (max is 80). Split it into more lines with semicolons")
+        endif()
+    endforeach()
+endfunction()
+
+function(list_join lst glue out)
+    if("${${lst}}" STREQUAL "")
+        set(${out}
+            ""
+            PARENT_SCOPE)
+        return()
+    endif()
+
+    list(GET ${lst} 0 joined)
+    list(REMOVE_AT ${lst} 0)
+    foreach(item ${${lst}})
+        set(joined "${joined}${glue}${item}")
+    endforeach()
+    set(${out}
+        ${joined}
+        PARENT_SCOPE)
+endfunction()
+
+macro(define_option name description default)
+    check_description_length(${name} ${description})
+    list_join(description "\n" multiline_description)
+
+    option(${name} "${multiline_description}" ${default})
+
+    list(APPEND "PAIMON_${PAIMON_OPTION_CATEGORY}_OPTION_NAMES" ${name})
+    set("${name}_OPTION_DESCRIPTION" ${description})
+    set("${name}_OPTION_DEFAULT" ${default})
+    set("${name}_OPTION_TYPE" "bool")
+endmacro()
+
+macro(define_option_string name description default)
+    check_description_length(${name} ${description})
+    list_join(description "\n" multiline_description)
+
+    set(${name}
+        ${default}
+        CACHE STRING "${multiline_description}")
+
+    list(APPEND "PAIMON_${PAIMON_OPTION_CATEGORY}_OPTION_NAMES" ${name})
+    set("${name}_OPTION_DESCRIPTION" ${description})
+    set("${name}_OPTION_DEFAULT" "\"${default}\"")
+    set("${name}_OPTION_TYPE" "string")
+    set("${name}_OPTION_POSSIBLE_VALUES" ${ARGN})
+
+    list_join("${name}_OPTION_POSSIBLE_VALUES" "|" "${name}_OPTION_ENUM")
+    if(NOT ("${${name}_OPTION_ENUM}" STREQUAL ""))
+        set_property(CACHE ${name} PROPERTY STRINGS 
"${name}_OPTION_POSSIBLE_VALUES")
+    endif()
+endmacro()
+
+# Top level cmake dir
+if("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_CURRENT_SOURCE_DIR}")
+    #----------------------------------------------------------------------
+    set_option_category("Compile and link")

Review Comment:
   `PAIMON_OPTION_CATEGORY` values like `"Compile and link"` (contains spaces) 
are used to form variable names such as 
`PAIMON_${PAIMON_OPTION_CATEGORY}_OPTION_NAMES`. Variable indirection with 
names containing spaces is error-prone and makes later expansion/iteration much 
harder to reason about. Prefer a normalized category key (e.g., 
`COMPILE_AND_LINK`) for variable names and keep the human-readable label 
separate, or sanitize category names (replace spaces with underscores) before 
composing variable names.



##########
cmake_modules/SetupCxxFlags.cmake:
##########
@@ -0,0 +1,434 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Check if the target architecture and compiler supports some special
+# instruction sets that would boost performance.
+include(CheckCXXCompilerFlag)
+# Get cpu architecture
+
+message(STATUS "System processor: ${CMAKE_SYSTEM_PROCESSOR}")
+
+# Support C11
+if(NOT DEFINED CMAKE_C_STANDARD)
+    set(CMAKE_C_STANDARD 11)
+endif()
+
+# This ensures that things like c++11 get passed correctly
+if(NOT DEFINED CMAKE_CXX_STANDARD)
+    set(CMAKE_CXX_STANDARD 17)
+endif()
+
+# We require a C++11 compliant compiler
+set(CMAKE_CXX_STANDARD_REQUIRED ON)
+
+set(CMAKE_CXX_EXTENSIONS OFF)
+
+# Build with -fPIC so that can static link our libraries into other people's
+# shared libraries
+set(CMAKE_POSITION_INDEPENDENT_CODE ON)
+
+string(TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE)
+
+set(UNKNOWN_COMPILER_MESSAGE
+    "Unknown compiler: ${CMAKE_CXX_COMPILER_VERSION} 
${CMAKE_CXX_COMPILER_VERSION}")
+
+# Defaults BUILD_WARNING_LEVEL to `CHECKIN`, unless CMAKE_BUILD_TYPE is
+# `RELEASE`, then it will default to `PRODUCTION`. The goal of defaulting to
+# `CHECKIN` is to avoid friction with long response time from CI.
+if(NOT BUILD_WARNING_LEVEL)
+    if("${CMAKE_BUILD_TYPE}" STREQUAL "RELEASE")
+        set(BUILD_WARNING_LEVEL PRODUCTION)
+    else()
+        set(BUILD_WARNING_LEVEL CHECKIN)
+    endif()
+endif(NOT BUILD_WARNING_LEVEL)
+string(TOUPPER ${BUILD_WARNING_LEVEL} BUILD_WARNING_LEVEL)
+
+message(STATUS "Paimon build warning level: ${BUILD_WARNING_LEVEL}")
+
+set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Werror")
+
+if("${BUILD_WARNING_LEVEL}" STREQUAL "CHECKIN")
+    # Pre-checkin builds
+    if(MSVC)
+        # 
https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warnings-by-compiler-version
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /W3")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4365")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4267")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4838")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang" OR 
CMAKE_CXX_COMPILER_ID STREQUAL
+                                                          "Clang")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wextra")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wdocumentation")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wglobal-constructors")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-missing-braces")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-parameter")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unknown-warning-option")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-constant-logical-operand")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-deprecated-declarations")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-conversion")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-deprecated-declarations")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-sign-conversion")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-variable")
+    else()
+        message(FATAL_ERROR "${UNKNOWN_COMPILER_MESSAGE}")
+    endif()
+
+elseif("${BUILD_WARNING_LEVEL}" STREQUAL "EVERYTHING")
+    # Pedantic builds for fixing warnings
+    if(MSVC)
+        string(REPLACE "/W3" "" CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS}")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /Wall")
+        # 
https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level
+        # /wdnnnn disables a warning where "nnnn" is a warning number
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang" OR 
CMAKE_CXX_COMPILER_ID STREQUAL
+                                                          "Clang")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Weverything")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-c++98-compat")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-c++98-compat-pedantic")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wpedantic")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wextra")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-parameter")
+    else()
+        message(FATAL_ERROR "${UNKNOWN_COMPILER_MESSAGE}")
+    endif()
+
+else()
+    # Production builds (warning are not treated as errors)
+    if(MSVC)
+        # 
https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level
+        # TODO: Enable /Wall and disable individual warnings until build 
compiles without errors
+        # /wdnnnn disables a warning where "nnnn" is a warning number
+        string(REPLACE "/W3" "" CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS}")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /W3")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang"
+           OR CMAKE_CXX_COMPILER_ID STREQUAL "Clang"
+           OR CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+    else()
+        message(FATAL_ERROR "${UNKNOWN_COMPILER_MESSAGE}")
+    endif()
+
+endif()
+
+if(MSVC)
+    # Disable annoying "performance warning" about int-to-bool conversion
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4800")
+
+    # Disable unchecked iterator warnings, equivalent to 
/D_SCL_SECURE_NO_WARNINGS
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4996")
+
+    # Disable "switch statement contains 'default' but no 'case' labels" 
warning
+    # (required for protobuf, see 
https://github.com/protocolbuffers/protobuf/issues/6885)
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4065")
+elseif(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+    if(CMAKE_CXX_COMPILER_VERSION VERSION_EQUAL "7.0" OR 
CMAKE_CXX_COMPILER_VERSION
+                                                         VERSION_GREATER "7.0")
+        # Without this, gcc >= 7 warns related to changes in C++17
+        set(CXX_ONLY_FLAGS
+            "${CXX_ONLY_FLAGS} -Wno-noexcept-type -Wno-stringop-overflow 
-Wno-free-nonheap-object"
+        )
+    endif()
+
+    if(CMAKE_CXX_COMPILER_VERSION VERSION_GREATER "4.9")
+        # Add colors when paired with ninja
+        set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fdiagnostics-color=always")
+    endif()
+
+    if(CMAKE_CXX_COMPILER_VERSION VERSION_LESS "6.0")
+        # Work around https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43407
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-attributes")
+    endif()
+
+    if(CMAKE_UNITY_BUILD)
+        # Work around issue similar to 
https://bugs.webkit.org/show_bug.cgi?id=176869
+        set(CXX_ONLY_FLAGS "${CXX_ONLY_FLAGS} -Wno-subobject-linkage")
+    endif()
+
+    if(CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL "8.0"
+       AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS "9.0")
+        # When using the C++17 filesystem library (<filesystem>) with GCC 8, 
you do need to explicitly link stdc++fs.
+        link_libraries(stdc++fs)
+    endif()
+
+elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang" OR CMAKE_CXX_COMPILER_ID 
STREQUAL
+                                                      "Clang")
+    # Clang options for all builds
+
+    # Using Clang with ccache causes a bunch of spurious warnings that are
+    # purportedly fixed in the next version of ccache. See the following for 
details:
+    #
+    #   http://petereisentraut.blogspot.com/2011/05/ccache-and-clang.html
+    #   
http://petereisentraut.blogspot.com/2011/09/ccache-and-clang-part-2.html
+    set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Qunused-arguments")
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Qunused-arguments")
+
+    # Avoid clang error when an unknown warning flag is passed
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unknown-warning-option")
+    # Add colors when paired with ninja
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fcolor-diagnostics")
+
+    # Don't complain about optimization passes that were not possible
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-pass-failed")
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-deprecated-declarations")
+
+    if(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang")
+        # Depending on the default OSX_DEPLOYMENT_TARGET (< 10.9), libstdc++ 
may be
+        # the default standard library which does not support C++11. libc++ is 
the
+        # default from 10.9 onward.
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -stdlib=libc++")
+    endif()
+endif()
+
+# if build warning flags is set, add to CXX_COMMON_FLAGS
+if(BUILD_WARNING_FLAGS)
+    # Use BUILD_WARNING_FLAGS with BUILD_WARNING_LEVEL=everything to disable
+    # warnings (use with Clang's -Weverything flag to find potential errors)
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${BUILD_WARNING_FLAGS}")
+endif(BUILD_WARNING_FLAGS)
+
+# Only enable additional instruction sets if they are supported
+if(PAIMON_CPU_FLAG STREQUAL "x86")
+    if(PAIMON_SIMD_LEVEL STREQUAL "AVX512")
+        if(NOT CXX_SUPPORTS_AVX512)
+            message(FATAL_ERROR "AVX512 required but compiler doesn't support 
it.")
+        endif()
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_AVX512_FLAG}")
+        add_definitions(-DPAIMON_HAVE_AVX512 -DPAIMON_HAVE_AVX2 
-DPAIMON_HAVE_BMI2
+                        -DPAIMON_HAVE_SSE4_2)
+    elseif(PAIMON_SIMD_LEVEL STREQUAL "AVX2")
+        if(NOT CXX_SUPPORTS_AVX2)
+            message(FATAL_ERROR "AVX2 required but compiler doesn't support 
it.")
+        endif()
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_AVX2_FLAG}")
+        add_definitions(-DPAIMON_HAVE_AVX2 -DPAIMON_HAVE_BMI2 
-DPAIMON_HAVE_SSE4_2)
+    elseif(PAIMON_SIMD_LEVEL STREQUAL "SSE4_2")
+        if(NOT CXX_SUPPORTS_SSE4_2)
+            message(FATAL_ERROR "SSE4.2 required but compiler doesn't support 
it.")
+        endif()
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_SSE4_2_FLAG}")
+        add_definitions(-DPAIMON_HAVE_SSE4_2)
+    endif()
+endif()
+
+if(PAIMON_CPU_FLAG STREQUAL "ppc")
+    if(CXX_SUPPORTS_ALTIVEC AND PAIMON_ALTIVEC)
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_ALTIVEC_FLAG}")
+    endif()
+endif()
+
+if(PAIMON_CPU_FLAG STREQUAL "armv8")
+    if(NOT CXX_SUPPORTS_ARMV8_ARCH)
+        message(FATAL_ERROR "Unsupported arch flag: 
${PAIMON_ARMV8_ARCH_FLAG}.")
+    endif()
+    if(PAIMON_ARMV8_ARCH_FLAG MATCHES "native")
+        message(FATAL_ERROR "native arch not allowed, please specify arch 
explicitly.")
+    endif()
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_ARMV8_ARCH_FLAG}")
+
+    add_definitions(-DPAIMON_HAVE_NEON)
+
+    if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU" AND CMAKE_CXX_COMPILER_VERSION 
VERSION_LESS
+                                                "5.4")
+        message(WARNING "Disable Armv8 CRC and Crypto as compiler doesn't 
support them well."
+        )
+    else()
+        if(PAIMON_ARMV8_ARCH_FLAG MATCHES "\\+crypto")
+            add_definitions(-DPAIMON_HAVE_ARMV8_CRYPTO)
+        endif()
+        # armv8.1+ implies crc support
+        if(PAIMON_ARMV8_ARCH_FLAG MATCHES "armv8\\.[1-9]|\\+crc")
+            add_definitions(-DPAIMON_HAVE_ARMV8_CRC)
+        endif()
+    endif()
+endif()
+
+# ----------------------------------------------------------------------
+# Setup Gold linker, if available. Code originally from Apache Kudu
+
+# Interrogates the linker version via the C++ compiler to determine whether
+# we're using the gold linker, and if so, extracts its version.
+#
+# If the gold linker is being used, sets GOLD_VERSION in the parent scope with
+# the extracted version.
+#
+# Any additional arguments are passed verbatim into the C++ compiler 
invocation.
+function(GET_GOLD_VERSION)
+    # The gold linker is only for ELF binaries, which macOS doesn't use.
+    execute_process(COMMAND ${CMAKE_CXX_COMPILER} "-Wl,--version" ${ARGN}
+                    ERROR_QUIET
+                    OUTPUT_VARIABLE LINKER_OUTPUT)
+    # We're expecting LINKER_OUTPUT to look like one of these:
+    #   GNU gold (version 2.24) 1.11
+    #   GNU gold (GNU Binutils for Ubuntu 2.30) 1.15
+    if(LINKER_OUTPUT MATCHES "GNU gold")
+        string(REGEX MATCH "GNU gold \\([^\\)]*\\) (([0-9]+\\.?)+)" _ 
"${LINKER_OUTPUT}")
+        if(NOT CMAKE_MATCH_1)
+            message(SEND_ERROR "Could not extract GNU gold version. "
+                               "Linker version output: ${LINKER_OUTPUT}")
+        endif()
+        set(GOLD_VERSION
+            "${CMAKE_MATCH_1}"
+            PARENT_SCOPE)
+    endif()
+endfunction()
+
+# Is the compiler hard-wired to use the gold linker?
+if(NOT WIN32 AND NOT APPLE)
+    get_gold_version()
+    if(GOLD_VERSION)
+        set(MUST_USE_GOLD 1)
+    elseif(PAIMON_USE_LD_GOLD)
+        # Can the compiler optionally enable the gold linker?
+        get_gold_version("-fuse-ld=gold")
+
+        # We can't use the gold linker if it's inside devtoolset because the 
compiler
+        # won't find it when invoked directly from make/ninja (which is 
typically
+        # done outside devtoolset).
+        execute_process(COMMAND which ld.gold
+                        OUTPUT_VARIABLE GOLD_LOCATION
+                        OUTPUT_STRIP_TRAILING_WHITESPACE ERROR_QUIET)
+        if("${GOLD_LOCATION}" MATCHES "^/opt/rh/devtoolset")
+            message("Skipping optional gold linker (version ${GOLD_VERSION}) 
because "
+                    "it's in devtoolset")
+            set(GOLD_VERSION)
+        endif()
+    endif()
+
+    if(GOLD_VERSION)
+        # Older versions of the gold linker are vulnerable to a bug [1] which
+        # prevents weak symbols from being overridden properly. This leads to
+        # omitting of dependencies like tcmalloc (used in Kudu, where this
+        # workaround was written originally)
+        #
+        # How we handle this situation depends on other factors:
+        # - If gold is optional, we won't use it.
+        # - If gold is required, we'll either:
+        #   - Raise an error in RELEASE builds (we shouldn't release such a 
product), or
+        #   - Drop tcmalloc in all other builds.
+        #
+        # 1. https://sourceware.org/bugzilla/show_bug.cgi?id=16979.
+        if("${GOLD_VERSION}" VERSION_LESS "1.12")
+            set(PAIMON_BUGGY_GOLD 1)
+        endif()
+        if(MUST_USE_GOLD)
+            message("Using hard-wired gold linker (version ${GOLD_VERSION})")
+            if(PAIMON_BUGGY_GOLD)
+                if("${PAIMON_LINK}" STREQUAL "d" AND "${CMAKE_BUILD_TYPE}" 
STREQUAL
+                                                     "RELEASE")
+                    message(SEND_ERROR "Configured to use buggy gold with 
dynamic linking "
+                                       "in a RELEASE build")
+                endif()
+            endif()
+        elseif(NOT PAIMON_BUGGY_GOLD)
+            # The Gold linker must be manually enabled.
+            set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fuse-ld=gold")
+            set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fuse-ld=gold")
+            message("Using optional gold linker (version ${GOLD_VERSION})")
+        else()
+            message("Optional gold linker is buggy, using ld linker instead")
+        endif()
+    else()
+        message("Using ld linker")
+    endif()
+endif()
+
+# compiler flags for different build types (run 'cmake 
-DCMAKE_BUILD_TYPE=<type> .')
+# For all builds:
+# For CMAKE_BUILD_TYPE=Debug
+#   -ggdb: Enable gdb debugging
+# For CMAKE_BUILD_TYPE=FastDebug
+#   Same as DEBUG, except with some optimizations on.
+# For CMAKE_BUILD_TYPE=Release
+#   -O3: Enable all compiler optimizations
+#   Debug symbols are stripped for reduced binary size. Add
+#   -DPAIMON_CXXFLAGS="-g" to add them
+if(NOT MSVC)
+    if(PAIMON_GGDB_DEBUG)
+        set(PAIMON_DEBUG_SYMBOL_TYPE "gdb")
+        set(C_FLAGS_DEBUG "-g${PAIMON_DEBUG_SYMBOL_TYPE} -O0")
+        set(C_FLAGS_FASTDEBUG "-g${PAIMON_DEBUG_SYMBOL_TYPE} -O1")
+        set(CXX_FLAGS_DEBUG "-g${PAIMON_DEBUG_SYMBOL_TYPE} -O0")
+        set(CXX_FLAGS_FASTDEBUG "-g${PAIMON_DEBUG_SYMBOL_TYPE} -O1")
+    else()
+        set(C_FLAGS_DEBUG "-g -O0")
+        set(C_FLAGS_FASTDEBUG "-g -O1")
+        set(CXX_FLAGS_DEBUG "-g -O0")
+        set(CXX_FLAGS_FASTDEBUG "-g -O1")
+    endif()
+
+    set(C_FLAGS_RELEASE "-O3 -DNDEBUG")
+    set(CXX_FLAGS_RELEASE "-O3 -DNDEBUG")
+endif()
+
+set(C_FLAGS_PROFILE_GEN "${CXX_FLAGS_RELEASE} -fprofile-generate")
+set(C_FLAGS_PROFILE_BUILD "${CXX_FLAGS_RELEASE} -fprofile-use")

Review Comment:
   The C profile flags are derived from `CXX_FLAGS_RELEASE` instead of 
`C_FLAGS_RELEASE`, so C builds will silently pick up the wrong baseline flags. 
Update `C_FLAGS_PROFILE_GEN/BUILD` to be based on `C_FLAGS_RELEASE`.
   



##########
cmake_modules/BuildUtils.cmake:
##########
@@ -0,0 +1,367 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# https://github.com/apache/arrow/blob/main/cpp/cmake_modules/BuildUtils.cmake
+
+function(add_paimon_lib LIB_NAME)
+    set(options BUILD_SHARED BUILD_STATIC)
+    set(one_value_args SHARED_LINK_FLAGS)
+    set(multi_value_args
+        SOURCES
+        STATIC_LINK_LIBS
+        SHARED_LINK_LIBS
+        EXTRA_INCLUDES
+        PRIVATE_INCLUDES
+        DEPENDENCIES)
+    cmake_parse_arguments(ARG
+                          "${options}"
+                          "${one_value_args}"
+                          "${multi_value_args}"
+                          ${ARGN})
+    if(ARG_UNPARSED_ARGUMENTS)
+        message(SEND_ERROR "Error: unrecognized arguments: 
${ARG_UNPARSED_ARGUMENTS}")
+    endif()
+
+    # Allow overriding PAIMON_BUILD_SHARED and PAIMON_BUILD_STATIC
+    if(ARG_BUILD_SHARED)
+        set(BUILD_SHARED ${ARG_BUILD_SHARED})
+    else()
+        set(BUILD_SHARED ${PAIMON_BUILD_SHARED})
+    endif()
+    if(ARG_BUILD_STATIC)
+        set(BUILD_STATIC ${ARG_BUILD_STATIC})
+    else()
+        set(BUILD_STATIC ${PAIMON_BUILD_STATIC})
+    endif()
+
+    # Generate a single "objlib" from all C++ modules and link
+    # that "objlib" into each library kind, to avoid compiling twice
+    add_library(${LIB_NAME}_objlib OBJECT ${ARG_SOURCES})
+    if(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
+        target_compile_options(${LIB_NAME}_objlib PRIVATE 
-Wno-global-constructors)
+    endif()
+    # Necessary to make static linking into other shared libraries work 
properly
+    set_property(TARGET ${LIB_NAME}_objlib PROPERTY POSITION_INDEPENDENT_CODE 
1)
+    if(ARG_DEPENDENCIES)
+        # In static-only builds, some dependency names are still declared as
+        # *_shared. Map them to *_static when the shared target is unavailable.
+        set(_paimon_objlib_link_deps)
+        set(_paimon_objlib_deps)
+        foreach(_paimon_dep IN LISTS ARG_DEPENDENCIES)
+            set(_paimon_mapped_dep "${_paimon_dep}")
+            if(NOT TARGET ${_paimon_mapped_dep} AND _paimon_dep MATCHES 
"_shared$")
+                string(REGEX REPLACE "_shared$" "_static" _paimon_mapped_dep
+                                     "${_paimon_dep}")
+            endif()
+            if(TARGET ${_paimon_mapped_dep})
+                get_target_property(_paimon_is_internal_lib 
${_paimon_mapped_dep}
+                                    PAIMON_INTERNAL_LIBRARY)
+                list(APPEND _paimon_objlib_deps ${_paimon_mapped_dep})
+                if(NOT _paimon_is_internal_lib)
+                    list(APPEND _paimon_objlib_link_deps ${_paimon_mapped_dep})
+                endif()
+                unset(_paimon_is_internal_lib)
+            endif()
+            unset(_paimon_mapped_dep)
+        endforeach()
+        if(_paimon_objlib_deps)
+            add_dependencies(${LIB_NAME}_objlib ${_paimon_objlib_deps})
+        endif()
+        if(_paimon_objlib_link_deps)
+            target_link_libraries(${LIB_NAME}_objlib PRIVATE 
${_paimon_objlib_link_deps})
+        endif()
+        unset(_paimon_objlib_deps)
+        unset(_paimon_objlib_link_deps)
+        unset(_paimon_dep)
+    endif()
+    set(LIB_DEPS $<TARGET_OBJECTS:${LIB_NAME}_objlib>)
+    set(LIB_INCLUDES)
+    set(EXTRA_DEPS)
+
+    if(ARG_EXTRA_INCLUDES)
+        target_include_directories(${LIB_NAME}_objlib SYSTEM PUBLIC 
${ARG_EXTRA_INCLUDES})
+    endif()
+    if(ARG_PRIVATE_INCLUDES)
+        target_include_directories(${LIB_NAME}_objlib PRIVATE 
${ARG_PRIVATE_INCLUDES})
+    endif()
+
+    set(RUNTIME_INSTALL_DIR bin)
+
+    if(BUILD_SHARED)
+        add_library(${LIB_NAME}_shared SHARED ${LIB_DEPS})
+        if(EXTRA_DEPS)
+            add_dependencies(${LIB_NAME}_shared ${EXTRA_DEPS})
+        endif()
+
+        if(LIB_INCLUDES)
+            target_include_directories(${LIB_NAME}_shared SYSTEM
+                                       PUBLIC ${ARG_EXTRA_INCLUDES})
+        endif()

Review Comment:
   `LIB_INCLUDES` is never set to a truthy value, so the include dirs are only 
attached to the OBJECT library. Usage requirements from an object library do 
not automatically propagate to consumers of the shared/static libraries built 
from `$<TARGET_OBJECTS:...>`, so downstream targets may miss required include 
paths. Fix by applying `ARG_EXTRA_INCLUDES` directly to 
`${LIB_NAME}_shared`/`${LIB_NAME}_static` (and/or removing the `LIB_INCLUDES` 
gate and using `if(ARG_EXTRA_INCLUDES)`).



##########
cmake_modules/BuildUtils.cmake:
##########
@@ -0,0 +1,367 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# https://github.com/apache/arrow/blob/main/cpp/cmake_modules/BuildUtils.cmake
+
+function(add_paimon_lib LIB_NAME)
+    set(options BUILD_SHARED BUILD_STATIC)
+    set(one_value_args SHARED_LINK_FLAGS)
+    set(multi_value_args
+        SOURCES
+        STATIC_LINK_LIBS
+        SHARED_LINK_LIBS
+        EXTRA_INCLUDES
+        PRIVATE_INCLUDES
+        DEPENDENCIES)
+    cmake_parse_arguments(ARG
+                          "${options}"
+                          "${one_value_args}"
+                          "${multi_value_args}"
+                          ${ARGN})
+    if(ARG_UNPARSED_ARGUMENTS)
+        message(SEND_ERROR "Error: unrecognized arguments: 
${ARG_UNPARSED_ARGUMENTS}")
+    endif()
+
+    # Allow overriding PAIMON_BUILD_SHARED and PAIMON_BUILD_STATIC
+    if(ARG_BUILD_SHARED)
+        set(BUILD_SHARED ${ARG_BUILD_SHARED})
+    else()
+        set(BUILD_SHARED ${PAIMON_BUILD_SHARED})
+    endif()
+    if(ARG_BUILD_STATIC)
+        set(BUILD_STATIC ${ARG_BUILD_STATIC})
+    else()
+        set(BUILD_STATIC ${PAIMON_BUILD_STATIC})
+    endif()
+
+    # Generate a single "objlib" from all C++ modules and link
+    # that "objlib" into each library kind, to avoid compiling twice
+    add_library(${LIB_NAME}_objlib OBJECT ${ARG_SOURCES})
+    if(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
+        target_compile_options(${LIB_NAME}_objlib PRIVATE 
-Wno-global-constructors)
+    endif()
+    # Necessary to make static linking into other shared libraries work 
properly
+    set_property(TARGET ${LIB_NAME}_objlib PROPERTY POSITION_INDEPENDENT_CODE 
1)
+    if(ARG_DEPENDENCIES)
+        # In static-only builds, some dependency names are still declared as
+        # *_shared. Map them to *_static when the shared target is unavailable.
+        set(_paimon_objlib_link_deps)
+        set(_paimon_objlib_deps)
+        foreach(_paimon_dep IN LISTS ARG_DEPENDENCIES)
+            set(_paimon_mapped_dep "${_paimon_dep}")
+            if(NOT TARGET ${_paimon_mapped_dep} AND _paimon_dep MATCHES 
"_shared$")
+                string(REGEX REPLACE "_shared$" "_static" _paimon_mapped_dep
+                                     "${_paimon_dep}")
+            endif()
+            if(TARGET ${_paimon_mapped_dep})
+                get_target_property(_paimon_is_internal_lib 
${_paimon_mapped_dep}
+                                    PAIMON_INTERNAL_LIBRARY)
+                list(APPEND _paimon_objlib_deps ${_paimon_mapped_dep})
+                if(NOT _paimon_is_internal_lib)
+                    list(APPEND _paimon_objlib_link_deps ${_paimon_mapped_dep})
+                endif()
+                unset(_paimon_is_internal_lib)
+            endif()
+            unset(_paimon_mapped_dep)
+        endforeach()
+        if(_paimon_objlib_deps)
+            add_dependencies(${LIB_NAME}_objlib ${_paimon_objlib_deps})
+        endif()
+        if(_paimon_objlib_link_deps)
+            target_link_libraries(${LIB_NAME}_objlib PRIVATE 
${_paimon_objlib_link_deps})
+        endif()
+        unset(_paimon_objlib_deps)
+        unset(_paimon_objlib_link_deps)
+        unset(_paimon_dep)
+    endif()
+    set(LIB_DEPS $<TARGET_OBJECTS:${LIB_NAME}_objlib>)
+    set(LIB_INCLUDES)
+    set(EXTRA_DEPS)
+
+    if(ARG_EXTRA_INCLUDES)

Review Comment:
   `LIB_INCLUDES` is never set to a truthy value, so the include dirs are only 
attached to the OBJECT library. Usage requirements from an object library do 
not automatically propagate to consumers of the shared/static libraries built 
from `$<TARGET_OBJECTS:...>`, so downstream targets may miss required include 
paths. Fix by applying `ARG_EXTRA_INCLUDES` directly to 
`${LIB_NAME}_shared`/`${LIB_NAME}_static` (and/or removing the `LIB_INCLUDES` 
gate and using `if(ARG_EXTRA_INCLUDES)`).
   



##########
cmake_modules/BuildUtils.cmake:
##########
@@ -0,0 +1,367 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# https://github.com/apache/arrow/blob/main/cpp/cmake_modules/BuildUtils.cmake
+
+function(add_paimon_lib LIB_NAME)
+    set(options BUILD_SHARED BUILD_STATIC)
+    set(one_value_args SHARED_LINK_FLAGS)
+    set(multi_value_args
+        SOURCES
+        STATIC_LINK_LIBS
+        SHARED_LINK_LIBS
+        EXTRA_INCLUDES
+        PRIVATE_INCLUDES
+        DEPENDENCIES)
+    cmake_parse_arguments(ARG
+                          "${options}"
+                          "${one_value_args}"
+                          "${multi_value_args}"
+                          ${ARGN})
+    if(ARG_UNPARSED_ARGUMENTS)
+        message(SEND_ERROR "Error: unrecognized arguments: 
${ARG_UNPARSED_ARGUMENTS}")
+    endif()
+
+    # Allow overriding PAIMON_BUILD_SHARED and PAIMON_BUILD_STATIC
+    if(ARG_BUILD_SHARED)
+        set(BUILD_SHARED ${ARG_BUILD_SHARED})
+    else()
+        set(BUILD_SHARED ${PAIMON_BUILD_SHARED})
+    endif()
+    if(ARG_BUILD_STATIC)
+        set(BUILD_STATIC ${ARG_BUILD_STATIC})
+    else()
+        set(BUILD_STATIC ${PAIMON_BUILD_STATIC})
+    endif()
+
+    # Generate a single "objlib" from all C++ modules and link
+    # that "objlib" into each library kind, to avoid compiling twice
+    add_library(${LIB_NAME}_objlib OBJECT ${ARG_SOURCES})
+    if(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
+        target_compile_options(${LIB_NAME}_objlib PRIVATE 
-Wno-global-constructors)
+    endif()
+    # Necessary to make static linking into other shared libraries work 
properly
+    set_property(TARGET ${LIB_NAME}_objlib PROPERTY POSITION_INDEPENDENT_CODE 
1)
+    if(ARG_DEPENDENCIES)
+        # In static-only builds, some dependency names are still declared as
+        # *_shared. Map them to *_static when the shared target is unavailable.
+        set(_paimon_objlib_link_deps)
+        set(_paimon_objlib_deps)
+        foreach(_paimon_dep IN LISTS ARG_DEPENDENCIES)
+            set(_paimon_mapped_dep "${_paimon_dep}")
+            if(NOT TARGET ${_paimon_mapped_dep} AND _paimon_dep MATCHES 
"_shared$")
+                string(REGEX REPLACE "_shared$" "_static" _paimon_mapped_dep
+                                     "${_paimon_dep}")
+            endif()
+            if(TARGET ${_paimon_mapped_dep})
+                get_target_property(_paimon_is_internal_lib 
${_paimon_mapped_dep}
+                                    PAIMON_INTERNAL_LIBRARY)
+                list(APPEND _paimon_objlib_deps ${_paimon_mapped_dep})
+                if(NOT _paimon_is_internal_lib)
+                    list(APPEND _paimon_objlib_link_deps ${_paimon_mapped_dep})
+                endif()
+                unset(_paimon_is_internal_lib)
+            endif()
+            unset(_paimon_mapped_dep)
+        endforeach()
+        if(_paimon_objlib_deps)
+            add_dependencies(${LIB_NAME}_objlib ${_paimon_objlib_deps})
+        endif()
+        if(_paimon_objlib_link_deps)
+            target_link_libraries(${LIB_NAME}_objlib PRIVATE 
${_paimon_objlib_link_deps})
+        endif()
+        unset(_paimon_objlib_deps)
+        unset(_paimon_objlib_link_deps)
+        unset(_paimon_dep)
+    endif()
+    set(LIB_DEPS $<TARGET_OBJECTS:${LIB_NAME}_objlib>)
+    set(LIB_INCLUDES)
+    set(EXTRA_DEPS)
+
+    if(ARG_EXTRA_INCLUDES)
+        target_include_directories(${LIB_NAME}_objlib SYSTEM PUBLIC 
${ARG_EXTRA_INCLUDES})
+    endif()
+    if(ARG_PRIVATE_INCLUDES)
+        target_include_directories(${LIB_NAME}_objlib PRIVATE 
${ARG_PRIVATE_INCLUDES})
+    endif()
+
+    set(RUNTIME_INSTALL_DIR bin)
+
+    if(BUILD_SHARED)
+        add_library(${LIB_NAME}_shared SHARED ${LIB_DEPS})
+        if(EXTRA_DEPS)
+            add_dependencies(${LIB_NAME}_shared ${EXTRA_DEPS})
+        endif()
+
+        if(LIB_INCLUDES)
+            target_include_directories(${LIB_NAME}_shared SYSTEM
+                                       PUBLIC ${ARG_EXTRA_INCLUDES})
+        endif()
+
+        if(ARG_PRIVATE_INCLUDES)
+            target_include_directories(${LIB_NAME}_shared PRIVATE 
${ARG_PRIVATE_INCLUDES})
+        endif()
+
+        set_property(TARGET ${LIB_NAME}_shared PROPERTY 
PAIMON_INTERNAL_LIBRARY TRUE)
+        set_target_properties(${LIB_NAME}_shared
+                              PROPERTIES LIBRARY_OUTPUT_DIRECTORY
+                                         "${CMAKE_LIBRARY_OUTPUT_DIRECTORY}"
+                                         RUNTIME_OUTPUT_DIRECTORY
+                                         "${CMAKE_RUNTIME_OUTPUT_DIRECTORY}"
+                                         PDB_OUTPUT_DIRECTORY
+                                         "${CMAKE_LIBRARY_OUTPUT_DIRECTORY}"
+                                         LINK_FLAGS "${ARG_SHARED_LINK_FLAGS}"
+                                         OUTPUT_NAME ${LIB_NAME})
+
+        target_link_libraries(${LIB_NAME}_shared
+                              LINK_PUBLIC
+                              "$<BUILD_INTERFACE:${ARG_SHARED_LINK_LIBS}>"
+                              
"$<INSTALL_INTERFACE:${ARG_SHARED_INSTALL_INTERFACE_LIBS}>"
+                              LINK_PRIVATE
+                              "$<BUILD_INTERFACE:${ARG_STATIC_LINK_LIBS}>"
+                              ${ARG_SHARED_PRIVATE_LINK_LIBS})
+
+        target_link_libraries(${LIB_NAME}_shared
+                              PUBLIC 
"$<BUILD_INTERFACE:paimon_sanitizer_flags>")
+
+        target_link_options(${LIB_NAME}_shared
+                            PRIVATE
+                            -Wl,--exclude-libs,ALL
+                            -Wl,-Bsymbolic
+                            -Wl,-z,defs
+                            -Wl,--gc-sections)

Review Comment:
   These linker options are ELF/GNU-ld specific and will fail on MSVC and 
typically on Apple linkers as well. Guard them by platform/linker (e.g., 
`if(NOT MSVC AND NOT APPLE AND CMAKE_CXX_COMPILER_ID MATCHES "GNU|Clang")` or a 
more direct linker-id check), and consider only applying `--gc-sections` when 
corresponding compile flags (e.g., `-ffunction-sections -fdata-sections`) are 
in use.
   



##########
cmake_modules/san-config.cmake:
##########
@@ -0,0 +1,36 @@
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License. See accompanying LICENSE file.
+
+add_library(paimon_sanitizer_flags INTERFACE)
+
+if(PAIMON_USE_ASAN)

Review Comment:
   `DefineOptions.cmake` introduces `PAIMON_USE_TSAN`, but this module only 
implements ASAN/UBSAN. If TSAN is intended to be supported, add a corresponding 
`if(PAIMON_USE_TSAN)` block (and ensure incompatible combinations like 
ASAN+TSAN are handled consistently).



##########
cmake_modules/SetupCxxFlags.cmake:
##########
@@ -0,0 +1,434 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Check if the target architecture and compiler supports some special
+# instruction sets that would boost performance.
+include(CheckCXXCompilerFlag)
+# Get cpu architecture
+
+message(STATUS "System processor: ${CMAKE_SYSTEM_PROCESSOR}")
+
+# Support C11
+if(NOT DEFINED CMAKE_C_STANDARD)
+    set(CMAKE_C_STANDARD 11)
+endif()
+
+# This ensures that things like c++11 get passed correctly
+if(NOT DEFINED CMAKE_CXX_STANDARD)
+    set(CMAKE_CXX_STANDARD 17)
+endif()
+
+# We require a C++11 compliant compiler
+set(CMAKE_CXX_STANDARD_REQUIRED ON)
+
+set(CMAKE_CXX_EXTENSIONS OFF)
+
+# Build with -fPIC so that can static link our libraries into other people's
+# shared libraries
+set(CMAKE_POSITION_INDEPENDENT_CODE ON)
+
+string(TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE)
+
+set(UNKNOWN_COMPILER_MESSAGE
+    "Unknown compiler: ${CMAKE_CXX_COMPILER_VERSION} 
${CMAKE_CXX_COMPILER_VERSION}")
+
+# Defaults BUILD_WARNING_LEVEL to `CHECKIN`, unless CMAKE_BUILD_TYPE is
+# `RELEASE`, then it will default to `PRODUCTION`. The goal of defaulting to
+# `CHECKIN` is to avoid friction with long response time from CI.
+if(NOT BUILD_WARNING_LEVEL)
+    if("${CMAKE_BUILD_TYPE}" STREQUAL "RELEASE")
+        set(BUILD_WARNING_LEVEL PRODUCTION)
+    else()
+        set(BUILD_WARNING_LEVEL CHECKIN)
+    endif()
+endif(NOT BUILD_WARNING_LEVEL)
+string(TOUPPER ${BUILD_WARNING_LEVEL} BUILD_WARNING_LEVEL)
+
+message(STATUS "Paimon build warning level: ${BUILD_WARNING_LEVEL}")
+
+set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Werror")
+
+if("${BUILD_WARNING_LEVEL}" STREQUAL "CHECKIN")
+    # Pre-checkin builds
+    if(MSVC)
+        # 
https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warnings-by-compiler-version
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /W3")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4365")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4267")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4838")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang" OR 
CMAKE_CXX_COMPILER_ID STREQUAL
+                                                          "Clang")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wextra")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wdocumentation")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wglobal-constructors")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-missing-braces")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-parameter")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unknown-warning-option")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-constant-logical-operand")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-deprecated-declarations")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-conversion")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-deprecated-declarations")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-sign-conversion")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-variable")
+    else()
+        message(FATAL_ERROR "${UNKNOWN_COMPILER_MESSAGE}")
+    endif()
+
+elseif("${BUILD_WARNING_LEVEL}" STREQUAL "EVERYTHING")
+    # Pedantic builds for fixing warnings
+    if(MSVC)
+        string(REPLACE "/W3" "" CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS}")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /Wall")
+        # 
https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level
+        # /wdnnnn disables a warning where "nnnn" is a warning number
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang" OR 
CMAKE_CXX_COMPILER_ID STREQUAL
+                                                          "Clang")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Weverything")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-c++98-compat")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-c++98-compat-pedantic")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wpedantic")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wextra")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-parameter")
+    else()
+        message(FATAL_ERROR "${UNKNOWN_COMPILER_MESSAGE}")
+    endif()
+
+else()
+    # Production builds (warning are not treated as errors)
+    if(MSVC)
+        # 
https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level
+        # TODO: Enable /Wall and disable individual warnings until build 
compiles without errors
+        # /wdnnnn disables a warning where "nnnn" is a warning number
+        string(REPLACE "/W3" "" CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS}")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /W3")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang"
+           OR CMAKE_CXX_COMPILER_ID STREQUAL "Clang"
+           OR CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+    else()
+        message(FATAL_ERROR "${UNKNOWN_COMPILER_MESSAGE}")
+    endif()
+
+endif()
+
+if(MSVC)
+    # Disable annoying "performance warning" about int-to-bool conversion
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4800")
+
+    # Disable unchecked iterator warnings, equivalent to 
/D_SCL_SECURE_NO_WARNINGS
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4996")
+
+    # Disable "switch statement contains 'default' but no 'case' labels" 
warning
+    # (required for protobuf, see 
https://github.com/protocolbuffers/protobuf/issues/6885)
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4065")
+elseif(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+    if(CMAKE_CXX_COMPILER_VERSION VERSION_EQUAL "7.0" OR 
CMAKE_CXX_COMPILER_VERSION
+                                                         VERSION_GREATER "7.0")
+        # Without this, gcc >= 7 warns related to changes in C++17
+        set(CXX_ONLY_FLAGS
+            "${CXX_ONLY_FLAGS} -Wno-noexcept-type -Wno-stringop-overflow 
-Wno-free-nonheap-object"
+        )
+    endif()
+
+    if(CMAKE_CXX_COMPILER_VERSION VERSION_GREATER "4.9")
+        # Add colors when paired with ninja
+        set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fdiagnostics-color=always")
+    endif()
+
+    if(CMAKE_CXX_COMPILER_VERSION VERSION_LESS "6.0")
+        # Work around https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43407
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-attributes")
+    endif()
+
+    if(CMAKE_UNITY_BUILD)
+        # Work around issue similar to 
https://bugs.webkit.org/show_bug.cgi?id=176869
+        set(CXX_ONLY_FLAGS "${CXX_ONLY_FLAGS} -Wno-subobject-linkage")
+    endif()
+
+    if(CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL "8.0"
+       AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS "9.0")
+        # When using the C++17 filesystem library (<filesystem>) with GCC 8, 
you do need to explicitly link stdc++fs.
+        link_libraries(stdc++fs)
+    endif()
+
+elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang" OR CMAKE_CXX_COMPILER_ID 
STREQUAL
+                                                      "Clang")
+    # Clang options for all builds
+
+    # Using Clang with ccache causes a bunch of spurious warnings that are
+    # purportedly fixed in the next version of ccache. See the following for 
details:
+    #
+    #   http://petereisentraut.blogspot.com/2011/05/ccache-and-clang.html
+    #   
http://petereisentraut.blogspot.com/2011/09/ccache-and-clang-part-2.html
+    set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Qunused-arguments")
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Qunused-arguments")
+
+    # Avoid clang error when an unknown warning flag is passed
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unknown-warning-option")
+    # Add colors when paired with ninja
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fcolor-diagnostics")
+
+    # Don't complain about optimization passes that were not possible
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-pass-failed")
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-deprecated-declarations")
+
+    if(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang")
+        # Depending on the default OSX_DEPLOYMENT_TARGET (< 10.9), libstdc++ 
may be
+        # the default standard library which does not support C++11. libc++ is 
the
+        # default from 10.9 onward.
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -stdlib=libc++")
+    endif()
+endif()
+
+# if build warning flags is set, add to CXX_COMMON_FLAGS
+if(BUILD_WARNING_FLAGS)
+    # Use BUILD_WARNING_FLAGS with BUILD_WARNING_LEVEL=everything to disable
+    # warnings (use with Clang's -Weverything flag to find potential errors)
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${BUILD_WARNING_FLAGS}")
+endif(BUILD_WARNING_FLAGS)
+
+# Only enable additional instruction sets if they are supported
+if(PAIMON_CPU_FLAG STREQUAL "x86")
+    if(PAIMON_SIMD_LEVEL STREQUAL "AVX512")
+        if(NOT CXX_SUPPORTS_AVX512)
+            message(FATAL_ERROR "AVX512 required but compiler doesn't support 
it.")
+        endif()
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_AVX512_FLAG}")
+        add_definitions(-DPAIMON_HAVE_AVX512 -DPAIMON_HAVE_AVX2 
-DPAIMON_HAVE_BMI2
+                        -DPAIMON_HAVE_SSE4_2)
+    elseif(PAIMON_SIMD_LEVEL STREQUAL "AVX2")
+        if(NOT CXX_SUPPORTS_AVX2)
+            message(FATAL_ERROR "AVX2 required but compiler doesn't support 
it.")
+        endif()
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_AVX2_FLAG}")
+        add_definitions(-DPAIMON_HAVE_AVX2 -DPAIMON_HAVE_BMI2 
-DPAIMON_HAVE_SSE4_2)
+    elseif(PAIMON_SIMD_LEVEL STREQUAL "SSE4_2")
+        if(NOT CXX_SUPPORTS_SSE4_2)
+            message(FATAL_ERROR "SSE4.2 required but compiler doesn't support 
it.")
+        endif()
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_SSE4_2_FLAG}")
+        add_definitions(-DPAIMON_HAVE_SSE4_2)
+    endif()
+endif()
+
+if(PAIMON_CPU_FLAG STREQUAL "ppc")
+    if(CXX_SUPPORTS_ALTIVEC AND PAIMON_ALTIVEC)
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_ALTIVEC_FLAG}")
+    endif()
+endif()
+
+if(PAIMON_CPU_FLAG STREQUAL "armv8")
+    if(NOT CXX_SUPPORTS_ARMV8_ARCH)
+        message(FATAL_ERROR "Unsupported arch flag: 
${PAIMON_ARMV8_ARCH_FLAG}.")
+    endif()
+    if(PAIMON_ARMV8_ARCH_FLAG MATCHES "native")
+        message(FATAL_ERROR "native arch not allowed, please specify arch 
explicitly.")
+    endif()
+    set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${PAIMON_ARMV8_ARCH_FLAG}")
+
+    add_definitions(-DPAIMON_HAVE_NEON)
+
+    if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU" AND CMAKE_CXX_COMPILER_VERSION 
VERSION_LESS
+                                                "5.4")
+        message(WARNING "Disable Armv8 CRC and Crypto as compiler doesn't 
support them well."
+        )
+    else()
+        if(PAIMON_ARMV8_ARCH_FLAG MATCHES "\\+crypto")
+            add_definitions(-DPAIMON_HAVE_ARMV8_CRYPTO)
+        endif()
+        # armv8.1+ implies crc support
+        if(PAIMON_ARMV8_ARCH_FLAG MATCHES "armv8\\.[1-9]|\\+crc")
+            add_definitions(-DPAIMON_HAVE_ARMV8_CRC)
+        endif()
+    endif()
+endif()
+
+# ----------------------------------------------------------------------
+# Setup Gold linker, if available. Code originally from Apache Kudu
+
+# Interrogates the linker version via the C++ compiler to determine whether
+# we're using the gold linker, and if so, extracts its version.
+#
+# If the gold linker is being used, sets GOLD_VERSION in the parent scope with
+# the extracted version.
+#
+# Any additional arguments are passed verbatim into the C++ compiler 
invocation.
+function(GET_GOLD_VERSION)
+    # The gold linker is only for ELF binaries, which macOS doesn't use.
+    execute_process(COMMAND ${CMAKE_CXX_COMPILER} "-Wl,--version" ${ARGN}
+                    ERROR_QUIET
+                    OUTPUT_VARIABLE LINKER_OUTPUT)
+    # We're expecting LINKER_OUTPUT to look like one of these:
+    #   GNU gold (version 2.24) 1.11
+    #   GNU gold (GNU Binutils for Ubuntu 2.30) 1.15
+    if(LINKER_OUTPUT MATCHES "GNU gold")
+        string(REGEX MATCH "GNU gold \\([^\\)]*\\) (([0-9]+\\.?)+)" _ 
"${LINKER_OUTPUT}")
+        if(NOT CMAKE_MATCH_1)
+            message(SEND_ERROR "Could not extract GNU gold version. "
+                               "Linker version output: ${LINKER_OUTPUT}")
+        endif()
+        set(GOLD_VERSION
+            "${CMAKE_MATCH_1}"
+            PARENT_SCOPE)
+    endif()
+endfunction()
+
+# Is the compiler hard-wired to use the gold linker?
+if(NOT WIN32 AND NOT APPLE)
+    get_gold_version()
+    if(GOLD_VERSION)
+        set(MUST_USE_GOLD 1)
+    elseif(PAIMON_USE_LD_GOLD)
+        # Can the compiler optionally enable the gold linker?
+        get_gold_version("-fuse-ld=gold")
+
+        # We can't use the gold linker if it's inside devtoolset because the 
compiler
+        # won't find it when invoked directly from make/ninja (which is 
typically
+        # done outside devtoolset).
+        execute_process(COMMAND which ld.gold
+                        OUTPUT_VARIABLE GOLD_LOCATION
+                        OUTPUT_STRIP_TRAILING_WHITESPACE ERROR_QUIET)
+        if("${GOLD_LOCATION}" MATCHES "^/opt/rh/devtoolset")
+            message("Skipping optional gold linker (version ${GOLD_VERSION}) 
because "
+                    "it's in devtoolset")
+            set(GOLD_VERSION)
+        endif()
+    endif()
+
+    if(GOLD_VERSION)
+        # Older versions of the gold linker are vulnerable to a bug [1] which
+        # prevents weak symbols from being overridden properly. This leads to
+        # omitting of dependencies like tcmalloc (used in Kudu, where this
+        # workaround was written originally)
+        #
+        # How we handle this situation depends on other factors:
+        # - If gold is optional, we won't use it.
+        # - If gold is required, we'll either:
+        #   - Raise an error in RELEASE builds (we shouldn't release such a 
product), or
+        #   - Drop tcmalloc in all other builds.
+        #
+        # 1. https://sourceware.org/bugzilla/show_bug.cgi?id=16979.
+        if("${GOLD_VERSION}" VERSION_LESS "1.12")
+            set(PAIMON_BUGGY_GOLD 1)
+        endif()
+        if(MUST_USE_GOLD)
+            message("Using hard-wired gold linker (version ${GOLD_VERSION})")
+            if(PAIMON_BUGGY_GOLD)
+                if("${PAIMON_LINK}" STREQUAL "d" AND "${CMAKE_BUILD_TYPE}" 
STREQUAL
+                                                     "RELEASE")
+                    message(SEND_ERROR "Configured to use buggy gold with 
dynamic linking "
+                                       "in a RELEASE build")
+                endif()
+            endif()
+        elseif(NOT PAIMON_BUGGY_GOLD)
+            # The Gold linker must be manually enabled.
+            set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fuse-ld=gold")
+            set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fuse-ld=gold")
+            message("Using optional gold linker (version ${GOLD_VERSION})")
+        else()
+            message("Optional gold linker is buggy, using ld linker instead")
+        endif()
+    else()
+        message("Using ld linker")
+    endif()
+endif()
+
+# compiler flags for different build types (run 'cmake 
-DCMAKE_BUILD_TYPE=<type> .')
+# For all builds:
+# For CMAKE_BUILD_TYPE=Debug
+#   -ggdb: Enable gdb debugging
+# For CMAKE_BUILD_TYPE=FastDebug
+#   Same as DEBUG, except with some optimizations on.
+# For CMAKE_BUILD_TYPE=Release
+#   -O3: Enable all compiler optimizations
+#   Debug symbols are stripped for reduced binary size. Add
+#   -DPAIMON_CXXFLAGS="-g" to add them
+if(NOT MSVC)
+    if(PAIMON_GGDB_DEBUG)
+        set(PAIMON_DEBUG_SYMBOL_TYPE "gdb")
+        set(C_FLAGS_DEBUG "-g${PAIMON_DEBUG_SYMBOL_TYPE} -O0")
+        set(C_FLAGS_FASTDEBUG "-g${PAIMON_DEBUG_SYMBOL_TYPE} -O1")
+        set(CXX_FLAGS_DEBUG "-g${PAIMON_DEBUG_SYMBOL_TYPE} -O0")
+        set(CXX_FLAGS_FASTDEBUG "-g${PAIMON_DEBUG_SYMBOL_TYPE} -O1")
+    else()
+        set(C_FLAGS_DEBUG "-g -O0")
+        set(C_FLAGS_FASTDEBUG "-g -O1")
+        set(CXX_FLAGS_DEBUG "-g -O0")
+        set(CXX_FLAGS_FASTDEBUG "-g -O1")
+    endif()
+
+    set(C_FLAGS_RELEASE "-O3 -DNDEBUG")
+    set(CXX_FLAGS_RELEASE "-O3 -DNDEBUG")
+endif()
+
+set(C_FLAGS_PROFILE_GEN "${CXX_FLAGS_RELEASE} -fprofile-generate")
+set(C_FLAGS_PROFILE_BUILD "${CXX_FLAGS_RELEASE} -fprofile-use")
+set(CXX_FLAGS_PROFILE_GEN "${CXX_FLAGS_RELEASE} -fprofile-generate")
+set(CXX_FLAGS_PROFILE_BUILD "${CXX_FLAGS_RELEASE} -fprofile-use")
+
+# Set compile flags based on the build type.
+message("Configured for ${CMAKE_BUILD_TYPE} build (set with cmake 
-DCMAKE_BUILD_TYPE={release,debug,...})"
+)
+if("${CMAKE_BUILD_TYPE}" STREQUAL "DEBUG")
+    set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${C_FLAGS_DEBUG}")
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${CXX_FLAGS_DEBUG}")
+elseif("${CMAKE_BUILD_TYPE}" STREQUAL "RELWITHDEBINFO")
+
+elseif("${CMAKE_BUILD_TYPE}" STREQUAL "FASTDEBUG")

Review Comment:
   The `RELWITHDEBINFO` branch is empty, so that build type won’t receive any 
flags from this module (unlike DEBUG/FASTDEBUG/RELEASE/PROFILE_*). Add 
appropriate flags (typically optimization + debug symbols) or explicitly fall 
back to an existing configuration.



##########
cmake_modules/SetupCxxFlags.cmake:
##########
@@ -0,0 +1,434 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Check if the target architecture and compiler supports some special
+# instruction sets that would boost performance.
+include(CheckCXXCompilerFlag)
+# Get cpu architecture
+
+message(STATUS "System processor: ${CMAKE_SYSTEM_PROCESSOR}")
+
+# Support C11
+if(NOT DEFINED CMAKE_C_STANDARD)
+    set(CMAKE_C_STANDARD 11)
+endif()
+
+# This ensures that things like c++11 get passed correctly
+if(NOT DEFINED CMAKE_CXX_STANDARD)
+    set(CMAKE_CXX_STANDARD 17)
+endif()
+
+# We require a C++11 compliant compiler
+set(CMAKE_CXX_STANDARD_REQUIRED ON)
+
+set(CMAKE_CXX_EXTENSIONS OFF)
+
+# Build with -fPIC so that can static link our libraries into other people's
+# shared libraries
+set(CMAKE_POSITION_INDEPENDENT_CODE ON)
+
+string(TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE)
+
+set(UNKNOWN_COMPILER_MESSAGE
+    "Unknown compiler: ${CMAKE_CXX_COMPILER_VERSION} 
${CMAKE_CXX_COMPILER_VERSION}")

Review Comment:
   The “Unknown compiler” message repeats `${CMAKE_CXX_COMPILER_VERSION}` twice 
and omits the compiler ID, which makes the failure harder to diagnose. Consider 
including `${CMAKE_CXX_COMPILER_ID}` (and/or `${CMAKE_CXX_COMPILER}`) and only 
listing the version once.
   



##########
cmake_modules/DefineOptions.cmake:
##########
@@ -0,0 +1,310 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# 
https://github.com/apache/arrow/blob/main/cpp/cmake_modules/DefineOptions.cmake
+
+macro(set_option_category name)
+    set(PAIMON_OPTION_CATEGORY ${name})
+    list(APPEND "PAIMON_OPTION_CATEGORIES" ${name})
+endmacro()
+
+function(check_description_length name description)
+    foreach(description_line ${description})
+        string(LENGTH ${description_line} line_length)
+        if(${line_length} GREATER 80)
+            message(FATAL_ERROR "description for ${name} contained a\n\
+        line ${line_length} characters long!\n\
+        (max is 80). Split it into more lines with semicolons")
+        endif()
+    endforeach()
+endfunction()
+
+function(list_join lst glue out)
+    if("${${lst}}" STREQUAL "")
+        set(${out}
+            ""
+            PARENT_SCOPE)
+        return()
+    endif()
+
+    list(GET ${lst} 0 joined)
+    list(REMOVE_AT ${lst} 0)
+    foreach(item ${${lst}})
+        set(joined "${joined}${glue}${item}")
+    endforeach()
+    set(${out}
+        ${joined}
+        PARENT_SCOPE)
+endfunction()
+
+macro(define_option name description default)
+    check_description_length(${name} ${description})
+    list_join(description "\n" multiline_description)
+
+    option(${name} "${multiline_description}" ${default})
+
+    list(APPEND "PAIMON_${PAIMON_OPTION_CATEGORY}_OPTION_NAMES" ${name})
+    set("${name}_OPTION_DESCRIPTION" ${description})
+    set("${name}_OPTION_DEFAULT" ${default})
+    set("${name}_OPTION_TYPE" "bool")
+endmacro()
+
+macro(define_option_string name description default)
+    check_description_length(${name} ${description})
+    list_join(description "\n" multiline_description)
+
+    set(${name}
+        ${default}
+        CACHE STRING "${multiline_description}")
+
+    list(APPEND "PAIMON_${PAIMON_OPTION_CATEGORY}_OPTION_NAMES" ${name})
+    set("${name}_OPTION_DESCRIPTION" ${description})
+    set("${name}_OPTION_DEFAULT" "\"${default}\"")
+    set("${name}_OPTION_TYPE" "string")
+    set("${name}_OPTION_POSSIBLE_VALUES" ${ARGN})
+
+    list_join("${name}_OPTION_POSSIBLE_VALUES" "|" "${name}_OPTION_ENUM")
+    if(NOT ("${${name}_OPTION_ENUM}" STREQUAL ""))
+        set_property(CACHE ${name} PROPERTY STRINGS 
"${name}_OPTION_POSSIBLE_VALUES")
+    endif()
+endmacro()
+
+# Top level cmake dir
+if("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_CURRENT_SOURCE_DIR}")
+    #----------------------------------------------------------------------
+    set_option_category("Compile and link")
+
+    define_option_string(PAIMON_CXXFLAGS "Compiler flags to append when 
compiling Paimon"
+                         "")
+
+    define_option(PAIMON_BUILD_STATIC "Build static libraries" ON)
+
+    define_option(PAIMON_BUILD_SHARED "Build shared libraries" ON)
+
+    define_option(PAIMON_USE_CCACHE "Use ccache when compiling (if available)" 
ON)
+
+    #----------------------------------------------------------------------
+    set_option_category("Test")
+
+    define_option(PAIMON_BUILD_TESTS "Build the Paimon googletest unit tests" 
OFF)
+
+    if(PAIMON_BUILD_SHARED)
+        set(PAIMON_TEST_LINKAGE_DEFAULT "shared")
+    else()
+        set(PAIMON_TEST_LINKAGE_DEFAULT "static")
+    endif()
+
+    #----------------------------------------------------------------------
+    set_option_category("Lint")
+
+    define_option(PAIMON_VERBOSE_LINT
+                  "If off, 'quiet' flags will be passed to linting tools" OFF)
+
+    define_option(PAIMON_LINT_GIT_DIFF_MODE
+                  "If on, only git-diff files will be passed to linting tools" 
ON)
+
+    define_option_string(PAIMON_LINT_GIT_TARGET_COMMIT
+                         "target commit/branch for comparison in git diff" 
"origin/main")
+
+    define_option(PAIMON_GENERATE_COVERAGE "Build with C++ code coverage 
enabled" OFF)
+
+    #----------------------------------------------------------------------
+    set_option_category("Checks")
+
+    define_option(PAIMON_USE_ASAN "Enable Address Sanitizer checks" OFF)
+    define_option(PAIMON_USE_TSAN "Enable Thread Sanitizer checks" OFF)
+    define_option(PAIMON_USE_UBSAN "Enable Undefined Behaviour Sanitizer 
checks" OFF)
+
+    #----------------------------------------------------------------------
+    set_option_category("Advanced developer")
+
+    define_option(PAIMON_EXTRA_ERROR_CONTEXT
+                  "Compile with extra error context (line numbers, code)" OFF)
+
+    option(PAIMON_BUILD_CONFIG_SUMMARY_JSON
+           "Summarize build configuration in a JSON file" ON)
+
+    #----------------------------------------------------------------------
+    set_option_category("Dependencies")
+
+    define_option_string(PAIMON_DEPENDENCY_SOURCE
+                         "Default third-party dependency source"
+                         "AUTO"
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+
+    define_option_string(PAIMON_PACKAGE_PREFIX
+                         "Default prefix used to find third-party packages" "")
+
+    define_option(PAIMON_DEPENDENCY_USE_SHARED
+                  "Prefer shared libraries for system third-party packages" 
OFF)
+
+    define_option_string(Arrow_SOURCE
+                         "Dependency source for Apache Arrow; SYSTEM is 
unsupported"
+                         ""
+                         AUTO
+                         BUNDLED)
+    define_option_string(zstd_SOURCE
+                         "Dependency source for zstd"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(Snappy_SOURCE
+                         "Dependency source for Snappy"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(LZ4_SOURCE
+                         "Dependency source for LZ4"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(ZLIB_SOURCE
+                         "Dependency source for ZLIB"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(RE2_SOURCE
+                         "Dependency source for RE2"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(Protobuf_SOURCE
+                         "Dependency source for Protobuf"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(ORC_SOURCE
+                         "Dependency source for Apache ORC; SYSTEM is 
unsupported"
+                         ""
+                         AUTO
+                         BUNDLED)
+    define_option_string(fmt_SOURCE
+                         "Dependency source for fmt"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(RapidJSON_SOURCE
+                         "Dependency source for RapidJSON"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(TBB_SOURCE
+                         "Dependency source for TBB"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(glog_SOURCE
+                         "Dependency source for glog"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(Avro_SOURCE
+                         "Dependency source for Avro C++"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+    define_option_string(GTest_SOURCE
+                         "Dependency source for GoogleTest"
+                         ""
+                         AUTO
+                         BUNDLED
+                         SYSTEM)
+endif()
+
+macro(validate_config)
+    foreach(category ${PAIMON_OPTION_CATEGORIES})
+        set(option_names ${PAIMON_${category}_OPTION_NAMES})
+
+        foreach(name ${option_names})
+            set(possible_values ${${name}_OPTION_POSSIBLE_VALUES})
+            set(value "${${name}}")
+            if(possible_values)
+                if(NOT CMAKE_VERSION VERSION_LESS "3.3")
+                    if(NOT "${value}" IN_LIST possible_values)
+                        message(FATAL_ERROR "Configuration option ${name} got 
invalid value '${value}'. "
+                                            "Allowed values: 
${${name}_OPTION_ENUM}.")
+                    endif()
+                endif()
+            endif()
+        endforeach()
+
+    endforeach()
+endmacro()
+
+macro(config_summary_message)
+    message(STATUS 
"---------------------------------------------------------------------"
+    )
+    message(STATUS "Paimon version:                                 
${PAIMON_VERSION}")
+    message(STATUS)
+    message(STATUS "Build configuration summary:")
+
+    message(STATUS "  Generator: ${CMAKE_GENERATOR}")
+    message(STATUS "  Build type: ${CMAKE_BUILD_TYPE}")
+    message(STATUS "  Source directory: ${CMAKE_CURRENT_SOURCE_DIR}")
+    message(STATUS "  Install prefix: ${CMAKE_INSTALL_PREFIX}")
+    message(STATUS "  Install libdir: ${CMAKE_INSTALL_LIBDIR}")
+
+    if(${CMAKE_EXPORT_COMPILE_COMMANDS})

Review Comment:
   `if(${CMAKE_EXPORT_COMPILE_COMMANDS})` will become `if()` when the variable 
is empty/undefined, which is a CMake error. Use 
`if(CMAKE_EXPORT_COMPILE_COMMANDS)` (or `if(DEFINED 
CMAKE_EXPORT_COMPILE_COMMANDS AND CMAKE_EXPORT_COMPILE_COMMANDS)`) instead.
   



##########
cmake_modules/SetupCxxFlags.cmake:
##########
@@ -0,0 +1,434 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Check if the target architecture and compiler supports some special
+# instruction sets that would boost performance.
+include(CheckCXXCompilerFlag)
+# Get cpu architecture
+
+message(STATUS "System processor: ${CMAKE_SYSTEM_PROCESSOR}")
+
+# Support C11
+if(NOT DEFINED CMAKE_C_STANDARD)
+    set(CMAKE_C_STANDARD 11)
+endif()
+
+# This ensures that things like c++11 get passed correctly
+if(NOT DEFINED CMAKE_CXX_STANDARD)
+    set(CMAKE_CXX_STANDARD 17)
+endif()
+
+# We require a C++11 compliant compiler
+set(CMAKE_CXX_STANDARD_REQUIRED ON)
+
+set(CMAKE_CXX_EXTENSIONS OFF)
+
+# Build with -fPIC so that can static link our libraries into other people's
+# shared libraries
+set(CMAKE_POSITION_INDEPENDENT_CODE ON)
+
+string(TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE)
+
+set(UNKNOWN_COMPILER_MESSAGE
+    "Unknown compiler: ${CMAKE_CXX_COMPILER_VERSION} 
${CMAKE_CXX_COMPILER_VERSION}")
+
+# Defaults BUILD_WARNING_LEVEL to `CHECKIN`, unless CMAKE_BUILD_TYPE is
+# `RELEASE`, then it will default to `PRODUCTION`. The goal of defaulting to
+# `CHECKIN` is to avoid friction with long response time from CI.
+if(NOT BUILD_WARNING_LEVEL)
+    if("${CMAKE_BUILD_TYPE}" STREQUAL "RELEASE")
+        set(BUILD_WARNING_LEVEL PRODUCTION)
+    else()
+        set(BUILD_WARNING_LEVEL CHECKIN)
+    endif()
+endif(NOT BUILD_WARNING_LEVEL)
+string(TOUPPER ${BUILD_WARNING_LEVEL} BUILD_WARNING_LEVEL)
+
+message(STATUS "Paimon build warning level: ${BUILD_WARNING_LEVEL}")
+
+set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Werror")
+
+if("${BUILD_WARNING_LEVEL}" STREQUAL "CHECKIN")
+    # Pre-checkin builds
+    if(MSVC)
+        # 
https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warnings-by-compiler-version
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /W3")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4365")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4267")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /wd4838")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang" OR 
CMAKE_CXX_COMPILER_ID STREQUAL
+                                                          "Clang")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wextra")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wdocumentation")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wglobal-constructors")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-missing-braces")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-parameter")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unknown-warning-option")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-constant-logical-operand")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-deprecated-declarations")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-conversion")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-Wno-deprecated-declarations")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-sign-conversion")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-variable")
+    else()
+        message(FATAL_ERROR "${UNKNOWN_COMPILER_MESSAGE}")
+    endif()
+
+elseif("${BUILD_WARNING_LEVEL}" STREQUAL "EVERYTHING")
+    # Pedantic builds for fixing warnings
+    if(MSVC)
+        string(REPLACE "/W3" "" CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS}")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} /Wall")
+        # 
https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level
+        # /wdnnnn disables a warning where "nnnn" is a warning number
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang" OR 
CMAKE_CXX_COMPILER_ID STREQUAL
+                                                          "Clang")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Weverything")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-c++98-compat")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-c++98-compat-pedantic")
+    elseif(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wpedantic")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wextra")
+        set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wno-unused-parameter")
+    else()
+        message(FATAL_ERROR "${UNKNOWN_COMPILER_MESSAGE}")
+    endif()
+
+else()
+    # Production builds (warning are not treated as errors)

Review Comment:
   `-Werror` is applied unconditionally via `CXX_COMMON_FLAGS`, but the 
production-build comment states warnings are not treated as errors. Either 
remove/avoid `-Werror` for production builds (and only add it for 
CHECKIN/EVERYTHING), or update the comment/logic so behavior and documentation 
match.



##########
cmake_modules/BuildUtils.cmake:
##########
@@ -0,0 +1,367 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Borrowed the file from Apache Arrow:
+# https://github.com/apache/arrow/blob/main/cpp/cmake_modules/BuildUtils.cmake
+
+function(add_paimon_lib LIB_NAME)
+    set(options BUILD_SHARED BUILD_STATIC)
+    set(one_value_args SHARED_LINK_FLAGS)
+    set(multi_value_args
+        SOURCES
+        STATIC_LINK_LIBS
+        SHARED_LINK_LIBS
+        EXTRA_INCLUDES
+        PRIVATE_INCLUDES
+        DEPENDENCIES)
+    cmake_parse_arguments(ARG
+                          "${options}"
+                          "${one_value_args}"
+                          "${multi_value_args}"
+                          ${ARGN})
+    if(ARG_UNPARSED_ARGUMENTS)
+        message(SEND_ERROR "Error: unrecognized arguments: 
${ARG_UNPARSED_ARGUMENTS}")
+    endif()
+
+    # Allow overriding PAIMON_BUILD_SHARED and PAIMON_BUILD_STATIC
+    if(ARG_BUILD_SHARED)
+        set(BUILD_SHARED ${ARG_BUILD_SHARED})
+    else()
+        set(BUILD_SHARED ${PAIMON_BUILD_SHARED})
+    endif()
+    if(ARG_BUILD_STATIC)
+        set(BUILD_STATIC ${ARG_BUILD_STATIC})
+    else()
+        set(BUILD_STATIC ${PAIMON_BUILD_STATIC})
+    endif()
+
+    # Generate a single "objlib" from all C++ modules and link
+    # that "objlib" into each library kind, to avoid compiling twice
+    add_library(${LIB_NAME}_objlib OBJECT ${ARG_SOURCES})
+    if(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
+        target_compile_options(${LIB_NAME}_objlib PRIVATE 
-Wno-global-constructors)
+    endif()
+    # Necessary to make static linking into other shared libraries work 
properly
+    set_property(TARGET ${LIB_NAME}_objlib PROPERTY POSITION_INDEPENDENT_CODE 
1)
+    if(ARG_DEPENDENCIES)
+        # In static-only builds, some dependency names are still declared as
+        # *_shared. Map them to *_static when the shared target is unavailable.
+        set(_paimon_objlib_link_deps)
+        set(_paimon_objlib_deps)
+        foreach(_paimon_dep IN LISTS ARG_DEPENDENCIES)
+            set(_paimon_mapped_dep "${_paimon_dep}")
+            if(NOT TARGET ${_paimon_mapped_dep} AND _paimon_dep MATCHES 
"_shared$")
+                string(REGEX REPLACE "_shared$" "_static" _paimon_mapped_dep
+                                     "${_paimon_dep}")
+            endif()
+            if(TARGET ${_paimon_mapped_dep})
+                get_target_property(_paimon_is_internal_lib 
${_paimon_mapped_dep}
+                                    PAIMON_INTERNAL_LIBRARY)
+                list(APPEND _paimon_objlib_deps ${_paimon_mapped_dep})
+                if(NOT _paimon_is_internal_lib)
+                    list(APPEND _paimon_objlib_link_deps ${_paimon_mapped_dep})
+                endif()
+                unset(_paimon_is_internal_lib)
+            endif()
+            unset(_paimon_mapped_dep)
+        endforeach()
+        if(_paimon_objlib_deps)
+            add_dependencies(${LIB_NAME}_objlib ${_paimon_objlib_deps})
+        endif()
+        if(_paimon_objlib_link_deps)
+            target_link_libraries(${LIB_NAME}_objlib PRIVATE 
${_paimon_objlib_link_deps})
+        endif()
+        unset(_paimon_objlib_deps)
+        unset(_paimon_objlib_link_deps)
+        unset(_paimon_dep)
+    endif()
+    set(LIB_DEPS $<TARGET_OBJECTS:${LIB_NAME}_objlib>)
+    set(LIB_INCLUDES)
+    set(EXTRA_DEPS)
+
+    if(ARG_EXTRA_INCLUDES)
+        target_include_directories(${LIB_NAME}_objlib SYSTEM PUBLIC 
${ARG_EXTRA_INCLUDES})
+    endif()
+    if(ARG_PRIVATE_INCLUDES)
+        target_include_directories(${LIB_NAME}_objlib PRIVATE 
${ARG_PRIVATE_INCLUDES})
+    endif()
+
+    set(RUNTIME_INSTALL_DIR bin)
+
+    if(BUILD_SHARED)
+        add_library(${LIB_NAME}_shared SHARED ${LIB_DEPS})
+        if(EXTRA_DEPS)
+            add_dependencies(${LIB_NAME}_shared ${EXTRA_DEPS})
+        endif()
+
+        if(LIB_INCLUDES)
+            target_include_directories(${LIB_NAME}_shared SYSTEM
+                                       PUBLIC ${ARG_EXTRA_INCLUDES})
+        endif()
+
+        if(ARG_PRIVATE_INCLUDES)
+            target_include_directories(${LIB_NAME}_shared PRIVATE 
${ARG_PRIVATE_INCLUDES})
+        endif()
+
+        set_property(TARGET ${LIB_NAME}_shared PROPERTY 
PAIMON_INTERNAL_LIBRARY TRUE)
+        set_target_properties(${LIB_NAME}_shared
+                              PROPERTIES LIBRARY_OUTPUT_DIRECTORY
+                                         "${CMAKE_LIBRARY_OUTPUT_DIRECTORY}"
+                                         RUNTIME_OUTPUT_DIRECTORY
+                                         "${CMAKE_RUNTIME_OUTPUT_DIRECTORY}"
+                                         PDB_OUTPUT_DIRECTORY
+                                         "${CMAKE_LIBRARY_OUTPUT_DIRECTORY}"
+                                         LINK_FLAGS "${ARG_SHARED_LINK_FLAGS}"
+                                         OUTPUT_NAME ${LIB_NAME})
+
+        target_link_libraries(${LIB_NAME}_shared
+                              LINK_PUBLIC
+                              "$<BUILD_INTERFACE:${ARG_SHARED_LINK_LIBS}>"
+                              
"$<INSTALL_INTERFACE:${ARG_SHARED_INSTALL_INTERFACE_LIBS}>"
+                              LINK_PRIVATE
+                              "$<BUILD_INTERFACE:${ARG_STATIC_LINK_LIBS}>"
+                              ${ARG_SHARED_PRIVATE_LINK_LIBS})
+
+        target_link_libraries(${LIB_NAME}_shared
+                              PUBLIC 
"$<BUILD_INTERFACE:paimon_sanitizer_flags>")
+
+        target_link_options(${LIB_NAME}_shared
+                            PRIVATE
+                            -Wl,--exclude-libs,ALL
+                            -Wl,-Bsymbolic
+                            -Wl,-z,defs
+                            -Wl,--gc-sections)
+
+        install(TARGETS ${LIB_NAME}_shared ${INSTALL_IS_OPTIONAL}
+                EXPORT ${LIB_NAME}_targets
+                RUNTIME DESTINATION ${RUNTIME_INSTALL_DIR}
+                LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
+                ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR}
+                INCLUDES
+                DESTINATION ${CMAKE_INSTALL_INCLUDEDIR})
+    endif()
+
+    if(BUILD_STATIC)
+        add_library(${LIB_NAME}_static STATIC ${LIB_DEPS})
+        if(EXTRA_DEPS)
+            add_dependencies(${LIB_NAME}_static ${EXTRA_DEPS})
+        endif()
+
+        if(LIB_INCLUDES)
+            target_include_directories(${LIB_NAME}_static SYSTEM
+                                       PUBLIC ${ARG_EXTRA_INCLUDES})
+        endif()

Review Comment:
   `LIB_INCLUDES` is never set to a truthy value, so the include dirs are only 
attached to the OBJECT library. Usage requirements from an object library do 
not automatically propagate to consumers of the shared/static libraries built 
from `$<TARGET_OBJECTS:...>`, so downstream targets may miss required include 
paths. Fix by applying `ARG_EXTRA_INCLUDES` directly to 
`${LIB_NAME}_shared`/`${LIB_NAME}_static` (and/or removing the `LIB_INCLUDES` 
gate and using `if(ARG_EXTRA_INCLUDES)`).



##########
cmake_modules/san-config.cmake:
##########
@@ -0,0 +1,36 @@
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License. See accompanying LICENSE file.
+
+add_library(paimon_sanitizer_flags INTERFACE)
+
+if(PAIMON_USE_ASAN)
+    if(CMAKE_CXX_COMPILER_ID MATCHES "GNU|Clang")
+        target_compile_options(paimon_sanitizer_flags INTERFACE 
-fsanitize=address
+                                                                
-fno-omit-frame-pointer)
+        target_link_options(paimon_sanitizer_flags INTERFACE 
-fsanitize=address)
+        message(STATUS "Address Sanitizer enabled")
+    else()
+        message(WARNING "Address Sanitizer is only supported for GCC and Clang 
compilers")
+    endif()
+endif()
+
+if(PAIMON_USE_UBSAN)

Review Comment:
   `DefineOptions.cmake` introduces `PAIMON_USE_TSAN`, but this module only 
implements ASAN/UBSAN. If TSAN is intended to be supported, add a corresponding 
`if(PAIMON_USE_TSAN)` block (and ensure incompatible combinations like 
ASAN+TSAN are handled consistently).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to