This is an automated email from the ASF dual-hosted git repository.
leaves12138 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/paimon-cpp.git
The following commit(s) were added to refs/heads/main by this push:
new 3d3513d chore: Add development tooling configs, contribution
guidelines, and third-party license notices
3d3513d is described below
commit 3d3513d4e6d469a1a1e815c9f4d7bdcf617719b1
Author: lxy <[email protected]>
AuthorDate: Fri May 22 12:10:58 2026 +0800
chore: Add development tooling configs, contribution guidelines, and
third-party license notices
Add development tooling configuration, contribution documentation, issue
and PR templates, Dev Container templates, and bootstrap project metadata for
Apache Paimon C++.
---
.clang-format | 17 +-
.clang-tidy | 63 ++--
.cmake-format.py | 63 ++++
.codespell_ignore | 11 +
.devcontainer/Dockerfile.template | 61 ++++
.devcontainer/devcontainer.json.template | 50 +++
.github/ISSUE_TEMPLATE/bug_report.yaml | 70 ++++
.gitignore => .github/ISSUE_TEMPLATE/config.yaml | 23 +-
.github/ISSUE_TEMPLATE/feature.yaml | 62 ++++
.github/PULL_REQUEST_TEMPLATE.md | 29 ++
.gitignore | 48 ++-
.pre-commit-config.yaml | 88 +++++
CONTRIBUTING.md | 97 +++++
NOTICE | 2 +-
README.md | 68 +++-
docs/code-style.md | 429 +++++++++++++++++++++++
16 files changed, 1089 insertions(+), 92 deletions(-)
diff --git a/.clang-format b/.clang-format
index 49002f9..ae45c8e 100644
--- a/.clang-format
+++ b/.clang-format
@@ -17,16 +17,7 @@
---
Language: Cpp
BasedOnStyle: Google
-ColumnLimit: 90
-DerivePointerAlignment: false
-IncludeBlocks: Regroup
-IncludeCategories:
-- Regex: '^<.*/' # Like <boost/lexical_cast.hpp>
- Priority: 3
-- Regex: '^<.*\.' # Like <fcntl.h>
- Priority: 1
-- Regex: '^<' # Like <vector>
- Priority: 2
-- Regex: '^"' # Like "util/auth-util.h"
- Priority: 4
-IndentPPDirectives: AfterHash
+ColumnLimit: 100
+IndentWidth: 4
+AccessModifierOffset: -3
+AllowShortFunctionsOnASingleLine: Empty
diff --git a/.clang-tidy b/.clang-tidy
index cd2a79c..4aed1bf 100644
--- a/.clang-tidy
+++ b/.clang-tidy
@@ -16,30 +16,51 @@
# under the License.
---
Checks: |
- -*,
+ bugprone-argument-comment,
+ bugprone-assert-side-effect,
+ bugprone-bool-pointer-implicit-conversion,
+ bugprone-dangling-handle,
+ bugprone-dynamic-static-initializers,
+ bugprone-forward-declaration-namespace,
+ bugprone-inaccurate-erase,
+ bugprone-redundant-branch-condition,
+ bugprone-string-constructor,
+ bugprone-string-integer-assignment,
+ bugprone-suspicious-memset-usage,
+ bugprone-suspicious-realloc-usage,
+ bugprone-terminating-continue,
+ bugprone-throwing-static-initialization,
+ bugprone-unique-ptr-array-mismatch,
+ bugprone-unused-raii,
+ bugprone-use-after-move,
+ bugprone-virtual-near-miss,
+ misc-misleading-identifier,
+ misc-homoglyph,
clang-diagnostic-*,
+ -clang-diagnostic-global-constructors,
+ -clang-diagnostic-sign-compare,
clang-analyzer-*,
+ -clang-analyzer-alpha*,
+ -clang-analyzer-cplusplus.NewDeleteLeaks,
google-*,
+ -google-default-arguments,
modernize-*,
- readability-identifier-naming,
- readability-isolate-declaration,
- -modernize-use-nodiscard,
+ -modernize-return-braced-init-list,
+ -modernize-avoid-c-arrays,
-modernize-use-trailing-return-type,
-
+ -modernize-use-nodiscard,
+ -modernize-pass-by-value,
+# produce HeaderFilterRegex from cpp/build-support/lint_exclusions.txt with:
+# echo -n '^('; sed -e 's/*/\.*/g' cpp/build-support/lint_exclusions.txt | tr
'\n' '|'; echo ')$'
+HeaderFilterRegex: '^(?!.*third_party/).*'
CheckOptions:
- - key: google-readability-braces-around-statements.ShortStatementLines
- value: '1'
- - key: google-readability-function-size.StatementThreshold
- value: '800'
- - key: google-readability-namespace-comments.ShortNamespaceLines
- value: '10'
- - key: google-readability-namespace-comments.SpacesBeforeComments
- value: '2'
- - key: readability-identifier-naming.PrivateMemberSuffix
- value: '_'
- - key: readability-identifier-naming.ProtectedMemberSuffix
- value: '_'
- - key: modernize-use-scoped-lock.WarnOnSingleLocks
- value: 'false'
-
-HeaderFilterRegex: 'src/paimon'
+ - key:
google-readability-braces-around-statements.ShortStatementLines
+ value: '1'
+ - key: google-readability-function-size.StatementThreshold
+ value: '800'
+ - key: google-readability-namespace-comments.ShortNamespaceLines
+ value: '10'
+ - key: google-readability-namespace-comments.SpacesBeforeComments
+ value: '2'
+ - key: modernize-use-emplace.IgnoreImplicitConstructors
+ value: 1
diff --git a/.cmake-format.py b/.cmake-format.py
new file mode 100644
index 0000000..382e270
--- /dev/null
+++ b/.cmake-format.py
@@ -0,0 +1,63 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# How wide to allow formatted cmake files
+line_width = 90
+
+# How many spaces to tab for indent
+tab_size = 4
+
+# If a positional argument group contains more than this many arguments,
+# then force it to a vertical layout.
+max_pargs_hwrap = 4
+
+# If the statement spelling length (including space and parenthesis) is
+# smaller than this amount, then force reject nested layouts.
+# This value only comes into play when considering whether or not to nest
+# arguments below their parent. If the number of characters in the parent
+# is less than this value, we will not nest.
+min_prefix_chars = 32
+
+# If true, separate flow control names from their parentheses with a space
+separate_ctrl_name_with_space = False
+
+# If true, separate function names from parentheses with a space
+separate_fn_name_with_space = False
+
+# If a statement is wrapped to more than one line, than dangle the closing
+# parenthesis on it's own line
+dangle_parens = False
+
+# What style line endings to use in the output.
+line_ending = "unix"
+
+# Format command names consistently as 'lower' or 'upper' case
+command_case = "lower"
+
+# Format keywords consistently as 'lower' or 'upper' case
+keyword_case = "unchanged"
+# enable comment markup parsing and reflow
+enable_markup = False
+
+# If comment markup is enabled, don't reflow the first comment block in
+# eachlistfile. Use this to preserve formatting of your
+# copyright/licensestatements.
+first_comment_is_literal = True
+
+# If comment markup is enabled, don't reflow any comment block which
+# matches this (regex) pattern. Default is `None` (disabled).
+literal_comment_pattern = None
diff --git a/.codespell_ignore b/.codespell_ignore
new file mode 100644
index 0000000..1e76b91
--- /dev/null
+++ b/.codespell_ignore
@@ -0,0 +1,11 @@
+nin
+testin
+mor
+writen
+noo
+CHECKIN
+thirdparty
+VisitIn
+NotIn
+Collet
+convertor
diff --git a/.devcontainer/Dockerfile.template
b/.devcontainer/Dockerfile.template
new file mode 100644
index 0000000..c28a0e1
--- /dev/null
+++ b/.devcontainer/Dockerfile.template
@@ -0,0 +1,61 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Adapted from Apache Iceberg C++
+#
https://github.com/apache/iceberg-cpp/blob/main/.devcontainer/Dockerfile.template
+
+# This Dockerfile is used to build a development container for Paimon C++.
+# It is based on the Ubuntu image and installs necessary dependencies.
+
+FROM ubuntu:24.04
+
+# Install necessary packages
+RUN apt update && \
+ apt install -y \
+ bash-completion \
+ build-essential \
+ ccache \
+ cmake \
+ curl \
+ gcc \
+ g++ \
+ git \
+ htop \
+ libboost-all-dev \
+ libcurl4-openssl-dev \
+ libssl-dev \
+ libxml2-dev \
+ lsb-release \
+ meson \
+ ninja-build \
+ pkg-config \
+ python3 \
+ python3-pip \
+ vim \
+ wget \
+ sudo \
+ && rm -rf /var/lib/apt/lists/*
+
+# Add a user for development
+RUN useradd -ms /bin/bash paimon && \
+ usermod -aG sudo paimon && \
+ echo "paimon ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/paimon && \
+ chmod 0440 /etc/sudoers.d/paimon
+
+# Switch to the paimon user
+USER paimon
+WORKDIR /home/paimon
diff --git a/.devcontainer/devcontainer.json.template
b/.devcontainer/devcontainer.json.template
new file mode 100644
index 0000000..856a89d
--- /dev/null
+++ b/.devcontainer/devcontainer.json.template
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+// Adapted from Apache Iceberg C++
+//
https://github.com/apache/iceberg-cpp/blob/main/.devcontainer/devcontainer.json.template
+
+{
+ "name": "Paimon CPP Dev Container",
+ "build": {
+ "dockerfile": "Dockerfile"
+ },
+ "runArgs": [
+ "--ulimit=core=-1",
+ "--cap-add=SYS_ADMIN",
+ "--cap-add=SYS_PTRACE",
+ "--cap-add=PERFMON",
+ "--security-opt",
+ "seccomp=unconfined",
+ "--privileged"
+ ],
+ "mounts": [
+
"source=${localEnv:HOME}/.ssh,target=/home/paimon/.ssh,type=bind,readonly"
+ ],
+ "customizations": {
+ "vscode": {
+ "extensions": [
+ "eamodio.gitlens"
+ ],
+ "settings": {
+ "editor.formatOnSave": true
+ }
+ }
+ }
+}
diff --git a/.github/ISSUE_TEMPLATE/bug_report.yaml
b/.github/ISSUE_TEMPLATE/bug_report.yaml
new file mode 100644
index 0000000..c17bd20
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/bug_report.yaml
@@ -0,0 +1,70 @@
+################################################################################
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+################################################################################
+
+# Adapted from Apache Paimon
+#
https://github.com/apache/paimon/blob/master/.github/ISSUE_TEMPLATE/bug-report.yml
+
+name: Bug report
+description: Problems with the software
+title: "[Bug] "
+labels: ["bug"]
+body:
+ - type: markdown
+ attributes:
+ value: |
+ Thank you very much for your feedback!
+ - type: checkboxes
+ attributes:
+ label: Search before asking
+ description: >
+ Please search [issues](https://github.com/apache/paimon-cpp/issues) to
check if your issue has already been reported.
+ options:
+ - label: >
+ I searched in the
[issues](https://github.com/apache/paimon-cpp/issues) and found nothing similar.
+ required: true
+ - type: textarea
+ attributes:
+ label: Paimon-cpp version
+ description: >
+ Please provide the version of Paimon-cpp you are using. If you are
using the master branch, please provide the commit id.
+ validations:
+ required: true
+ - type: textarea
+ attributes:
+ label: Minimal reproduce step
+ description: Please try to give reproducing steps to facilitate quick
location of the problem.
+ validations:
+ required: true
+ - type: textarea
+ attributes:
+ label: What doesn't meet your expectations?
+ validations:
+ required: true
+ - type: textarea
+ attributes:
+ label: Anything else?
+ - type: checkboxes
+ attributes:
+ label: Are you willing to submit a PR?
+ description: >
+ We look forward to the community of developers or users helping solve
Paimon-cpp problems together. If you are willing to submit a PR to fix this
problem, please check the box.
+ options:
+ - label: I'm willing to submit a PR!
+ - type: markdown
+ attributes:
+ value: "Thanks for completing our form!"
diff --git a/.gitignore b/.github/ISSUE_TEMPLATE/config.yaml
similarity index 78%
copy from .gitignore
copy to .github/ISSUE_TEMPLATE/config.yaml
index 63993dc..8085d31 100644
--- a/.gitignore
+++ b/.github/ISSUE_TEMPLATE/config.yaml
@@ -15,25 +15,4 @@
# specific language governing permissions and limitations
# under the License.
-build/
-cmake-build/
-cmake-build-debug/
-cmake-build-release/
-.DS_Store
-
-# intellij files
-.idea
-
-# vscode files
-.vscode
-.cache
-
-# AI
-AGENTS.md
-CLAUDE.md
-GEMINI.md
-.claude/
-.cursor/
-.gemini/
-.github/prompts/
-.kiro/
+blank_issues_enabled: false
diff --git a/.github/ISSUE_TEMPLATE/feature.yaml
b/.github/ISSUE_TEMPLATE/feature.yaml
new file mode 100644
index 0000000..af26999
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/feature.yaml
@@ -0,0 +1,62 @@
+################################################################################
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+################################################################################
+
+# Adapted from Apache Paimon
+#
https://github.com/apache/paimon/blob/master/.github/ISSUE_TEMPLATE/feature.yml
+
+name: Feature
+description: Add new feature, improve code, and more
+title: "[Feature] "
+labels: [ "enhancement" ]
+body:
+ - type: markdown
+ attributes:
+ value: |
+ Thank you very much for your feature proposal!
+ - type: checkboxes
+ attributes:
+ label: Search before asking
+ description: >
+ Please search [issues](https://github.com/apache/paimon-cpp/issues) to
check if your issue has already been reported.
+ options:
+ - label: >
+ I searched in the
[issues](https://github.com/apache/paimon-cpp/issues) and found nothing similar.
+ required: true
+ - type: textarea
+ attributes:
+ label: Motivation
+ description: Describe the motivations for this feature, like how it
fixes the problem you meet.
+ validations:
+ required: true
+ - type: textarea
+ attributes:
+ label: Solution
+ description: Describe the proposed solution and add related materials
like links if any.
+ - type: textarea
+ attributes:
+ label: Anything else?
+ - type: checkboxes
+ attributes:
+ label: Are you willing to submit a PR?
+ description: >
+ We look forward to the community of developers or users helping
develop Paimon-cpp features together. If you are willing to submit a PR to
implement the feature, please check the box.
+ options:
+ - label: I'm willing to submit a PR!
+ - type: markdown
+ attributes:
+ value: "Thanks for completing our form!"
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
new file mode 100644
index 0000000..1320da7
--- /dev/null
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,29 @@
+<!-- Please specify the module before the PR name: feat: ... or fix: ... -->
+
+### Purpose
+
+<!-- Linking this pull request to the issue -->
+Linked issue: close #xxx
+
+<!-- What is the purpose of the change -->
+
+### Tests
+
+<!-- List UT and IT cases to verify this change -->
+
+### API and Format
+
+<!-- Does this change affect API in include dir or storage format or protocol
-->
+
+### Documentation
+
+<!-- Does this change introduce a new feature -->
+
+### Generative AI tooling
+
+<!--
+If generative AI tooling has been used in the process of authoring this patch,
please include the
+phrase: 'Generated-by: ' followed by the name of the tool and its version.
+If no, write 'No'.
+Please refer to the [ASF Generative Tooling
Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
+-->
diff --git a/.gitignore b/.gitignore
index 63993dc..c3e1d78 100644
--- a/.gitignore
+++ b/.gitignore
@@ -15,25 +15,39 @@
# specific language governing permissions and limitations
# under the License.
-build/
-cmake-build/
-cmake-build-debug/
-cmake-build-release/
-.DS_Store
+# Build directories
+build
+build-release
+build-debug
+output
-# intellij files
+# IDE settings
.idea
-
-# vscode files
.vscode
.cache
-# AI
-AGENTS.md
-CLAUDE.md
-GEMINI.md
-.claude/
-.cursor/
-.gemini/
-.github/prompts/
-.kiro/
+# Devcontainer configuration
+.devcontainer/*
+!.devcontainer/*.template
+
+# Temporary and backup files
+*~
+*.folded
+*.pyc
+__pycache__
+
+# Performance analysis and profiling files
+*.perf*
+perf.*
+*lcov.info
+profile.json
+FlameGraph
+
+# Shared objects and binary files
+*.so
+
+# Images
+*.svg
+
+# Third party dependencies archives
+third_party/*.tar.gz
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
new file mode 100644
index 0000000..ecc7017
--- /dev/null
+++ b/.pre-commit-config.yaml
@@ -0,0 +1,88 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Adapted from Apache Iceberg C++
+# https://github.com/apache/iceberg-cpp/blob/main/.pre-commit-config.yaml
+
+# To use this, install the python package `pre-commit` and
+# run once `pre-commit install`. This will setup a git pre-commit-hook
+# that is executed on each commit and will report the linting problems.
+# To run all hooks on all files use `pre-commit run -a`
+
+repos:
+ - repo: https://github.com/pre-commit/pre-commit-hooks
+ rev: v5.0.0
+ hooks:
+ - id: trailing-whitespace
+ exclude: (^test/test_data/.*|^third_party/.*)
+ - id: end-of-file-fixer
+ exclude: (^test/test_data/.*|^third_party/.*)
+ - id: check-yaml
+ - id: check-added-large-files
+ args: ["--maxkb=5120", "--enforce-all"]
+
+ - repo: https://github.com/pre-commit/mirrors-clang-format
+ rev: v20.1.8
+ hooks:
+ - id: clang-format
+ exclude_types: [json]
+ files: \.(c|cpp|h|hpp|cc|cxx)$
+ exclude: (^test/test_data/.*|^third_party/.*)
+
+ - repo: https://github.com/cheshirekow/cmake-format-precommit
+ rev: v0.6.13
+ hooks:
+ - id: cmake-format
+ exclude: (^test/test_data/.*|^third_party/.*)
+
+ - repo: https://github.com/codespell-project/codespell
+ rev: v2.4.1
+ hooks:
+ - id: codespell
+ exclude: (^test/test_data/.*|^third_party/.*|fix_includes.py)
+ args: ["--ignore-words", ".codespell_ignore"]
+ - repo: https://github.com/sphinx-contrib/sphinx-lint
+ rev: v0.9.1
+ hooks:
+ - id: sphinx-lint
+ alias: docs
+ files: ^docs/source
+ exclude: ^docs/source/python/generated
+ args: [
+ '--enable',
+ 'all',
+ '--disable',
+ 'dangling-hyphen,line-too-long',
+ ]
+
+ - repo: https://github.com/cpplint/cpplint
+ rev: 2.0.2
+ hooks:
+ - id: cpplint
+ alias: cpp
+ name: C++ Lint
+ args:
+ - "--quiet"
+ - "--verbose=2"
+ -
"--filter=-whitespace/line_length,-whitespace/parens,-whitespace/indent_namespace,-build/include_what_you_use,-build/c++11,-build/c++17,-readability/nolint,-runtime/references"
+
+ types_or:
+ - c++
+ exclude: |
+ (?x)^(
+ (third_party)/.*
+ )$
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..5aa4531
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,97 @@
+<!--
+ ~ Licensed to the Apache Software Foundation (ASF) under one
+ ~ or more contributor license agreements. See the NOTICE file
+ ~ distributed with this work for additional information
+ ~ regarding copyright ownership. The ASF licenses this file
+ ~ to you under the Apache License, Version 2.0 (the
+ ~ "License"); you may not use this file except in compliance
+ ~ with the License. You may obtain a copy of the License at
+ ~
+ ~ http://www.apache.org/licenses/LICENSE-2.0
+ ~
+ ~ Unless required by applicable law or agreed to in writing,
+ ~ software distributed under the License is distributed on an
+ ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ ~ KIND, either express or implied. See the License for the
+ ~ specific language governing permissions and limitations
+ ~ under the License.
+-->
+
+# Contributing to Apache Paimon-cpp
+
+Thank you for your interest in contributing to paimon-cpp! This document
explains how to get started.
+
+---
+
+## Reporting Issues
+
+If you find a bug or want to request a feature, please open an
[issue](https://github.com/apache/paimon-cpp/issues/new). Include as much
detail as possible — steps to reproduce, expected vs. actual behavior,
environment info, etc.
+
+---
+
+## Submitting Pull Requests
+
+1. **Fork** the repository and create a feature branch from `main`.
+2. Make your changes following the [Code Style Guide](docs/code-style.md).
+3. Add or update tests for the functionality you changed.
+4. Ensure all checks pass.
+5. Open a pull request against `main`. Fill in the [PR
template](.github/PULL_REQUEST_TEMPLATE.md).
+
+### PR Checklist
+
+Before submitting, please verify:
+
+- [ ] Code compiles without warnings under `-Wall` (enabled by default in the
build system).
+- [ ] `pre-commit run --all-files` passes.
+- [ ] New / modified public APIs are marked with `PAIMON_EXPORT`.
+- [ ] Every new file has the Apache 2.0 license header.
+- [ ] Error handling uses `Status` / `Result<T>` — no exceptions.
+- [ ] New utility code checks for existing helpers before reinventing (see the
[utility reference](docs/code-style.md#reuse-existing-utilities)).
+- [ ] Tests are added or updated for the changed functionality.
+- [ ] PR description follows the [template](.github/PULL_REQUEST_TEMPLATE.md).
+
+---
+
+## Development Setup
+
+### Prerequisites
+
+- **C++17** compatible compiler (GCC recommended)
+- **CMake** ≥ 3.16
+- **Python 3** (for linting scripts and pre-commit)
+- **git-lfs** (the repository uses Git LFS for large files)
+
+---
+
+## Code Style
+
+Please read the full [Code Style Guide](docs/code-style.md) before writing
code. Key highlights:
+
+- **Formatting**: Google C++ Style base, 100-column limit, 4-space indent. Let
clang-format handle it.
+- **Error handling**: Use `Status` / `Result<T>` and project macros
(`PAIMON_RETURN_NOT_OK`, `PAIMON_ASSIGN_OR_RAISE`). No exceptions.
+- **Naming**: PascalCase for classes and methods, snake_case for variables,
trailing `_` for members.
+- **Headers**: `#pragma once`, includes ordered by category, Apache 2.0
license header on every file.
+- **Factory pattern**: Use `static Create()` + private constructor when
construction can fail.
+- **Testing**: Google Test, `ASSERT_*` preferred over `EXPECT_*`, test files
named `*_test.cpp` next to source.
+
+---
+
+## Dev Containers
+
+We provide Dev Container configuration file templates for VS Code:
+
+```bash
+cd .devcontainer
+cp Dockerfile.template Dockerfile
+cp devcontainer.json.template devcontainer.json
+```
+
+Then select **Dev Containers: Reopen in Container** from VS Code's Command
Palette.
+
+If you make improvements that could benefit all developers, please update the
template files and submit a pull request.
+
+---
+
+## License
+
+By contributing to paimon-cpp, you agree that your contributions will be
licensed under the [Apache License, Version
2.0](http://www.apache.org/licenses/LICENSE-2.0).
diff --git a/NOTICE b/NOTICE
index 94d4314..85fe26f 100644
--- a/NOTICE
+++ b/NOTICE
@@ -1,5 +1,5 @@
Apache Paimon C++
-Copyright 2025-2026 The Apache Software Foundation
+Copyright 2024-2026 The Apache Software Foundation
This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
diff --git a/README.md b/README.md
index 52daadf..44b926e 100644
--- a/README.md
+++ b/README.md
@@ -17,38 +17,70 @@
~ under the License.
-->
-# Apache Paimon C++ Client
+# Apache Paimon-cpp
-A C++ client library for [Apache Paimon](https://paimon.apache.org/) — a
streaming data lake platform that supports high-speed data ingestion, changelog
tracking, and efficient real-time analytics.
+[](https://www.apache.org/licenses/LICENSE-2.0.html)
-## Overview
+Paimon-cpp is the C++ implementation of [Apache
Paimon](https://paimon.apache.org).
+It provides native, high-performance, and extensible access to the Paimon lake
format for C++ engines and services without JVM dependencies.
-This project provides a native C++ SDK for reading and writing Apache Paimon
tables, enabling high-performance integration for C/C++ applications without
JVM dependencies.
+Background and documentation are available at
[paimon.apache.org](https://paimon.apache.org).
-## Features (Planned)
+## Status
-- Read Paimon tables (primary-key tables and append-only tables)
-- Write to Paimon tables with streaming and batch modes
-- Schema evolution support
-- Snapshot management and time travel queries
-- Filesystem and object storage access (local, HDFS, S3, OSS)
+Paimon-cpp is currently undergoing repository migration. The original
repository is hosted at
[github.com/alibaba/paimon-cpp](https://github.com/alibaba/paimon-cpp/), and
the codebase is being migrated incrementally to the Apache Paimon community
repository.
-## Requirements
+## Features
-- C++17 or later
-- CMake 3.16+
+Paimon-cpp currently provides:
+
+- **Write**: append table and primary key table write support with compaction.
+- **Commit**: append table commit support for simple append-only tables.
+- **Scan**: batch and stream scan for append tables and primary key tables
without changelog.
+- **Read**: append table read, primary key table read with deletion vector,
and primary key table merge-on-read.
+- **Arrow integration**: batch read and write interfaces based on the [Arrow
Columnar In-Memory Format](https://arrow.apache.org).
+- **File systems**: file system abstraction with built-in local and Jindo file
system support.
+- **File formats**: file format abstraction with built-in ORC, Parquet, and
Avro support.
+- **Runtime utilities**: memory pool and thread pool abstractions with default
implementations.
+- **AI-Oriented Features**: supports RowTracking and DataEvolution mode and
provides Global Index capabilities including bitmap index, B-tree index,
DiskANN-based vector search with Lumina, and Lucene-based full-text search.
+- **Compatibility**: compatibility with Apache Paimon Java format and
communication protocols, including commit messages, data splits, and manifests.
+
+The current implementation supports the `x86_64` architecture.
## Building
+> **Note:** The build system and source files are being migrated
incrementally. The instructions below will work once the CMake build files and
source code are available in this repository.
+
+If you do not have `git-lfs` installed, install it first.
+
```bash
-mkdir build && cd build
-cmake ..
-make -j$(nproc)
+git clone https://github.com/apache/paimon-cpp.git
+cd paimon-cpp
+git lfs pull
+```
+
+Build with CMake:
+
+```bash
+cmake -B build
+cmake --build build
+```
+### Dev Containers
+
+We provide Dev Container configuration file templates.
+
+To use a Dev Container as your development environment, follow the steps
below, then select `Dev Containers: Reopen in Container` from VS Code's Command
Palette.
+
+```
+cd .devcontainer
+cp Dockerfile.template Dockerfile
+cp devcontainer.json.template devcontainer.json
```
-## Contributing
+## Collaboration
-We welcome contributions! Please see the [Apache Paimon community
page](https://paimon.apache.org/community/contribution-guide/) for guidelines.
+Paimon-cpp is an active open-source project and we welcome people who want to
contribute or share good ideas!
+Before contributing, please read the [Contributing Guide](CONTRIBUTING.md) and
the [Code Style Guide](docs/code-style.md). You are encouraged to check out our
[documentation](https://alibaba.github.io/paimon-cpp/).
## License
diff --git a/docs/code-style.md b/docs/code-style.md
new file mode 100644
index 0000000..b413cb4
--- /dev/null
+++ b/docs/code-style.md
@@ -0,0 +1,429 @@
+<!--
+ ~ Licensed to the Apache Software Foundation (ASF) under one
+ ~ or more contributor license agreements. See the NOTICE file
+ ~ distributed with this work for additional information
+ ~ regarding copyright ownership. The ASF licenses this file
+ ~ to you under the Apache License, Version 2.0 (the
+ ~ "License"); you may not use this file except in compliance
+ ~ with the License. You may obtain a copy of the License at
+ ~
+ ~ http://www.apache.org/licenses/LICENSE-2.0
+ ~
+ ~ Unless required by applicable law or agreed to in writing,
+ ~ software distributed under the License is distributed on an
+ ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ ~ KIND, either express or implied. See the License for the
+ ~ specific language governing permissions and limitations
+ ~ under the License.
+-->
+
+# Paimon C++ Code Style Guide
+
+This document defines the coding conventions for the paimon-cpp project. All
pull requests are expected to follow these rules. Automated tooling
(clang-format, clang-tidy, cpplint, pre-commit) enforces many of them.
+
+---
+
+## Language Standard
+
+- **C++17** is the target standard.
+- Do **not** use C++20 or later features.
+- Only the **x86_64** architecture is currently supported.
+
+---
+
+## Formatting
+
+Formatting is based on **Google C++ Style** with the following overrides
(defined in `.clang-format`):
+
+| Setting | Value |
+|---------|-------|
+| `ColumnLimit` | 100 |
+| `IndentWidth` | 4 |
+| `AccessModifierOffset` | -3 |
+| `AllowShortFunctionsOnASingleLine` | Empty |
+
+---
+
+## Naming Conventions
+
+| Category | Style | Example |
+|----------|-------|---------|
+| Class / Struct | PascalCase | `TableScan`, `FileUtils` |
+| Method | PascalCase | `CreatePlan()`, `ReadFile()` |
+| Member variable | snake_case with trailing `_` | `schema_id_`,
`commit_user_` |
+| Local variable / parameter | snake_case | `table_path`, `json_str` |
+| Constant (new code) | `k` prefix + PascalCase | `kFieldVersion`,
`kMaxRetryCount` |
+| Constant (existing code) | Match surrounding style | `FIELD_VERSION` if that
is the local convention |
+| Namespace | lowercase | `paimon`, `paimon::test` |
+| File name | snake_case | `table_scan.cpp`, `file_utils.h` |
+| Test file | `*_test.cpp` next to source | `table_scan_test.cpp` |
+
+---
+
+## File Layout
+
+### License Header
+
+Every file **must** start with the Apache 2.0 license header.
+
+```cpp
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+```
+
+### Header Guard
+
+Use `#pragma once`. Do **not** use `#ifndef` / `#define` guards.
+
+### Include Order
+
+Group includes in the following order, separated by blank lines:
+
+1. **Corresponding header** — e.g. `snapshot.cpp` includes
`paimon/core/snapshot.h` first.
+2. **C standard headers** — `<cstdint>`, `<cstring>`, etc.
+3. **C++ standard headers** — `<string>`, `<vector>`, `<memory>`, etc.
+4. **Third-party headers** — `fmt/format.h`, `rapidjson/document.h`, etc.
+5. **Project headers** — `paimon/status.h`, `paimon/result.h`, etc.
+
+### Namespace
+
+All code lives inside `namespace paimon { }`. Close with a comment:
+
+```cpp
+namespace paimon {
+// ...
+} // namespace paimon
+```
+
+Test code uses `namespace paimon::test`.
+
+---
+
+## Error Handling
+
+### `Status` / `Result<T>` — No Exceptions
+
+All fallible functions return `Status` (no value) or `Result<T>` (with value).
**Exceptions are not used** for error propagation in production code.
+
+### Error Propagation Macros
+
+| Macro | Purpose |
+|-------|---------|
+| `PAIMON_RETURN_NOT_OK(status)` | Propagate a `Status` error |
+| `PAIMON_ASSIGN_OR_RAISE(lhs, rexpr)` | Extract value from `Result<T>`,
return on error |
+| `PAIMON_RETURN_NOT_OK_FROM_ARROW(arrow_status)` | Bridge Apache Arrow
`Status` |
+| `PAIMON_ASSIGN_OR_RAISE_FROM_ARROW(lhs, rexpr)` | Bridge Apache Arrow
`Result` |
+
+### `PAIMON_ASSIGN_OR_RAISE` and `PAIMON_ASSIGN_OR_RAISE_FROM_ARROW` — Always
Use Explicit Types
+
+```cpp
+// ✅ Good — explicit type
+PAIMON_ASSIGN_OR_RAISE(std::unique_ptr<TableScan> scan,
+ TableScan::Create(std::move(ctx)));
+
+// ❌ Bad — auto
+PAIMON_ASSIGN_OR_RAISE(auto scan, TableScan::Create(std::move(ctx)));
+```
+
+If you need a `shared_ptr` from a function returning `unique_ptr`, write the
target type directly — the macro already moves:
+
+```cpp
+PAIMON_ASSIGN_OR_RAISE(std::shared_ptr<T> ret, FuncReturningUniquePtr());
+```
+
+---
+
+## Class Design
+
+### Factory Pattern — `static Create()` + Private Constructor
+
+When construction can fail, use a factory method:
+
+```cpp
+class TableScan {
+ public:
+ static Result<std::unique_ptr<TableScan>>
Create(std::unique_ptr<ScanContext> context);
+ ~TableScan() = default;
+
+ private:
+ explicit TableScan(std::unique_ptr<ScanContext> context);
+};
+```
+
+- `Create()` returns `Result<std::unique_ptr<T>>`.
+- The private constructor does only trivial member assignment.
+- All fallible initialization goes in `Create()`, **after** construction.
+- If construction cannot fail (POD, data-only classes), a public constructor
is fine.
+
+### Utility Classes
+
+Static-only utility classes delete their constructors and deconstructors:
+
+```cpp
+class StringUtils {
+ public:
+ StringUtils() = delete;
+ ~StringUtils() = delete;
+
+ static std::vector<std::string> Split(const std::string& s, char delim);
+ // ...
+};
+```
+
+### Interface Classes
+
+Pure-virtual base classes use a virtual default destructor:
+
+```cpp
+virtual ~ClassName() = default;
+```
+
+---
+
+## Smart Pointers & Memory
+
+- Prefer `std::unique_ptr` for sole ownership.
+- Use `std::shared_ptr` only when shared ownership is genuinely needed.
+- **No raw `new` / `delete`** except inside factory `Create()` methods that
call a private constructor.
+- Use `std::optional<T>` for optional values.
+- Use `ScopeGuard` (in `src/paimon/common/utils/`) for RAII cleanup.
+
+---
+
+## Comments & Documentation
+
+- Use `///` (Doxygen) for classes and public methods.
+- Use `@param` and `@return` for parameters and return values.
+- Do **not** comment obvious code.
+- Inline comments go **above** the line, not at the end.
+
+```cpp
+/// Create a scan plan from the current snapshot.
+///
+/// @param snapshot_id The snapshot to scan.
+/// @return A Plan on success, or an error Status.
+Result<std::shared_ptr<Plan>> CreatePlan(int64_t snapshot_id);
+```
+
+### TODO Comments
+
+TODOs are allowed but **must** include an author tag:
+
+```cpp
+// TODO(alice): Support changelog manifest list size validation
+```
+
+---
+
+## String Formatting
+
+Use `fmt::format()` for string formatting. Do **not** concatenate with
`std::to_string()` + `+`.
+
+```cpp
+// ✅ Good
+auto msg = fmt::format("Snapshot {} not found in {}", snapshot_id, path);
+
+// ❌ Bad
+auto msg = "Snapshot " + std::to_string(snapshot_id) + " not found in " + path;
+```
+
+---
+
+## Public API Visibility
+
+Mark all public API symbols with `PAIMON_EXPORT`:
+
+```cpp
+class PAIMON_EXPORT Timestamp { ... };
+```
+
+---
+
+## Reuse Existing Utilities
+
+Before writing new helper code, check whether the project already provides
what you need. The tables below list the most commonly used utilities.
+
+### General Utilities (`src/paimon/common/utils/`)
+
+| Class | Purpose |
+|-------|---------|
+| `StringUtils` | Split, Replace, StartsWith, EndsWith, Trim, etc. |
+| `ObjectUtils` | Collection helpers (ContainsAll, etc.) |
+| `DateTimeUtils` | Date / time conversions |
+| `RapidJsonUtil` | JSON serialization / deserialization |
+| `StreamUtils` | Stream read / write helpers |
+| `OptionsUtils` | Configuration parsing |
+| `PathUtil` | Path joining and manipulation |
+| `Preconditions` | Precondition checks |
+| `ScopeGuard` | RAII cleanup guard |
+| `BloomFilter` / `BloomFilter64` | Bloom filters |
+| `MurmurHashUtils` | Hashing |
+| `CRC32C` | Checksums |
+| `BitSet` | Bit set |
+| `RoaringBitmap32` / `RoaringBitmap64` | Roaring bitmaps |
+| `SerializationUtils` | Serialization helpers |
+| `DataConverterUtils` | Data conversion |
+| `DecimalUtils` | Decimal arithmetic |
+| `FieldTypeUtils` | Field type helpers |
+| `BinPacking` | Bin-packing algorithm |
+| `ConcurrentHashMap` | Thread-safe hash map |
+| `ThreadsafeQueue` | Thread-safe queue |
+| `LinkedHashMap` | Insertion-ordered hash map |
+| `LongCounter` | Atomic counter |
+| `UUID` | UUID generation |
+
+### Arrow Utilities (`src/paimon/common/utils/arrow/`)
+
+| File | Purpose |
+|------|---------|
+| `ArrowUtils` | Arrow format helpers |
+| `MemUtils` | Arrow memory helpers |
+| `status_utils.h` | Arrow ↔ Paimon status bridge macros |
+
+### Core Utilities (`src/paimon/core/utils/`)
+
+| Class | Purpose |
+|-------|---------|
+| `FileUtils` | File listing, version file operations |
+| `FileStorePathFactory` | Path factory for data / manifest files |
+| `SnapshotManager` | Snapshot management |
+| `TagManager` | Tag management |
+| `BranchManager` | Branch management |
+| `PartitionPathUtils` | Partition path helpers |
+| `FieldMapping` | Field mapping |
+| `FieldsComparator` | Field comparison |
+| `PrimaryKeyTableUtils` | Primary-key table helpers |
+
+### Concurrency (`src/paimon/common/executor/`)
+
+| Function | Purpose |
+|----------|---------|
+| `CollectAll()` | Collect results from multiple futures |
+| `Wait()` | Wait for multiple void futures |
+
+### Testing Utilities (`src/paimon/testing/utils/`)
+
+| Class | Purpose |
+|-------|---------|
+| `TestHelper` | End-to-end helper: create tables, write & commit data, scan &
read results |
+| `DataGenerator` | Generate `RecordBatch` data, split by partition/bucket,
extract partial rows |
+| `BinaryRowGenerator` | Generate `BinaryRow`, `InternalRow`, `SimpleStats`,
and `BinaryArray` for tests |
+| `KeyValueChecker` | Assert expected vs. actual `KeyValue` results in
primary-key table tests |
+| `ReadResultCollector` | Collect `KeyValue` or Arrow `ChunkedArray` results
from a `BatchReader` |
+| `DictArrayConverter` | Convert Arrow dictionary-encoded arrays to plain
arrays |
+| `UniqueTestDirectory` | RAII helper that creates a unique temp directory and
cleans it up on destruction |
+| `TimezoneGuard` | RAII guard that sets `TZ` environment variable and
restores it on destruction |
+| `IOExceptionHelper` | Inject I/O errors via `IOHook` for fault-injection
testing |
+| `testharness.h` | Test macros: `ASSERT_OK`, `ASSERT_NOK`,
`ASSERT_NOK_WITH_MSG`, `ASSERT_RAISES` |
+
+### Mock Objects (`src/paimon/testing/mock/`)
+
+| Class | Purpose |
+|-------|---------|
+| `MockFileSystem` / `MockFileSystemFactory` | Mock file system and its
factory for unit tests |
+| `MockFileFormat` / `MockFileFormatFactory` | Mock file format and its
factory |
+| `MockFormatReaderBuilder` / `MockFormatWriterBuilder` | Mock reader/writer
builders |
+| `MockFormatWriter` / `MockFileBatchReader` | Mock format writer and batch
reader |
+| `MockIndexPathFactory` | Mock index path factory |
+| `MockKeyValueDataFileRecordReader` | Mock key-value data file reader holding
in-memory data |
+| `MockStatsExtractor` | Mock statistics extractor |
+
+---
+
+## Testing
+
+- Test files live **next to** the source file they test, named `*_test.cpp`.
+- Use **Google Test** (`gtest`).
+- Test classes go in `namespace paimon::test`.
+- Use project test macros: `ASSERT_OK`, `ASSERT_NOK`, `ASSERT_NOK_WITH_MSG`.
+- **Prefer `ASSERT_*` over `EXPECT_*`** — `ASSERT_*` stops the test
immediately on failure, preventing cascading errors.
+ - **Exception**: In non-void helper functions, use `EXPECT_*` because
`ASSERT_*` expands to `return;` which is incompatible with non-void return
types.
+
+---
+
+## Tooling
+
+### Pre-commit Hooks (Recommended)
+
+The easiest way to stay compliant is to install
[pre-commit](https://pre-commit.com/):
+
+```bash
+pip install pre-commit
+pre-commit install
+```
+
+This installs Git hooks that automatically run on every commit:
+
+| Hook | What it checks |
+|------|----------------|
+| `clang-format` | C++ formatting (`.clang-format`) |
+| `cpplint` | Google C++ lint rules |
+| `cmake-format` | CMake file formatting (`.cmake-format.py`) |
+| `codespell` | Spelling errors |
+| `trailing-whitespace` | Trailing whitespace |
+| `end-of-file-fixer` | Missing newline at end of file |
+| `check-added-large-files` | Files > 5 MB |
+
+To run all hooks on all files manually:
+
+```bash
+pre-commit run --all-files
+```
+
+### clang-tidy
+
+Static analysis is configured in `.clang-tidy`. Key enabled check groups:
+
+- `bugprone-*` — Common bug patterns
+- `google-*` — Google style checks
+- `modernize-*` — C++ modernization suggestions
+- `clang-analyzer-*` — Clang static analyzer
+
+### Building & Testing
+
+```bash
+# Configure (from project root)
+mkdir -p build && cd build
+cmake ../ \
+ -DCMAKE_BUILD_TYPE=debug \
+ -DPAIMON_BUILD_TESTS=ON
+
+# Build
+make -j$(nproc)
+
+# Run a specific test suite
+./debug/paimon-core-test --gtest_filter="CoreOptionsTest*"
+```
+
+---
+
+## Pull Request Checklist
+
+Before submitting a PR, please verify:
+
+- [ ] Code compiles without warnings under `-Wall` (enabled by default in the
build system).
+- [ ] `pre-commit run --all-files` passes (or individual tools: clang-format,
cpplint, codespell).
+- [ ] New / modified public APIs are marked with `PAIMON_EXPORT`.
+- [ ] Every new file has the Apache 2.0 license header.
+- [ ] Error handling uses `Status` / `Result<T>` — no exceptions.
+- [ ] New utility code checks for existing helpers before reinventing.
+- [ ] Tests are added or updated for the changed functionality.
+- [ ] `ASSERT_*` is preferred over `EXPECT_*` in tests.
+- [ ] No raw `new` / `delete` outside factory methods.
+- [ ] PR description follows the [template](.github/PULL_REQUEST_TEMPLATE.md).