Jefffrey commented on code in PR #9975:
URL: https://github.com/apache/arrow-rs/pull/9975#discussion_r3460276998


##########
.github/workflows/codspeed-pr.yml:
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Opt-in CodSpeed benchmarking for pull requests, gated by labels and
+# sharded one job per `[[bench]]` target in each selected crate.
+#
+# Label convention (managed manually on each PR):
+#
+#   bench:all                        # every [[bench]] in the workspace
+#   bench:<crate>                    # every [[bench]] in that crate
+#   bench:<crate> bench:<crate>      # union
+#
+# Where <crate> is a workspace member name, e.g. `bench:arrow`,
+# `bench:parquet`, `bench:arrow-cast`. `bench:all` short-circuits and
+# supersedes any per-crate labels.
+#
+# Topology mirrors codspeed.yml (setup + build run in parallel; bench
+# is a matrix that downloads the build artifact and runs one bench
+# target per shard). The `setup` job additionally filters the matrix
+# by labels.
+#
+# Authorization: only users with write access to the repo can add
+# labels, so the label is itself the authorization gate.
+#
+# Baseline: native `pull_request` event → CodSpeed compares against
+# the base branch's latest CodSpeed report automatically.
+#
+# Fork PR caveat: workflows triggered by `pull_request` from fork PRs
+# do not get an OIDC token. For benches on fork PRs, push the branch
+# to this repo and label it there.

Review Comment:
   i suppose this is an important caveat considering essentially all our PRs 
come from forks 🤔 



##########
.github/workflows/codspeed.yml:
##########
@@ -0,0 +1,163 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Continuous benchmarking on CodSpeed.
+#
+# Runs the full workspace bench suite on every push to main, sharded
+# one job per `[[bench]]` target. Sharding at this granularity is
+# required because both crate-level shards (e.g. parquet alone has 16
+# bench targets producing >1000 benchmarks) and the workspace as a
+# whole exceed CodSpeed's 1000-benchmark per-upload limit. Jobs in the
+# same workflow are auto-aggregated by CodSpeed into one report.
+# https://codspeed.io/docs/features/sharded-benchmarks

Review Comment:
   I do wonder how heavy this would be to run entire benchmark suite on every 
commit 🤔 



##########
.github/workflows/codspeed.yml:
##########
@@ -0,0 +1,163 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Continuous benchmarking on CodSpeed.
+#
+# Runs the full workspace bench suite on every push to main, sharded
+# one job per `[[bench]]` target. Sharding at this granularity is
+# required because both crate-level shards (e.g. parquet alone has 16
+# bench targets producing >1000 benchmarks) and the workspace as a
+# whole exceed CodSpeed's 1000-benchmark per-upload limit. Jobs in the
+# same workflow are auto-aggregated by CodSpeed into one report.
+# https://codspeed.io/docs/features/sharded-benchmarks
+#
+# Topology:
+#
+#   setup ─┐
+#          ├──→ bench (matrix, ~78 jobs)
+#   build ─┘
+#
+# `setup` discovers every `[[bench]]` target via `cargo metadata` (see
+# codspeed-matrix.sh) and emits a {crate, bench} matrix. `build` does the
+# full-workspace
+# `cargo codspeed build` exactly once and uploads
+# `target/codspeed/<mode>/` as an artifact. `bench` shards download the
+# artifact and run a single bench target each via the CodSpeed action;
+# no rebuild per shard.
+#
+# Auth is via GitHub OIDC; the workflow's `id-token` claim is what
+# CodSpeed verifies, so no secret token is required.
+#
+# The `criterion` workspace dependency is renamed to
+# `codspeed-criterion-compat`, so existing `use criterion::*` benches
+# work unmodified; outside of CodSpeed the compat layer falls back to
+# standard criterion behavior.
+
+name: codspeed
+
+concurrency:
+  group: ${{ github.repository }}-${{ github.workflow }}-${{ github.sha }}
+  cancel-in-progress: false
+
+on:
+  push:
+    branches:
+      - main
+  workflow_dispatch:
+
+permissions:
+  contents: read
+  id-token: write
+
+env:
+  CODSPEED_FEATURES: 
arrow/test_utils,arrow/csv,arrow/json,arrow/chrono-tz,arrow/prettyprint,arrow-schema/ffi,parquet/arrow,parquet/async,parquet/test_common,parquet/experimental,parquet/object_store
+
+jobs:
+  setup:
+    name: Generate bench matrix
+    runs-on: ubuntu-latest
+    outputs:
+      matrix: ${{ steps.gen.outputs.matrix }}
+    steps:
+      - uses: actions/checkout@v6
+
+      - name: Generate {crate, bench} matrix across the workspace
+        id: gen
+        # Discovery + the known-broken exclusion list live in the shared
+        # codspeed-matrix.sh (also used by codspeed-pr.yml) so they stay in
+        # one place. No args = every workspace crate.
+        run: |
+          matrix="$(bash .github/workflows/codspeed-matrix.sh)"
+          echo "matrix=$matrix" >> "$GITHUB_OUTPUT"
+          echo "::notice::Generated $(jq length <<<"$matrix") bench shards 
(one per bench target, known-broken targets excluded)"
+
+  build:
+    name: Build workspace benchmarks
+    runs-on: ubuntu-latest
+    timeout-minutes: 60
+    steps:
+      - uses: actions/checkout@v6
+        with:
+          submodules: true
+
+      - name: Install protoc
+        run: sudo apt-get update && sudo apt-get install -y protobuf-compiler
+
+      - name: Setup Rust toolchain, cache and cargo-codspeed
+        uses: moonrepo/setup-rust@v1
+        with:
+          channel: stable
+          cache-target: release
+          bins: cargo-codspeed
+
+      - name: Build benchmarks
+        # The --features list enables every feature any workspace bench
+        # target gates behind `required-features`; without these those
+        # benches silently aren't built.
+        run: cargo codspeed build --workspace --features "$CODSPEED_FEATURES"
+
+      - name: Pack bench binaries into a tarball
+        # actions/upload-artifact does not preserve Unix executable
+        # bits, so downloaded bench binaries land as 644 and
+        # `cargo codspeed run` then fails with EACCES. Tar preserves
+        # mode bits, so we wrap the directory before upload.
+        run: tar -cf codspeed-binaries.tar -C target codspeed
+
+      - name: Upload built bench binaries
+        uses: actions/upload-artifact@v4
+        with:
+          name: codspeed-binaries
+          path: codspeed-binaries.tar
+          retention-days: 1
+          if-no-files-found: error
+
+  bench:
+    needs: [setup, build]
+    name: ${{ matrix.config.crate }} / ${{ matrix.config.bench }}
+    runs-on: ubuntu-latest
+    timeout-minutes: 30
+    strategy:
+      fail-fast: false
+      matrix:
+        config: ${{ fromJson(needs.setup.outputs.matrix) }}
+    steps:
+      - uses: actions/checkout@v6

Review Comment:
   minor note: we should change the versions to hashes, but can do in a followup



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to