Re: Streamify more code paths

Xuneng Zhou Thu, 12 Mar 2026 08:36:23 -0700

On Thu, Mar 12, 2026 at 12:39 PM Xuneng Zhou <[email protected]> wrote:
>
> On Thu, Mar 12, 2026 at 11:42 AM Michael Paquier <[email protected]> wrote:
> >
> > On Thu, Mar 12, 2026 at 06:33:08AM +0900, Michael Paquier wrote:
> > > Thanks for doing that.  On my side, I am going to look at the gin and
> > > hash vacuum paths first with more testing as these don't use a custom
> > > callback.  I don't think that I am going to need a lot of convincing,
> > > but I'd rather produce some numbers myself because doing something.
> > > I'll tweak a mounting point with the delay trick, as well.
> >
> > While debug_io_direct has been helping a bit, the trick for the delay
> > to throttle the IO activity has helped much more with my runtime
> > numbers.  I have mounted a separate partition with a delay of 5ms,
> > disabled checkums (this part did not make a real difference), and
> > evicted shared buffers for relation and indexes before the VACUUM.
> >
> > Then I got better numbers.  Here is an extract:
> > - worker=3:
> > gin_vacuum (100k tuples)   base=  1448.2ms  patch=   572.5ms   2.53x
> > ( 60.5%)  (reads=175→104, io_time=1382.70→506.64ms)
> > gin_vacuum (300k tuples)   base=  3728.0ms  patch=  1332.0ms   2.80x
> > ( 64.3%)  (reads=486→293, io_time=3669.89→1266.27ms)
> > bloom_vacuum (100k tuples) base= 21826.8ms  patch= 17220.3ms   1.27x
> > ( 21.1%)  (reads=485→117, io_time=4773.33→270.56ms)
> > bloom_vacuum (300k tuples) base= 67054.0ms  patch= 53164.7ms   1.26x
> > ( 20.7%)  (reads=1431.5→327.5, io_time=13880.2→381.395ms)
> > - io_uring:
> > gin_vacuum (100k tuples)   base=  1240.3ms  patch=   360.5ms   3.44x
> > ( 70.9%)  (reads=175→104, io_time=1175.35→299.75ms)
> > gin_vacuum (300k tuples)   base=  2829.9ms  patch=   642.0ms   4.41x
> > ( 77.3%)  (reads=465.5→293, io_time=2768.46→579.04ms)
> > bloom_vacuum (100k tuples) base= 22121.7ms  patch= 17532.3ms   1.26x
> > ( 20.7%)  (reads=485→117, io_time=4850.46→285.28ms)
> > bloom_vacuum (300k tuples) base= 67058.0ms  patch= 53118.0ms   1.26x
> > ( 20.8%)  (reads=1431.5→327.5, io_time=13870.9→305.44ms)
> >
> > The higher the number of tuples, the better the performance for each
> > individual operation, but the tests take a much longer time (tens of
> > seconds vs tens of minutes).  For GIN, the numbers can be quite good
> > once these reads are pushed.  For bloom, the runtime is improved, and
> > the IO numbers are much better.
> >
>
> -- io_uring, medium size
>
> bloom_vacuum_medium        base=  8355.2ms  patch=   715.0ms  11.68x
> ( 91.4%)  (reads=4732→1056, io_time=7699.47→86.52ms)
> pgstattuple_medium         base=  4012.8ms  patch=   213.7ms  18.78x
> ( 94.7%)  (reads=2006→2006, io_time=4001.66→200.24ms)
> pgstatindex_medium         base=  5490.6ms  patch=    37.9ms  144.88x
> ( 99.3%)  (reads=2745→173, io_time=5481.54→7.82ms)
> hash_vacuum_medium         base= 34483.4ms  patch=  2703.5ms  12.75x
> ( 92.2%)  (reads=19166→3901, io_time=31948.33→308.05ms)
> wal_logging_medium         base=  7778.6ms  patch=  7814.5ms   1.00x
> ( -0.5%)  (reads=2857→2845, io_time=11.84→11.45ms)
>
> -- worker, medium size
> bloom_vacuum_medium        base=  8376.2ms  patch=   747.7ms  11.20x
> ( 91.1%)  (reads=4732→1056, io_time=7688.91→65.49ms)
> pgstattuple_medium         base=  4012.7ms  patch=   339.0ms  11.84x
> ( 91.6%)  (reads=2006→2006, io_time=4002.23→49.99ms)
> pgstatindex_medium         base=  5490.3ms  patch=    38.3ms  143.23x
> ( 99.3%)  (reads=2745→173, io_time=5480.60→16.24ms)
> hash_vacuum_medium         base= 34638.4ms  patch=  2940.2ms  11.78x
> ( 91.5%)  (reads=19166→3901, io_time=31881.61→242.01ms)
> wal_logging_medium         base=  7440.1ms  patch=  7434.0ms   1.00x
> (  0.1%)  (reads=2861→2825, io_time=10.62→10.71ms)
>


Our io_time metric currently measures only read time and ignores write
I/O, which can be misleading. We now separate it into read_time and
write_time.

-- write-delay 2 ms
WORKROOT=/srv/pg_delayed SIZES=small REPS=3
./run_streaming_benchmark.sh --baseline --io-method worker
--io-workers 12 --test hash_vacuum --direct-io --read-delay 2
--write-delay 2
v6-0004-Streamify-hash-index-VACUUM-primary-bucket-page-r.patch

hash_vacuum_small          base= 16652.8ms  patch= 13493.2ms   1.23x
( 19.0%)  (reads=2338→815, read_time=4136.19→884.79ms,
writes=6218→6206, write_time=12313.81→12289.58ms)

-- write-delay 0 ms
WORKROOT=/srv/pg_delayed SIZES=small REPS=3
./run_streaming_benchmark.sh --baseline --io-method worker
--io-workers 12 --test hash_vacuum --direct-io --read-delay 2
--write-delay 0
v6-0004-Streamify-hash-index-VACUUM-primary-bucket-page-r.patch

hash_vacuum_small          base=  4310.2ms  patch=  1146.7ms   3.76x
( 73.4%)  (reads=2338→815, read_time=4002.24→833.47ms,
writes=6218→6206, write_time=186.69→140.96ms)

-- 
Best,
Xuneng

#!/usr/bin/env bash
set -euo pipefail

###############################################################################
# Streaming Read Patches Benchmark
#
# Usage: ./run_streaming_bench.sh [OPTIONS] <patch>
#
# Options:
#   --clean           Remove existing builds and start fresh
#   --baseline        Also build and test vanilla PostgreSQL for comparison
#   --test TEST       Run specific test (bloom_scan, bloom_vacuum, pgstattuple,
#                     pgstatindex, gin_vacuum, wal_logging, hash_vacuum, or "all")
#   --io-method MODE  I/O method: io_uring, worker, or sync (default: io_uring)
#   --io-workers N    Number of I/O workers for worker mode (default: 3)
#   --io-concurrency N  Max concurrent I/Os per process (default: 64)
#   --direct-io         Enable direct IO (debug_io_direct=data), bypasses OS page cache
#   --read-delay MS     Simulate read latency via dm_delay (requires pre-created device)
#   --write-delay MS    Simulate write latency via dm_delay (default: 0, requires --read-delay)
#   --profile           Enable perf profiling and flamegraph generation
#
# Environment:
#   WORKROOT       Base directory (default: $HOME/pg_bench)
#   REPS           Repetitions per test (default: 5)
#   SIZES          Table sizes to test (default: "large")
#   FLAMEGRAPH_DIR Path to FlameGraph tools (default: $HOME/FlameGraph)
#   DM_DELAY_DEV   dm_delay device name for --read-delay (default: "delayed")
###############################################################################

log() { printf '\033[1;34m==>\033[0m %s\n' "$*"; }
die() { printf '\033[1;31mERROR:\033[0m %s\n' "$*" >&2; exit 1; }

# --- CLI parsing ---
CLEAN=0
BASELINE=0
DO_PROFILE=0
DIRECT_IO=0
IO_DELAY_MS=""
WRITE_DELAY_MS="0"
TEST="all"
IO_METHOD="${IO_METHOD:-io_uring}"
IO_WORKERS="${IO_WORKERS:-3}"
IO_MAX_CONCURRENCY="${IO_MAX_CONCURRENCY:-64}"
DM_DELAY_DEV="${DM_DELAY_DEV:-delayed}"
PATCH=""

while [[ $# -gt 0 ]]; do
  case "$1" in
    --clean)          CLEAN=1 ;;
    --baseline)       BASELINE=1 ;;
    --profile)        DO_PROFILE=1 ;;
    --direct-io)      DIRECT_IO=1 ;;
    --read-delay)     IO_DELAY_MS="$2"; shift ;;
    --write-delay)    WRITE_DELAY_MS="$2"; shift ;;
    --test)           TEST="$2"; shift ;;
    --io-method)      IO_METHOD="$2"; shift ;;
    --io-workers)     IO_WORKERS="$2"; shift ;;
    --io-concurrency) IO_MAX_CONCURRENCY="$2"; shift ;;
    -h|--help)        sed -n '3,27p' "$0" | sed 's/^# \?//'; exit 0 ;;
    -*)               die "Unknown option: $1" ;;
    *)                PATCH="$1" ;;
  esac
  shift
done

# Validate io_method
case "$IO_METHOD" in
  io_uring|worker|sync) ;;
  *) die "Invalid --io-method: $IO_METHOD (must be io_uring, worker, or sync)" ;;
esac

# Validate dm_delay device if --read-delay is used
if [[ -n "$IO_DELAY_MS" ]]; then
  command -v dmsetup >/dev/null 2>&1 || die "--read-delay requires dmsetup (sudo apt install dmsetup)"
  sudo dmsetup status "$DM_DELAY_DEV" >/dev/null 2>&1 \
    || die "dm_delay device '$DM_DELAY_DEV' not found. Create it first, e.g.:\n  umount /srv && dmsetup create $DM_DELAY_DEV --table \"0 \$(blockdev --getsz /dev/DEVICE) delay /dev/DEVICE 0 $IO_DELAY_MS\" && mount /dev/mapper/$DM_DELAY_DEV /srv/"
fi

[[ -z "$PATCH" ]] && die "Usage: $0 [--clean] [--baseline] [--test TEST] <patch>"
[[ ! -f "$PATCH" ]] && die "Patch not found: $PATCH"
[[ "$PATCH" != /* ]] && PATCH="$PWD/$PATCH"

# --- Profiling validation ---
FLAMEGRAPH_DIR="${FLAMEGRAPH_DIR:-$HOME/FlameGraph}"
PERF_SUDO="${PERF_SUDO:-sudo}"
PERF_EVENT="${PERF_EVENT:-cycles}"  # cycles = user+kernel; cycles:u = user-only
if [[ $DO_PROFILE -eq 1 ]]; then
  command -v perf >/dev/null 2>&1 || die "Need perf (sudo apt install linux-tools-$(uname -r))"
  [[ -x "$FLAMEGRAPH_DIR/stackcollapse-perf.pl" ]] || die "Missing $FLAMEGRAPH_DIR/stackcollapse-perf.pl (git clone https://github.com/brendangregg/FlameGraph)"
  [[ -x "$FLAMEGRAPH_DIR/flamegraph.pl" ]] || die "Missing $FLAMEGRAPH_DIR/flamegraph.pl"
fi

# --- Configuration ---
WORKROOT="${WORKROOT:-$HOME/pg_bench}"
REPS="${REPS:-5}"
SIZES="${SIZES:-large}"

ROOT_BASE="$WORKROOT/vanilla"
PATCH_TAG=$(basename "$PATCH" .patch | tr -dc '[:alnum:]_-' | cut -c1-40)
ROOT_PATCH="$WORKROOT/$PATCH_TAG"

# --- Helpers ---
pg() { echo "$1/pg/bin/$2"; }

pick_port() {
  for p in $(seq "${1:-5432}" 60000); do
    lsof -iTCP:"$p" -sTCP:LISTEN >/dev/null 2>&1 || { echo "$p"; return; }
  done
  die "No free port found"
}

set_io_delay() {
  local ms="$1"
  [[ -z "$IO_DELAY_MS" ]] && return
  local table size dev
  table=$(sudo dmsetup table "$DM_DELAY_DEV")
  size=$(echo "$table" | awk '{print $2}')
  dev=$(echo "$table" | awk '{print $4}')
  log "Setting dm_delay on $DM_DELAY_DEV to ${ms}ms read / ${WRITE_DELAY_MS}ms write"
  sudo dmsetup suspend "$DM_DELAY_DEV"
  sudo dmsetup reload "$DM_DELAY_DEV" --table "0 $size delay $dev 0 $ms $dev 0 $WRITE_DELAY_MS"
  sudo dmsetup resume "$DM_DELAY_DEV"
}

# --- Build PostgreSQL ---
build_pg() {
  local ROOT="$1" PATCH_FILE="${2:-}"
  
  [[ $CLEAN -eq 1 ]] && rm -rf "$ROOT"
  
  if [[ ! -x "$(pg "$ROOT" initdb)" ]]; then
    log "Building PostgreSQL: $ROOT"
    mkdir -p "$ROOT"
    
    git clone --depth 1 https://github.com/postgres/postgres "$ROOT/src" 2>/dev/null
    cd "$ROOT/src"
    
    [[ -n "$PATCH_FILE" ]] && { log "Applying patch"; git apply "$PATCH_FILE"; }
    
    ./configure --prefix="$ROOT/pg" --with-liburing \
      CFLAGS='-O2 -ggdb3 -fno-omit-frame-pointer' >/dev/null 2>&1
    
    make -j"$(nproc)" install >/dev/null 2>&1
  else
    log "Reusing build: $ROOT"
    cd "$ROOT/src"
  fi
  
  # Always install contribs (idempotent, catches reused builds missing new extensions)
  make -C contrib/bloom install >/dev/null 2>&1
  make -C contrib/pgstattuple install >/dev/null 2>&1
  make -C contrib/pg_buffercache install >/dev/null 2>&1
  make -C contrib/pg_prewarm install >/dev/null 2>&1
}

# --- Cluster management ---
init_cluster() {
  local ROOT="$1" PORT="$2"
  
  rm -rf "$ROOT/data"
  "$(pg "$ROOT" initdb)" -D "$ROOT/data" --no-locale >/dev/null 2>&1
  
  cat >> "$ROOT/data/postgresql.conf" <<EOF
port = $PORT
listen_addresses = '127.0.0.1'
shared_buffers = '32GB'
effective_io_concurrency = 200
io_method = $IO_METHOD
io_workers = $IO_WORKERS
io_max_concurrency = $IO_MAX_CONCURRENCY
track_io_timing = on
track_wal_io_timing = on
synchronous_commit = on
autovacuum = off
checkpoint_timeout = 1h
max_wal_size = 10GB
max_parallel_workers_per_gather = 0
EOF
  
  [[ $DIRECT_IO -eq 1 ]] && echo "debug_io_direct = data" >> "$ROOT/data/postgresql.conf"
  
  "$(pg "$ROOT" pg_ctl)" -D "$ROOT/data" -l "$ROOT/server.log" start -w >/dev/null
  
  psql_run "$ROOT" "$PORT" -c "CREATE EXTENSION IF NOT EXISTS pg_buffercache;"
  psql_run "$ROOT" "$PORT" -c "CREATE EXTENSION IF NOT EXISTS pg_prewarm;"
}

stop_cluster() {
  local ROOT="$1"
  "$(pg "$ROOT" pg_ctl)" -D "$ROOT/data" stop -m fast 2>/dev/null || true
}

drop_caches() {
  local ROOT="$1" PORT="$2"
  shift 2
  local rels=("$@")
  
  # Evict target relations from shared buffers (no PG restart needed)
  for rel in "${rels[@]}"; do
    psql_run "$ROOT" "$PORT" -c "SELECT pg_buffercache_evict_relation('${rel}'::regclass);" >/dev/null
  done
  
  # Drop OS page cache (skip with direct IO — no page cache involved)
  if [[ $DIRECT_IO -eq 0 ]]; then
    sync
    echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null 2>&1 || true
    sleep 2
  fi
}

psql_run() {
  local ROOT="$1" PORT="$2"
  shift 2
  "$(pg "$ROOT" psql)" -h 127.0.0.1 -p "$PORT" -d postgres -v ON_ERROR_STOP=1 -Atq "$@"
}

# --- Timing ---
run_timed() {
  local ROOT="$1" PORT="$2" SQL="$3"
  local ms
  # -X: ignore .psqlrc, -v ON_ERROR_STOP=1: fail on SQL errors
  # Parse last Time: line, handle both "ms" and "s" units
  ms=$("$(pg "$ROOT" psql)" -h 127.0.0.1 -p "$PORT" -d postgres -X -v ON_ERROR_STOP=1 -At \
    -c '\timing on' -c "$SQL" 2>&1 | \
    awk '
      /Time:/ {
        val=$2; unit=$3;
        if (unit=="ms") ms=val;
        else if (unit=="s") ms=val*1000;
      }
      END { if (ms=="") exit 1; printf "%.3f\n", ms; }
    ')
  # Validate numeric output
  [[ "$ms" =~ ^[0-9]+(\.[0-9]+)?$ ]] || { echo "ERROR: Non-numeric timing: $ms" >&2; return 1; }
  echo "$ms"
}

# --- I/O Stats ---
# Run SQL and capture timing + I/O stats from pg_stat_io
# Resets stats before query, waits for flush, then reads absolute values
# Note: pg_stat_io has PGSTAT_MIN_INTERVAL=1000ms flush delay, so we wait 1.5s
#       after the query to ensure stats are flushed to shared memory.
# Note: pg_stat_io counts I/O operations, not pages (with io_combine_limit=128kB,
#       up to 16 pages per operation). This is expected behavior.
# Returns: ms,reads,read_time,writes,write_time
run_timed_with_io() {
  local ROOT="$1" PORT="$2" SQL="$3"
  local result
  
  # Reset stats, run query, wait for flush, read absolute values
  # - Filter by client backend and io worker (excludes bgwriter/checkpointer)
  # - 1.5s delay allows stats to flush (PGSTAT_MIN_INTERVAL=1000ms)
  result=$("$(pg "$ROOT" psql)" -h 127.0.0.1 -p "$PORT" -d postgres -X -v ON_ERROR_STOP=1 <<EOSQL
SELECT pg_stat_reset_shared('io');
\\timing on
$SQL
\\timing off
SELECT pg_sleep(1.5);
\\t on
SELECT 
  COALESCE(SUM(reads),0)::bigint,
  COALESCE(SUM(read_time),0)::numeric(12,2),
  COALESCE(SUM(writes),0)::bigint,
  COALESCE(SUM(write_time),0)::numeric(12,2)
FROM pg_stat_io 
WHERE object = 'relation' AND backend_type IN ('client backend', 'io worker');
EOSQL
  2>&1)
  
  # Parse timing (last Time: line)
  local ms
  ms=$(echo "$result" | awk '
    /Time:/ {
      val=$2; unit=$3;
      if (unit=="ms") ms=val;
      else if (unit=="s") ms=val*1000;
    }
    END { if (ms=="") exit 1; printf "%.3f\n", ms; }
  ')
  
  # Parse I/O stats (last non-empty line with pipe separator: reads|read_time|writes|write_time)
  local reads read_time writes write_time
  local io_line
  io_line=$(echo "$result" | grep '|' | tail -1)
  reads=$(echo "$io_line"     | cut -d'|' -f1 | tr -d ' ')
  read_time=$(echo "$io_line"  | cut -d'|' -f2 | tr -d ' ')
  writes=$(echo "$io_line"    | cut -d'|' -f3 | tr -d ' ')
  write_time=$(echo "$io_line" | cut -d'|' -f4 | tr -d ' ')
  
  # Default to 0 if not found
  [[ "$reads"      =~ ^-?[0-9]+$             ]] || reads=0
  [[ "$read_time"  =~ ^-?[0-9]+(\.[0-9]+)?$ ]] || read_time=0
  [[ "$writes"     =~ ^-?[0-9]+$             ]] || writes=0
  [[ "$write_time" =~ ^-?[0-9]+(\.[0-9]+)?$ ]] || write_time=0
  
  echo "$ms,$reads,$read_time,$writes,$write_time"
}

# --- Statistics ---
calc_median() {
  awk -F, 'NR>1{a[++n]=$2}END{
    if(n==0){print 0; exit}
    for(i=1;i<=n;i++)for(j=i+1;j<=n;j++)if(a[i]>a[j]){t=a[i];a[i]=a[j];a[j]=t}
    print (n%2)?a[int(n/2)+1]:(a[n/2]+a[n/2+1])/2
  }' "$1"
}

calc_median_col() {
  local file="$1" col="$2"
  awk -F, -v col="$col" 'NR>1{a[++n]=$col}END{
    if(n==0){print 0; exit}
    for(i=1;i<=n;i++)for(j=i+1;j<=n;j++)if(a[i]>a[j]){t=a[i];a[i]=a[j];a[j]=t}
    print (n%2)?a[int(n/2)+1]:(a[n/2]+a[n/2+1])/2
  }' "$file"
}

calc_stats() {
  local csv="$1"
  awk -F, 'NR>1{a[++n]=$2;s+=$2}END{
    if(n==0)exit
    for(i=1;i<=n;i++)for(j=i+1;j<=n;j++)if(a[i]>a[j]){t=a[i];a[i]=a[j];a[j]=t}
    med=(n%2)?a[int(n/2)+1]:(a[n/2]+a[n/2+1])/2
    avg=s/n; for(i=1;i<=n;i++)ss+=(a[i]-avg)^2; sd=sqrt(ss/n)
    printf "median=%.1fms mean=%.1f±%.1fms n=%d", med, avg, sd, n
  }' "$csv"
}

# --- Profiling ---
# Run a SQL command under perf, attaching to the backend PID.
# Generates perf.data and flamegraph SVG.
#   profile_sql ROOT PORT LABEL SQL
profile_sql() {
  [[ $DO_PROFILE -ne 1 ]] && return
  
  local ROOT="$1" PORT="$2" LABEL="$3" SQL="$4"
  local PROF_DIR="$ROOT/profile"
  mkdir -p "$PROF_DIR"
  
  local PERF_DATA="$PROF_DIR/${LABEL}.perf.data"
  local SVG="$PROF_DIR/${LABEL}.svg"
  local psql_bin
  psql_bin="$(pg "$ROOT" psql)"
  
  # Use a unique application_name to find the backend PID
  local APP="prof_${LABEL}_$$"
  
  # Launch a psql session that will first identify itself, then run the SQL
  # The pg_sleep() gives us time to find the backend PID and attach perf
  PGAPPNAME="$APP" "$psql_bin" -h 127.0.0.1 -p "$PORT" -d postgres \
    -X -v ON_ERROR_STOP=1 <<EOSQL >/dev/null 2>&1 &
SELECT pg_sleep(2);
$SQL
EOSQL
  local QUERY_SHELL_PID=$!
  
  # Find the backend PID via pg_stat_activity
  local BACKEND_PID=""
  for ((n=0; n<100; n++)); do
    BACKEND_PID=$("$psql_bin" -h 127.0.0.1 -p "$PORT" -d postgres -Atq \
      -c "SELECT pid FROM pg_stat_activity WHERE application_name='${APP}' ORDER BY backend_start DESC LIMIT 1;" 2>/dev/null)
    [[ -n "$BACKEND_PID" && -d "/proc/$BACKEND_PID" ]] && break
    sleep 0.05
  done
  
  if [[ -z "$BACKEND_PID" || ! -d "/proc/$BACKEND_PID" ]]; then
    log "WARNING: Could not find backend PID for profiling, skipping"
    wait "$QUERY_SHELL_PID" 2>/dev/null || true
    return
  fi
  
  log "Profiling backend PID $BACKEND_PID → $PERF_DATA"
  
  # Attach perf to the backend; we explicitly kill -INT it after the query finishes
  $PERF_SUDO perf record -g --call-graph dwarf \
    -p "$BACKEND_PID" -o "$PERF_DATA" \
    --event="$PERF_EVENT" 2>/dev/null &
  local PERF_PID=$!
  sleep 0.1
  
  # Verify perf actually started (permissions, valid PID, etc.)
  if ! kill -0 "$PERF_PID" 2>/dev/null; then
    log "WARNING: perf record failed to start (permissions/config?), skipping flamegraph"
    wait "$QUERY_SHELL_PID" 2>/dev/null || true
    return
  fi
  
  # Wait for the query to finish
  wait "$QUERY_SHELL_PID" 2>/dev/null || true
  
  # Give perf a moment to flush, then stop it
  sleep 0.5
  $PERF_SUDO kill -INT "$PERF_PID" 2>/dev/null || true; wait "$PERF_PID" 2>/dev/null || true
  
  # Generate flamegraph
  generate_flamegraph "$PERF_DATA" "$SVG" "$LABEL"
}

# Convert perf.data → flamegraph SVG
#   generate_flamegraph PERF_DATA SVG_PATH TITLE
generate_flamegraph() {
  local PERF_DATA="$1" SVG="$2" TITLE="$3"
  
  [[ -f "$PERF_DATA" ]] || return
  
  local FOLDED="${PERF_DATA%.perf.data}.folded"
  if $PERF_SUDO perf script -i "$PERF_DATA" 2>/dev/null \
      | "$FLAMEGRAPH_DIR/stackcollapse-perf.pl" > "$FOLDED" 2>/dev/null \
      && [[ -s "$FOLDED" ]]; then
    "$FLAMEGRAPH_DIR/flamegraph.pl" --title "$TITLE" --countname samples \
      "$FOLDED" > "$SVG" 2>/dev/null
    log "Flamegraph: $SVG"
    rm -f "$FOLDED"
  else
    log "WARNING: Failed to generate flamegraph for $TITLE"
    rm -f "$FOLDED"
  fi
}

# --- Benchmark runner ---
# benchmark ROOT PORT NAME SQL RELATION [RELATION...]
benchmark() {
  local ROOT="$1" PORT="$2" NAME="$3" SQL="$4"
  shift 4
  local rels=("$@")
  local OUT="$ROOT/results/${NAME}.csv"
  
  mkdir -p "$ROOT/results"
  echo "run,ms,reads,read_time_ms,writes,write_time_ms" > "$OUT"
  
  for ((i=1; i<=REPS; i++)); do
    drop_caches "$ROOT" "$PORT" "${rels[@]}"
    local result ms reads read_time writes write_time
    result=$(run_timed_with_io "$ROOT" "$PORT" "$SQL")
    IFS=',' read -r ms reads read_time writes write_time <<<"$result"
    echo "$i,$ms,$reads,$read_time,$writes,$write_time" >> "$OUT"
    log "$NAME [$i/$REPS]: ${ms}ms (reads=$reads, read_time=${read_time}ms, writes=$writes, write_time=${write_time}ms)"
  done
}

# --- Data setup functions ---
setup_bloom() {
  local ROOT="$1" PORT="$2" SIZE="$3"
  local NROWS
  case "$SIZE" in
    small)  NROWS=100000 ;;
    medium) NROWS=1000000 ;;
    large)  NROWS=10000000 ;;
  esac
  
  log "Creating Bloom test data ($SIZE: $NROWS rows)"
  psql_run "$ROOT" "$PORT" <<SQL
CREATE EXTENSION IF NOT EXISTS bloom;
DROP TABLE IF EXISTS bloom_test;
CREATE TABLE bloom_test (id INT, data TEXT, val1 INT, val2 INT);
INSERT INTO bloom_test SELECT i, 'data_'||i, i%1000, i%100 FROM generate_series(1,$NROWS) i;
CREATE INDEX bloom_idx ON bloom_test USING bloom (val1, val2);
VACUUM ANALYZE bloom_test;
CHECKPOINT;
SQL
}

setup_pgstattuple() {
  local ROOT="$1" PORT="$2" SIZE="$3"
  local NROWS
  case "$SIZE" in
    small)  NROWS=100000 ;;
    medium) NROWS=1000000 ;;
    large)  NROWS=10000000 ;;
  esac
  
  log "Creating pgstattuple test data ($SIZE: $NROWS rows)"
  psql_run "$ROOT" "$PORT" <<SQL
CREATE EXTENSION IF NOT EXISTS pgstattuple;
DROP TABLE IF EXISTS heap_test;
CREATE TABLE heap_test (id SERIAL PRIMARY KEY, data TEXT);
INSERT INTO heap_test (data) SELECT repeat('x',100) FROM generate_series(1,$NROWS);
VACUUM ANALYZE heap_test;
CHECKPOINT;
SQL
}

setup_pgstatindex() {
  local ROOT="$1" PORT="$2" SIZE="$3"
  local NROWS
  case "$SIZE" in
    small)  NROWS=100000 ;;
    medium) NROWS=1000000 ;;
    large)  NROWS=10000000 ;;
  esac
  
  log "Creating pgstatindex test data ($SIZE: $NROWS rows)"
  psql_run "$ROOT" "$PORT" <<SQL
CREATE EXTENSION IF NOT EXISTS pgstattuple;
DROP TABLE IF EXISTS idx_test;
CREATE TABLE idx_test (id SERIAL PRIMARY KEY, data TEXT);
INSERT INTO idx_test (data) SELECT 'data_row_' || i || '_' || repeat('x',50) FROM generate_series(1,$NROWS) i;
VACUUM ANALYZE idx_test;
CHECKPOINT;
SQL
}

setup_gin() {
  local ROOT="$1" PORT="$2" SIZE="$3"
  local NROWS
  case "$SIZE" in
    small)  NROWS=100000 ;;
    medium) NROWS=1000000 ;;
    large)  NROWS=5000000 ;;
  esac
  
  log "Creating GIN test data ($SIZE: $NROWS rows)"
  psql_run "$ROOT" "$PORT" <<SQL
DROP TABLE IF EXISTS gin_test;
-- No PRIMARY KEY: isolate GIN index vacuum from btree overhead
CREATE TABLE gin_test (id INT, tags TEXT[]);
INSERT INTO gin_test (id, tags)
SELECT i, ARRAY(SELECT 'tag_'||(random()*100)::int FROM generate_series(1,5))
FROM generate_series(1,$NROWS) i;
CREATE INDEX gin_idx ON gin_test USING gin (tags);
VACUUM ANALYZE gin_test;
CHECKPOINT;
SQL
}

setup_hash() {
  local ROOT="$1" PORT="$2" SIZE="$3"
  local NROWS
  case "$SIZE" in
    small)  NROWS=500000 ;;
    medium) NROWS=5000000 ;;
    large)  NROWS=20000000 ;;
  esac
  
  log "Creating Hash test data ($SIZE: $NROWS unique values)"
  psql_run "$ROOT" "$PORT" <<SQL
DROP TABLE IF EXISTS hash_test;
-- No PRIMARY KEY: isolate hash index vacuum from btree overhead
CREATE TABLE hash_test (id INT, data TEXT);
INSERT INTO hash_test SELECT i, 'x' FROM generate_series(1,$NROWS) i;
CREATE INDEX hash_idx ON hash_test USING hash (id);
VACUUM ANALYZE hash_test;
CHECKPOINT;
SQL
}

setup_wal() {
  local ROOT="$1" PORT="$2" SIZE="$3"
  local NROWS
  case "$SIZE" in
    small)  NROWS=1000000 ;;
    medium) NROWS=5000000 ;;
    large)  NROWS=20000000 ;;
  esac
  
  log "Creating table for GIN index build / log_newpage_range test ($SIZE: $NROWS rows)"
  psql_run "$ROOT" "$PORT" <<SQL
DROP TABLE IF EXISTS wal_test;
-- Table with tsvector column for GIN indexing (full-text search)
-- GIN index builds always call log_newpage_range() at the end of
-- ginbuild() (gininsert.c) to WAL-log all index pages. 
CREATE TABLE wal_test (id INT, doc TEXT, doc_tsv TSVECTOR);
INSERT INTO wal_test
  SELECT i,
         'word' || (random()*10000)::int || ' term' || (random()*10000)::int
           || ' token' || (random()*5000)::int || ' phrase' || (random()*8000)::int,
         to_tsvector('simple',
           'word' || (random()*10000)::int || ' term' || (random()*10000)::int
           || ' token' || (random()*5000)::int || ' phrase' || (random()*8000)::int)
  FROM generate_series(1,$NROWS) i;
VACUUM ANALYZE wal_test;
CHECKPOINT;
SQL
}

# --- Test functions ---
test_bloom_scan() {
  local ROOT="$1" PORT="$2" LABEL="$3" SIZE="$4"
  setup_bloom "$ROOT" "$PORT" "$SIZE"
  benchmark "$ROOT" "$PORT" "${LABEL}_bloom_scan_${SIZE}" \
    "SET enable_seqscan=off; SELECT COUNT(*) FROM bloom_test WHERE val1=42 AND val2=7;" \
    bloom_test bloom_idx
  # Profile after benchmark reps: shared_buffers memory already faulted in,
  # so page-fault noise is gone; drop_caches ensures cold IO for the profile.
  if [[ $DO_PROFILE -eq 1 ]]; then
    drop_caches "$ROOT" "$PORT" bloom_test bloom_idx
    profile_sql "$ROOT" "$PORT" "${LABEL}_bloom_scan_${SIZE}" \
      "SET enable_seqscan=off; SELECT COUNT(*) FROM bloom_test WHERE val1=42 AND val2=7;"
  fi
}

test_bloom_vacuum() {
  local ROOT="$1" PORT="$2" LABEL="$3" SIZE="$4"
  local OUT="$ROOT/results/${LABEL}_bloom_vacuum_${SIZE}.csv"
  mkdir -p "$ROOT/results"
  echo "run,ms,reads,read_time_ms,writes,write_time_ms" > "$OUT"
  
  for ((i=1; i<=REPS; i++)); do
    # Fresh table each run for consistent state
    setup_bloom "$ROOT" "$PORT" "$SIZE"
    # Create 10% dead tuples
    psql_run "$ROOT" "$PORT" -c "DELETE FROM bloom_test WHERE id % 10 = 0;"
    
    drop_caches "$ROOT" "$PORT" bloom_test bloom_idx
    local result ms reads read_time writes write_time
    result=$(run_timed_with_io "$ROOT" "$PORT" "VACUUM bloom_test;")
    IFS=',' read -r ms reads read_time writes write_time <<<"$result"
    echo "$i,$ms,$reads,$read_time,$writes,$write_time" >> "$OUT"
    log "${LABEL}_bloom_vacuum_${SIZE} [$i/$REPS]: ${ms}ms (reads=$reads, read_time=${read_time}ms, writes=$writes, write_time=${write_time}ms)"
  done
  
  if [[ $DO_PROFILE -eq 1 ]]; then
    setup_bloom "$ROOT" "$PORT" "$SIZE"
    psql_run "$ROOT" "$PORT" -c "DELETE FROM bloom_test WHERE id % 10 = 0;"
    drop_caches "$ROOT" "$PORT" bloom_test bloom_idx
    profile_sql "$ROOT" "$PORT" "${LABEL}_bloom_vacuum_${SIZE}" "VACUUM bloom_test;"
  fi
}

test_pgstattuple() {
  local ROOT="$1" PORT="$2" LABEL="$3" SIZE="$4"
  local OUT="$ROOT/results/${LABEL}_pgstattuple_${SIZE}.csv"
  mkdir -p "$ROOT/results"
  echo "run,ms,reads,read_time_ms,writes,write_time_ms" > "$OUT"
  
  # Setup once — rolled-back DELETE keeps layout identical across all reps
  setup_pgstattuple "$ROOT" "$PORT" "$SIZE"
  # Rolled-back DELETE clears the all-visible bit in the Visibility Map so
  # pgstattuple_approx must actually read those pages (it skips all-visible pages).
  # Using ROLLBACK keeps the physical layout identical across all reps (no TOAST
  # out-of-page updates, no dirty pages to flush from shared_buffers).
  psql_run "$ROOT" "$PORT" -c "BEGIN; DELETE FROM heap_test WHERE id % 500 = 0; ROLLBACK;"
  # Warmup pass: The rolled-back DELETE left every touched tuple with an xmax
  # pointing to the aborted transaction but no hint bits set. On the first
  # pgstattuple_approx call, HeapTupleSatisfiesVacuum → HeapTupleSatisfiesVacuumHorizon
  # must resolve each such xmax: TransactionIdIsInProgress (ProcArray scan) then
  # TransactionIdDidCommit (CLOG lookup) — only then can it call SetHintBits to
  # stamp HEAP_XMAX_INVALID and MarkBufferDirtyHint. Without this warmup, rep 1
  # pays ~1100ms extra CPU for those CLOG/ProcArray lookups. Subsequent reps hit
  # the early-exit at "if (t_infomask & HEAP_XMAX_INVALID) return HEAPTUPLE_LIVE"
  # and skip the expensive path entirely.
  # After this pass, the dirtied hint-bit pages are flushed to disk via
  # drop_caches, so all reps start from the same on-disk state.
  psql_run "$ROOT" "$PORT" -c "SELECT * FROM pgstattuple_approx('heap_test');" >/dev/null

  for ((i=1; i<=REPS; i++)); do
    drop_caches "$ROOT" "$PORT" heap_test heap_test_pkey
    local result ms reads read_time writes write_time
    result=$(run_timed_with_io "$ROOT" "$PORT" "SELECT * FROM pgstattuple_approx('heap_test');")
    IFS=',' read -r ms reads read_time writes write_time <<<"$result"
    echo "$i,$ms,$reads,$read_time,$writes,$write_time" >> "$OUT"
    log "${LABEL}_pgstattuple_${SIZE} [$i/$REPS]: ${ms}ms (reads=$reads, read_time=${read_time}ms, writes=$writes, write_time=${write_time}ms)"
  done
  
  if [[ $DO_PROFILE -eq 1 ]]; then
    psql_run "$ROOT" "$PORT" -c "BEGIN; DELETE FROM heap_test WHERE id % 500 = 0; ROLLBACK;"
    drop_caches "$ROOT" "$PORT" heap_test heap_test_pkey
    profile_sql "$ROOT" "$PORT" "${LABEL}_pgstattuple_${SIZE}" \
      "SELECT * FROM pgstattuple_approx('heap_test');"
  fi
}

test_pgstatindex() {
  local ROOT="$1" PORT="$2" LABEL="$3" SIZE="$4"
  setup_pgstatindex "$ROOT" "$PORT" "$SIZE"
  benchmark "$ROOT" "$PORT" "${LABEL}_pgstatindex_${SIZE}" \
    "SELECT * FROM pgstatindex('idx_test_pkey');" \
    idx_test idx_test_pkey
  if [[ $DO_PROFILE -eq 1 ]]; then
    drop_caches "$ROOT" "$PORT" idx_test idx_test_pkey
    profile_sql "$ROOT" "$PORT" "${LABEL}_pgstatindex_${SIZE}" \
      "SELECT * FROM pgstatindex('idx_test_pkey');"
  fi
}

test_gin_vacuum() {
  local ROOT="$1" PORT="$2" LABEL="$3" SIZE="$4"
  local OUT="$ROOT/results/${LABEL}_gin_vacuum_${SIZE}.csv"
  mkdir -p "$ROOT/results"
  echo "run,ms,reads,read_time_ms,writes,write_time_ms" > "$OUT"
  
  for ((i=1; i<=REPS; i++)); do
    # Fresh table each run for consistent state
    setup_gin "$ROOT" "$PORT" "$SIZE"
    
    drop_caches "$ROOT" "$PORT" gin_test gin_idx
    local result ms reads read_time writes write_time
    # VACUUM ANALYZE forces ginvacuumcleanup() to run and scan all pages
    result=$(run_timed_with_io "$ROOT" "$PORT" "VACUUM ANALYZE gin_test;")
    IFS=',' read -r ms reads read_time writes write_time <<<"$result"
    echo "$i,$ms,$reads,$read_time,$writes,$write_time" >> "$OUT"
    log "${LABEL}_gin_vacuum_${SIZE} [$i/$REPS]: ${ms}ms (reads=$reads, read_time=${read_time}ms, writes=$writes, write_time=${write_time}ms)"
  done
  
  if [[ $DO_PROFILE -eq 1 ]]; then
    setup_gin "$ROOT" "$PORT" "$SIZE"
    drop_caches "$ROOT" "$PORT" gin_test gin_idx
    profile_sql "$ROOT" "$PORT" "${LABEL}_gin_vacuum_${SIZE}" "VACUUM ANALYZE gin_test;"
  fi
}

test_hash_vacuum() {
  local ROOT="$1" PORT="$2" LABEL="$3" SIZE="$4"
  local OUT="$ROOT/results/${LABEL}_hash_vacuum_${SIZE}.csv"
  mkdir -p "$ROOT/results"
  echo "run,ms,reads,read_time_ms,writes,write_time_ms" > "$OUT"
  
  for ((i=1; i<=REPS; i++)); do
    # Fresh table each run for consistent state
    setup_hash "$ROOT" "$PORT" "$SIZE"
    # Create 10% dead tuples
    psql_run "$ROOT" "$PORT" -c "DELETE FROM hash_test WHERE id % 10 = 0;"
    
    drop_caches "$ROOT" "$PORT" hash_test hash_idx
    local result ms reads read_time writes write_time
    result=$(run_timed_with_io "$ROOT" "$PORT" "VACUUM hash_test;")
    IFS=',' read -r ms reads read_time writes write_time <<<"$result"
    echo "$i,$ms,$reads,$read_time,$writes,$write_time" >> "$OUT"
    log "${LABEL}_hash_vacuum_${SIZE} [$i/$REPS]: ${ms}ms (reads=$reads, read_time=${read_time}ms, writes=$writes, write_time=${write_time}ms)"
  done
  
  if [[ $DO_PROFILE -eq 1 ]]; then
    setup_hash "$ROOT" "$PORT" "$SIZE"
    psql_run "$ROOT" "$PORT" -c "DELETE FROM hash_test WHERE id % 10 = 0;"
    drop_caches "$ROOT" "$PORT" hash_test hash_idx
    profile_sql "$ROOT" "$PORT" "${LABEL}_hash_vacuum_${SIZE}" "VACUUM hash_test;"
  fi
}

test_wal_logging() {
  local ROOT="$1" PORT="$2" LABEL="$3" SIZE="$4"
  local OUT="$ROOT/results/${LABEL}_wal_logging_${SIZE}.csv"
  mkdir -p "$ROOT/results"
  echo "run,ms,reads,read_time_ms,writes,write_time_ms" > "$OUT"
  
  # Build table once - only rebuild index each rep
  setup_wal "$ROOT" "$PORT" "$SIZE"
  
  local WAL_SQL="CREATE INDEX wal_test_gin_idx ON wal_test USING gin (doc_tsv);"
  
  for ((i=1; i<=REPS; i++)); do
    # Drop index from previous iteration
    psql_run "$ROOT" "$PORT" -c "DROP INDEX IF EXISTS wal_test_gin_idx;"
    
    # Drop OS caches - source table pages are COLD on disk
    drop_caches "$ROOT" "$PORT" wal_test
    
    # CREATE INDEX on GIN (tsvector_ops):
    # - GIN always uses the same build path: ginbuild() populates the
    #   index in memory, flushes to disk, then calls log_newpage_range()
    #   to read ALL index pages and write them to WAL (gininsert.c:785-790)
    local result ms reads read_time writes write_time
    result=$(run_timed_with_io "$ROOT" "$PORT" "$WAL_SQL")
    IFS=',' read -r ms reads read_time writes write_time <<<"$result"
    echo "$i,$ms,$reads,$read_time,$writes,$write_time" >> "$OUT"
    log "${LABEL}_wal_logging_${SIZE} [$i/$REPS]: ${ms}ms (reads=$reads, read_time=${read_time}ms, writes=$writes, write_time=${write_time}ms)"
  done
  
  if [[ $DO_PROFILE -eq 1 ]]; then
    psql_run "$ROOT" "$PORT" -c "DROP INDEX IF EXISTS wal_test_gin_idx;"
    drop_caches "$ROOT" "$PORT" wal_test
    profile_sql "$ROOT" "$PORT" "${LABEL}_wal_logging_${SIZE}" "$WAL_SQL"
  fi
}

# --- Run tests for a build ---
warmup_catalog() {
  local ROOT="$1" PORT="$2"
  # Explicitly prewarm catalog tables and their indexes into shared_buffers
  # so rep 1 doesn't pay disk-read cost for catalog pages.
  # pg_buffercache_evict_relation only evicts the test relation, not catalogs,
  # so these stay warm across all reps.
  psql_run "$ROOT" "$PORT" <<SQL >/dev/null
SELECT pg_prewarm('pg_class',     'buffer');
SELECT pg_prewarm('pg_attribute', 'buffer');
SELECT pg_prewarm('pg_namespace', 'buffer');
SELECT pg_prewarm('pg_proc',      'buffer');
SELECT pg_prewarm('pg_type',      'buffer');
SQL
}

run_tests() {
  local ROOT="$1" LABEL="$2"
  local PORT
  PORT=$(pick_port)
  
  log "[$LABEL] Starting cluster on port $PORT"
  init_cluster "$ROOT" "$PORT"
  warmup_catalog "$ROOT" "$PORT"
  set_io_delay "$IO_DELAY_MS"
  
  trap "stop_cluster '$ROOT'" EXIT
  
  for SIZE in $SIZES; do
    case "$TEST" in
      bloom_scan)   test_bloom_scan "$ROOT" "$PORT" "$LABEL" "$SIZE" ;;
      bloom_vacuum) test_bloom_vacuum "$ROOT" "$PORT" "$LABEL" "$SIZE" ;;
      pgstattuple)  test_pgstattuple "$ROOT" "$PORT" "$LABEL" "$SIZE" ;;
      pgstatindex)  test_pgstatindex "$ROOT" "$PORT" "$LABEL" "$SIZE" ;;
      gin_vacuum)   test_gin_vacuum "$ROOT" "$PORT" "$LABEL" "$SIZE" ;;
      hash_vacuum)  test_hash_vacuum "$ROOT" "$PORT" "$LABEL" "$SIZE" ;;
      wal_logging)  test_wal_logging "$ROOT" "$PORT" "$LABEL" "$SIZE" ;;
      all)
        test_bloom_vacuum "$ROOT" "$PORT" "$LABEL" "$SIZE"
        test_pgstattuple "$ROOT" "$PORT" "$LABEL" "$SIZE"
        test_pgstatindex "$ROOT" "$PORT" "$LABEL" "$SIZE"
        test_hash_vacuum "$ROOT" "$PORT" "$LABEL" "$SIZE"
        test_wal_logging "$ROOT" "$PORT" "$LABEL" "$SIZE"
        ;;
      *) die "Unknown test: $TEST" ;;
    esac
  done
  
  stop_cluster "$ROOT"
  trap - EXIT
}

# --- Compare results ---
compare_results() {
  local base_csv="$1" patch_csv="$2" label="$3"
  
  [[ ! -f "$base_csv" || ! -f "$patch_csv" ]] && return
  
  local base_med patch_med
  base_med=$(calc_median "$base_csv")
  patch_med=$(calc_median "$patch_csv")
  
  # Guard against empty or zero values to prevent division by zero
  [[ -z "$base_med" || "$base_med" == "0" ]] && base_med="0.001"
  [[ -z "$patch_med" || "$patch_med" == "0" ]] && patch_med="0.001"
  
  local speedup pct
  speedup=$(awk "BEGIN { printf \"%.2f\", $base_med / $patch_med }")
  pct=$(awk "BEGIN { printf \"%.1f\", ($base_med - $patch_med) / $base_med * 100 }")
  
  local io_info=""
  if head -1 "$base_csv" | grep -q "reads"; then
    # Standard test: columns are run,ms,reads,read_time_ms,writes,write_time_ms
    local base_reads patch_reads base_rtime patch_rtime base_writes patch_writes base_wtime patch_wtime
    base_reads=$(calc_median_col "$base_csv" 3)
    patch_reads=$(calc_median_col "$patch_csv" 3)
    base_rtime=$(calc_median_col "$base_csv" 4)
    patch_rtime=$(calc_median_col "$patch_csv" 4)
    base_writes=$(calc_median_col "$base_csv" 5)
    patch_writes=$(calc_median_col "$patch_csv" 5)
    base_wtime=$(calc_median_col "$base_csv" 6)
    patch_wtime=$(calc_median_col "$patch_csv" 6)
    # Default to 0 if empty
    [[ -z "$base_reads" ]]   && base_reads=0
    [[ -z "$patch_reads" ]]  && patch_reads=0
    [[ -z "$base_rtime" ]]   && base_rtime=0
    [[ -z "$patch_rtime" ]]  && patch_rtime=0
    [[ -z "$base_writes" ]]  && base_writes=0
    [[ -z "$patch_writes" ]] && patch_writes=0
    [[ -z "$base_wtime" ]]   && base_wtime=0
    [[ -z "$patch_wtime" ]]  && patch_wtime=0
    io_info="  (reads=${base_reads}→${patch_reads}, read_time=${base_rtime}→${patch_rtime}ms, writes=${base_writes}→${patch_writes}, write_time=${base_wtime}→${patch_wtime}ms)"
  fi
  
  printf "%-26s base=%8.1fms  patch=%8.1fms  %5.2fx  (%5.1f%%)%s\n" \
    "$label" "$base_med" "$patch_med" "$speedup" "$pct" "$io_info"
}

print_summary() {
  echo ""
  echo "═══════════════════════════════════════════════════════════════════════"
  echo "                     STREAMING READ BENCHMARK RESULTS                   "
  echo "═══════════════════════════════════════════════════════════════════════"
  echo ""
  
  if [[ $BASELINE -eq 1 ]]; then
    printf "%-26s %-17s %-17s %-7s %-7s %s\n" "TEST" "BASELINE" "PATCHED" "SPEEDUP" "CHANGE" "I/O TIME"
    echo "─────────────────────────────────────────────────────────────────────────────────────────────────"
    
    for SIZE in $SIZES; do
      for test_name in bloom_scan bloom_vacuum pgstattuple pgstatindex gin_vacuum hash_vacuum wal_logging; do
        [[ "$TEST" != "all" && "$TEST" != "$test_name" ]] && continue
        compare_results \
          "$ROOT_BASE/results/base_${test_name}_${SIZE}.csv" \
          "$ROOT_PATCH/results/patched_${test_name}_${SIZE}.csv" \
          "${test_name}_${SIZE}"
      done
    done
  else
    echo "Results (patched only):"
    echo ""
    for f in "$ROOT_PATCH/results/"*.csv; do
      [[ -f "$f" ]] || continue
      printf "%-40s %s\n" "$(basename "$f" .csv):" "$(calc_stats "$f")"
    done
  fi
  
  echo ""
  echo "═══════════════════════════════════════════════════════════════════════"
  echo "CSV files: $ROOT_PATCH/results/"
  [[ $BASELINE -eq 1 ]] && echo "Baseline:  $ROOT_BASE/results/"
  
  # List generated flamegraphs
  if [[ $DO_PROFILE -eq 1 ]]; then
    local svgs=()
    for dir in "$ROOT_BASE/profile" "$ROOT_PATCH/profile"; do
      [[ -d "$dir" ]] || continue
      for svg in "$dir"/*.svg; do
        [[ -f "$svg" ]] && svgs+=("$svg")
      done
    done
    if [[ ${#svgs[@]} -gt 0 ]]; then
      echo ""
      echo "Flamegraphs:"
      for svg in "${svgs[@]}"; do echo "  $svg"; done
    fi
  fi
  
  echo "═══════════════════════════════════════════════════════════════════════"
}

# --- Main ---
main() {
  log "Streaming Read Benchmark"
  log "Patch: $PATCH ($PATCH_TAG)"
  log "Tests: $TEST"
  log "Sizes: $SIZES"
  log "Reps:  $REPS"
  log "I/O:   $IO_METHOD (workers=$IO_WORKERS, concurrency=$IO_MAX_CONCURRENCY)"
  [[ $DIRECT_IO -eq 1 ]] && log "Direct IO: enabled (debug_io_direct=data)"
  [[ -n "$IO_DELAY_MS" ]] && log "I/O delay: ${IO_DELAY_MS}ms read / ${WRITE_DELAY_MS}ms write via dm_delay ($DM_DELAY_DEV)"
  [[ $DO_PROFILE -eq 1 ]] && log "Profile: enabled (flamegraphs → <root>/profile/)"
  
  # Build
  if [[ $BASELINE -eq 1 ]]; then
    build_pg "$ROOT_BASE" ""
  fi
  build_pg "$ROOT_PATCH" "$PATCH"
  
  # Run tests
  if [[ $BASELINE -eq 1 ]]; then
    log "Running baseline tests"
    run_tests "$ROOT_BASE" "base"
  fi
  
  log "Running patched tests"
  run_tests "$ROOT_PATCH" "patched"
  
  # Summary
  print_summary
}

main

Re: Streamify more code paths

Reply via email to