This is an automated email from the ASF dual-hosted git repository.

xuanwo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/opendal.git


The following commit(s) were added to refs/heads/main by this push:
     new 8817d47d6 RFC-7130: Route Layer (#7130)
8817d47d6 is described below

commit 8817d47d6b67457125f5ed91970e71feb8c12b74
Author: Xuanwo <[email protected]>
AuthorDate: Tue Jan 6 18:48:20 2026 +0800

    RFC-7130: Route Layer (#7130)
    
    * Add RFC for route layer
    
    * Assign number
    
    * Remove
---
 core/core/src/docs/rfcs/7130_route_layer.md | 214 ++++++++++++++++++++++++++++
 core/core/src/docs/rfcs/mod.rs              |   4 +
 2 files changed, 218 insertions(+)

diff --git a/core/core/src/docs/rfcs/7130_route_layer.md 
b/core/core/src/docs/rfcs/7130_route_layer.md
new file mode 100644
index 000000000..2975cff9e
--- /dev/null
+++ b/core/core/src/docs/rfcs/7130_route_layer.md
@@ -0,0 +1,214 @@
+- Proposal Name: `route_layer`
+- Start Date: 2026-01-04
+- RFC PR: [apache/opendal#7130](https://github.com/apache/opendal/pull/7130)
+- Tracking Issue: 
[apache/opendal#7131](https://github.com/apache/opendal/issues/7131)
+
+# Summary
+
+Introduce `RouteLayer`, a layer that dispatches each OpenDAL operation to one 
of multiple pre-built `Operator` stacks by matching the operation path against 
user-provided glob patterns.
+
+# Motivation
+
+In practice, OpenDAL users often want different policies for different parts 
of a namespace:
+
+- Apply caching only for frequently-read objects (e.g., `**/*.parquet`).
+- Use a tighter timeout for latency-sensitive paths (e.g., `hot/**`).
+- Attach different observability or throttling configurations per dataset.
+
+Today, users can achieve this by building multiple `Operator`s and routing 
requests in application code. However, this approach duplicates routing logic 
across projects and makes it harder to share a consistent, well-tested routing 
behavior across bindings and integrations.
+
+`RouteLayer` centralizes this routing as a reusable primitive while preserving 
OpenDAL's existing composition model: users still build independent `Operator` 
stacks for each policy, and `RouteLayer` only decides which stack should handle 
a request.
+
+# Guide-level explanation
+
+## Enable feature
+
+```toml
+opendal = { version = "*", features = ["layers-route"] }
+```
+
+## Build per-route operators
+
+Build each routed `Operator` as usual, including its own service configuration 
and layer stack.
+
+```rust
+use std::time::Duration;
+
+use opendal::layers::RouteLayer;
+use opendal::services;
+use opendal::Operator;
+use opendal::Result;
+use opendal_layer_timeout::TimeoutLayer;
+
+fn build_default() -> Result<Operator> {
+    Operator::new(services::Memory::default())?.finish()
+}
+
+fn build_parquet_fast_path() -> Result<Operator> {
+    Operator::new(services::Memory::default())?
+        .layer(TimeoutLayer::default().with_timeout(Duration::from_secs(3)))
+        .finish()
+}
+```
+
+## Combine them with `RouteLayer`
+
+`RouteLayer` is applied to the default operator. When none of the patterns 
match, the request is handled by the default operator.
+
+Patterns are evaluated in insertion order. The first matching pattern wins.
+
+```rust
+use opendal::Operator;
+use opendal::Result;
+
+fn build_routed_operator() -> Result<Operator> {
+    let default_op = build_default()?;
+    let parquet_op = build_parquet_fast_path()?;
+
+    let routed = default_op.layer(
+        RouteLayer::builder()
+            .route("**/*.parquet", parquet_op)
+            .build()?,
+    );
+
+    Ok(routed)
+}
+```
+
+## Pattern rules
+
+`RouteLayer` uses glob patterns (via `globset`) and does not perform any 
implicit expansion.
+
+- `*.parquet` matches `file.parquet` in the root, but does not match 
`dir/file.parquet`.
+- `**/*.parquet` matches `dir/file.parquet` and any other depth.
+
+Paths are matched against OpenDAL normalized paths as seen by the accessor:
+
+- The root is represented as `/`.
+- Non-root paths do not start with `/` (e.g., `dir/file`).
+- Directory paths end with `/` (e.g., `dir/`).
+
+## Routing for `copy` and `rename`
+
+For `copy(from, to, ..)` and `rename(from, to, ..)`, routing is decided only 
by `from`. The `to` path is forwarded unchanged to the selected operator.
+
+# Reference-level explanation
+
+## Public API
+
+`RouteLayer` is constructed via a fallible builder, because glob compilation 
can fail and `Layer::layer()` cannot return `Result`.
+
+```rust
+pub struct RouteLayer { /* compiled router */ }
+
+impl RouteLayer {
+    pub fn builder() -> RouteLayerBuilder;
+}
+
+pub struct RouteLayerBuilder { /* patterns + operators */ }
+
+impl RouteLayerBuilder {
+    pub fn route(self, pattern: impl AsRef<str>, op: Operator) -> Self;
+    pub fn build(self) -> Result<RouteLayer>;
+}
+```
+
+`RouteLayer` is intended to be applied to a fully-built `Operator` (dynamic 
dispatch). Routed operators are also fully-built and independent.
+
+## Internal structure
+
+`RouteLayer` holds:
+
+- `glob: globset::GlobSet`, compiled from patterns in insertion order.
+- `targets: Vec<Accessor>`, where `targets[i]` is the accessor corresponding 
to the `i`-th inserted pattern.
+
+At `build()` time:
+
+1. Compile each pattern into a `globset::Glob`.
+2. Insert each glob into a `globset::GlobSetBuilder` in insertion order.
+3. Convert each routed `Operator` into its `Accessor` and store it in 
`targets` in the same order.
+
+Any glob compilation error results in `build()` returning an 
`ErrorKind::InvalidInput` error with the pattern and the underlying source 
error attached.
+
+## Dispatch algorithm
+
+For an operation that includes a path `p`:
+
+1. Compute `matches = glob.matches(p)`, yielding a set of matching pattern 
indices.
+2. Select `idx = min(matches)` as the chosen route (first match in insertion 
order).
+3. If no match exists, dispatch to the inner accessor (the default operator 
that `RouteLayer` is applied to).
+
+This approach does not rely on any ordering guarantees from `globset` for 
returned matches.
+
+### Operations with paths
+
+The following operations are dispatched based on their `path` argument:
+
+- `create_dir(path, ..)`
+- `read(path, ..)`
+- `write(path, ..)`
+- `stat(path, ..)`
+- `list(path, ..)`
+- `presign(path, ..)`
+
+### Operations with two paths
+
+The following operations are dispatched based on `from` only:
+
+- `copy(from, to, ..)`
+- `rename(from, to, ..)`
+
+### Delete semantics
+
+`Access::delete()` returns a deleter that can queue multiple deletions and 
execute them in `close()`. To preserve this batching behavior while supporting 
per-path routing, `RouteLayer` returns a `RouteDeleter`:
+
+- `RouteDeleter::delete(path, ..)` routes by `path` and lazily creates (and 
caches) the underlying deleter for the selected target accessor.
+- `RouteDeleter::close()` closes all created deleters and returns an error if 
any close fails.
+
+This design batches deletions per routed operator and avoids creating deleters 
for unused routes.
+
+## `info()` behavior
+
+`RouteAccessor::info()` forwards to the default (inner) accessor's 
`AccessorInfo`. This keeps `Operator::info()` stable and avoids synthesizing a 
potentially misleading merged capability across routed operators.
+
+Users that need per-route `AccessorInfo` can query the routed `Operator`s 
directly before passing them into the builder.
+
+# Drawbacks
+
+- Order-dependent routing can be error-prone when multiple patterns overlap; 
users must manage ordering carefully.
+- `Operator::info()` reflects the default operator, not a merged view across 
routes.
+- `copy`/`rename` routing based only on `from` can lead to surprising behavior 
if the destination logically belongs to a different route; this is accepted for 
simplicity.
+- Adds a new dependency (`globset`) and introduces a small per-operation 
routing cost.
+
+# Rationale and alternatives
+
+## Why glob patterns via `globset`?
+
+- Glob patterns are a well-known way to express path-based rules and cover 
common needs like `**/*.parquet`.
+- `globset` compiles patterns into an efficient matcher and provides a 
`matches` API suitable for first-match selection without scanning all patterns 
with separate matchers.
+
+## Why first-match-wins without priority?
+
+First-match-wins keeps the model explicit and deterministic without additional 
concepts. Rule ordering is visible in code review and can be reasoned about 
locally.
+
+## Alternatives considered
+
+- **User-side routing**: build multiple operators and dispatch in application 
code. Works today but duplicates logic and increases maintenance burden across 
projects.
+- **Prefix-only routing**: efficient but does not satisfy suffix-based rules 
like `**/*.parquet`.
+- **Implicit expansion (e.g., treating `*.parquet` as `**/*.parquet`)**: 
simplifies some patterns but introduces surprising behavior; this proposal 
requires explicit `**`.
+- **Routing on operation type or options**: valuable, but out of scope for 
this RFC and reserved for future work.
+
+# Prior art
+
+- `gitignore`-style globs for hierarchical path matching.
+- HTTP routers and proxy configuration models (e.g., ordered rules with 
first-match semantics).
+- Data lake engines commonly route by file suffix (e.g., `.parquet`) to 
specialized code paths.
+
+# Unresolved questions
+
+None
+
+# Future possibilities
+
+- Extend routing conditions beyond path, such as operation type 
(read/write/list) or per-operation options.
+- Support loading rules from configuration files or environment variables for 
CLI tools and integrations.
diff --git a/core/core/src/docs/rfcs/mod.rs b/core/core/src/docs/rfcs/mod.rs
index 5c5d6b82d..07095e49b 100644
--- a/core/core/src/docs/rfcs/mod.rs
+++ b/core/core/src/docs/rfcs/mod.rs
@@ -292,3 +292,7 @@ pub mod rfc_6817_checksum {}
 /// Core
 #[doc = include_str!("6828_core.md")]
 pub mod rfc_6828_core {}
+
+/// Route Layer
+#[doc = include_str!("7130_route_layer.md")]
+pub mod rfc_7130_route_layer {}

Reply via email to