(opendal) 01/01: rfc Simulate Layer

xuanwo Sat, 18 Oct 2025 16:32:55 -0700

This is an automated email from the ASF dual-hosted git repository.

xuanwo pushed a commit to branch 6676-start-after-on-fs
in repository https://gitbox.apache.org/repos/asf/opendal.git


commit 8a733dd555e699acd8a26639e70ac9428db63e7d
Author: Xuanwo <[email protected]>
AuthorDate: Thu Oct 16 17:03:58 2025 +0800

    rfc Simulate Layer
    
    Signed-off-by: Xuanwo <[email protected]>
---
 core/src/docs/rfcs/0000_simulate_layer.md | 491 ++++++++++++++++++++++++++++++
 1 file changed, 491 insertions(+)

diff --git a/core/src/docs/rfcs/0000_simulate_layer.md 
b/core/src/docs/rfcs/0000_simulate_layer.md
new file mode 100644
index 000000000..8526936b2
--- /dev/null
+++ b/core/src/docs/rfcs/0000_simulate_layer.md
@@ -0,0 +1,491 @@
+- Proposal Name: `simulate_layer`
+- Start Date: 2025-10-16
+- RFC PR: [apache/opendal#0000](https://github.com/apache/opendal/pull/0000)
+- Tracking Issue: 
[apache/opendal#6676](https://github.com/apache/opendal/issues/6676)
+
+# Summary
+
+Introduce a public `SimulateLayer` to replace the internal `CompleteLayer`, 
giving users explicit control over capability simulation while maintaining 
backward compatibility through a phased migration strategy.
+
+# Motivation
+
+Currently, OpenDAL automatically applies `CompleteLayer` internally to 
simulate missing capabilities for all backends. While this provides a seamless 
"batteries-included" experience, it has several limitations:
+
+1. **Lack of user control**: Users cannot opt-out of simulations they don't 
need, even in performance-sensitive scenarios where native-only operations are 
preferred.
+
+2. **Hidden complexity**: The `CompleteLayer` has grown to 385 lines handling 
multiple concerns (list recursive, stat dir, create_dir, reader/writer 
wrappers), making it difficult to maintain and understand.
+
+3. **All-or-nothing approach**: Users either get all simulations or none. 
There's no way to selectively enable only the simulations they need.
+
+4. **Binding limitations**: While Python/Java/Ruby bindings automatically 
benefit from simulations, they also cannot disable them, potentially causing 
unexpected behavior when users expect native backend semantics.
+
+5. **Missing features**: Some useful simulations (like `start_after` for fs 
backend, reported in #6676) are not implemented because adding more logic to 
`CompleteLayer` would make it even more complex.
+
+This proposal aims to:
+- Make capability simulation **explicit and user-controllable**
+- **Reduce complexity** by separating concerns
+- Provide a **clear migration path** that maintains backward compatibility
+- Enable **fine-grained control** over which simulations to apply
+
+# Guide-level explanation
+
+## For Rust users
+
+After this change, capability simulation becomes explicit through 
`SimulateLayer`:
+
+```rust
+use opendal::layers::SimulateLayer;
+use opendal::services::Fs;
+use opendal::Operator;
+
+// Default: all simulations enabled
+let op = Operator::new(Fs::default())?
+    .layer(SimulateLayer::default())
+    .finish();
+
+// Selective simulation with method chaining
+let op = Operator::new(Fs::default())?
+    .layer(
+        SimulateLayer::default()
+            .with_list_recursive(true)      // Enable recursive listing 
simulation
+            .with_list_start_after(true)    // Enable start_after simulation
+            .with_stat_dir(false)           // Disable stat dir simulation
+            .with_create_dir(false)         // Disable create_dir simulation
+    )
+    .finish();
+
+// Performance-critical: no simulation overhead
+let op = Operator::new(S3::default())?
+    // Don't add SimulateLayer - use native capabilities only
+    .finish();
+```
+
+## For binding users (Python/Java/Ruby)
+
+Binding users will see **no breaking changes** during the migration. 
Simulations will continue to work automatically:
+
+```python
+# Python - works the same way
+operator = opendal.Operator("fs", root="/tmp")
+files = list(operator.list("", start_after="file.txt"))  # Just works
+```
+
+## Migration path
+
+### Phase 1: Next version (Soft transition - 0-3 months)
+- `SimulateLayer` is introduced as a **public layer**
+- `CompleteLayer` remains internal, controlled by `auto-simulate` feature 
(enabled by default)
+- Documentation encourages explicit use of `SimulateLayer`
+- No breaking changes
+
+```toml
+# Still works with auto-simulate enabled by default
+[dependencies]
+opendal = "0.x"
+```
+
+### Phase 2: Next 2-3 versions (Transition period - 3-6 months)
+- Deprecation warnings added for implicit simulation
+- Migration guide published
+- Examples updated to use explicit `SimulateLayer`
+- Users can test with `default-features = false` to disable auto-simulate
+
+```toml
+# Opt-out of auto-simulate to prepare for migration
+[dependencies]
+opendal = { version = "0.x", default-features = false }
+```
+
+### Phase 3: Next minor version after transition (Breaking change - 6 months)
+- `auto-simulate` feature **disabled by default**
+- Users must explicitly add `SimulateLayer` or opt-in with `features = 
["auto-simulate"]`
+
+```rust
+// Required after this version
+let op = Operator::new(Fs::default())?
+    .layer(SimulateLayer::default())
+    .finish();
+```
+
+### Phase 4: Next major version (Clean slate - 12 months)
+- `CompleteLayer` and `auto-simulate` feature completely removed
+- Bindings automatically apply `SimulateLayer` internally
+- Core library requires explicit simulation
+
+# Reference-level explanation
+
+## Architecture changes
+
+### Current architecture
+```
+Backend → CompleteLayer (auto-applied) → User
+          ↓
+          • list recursive simulation
+          • stat dir simulation  
+          • create_dir simulation
+          • reader/writer wrappers
+          • (all forced, no user control)
+```
+
+### New architecture
+```
+Backend → [SimulateLayer (user-controlled)] → User
+          ↓
+          • list recursive simulation (configurable)
+          • list start_after simulation (configurable)
+          • stat dir simulation (configurable)
+          • create_dir simulation (configurable)
+          • reader/writer wrappers (kept in CompleteLayer or removed)
+```
+
+## SimulateLayer implementation
+
+```rust
+// core/src/layers/simulate.rs
+
+#[derive(Debug, Clone)]
+pub struct SimulateLayer {
+    list_recursive: bool,
+    list_start_after: bool,
+    list_limit: bool,        // Future
+    stat_dir: bool,
+    create_dir: bool,
+}
+
+impl Default for SimulateLayer {
+    fn default() -> Self {
+        Self {
+            list_recursive: true,
+            list_start_after: true,
+            list_limit: false,
+            stat_dir: true,
+            create_dir: true,
+        }
+    }
+}
+
+impl SimulateLayer {
+    /// Enable/disable recursive listing simulation (default: true)
+    pub fn with_list_recursive(mut self, enabled: bool) -> Self {
+        self.list_recursive = enabled;
+        self
+    }
+    
+    /// Enable/disable start_after simulation (default: true)
+    pub fn with_list_start_after(mut self, enabled: bool) -> Self {
+        self.list_start_after = enabled;
+        self
+    }
+    
+    /// Enable/disable stat directory simulation (default: true)
+    pub fn with_stat_dir(mut self, enabled: bool) -> Self {
+        self.stat_dir = enabled;
+        self
+    }
+    
+    /// Enable/disable create_dir simulation (default: true)
+    pub fn with_create_dir(mut self, enabled: bool) -> Self {
+        self.create_dir = enabled;
+        self
+    }
+}
+```
+
+## Feature flag for migration
+
+```rust
+// core/src/types/operator/builder.rs
+
+impl<A: Builder> Operator {
+    pub fn new(accessor: A) -> OperatorBuilder<impl Access> {
+        let builder = OperatorBuilder { accessor }
+            .layer(ErrorContextLayer);
+        
+        #[cfg(feature = "auto-simulate")]
+        let builder = builder.layer(CompleteLayer);
+        
+        builder.layer(CorrectnessCheckLayer)
+    }
+}
+
+// Cargo.toml
+[features]
+default = ["auto-simulate"]  # Next version to Phase 3
+# default = []               # Phase 3+ (breaking change)
+auto-simulate = []
+```
+
+## Capability updates
+
+Services that don't natively support certain features will have their 
`full_capability` updated when `SimulateLayer` is applied:
+
+```rust
+impl<A: Access> Layer<A> for SimulateLayer {
+    fn layer(&self, inner: A) -> Self::LayeredAccess {
+        let info = inner.info();
+        
+        info.update_full_capability(|mut cap| {
+            if self.list_start_after && cap.list {
+                cap.list_with_start_after = true;
+            }
+            if self.list_recursive && cap.list {
+                cap.list_with_recursive = true;
+            }
+            if self.create_dir && cap.list && cap.write_can_empty {
+                cap.create_dir = true;
+            }
+            cap
+        });
+        
+        SimulateAccessor { 
+            inner: Arc::new(inner), 
+            info, 
+            config: self.clone() 
+        }
+    }
+}
+```
+
+## Bindings integration
+
+Bindings will automatically apply `SimulateLayer` to maintain backward 
compatibility:
+
+```rust
+// bindings/python/src/operator.rs
+
+impl Operator {
+    pub fn new(scheme: &str, options: HashMap<String, String>) -> 
PyResult<Self> {
+        let op = ocore::Operator::new(accessor)?
+            .layer(ocore::layers::SimulateLayer::default())  // Auto-apply in 
bindings
+            .finish();
+        Ok(Self { core: op })
+    }
+}
+```
+
+## List start_after simulation
+
+As a concrete example, here's how `start_after` will be simulated:
+
+```rust
+pub struct StartAfterLister<L> {
+    inner: L,
+    start_after: String,
+    skipped: bool,
+}
+
+impl<L: oio::List> oio::List for StartAfterLister<L> {
+    async fn next(&mut self) -> Result<Option<oio::Entry>> {
+        loop {
+            let Some(entry) = self.inner.next().await? else {
+                return Ok(None);
+            };
+            
+            // Skip entries until we find one > start_after
+            if !self.skipped {
+                if entry.path() <= self.start_after.as_str() {
+                    continue;
+                }
+                self.skipped = true;
+            }
+            
+            return Ok(Some(entry));
+        }
+    }
+}
+```
+
+# Drawbacks
+
+## Breaking change risk
+
+Even with a phased migration, changing from implicit to explicit simulation is 
fundamentally a breaking change. Users who don't follow the migration guide 
will face compilation errors after Phase 3.
+
+**Mitigation**: 
+- 6-month transition period with clear deprecation warnings
+- Comprehensive migration guide with automated tools
+- `auto-simulate` feature for temporary backward compatibility
+
+## Documentation burden
+
+This change requires updating extensive documentation, examples, and tutorials 
across all bindings.
+
+**Mitigation**:
+- Automated documentation generation
+- Version-specific migration guides
+- Community support during transition
+
+## Potential performance regression for some users
+
+Users who upgrade without understanding the change might add 
`SimulateLayer::default()` everywhere, even for backends that don't need it, 
adding unnecessary overhead.
+
+**Mitigation**:
+- Clear documentation about which backends benefit from simulation
+- Performance recommendations in the migration guide
+- Compiler warnings for unnecessary simulations (future enhancement)
+
+# Rationale and alternatives
+
+## Why this design?
+
+1. **Explicit over implicit**: Users should consciously decide what 
simulations to use, leading to better understanding and fewer surprises.
+
+2. **Gradual migration**: The phased approach minimizes disruption while 
giving the community time to adapt.
+
+3. **Binding compatibility**: Automatic application in bindings ensures that 
non-Rust users aren't affected.
+
+4. **Extensibility**: The method chaining pattern makes it easy to add new 
simulations (like `list_with_limit`) without breaking changes.
+
+## Alternative 1: Keep CompleteLayer, add disable flags
+
+Instead of a new layer, add flags to disable specific simulations in 
`CompleteLayer`.
+
+**Rejected because**:
+- Doesn't address the "all-or-nothing" problem fundamentally
+- Still forces all users to pay for simulation machinery even if disabled
+- Doesn't improve code organization or maintainability
+
+## Alternative 2: Service-specific default simulation
+
+Each service declares which simulations it needs, applied automatically.
+
+**Rejected because**:
+- Less transparent - users don't know what's being simulated
+- Service maintainers must decide defaults, which may not match user needs
+- Doesn't allow per-application customization
+
+## Alternative 3: Split into multiple layers
+
+Create separate layers: `ListSimulateLayer`, `StatSimulateLayer`, 
`CreateDirSimulateLayer`.
+
+**Rejected because**:
+- Too granular for most users
+- Verbose: `.layer(ListSimulateLayer).layer(StatSimulateLayer)...`
+- Doesn't align with OpenDAL's "batteries included" philosophy
+- `SimulateLayer` with method chaining provides the same granularity when 
needed
+
+## Alternative 4: Builder pattern
+
+Use a separate builder type like `SimulateLayer::builder().with_xxx().build()`.
+
+**Rejected because**:
+- More verbose than necessary for simple use cases
+- Method chaining on the layer itself is more ergonomic
+- OpenDAL already uses method chaining in many places (e.g., `OpList`)
+
+## What if we don't do this?
+
+1. `CompleteLayer` continues to grow in complexity
+2. New simulations (like `start_after` for fs) remain unimplemented
+3. Performance-sensitive users continue to be frustrated by forced overhead
+4. OpenDAL becomes harder to maintain and understand
+
+# Prior art
+
+## Rust ecosystem
+
+- **Tower** (HTTP middleware): Uses explicit middleware with builder pattern
+- **Actix-web**: Middleware are opt-in with clear APIs
+- **Reqwest**: Features are controlled via Cargo features, not runtime
+
+## Other storage libraries
+
+- **AWS SDK**: Capabilities are documented but not abstracted
+- **GCS Client**: No simulation layer - users must handle missing features
+- **MinIO SDK**: Provides compatibility helpers as separate modules
+
+OpenDAL's approach is unique in providing transparent simulation, and this RFC 
aims to maintain that advantage while adding user control.
+
+# Unresolved questions
+
+## When to remove reader/writer wrappers?
+
+`CompleteLayer` currently includes `CompleteReader` and `CompleteWriter` that 
validate read/write sizes. Should these:
+1. Stay in a minimal `CompleteLayer` (still auto-applied)?
+2. Move to `SimulateLayer`?
+3. Move to a separate `ValidateLayer`?
+
+**Proposed resolution**: Keep in minimal `CompleteLayer` during transition, 
revisit in next major version.
+
+## Should we provide migration tooling?
+
+Should we build `cargo-opendal migrate` or similar tools to automatically 
update code?
+
+**Proposed resolution**: Defer to Phase 2, assess based on community feedback.
+
+## Binding-specific configuration?
+
+Should Python users be able to configure `SimulateLayer` through binding APIs?
+
+```python
+# Hypothetical API
+operator = opendal.Operator(
+    "fs", 
+    root="/tmp",
+    simulate={"list_start_after": True, "stat_dir": False}
+)
+```
+
+**Proposed resolution**: Not in initial implementation. Can be added in future 
based on demand.
+
+# Future possibilities
+
+## Automatic simulation detection
+
+Add compile-time or runtime warnings when `SimulateLayer` is used with 
backends that don't need it:
+
+```rust
+// Future enhancement
+let op = Operator::new(S3::default())?
+    .layer(SimulateLayer::default())  // Warning: S3 supports all features 
natively
+    .finish();
+```
+
+## Conditional simulation based on operations
+
+Allow enabling simulations only for specific operations:
+
+```rust
+// Future API
+SimulateLayer::default()
+    .simulate_only_for(&["list", "stat"])
+```
+
+## Simulation statistics
+
+Expose metrics about which simulations are being used:
+
+```rust
+// Future API
+let stats = op.simulation_stats();
+println!("start_after simulated: {} times", stats.start_after_count);
+```
+
+## List limit simulation
+
+Extend simulation to `list_with_limit` for backends that don't support it 
natively (requires client-side buffering).
+
+## Versioning simulation
+
+Extend simulation to `list_with_versions` for backends that don't support 
versioning natively (requires metadata storage).
+
+## Plugin system for custom simulations
+
+Allow users to provide custom simulation logic:
+
+```rust
+// Far future API
+SimulateLayer::default()
+    .with_custom_simulator(MyCustomSimulator::new())
+```
+
+## Zero-cost simulation layer
+
+Use const generics or compile-time feature detection to make simulation truly 
zero-cost when all features are native:
+
+```rust
+// Speculative future optimization
+let op = Operator::new(S3::default())?
+    .layer(SimulateLayer::default())  // Optimized away at compile time for S3
+    .finish();
+```

(opendal) 01/01: rfc Simulate Layer

Reply via email to