Re: [PR] RFC-6297: Cache Layer [opendal]

via GitHub Sat, 03 Jan 2026 19:37:11 -0800


flaneur2020 commented on code in PR #6297:
URL: https://github.com/apache/opendal/pull/6297#discussion_r2659297907



##########
core/core/src/docs/rfcs/6297_cache_layer.md:
##########
@@ -0,0 +1,435 @@
+- Proposal Name: `cache_layer`
+- Start Date: 2025-06-16
+- RFC PR: [apache/opendal#6297](https://github.com/apache/opendal/pull/6297)
+- Tracking Issue: 
[apache/opendal#7107](https://github.com/apache/opendal/issues/7107)
+
+# Summary
+
+This RFC proposes the addition of a Cache Layer to OpenDAL, providing 
transparent read-through and write-through caching capabilities. The Cache 
Layer allows users to improve performance by caching data from a slower storage 
service (e.g., S3, HDFS) to a faster one (e.g., Memory, Moka, Redis).
+
+# Motivation
+
+Storage access performance varies greatly across different storage services.
+Remote object stores like S3 or GCS have much higher latency than local 
storage or in-memory caches.
+In many applications, particularly those with read-heavy workloads or repeated 
access to the same data, caching can significantly improve performance.
+
+Currently, users who want to implement caching with OpenDAL must manually:
+
+1. Check if data exists in cache service
+2. If cache misses, fetch from original storage and manually populate cache
+3. Handle cache invalidation and consistency manually
+
+By introducing a dedicated Cache Layer, we can:
+
+- Provide a unified, transparent caching solution within OpenDAL
+- Eliminate boilerplate code for common caching patterns
+- Allow flexible configuration of caching policies
+- Enable performance optimization with minimal code changes
+- Leverage existing OpenDAL services as cache storage
+
+# Guide-level explanation
+
+The Cache Layer allows you to wrap any existing service with a caching 
mechanism.
+When data is accessed through this layer, it will automatically be cached in 
your specified cache service.
+The cache layer is designed to be straightforward, and delegates cache 
management policies (like TTL, eviction policy) to the underlying cache service.
+
+## Basic Usage
+
+```rust
+use opendal::{layers::CacheLayer, services::Memory, services::S3, Operator};
+
+#[tokio::main]
+async fn main() -> opendal::Result<()> {
+    // Create a memory operator to use as cache
+    let memory = Operator::new(Memory::default())?;
+
+    // Create a primary storage operator (e.g., S3)
+    let s3 = Operator::new(
+        S3::default()
+            .bucket("my-bucket")
+            .region("us-east-1")
+            .build()?
+        )?
+        .finish();
+
+    // Wrap the primary storage with the cache layer
+    let op = s3.layer(CacheLayer::new(memory)).finish();
+
+    // Use the operator as normal - caching is transparent
+    let data = op.read("path/to/file").await?;
+
+    // Later reads will be served from cache if available
+    let cached_data = op.read("path/to/file").await?;
+
+    Ok(())
+}
+```
+
+## Using Different Cache Services
+
+The Cache Layer can use any OpenDAL service as cache service:
+
+```rust
+// Using Redis as cache
+use opendal::services::Redis;
+
+let redis_cache = Operator::new(
+    Redis::default()
+        .endpoint("redis://localhost:6379")
+    )?
+    .finish();
+
+let op = s3.layer(CacheLayer::new(redis_cache)).finish();
+```
+
+```rust
+// Using Moka (in-memory cache with advanced features)
+use opendal::services::Moka;
+
+let moka_cache = Operator::new(
+        Moka::default()
+            .max_capacity(1000)
+            .time_to_live(Duration::from_secs(3600)) // TTL managed by Moka
+    )?
+    .finish();
+
+let op = s3.layer(CacheLayer::new(moka_cache)).finish();
+```
+
+## Multiple Cache Layers
+
+You can stack multiple cache layers for a multi-tier caching strategy:
+
+```rust
+// L1 cache: Fast in-memory cache
+let l1_cache = Operator::new(Memory::default())?.finish();
+
+// L2 cache: Larger but slightly slower cache (e.g., Redis)
+let l2_cache = Operator::new(
+        Redis::default().endpoint("redis://localhost:6379")
+    )?
+    .finish();
+
+// Stack the caches: L1 -> L2 -> S3
+let op = s3
+    .layer(CacheLayer::new(l2_cache))  // L2 cache
+    .layer(CacheLayer::new(l1_cache))  // L1 cache
+    .finish();
+```
+
+## Configuration Options
+
+The Cache Layer provides minimal configuration to keep it simple:
+
+```rust
+let op = s3.layer(
+    CacheLayer::new(memory)
+        .with_options(CacheOptions {
+            // Enable read-through caching (default: true)
+            read: true,
+            // Enable cache promotion during read operations (default: true)
+            read_promotion: true,
+            // Enable write-through caching (default: true)
+            write: true,

Review Comment:
   is it possible to use an enum for the write option to seperate `write 
through` or `write back`?
   
   sometimes write back could be useful, be like if the upstream s3 service 
became unavailable, having a write back cache could be benefical to keep our 
service available by buffering the writes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] RFC-6297: Cache Layer [opendal]

Reply via email to