ankrgyl commented on code in PR #542:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/542#discussion_r2511193232


##########
src/client/token.rs:
##########
@@ -33,15 +35,17 @@ pub(crate) struct TemporaryToken<T> {
 /// [`TemporaryToken`] based on its expiry
 #[derive(Debug)]
 pub(crate) struct TokenCache<T> {
-    cache: Mutex<Option<(TemporaryToken<T>, Instant)>>,
+    cache: ArcSwapOption<CacheEntry<T>>,

Review Comment:
   Sorry I just saw this before writing my other comment.
   
   > I would imagine the overhead of actually using the token (making an HTTP 
request) is pretty huge compared to getting a lock.
   
   The problem with the previous design, which may not apply to an RwLock (and 
sure, I will benchmark it and report back), is that "waiting in line" for the 
mutex would become so expensive with a high number of concurrent requests (eg. 
HEAD requests with 8ms p50 latencies), that it actually overwhelmed tokio's 
worker threads and dominated the execution time (we saw p50 "HEAD" operation 
latency spike to 700ms, and realized the mutex was the root cause).
   
   Let me run a benchmark with arc swap vs. RwLock and report back



##########
src/client/token.rs:
##########
@@ -33,15 +35,17 @@ pub(crate) struct TemporaryToken<T> {
 /// [`TemporaryToken`] based on its expiry
 #[derive(Debug)]
 pub(crate) struct TokenCache<T> {
-    cache: Mutex<Option<(TemporaryToken<T>, Instant)>>,
+    cache: ArcSwapOption<CacheEntry<T>>,

Review Comment:
   Sorry I just saw this afer writing my other comment.
   
   > I would imagine the overhead of actually using the token (making an HTTP 
request) is pretty huge compared to getting a lock.
   
   The problem with the previous design, which may not apply to an RwLock (and 
sure, I will benchmark it and report back), is that "waiting in line" for the 
mutex would become so expensive with a high number of concurrent requests (eg. 
HEAD requests with 8ms p50 latencies), that it actually overwhelmed tokio's 
worker threads and dominated the execution time (we saw p50 "HEAD" operation 
latency spike to 700ms, and realized the mutex was the root cause).
   
   Let me run a benchmark with arc swap vs. RwLock and report back



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to