[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756645#comment-13756645
 ] 

Nikola Petrov commented on HTTPCLIENT-1395:
-------------------------------------------

Hi Jon,

I agree with you on all points. Here are my notes(maybe somewhat specific to my 
usecase)

* Yep, sometimes making 3 calls to the cache storage layer will be slower than 
just sending a request to the server(given that the network to the HTTP server 
is fast enough)
* {quote}
I need to refresh my memory as to why, if we have a cache miss, we re-check 
whether there are variants present before calling the backend
{quote} as far as I could see, the method getCachedEntry didn't expose that 
information and returned null on *both* no variant and no root entry
* {quote} I believe that might be the only cache lookup we can avoid (as the 
later one to check if a more recent entry exists after getting the backend 
response is necessary for proper cache behavior){quote}
In my case(a web crawler), there is another layer/component that checks if the 
current URI is already processed by another worker thread so this is not 
needed. I agree that the default should be to check if the response is older 
than the one in the cache but the API user should be able to control the 
checking.

-- 
Nikola
                
> Call the storage implementation only once on a cache miss
> ---------------------------------------------------------
>
>                 Key: HTTPCLIENT-1395
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1395
>             Project: HttpComponents HttpClient
>          Issue Type: Improvement
>          Components: HttpCache
>    Affects Versions: 4.2.5
>            Reporter: Nikola Petrov
>            Priority: Minor
>             Fix For: 4.3.1
>
>         Attachments: call-storage-implementation-once-4.2-branch.patch, 
> call-storage-implementation-once.patch, 
> call-storage-implementation-once-trunk.patch
>
>
> I am trying to use the httpclient-cache component with a Cassandra backend. 
> Everything seems good except that HttpCacheStorage#getEntry is getting called 
> 3 times the first time resulting in a performance bottleneck. There might be 
> a way to handle this in the Storage implementation by caching the recently 
> queried values but I think that a better place is in the CachingHttpClient 
> class. The current code expects zero latency to the storage backend(the 
> current implementations are all memory based) but here is a patch that fixes 
> the problem. Some notes:
> * I am using the code from the 4.2.5 release(but can port the code to the 
> current trunk) 
> * test is provided in org.apache.http.impl.client.cache.TestCachingHttpClient
> * BasicHttpCache is patched to expose methods that check if the key is found 
> or if a proper variant is found - without this there is no way to say if 
> there was a real cache miss or the specific variant is missing
> * CachingHttpClient is checking if the current HttpCache implementation is 
> BasicHttpCache so it can use the new methods - I didn't want to change the 
> interface because this will add breaking changes to the API
> * This exposes the alreadyHaveNewerCacheEntry method so implementations can 
> control if the client should check for a more recent version in the cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to