[
https://issues.apache.org/jira/browse/HTTPCLIENT-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756645#comment-13756645
]
Nikola Petrov commented on HTTPCLIENT-1395:
-------------------------------------------
Hi Jon,
I agree with you on all points. Here are my notes(maybe somewhat specific to my
usecase)
* Yep, sometimes making 3 calls to the cache storage layer will be slower than
just sending a request to the server(given that the network to the HTTP server
is fast enough)
* {quote}
I need to refresh my memory as to why, if we have a cache miss, we re-check
whether there are variants present before calling the backend
{quote} as far as I could see, the method getCachedEntry didn't expose that
information and returned null on *both* no variant and no root entry
* {quote} I believe that might be the only cache lookup we can avoid (as the
later one to check if a more recent entry exists after getting the backend
response is necessary for proper cache behavior){quote}
In my case(a web crawler), there is another layer/component that checks if the
current URI is already processed by another worker thread so this is not
needed. I agree that the default should be to check if the response is older
than the one in the cache but the API user should be able to control the
checking.
--
Nikola
> Call the storage implementation only once on a cache miss
> ---------------------------------------------------------
>
> Key: HTTPCLIENT-1395
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1395
> Project: HttpComponents HttpClient
> Issue Type: Improvement
> Components: HttpCache
> Affects Versions: 4.2.5
> Reporter: Nikola Petrov
> Priority: Minor
> Fix For: 4.3.1
>
> Attachments: call-storage-implementation-once-4.2-branch.patch,
> call-storage-implementation-once.patch,
> call-storage-implementation-once-trunk.patch
>
>
> I am trying to use the httpclient-cache component with a Cassandra backend.
> Everything seems good except that HttpCacheStorage#getEntry is getting called
> 3 times the first time resulting in a performance bottleneck. There might be
> a way to handle this in the Storage implementation by caching the recently
> queried values but I think that a better place is in the CachingHttpClient
> class. The current code expects zero latency to the storage backend(the
> current implementations are all memory based) but here is a patch that fixes
> the problem. Some notes:
> * I am using the code from the 4.2.5 release(but can port the code to the
> current trunk)
> * test is provided in org.apache.http.impl.client.cache.TestCachingHttpClient
> * BasicHttpCache is patched to expose methods that check if the key is found
> or if a proper variant is found - without this there is no way to say if
> there was a real cache miss or the specific variant is missing
> * CachingHttpClient is checking if the current HttpCache implementation is
> BasicHttpCache so it can use the new methods - I didn't want to change the
> interface because this will add breaking changes to the API
> * This exposes the alreadyHaveNewerCacheEntry method so implementations can
> control if the client should check for a more recent version in the cache
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]