On Jul 8, 2014, at 7:20 AM, [email protected] wrote:

> Hello,
> 
> I am looking for an alternate method to get the full content of cached objets 
> as (preferably unmapped) URLs.
> The web cache inspector has its own limitation and is not efficient for large 
> configuration (6TB of raw cache and/or hundred millions of objects). Later it 
> seems it will abandonned.
> 
> As suggested by Leif, see 
> https://mail-archives.apache.org/mod_mbox/trafficserver-dev/201306.mbox/%[email protected]%3E,
>  I wrote 2 shell scripts using a custom log. One runing via crontab adding a 
> daily summary of all uniq objects newly cached, a 30 days windowing is done 
> based on the biggest TTL of the platform. The other accepts regex and/or 
> types of documents ("image" is "*.jpg", "*.png"...) as input and then build a 
> AWK script containing regex to parse the windowed summary. Each matching URL 
> is then passed to CURL to purge the cache.

There is a Jira ticket somewhere to expose the cache inspector features as a 
set of command-line tools, but Jira being what it is, I can't find it right 
now. In the absence on that, tools that work on the access logs are a pretty 
reasonable solution.

If the goal is to invalidate large portions of the cache, that's typically done 
with a small plugin that injects a static version string into the cache key. To 
invalidate a portion of the cache (usually defined by a remap rule), you just 
rev the version string.


> Do you think this approach is a good one, or is there a more suitable 
> solution to get the valid (not expired) URLs Traffic Server stores ? I was 
> thinking of querying Traffic Server for non stale objects to retrieve all 
> valid URLs stored but it can be a big IO consumer...
> 
> I took a look at the API but since I am not programming C anymore since years 
> and C++ looks like bizarre I can't go this way.
> 
> Regards,
> --
> Denis

Reply via email to