On Jul 8, 2014, at 7:20 AM, [email protected] wrote: > Hello, > > I am looking for an alternate method to get the full content of cached objets > as (preferably unmapped) URLs. > The web cache inspector has its own limitation and is not efficient for large > configuration (6TB of raw cache and/or hundred millions of objects). Later it > seems it will abandonned. > > As suggested by Leif, see > https://mail-archives.apache.org/mod_mbox/trafficserver-dev/201306.mbox/%[email protected]%3E, > I wrote 2 shell scripts using a custom log. One runing via crontab adding a > daily summary of all uniq objects newly cached, a 30 days windowing is done > based on the biggest TTL of the platform. The other accepts regex and/or > types of documents ("image" is "*.jpg", "*.png"...) as input and then build a > AWK script containing regex to parse the windowed summary. Each matching URL > is then passed to CURL to purge the cache.
There is a Jira ticket somewhere to expose the cache inspector features as a set of command-line tools, but Jira being what it is, I can't find it right now. In the absence on that, tools that work on the access logs are a pretty reasonable solution. If the goal is to invalidate large portions of the cache, that's typically done with a small plugin that injects a static version string into the cache key. To invalidate a portion of the cache (usually defined by a remap rule), you just rev the version string. > Do you think this approach is a good one, or is there a more suitable > solution to get the valid (not expired) URLs Traffic Server stores ? I was > thinking of querying Traffic Server for non stale objects to retrieve all > valid URLs stored but it can be a big IO consumer... > > I took a look at the API but since I am not programming C anymore since years > and C++ looks like bizarre I can't go this way. > > Regards, > -- > Denis
