Thanks alot Gancho. The provided guidelines are really helpful. Will share feedback on this after implementation. -- Thanks f. Saturday, 06 August 2016, 03:45AM +05:00 from Gancho Tenev [email protected] :
>Hello Faisal, > >You could do it in many different ways, using different cachekey params, a >number of different regex expressions to match, so you would have to >experiment a little bit. > >Please find a couple of quick examples below. > >if you like to match all the following URIs by ignoring the first part of the >host name and the path up to the filename (to remove the “random" parts): > >>> >>>URL: >>>http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe >>>URL: >>>http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/vlc-2.2.2-win64.exe >>>URL: >>>http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/vlc-2.2.2-win64.exe > >to the same cache key '/ mydomain.com/vlc-2.2.2-win64.exe' , you could do >something like the following: > > >$ cat remap.config >regex_map http://(fs[0-9]+ ). mydomain.com \ > http://$1.mydomain.com \ > @plugin=cachekey.so \ > >@pparam=--capture-path=/((?:\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$/$2/ >\ > @pparam=--static-prefix= mydomain.com > >$ curl -x 127.0.0.1:80 \ > >http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe > \ > >http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/vlc-2.2.2-win64.exe > \ > >http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/vlc-2.2.2-win64.exe > > >$ sudo ./bin/traffic_server -T cachekey 2>&1 | grep 'finalizing cache key' >[Aug 5 15:24:37.491] Server {0x7f042ace0700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ mydomain.com/vlc-2.2.2-win64.exe' >[Aug 5 15:24:37.575] Server {0x7f042ace0700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ mydomain.com/vlc-2.2.2-win64.exe' >[Aug 5 15:24:37.625] Server {0x7f042ace0700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ mydomain.com/vlc-2.2.2-win64.exe' > > >If you like to ignore only the first 2 parts from the path you could do >something like: > >@pparam=--capture-path=/((?:\/\w+){0,2}\/)((?:\w+\/)*)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$/$2$3/ > \ > >and then > >$ curl -x 127.0.0.1:80 \ > >http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe > \ > >http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/vlc-2.2.2-win64.exe > \ > >http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/vlc-2.2.2-win64.exe > > >would produce: > >$ sudo ./bin/traffic_server -T cachekey 2>&1 | grep 'finalizing cache key' >[Aug 5 15:35:43.574] Server {0x7f6784d43700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ mydomain.com/vlc-2.2.2-win64.exe' >[Aug 5 15:35:43.660] Server {0x7f6784d43700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ mydomain.com/vlc-2.2.2-win64.exe' >[Aug 5 15:35:43.708] Server {0x7f6784d43700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ mydomain.com/vlc-2.2.2-win64.exe' > > >and > >$ curl -x 127.0.0.1:80 \ > >http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/onemore/twomore/vlc-2.2.2-win64.exe > \ > >http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/onemore/twomore/vlc-2.2.2-win64.exe > \ > >http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/onemore/twomore/vlc-2.2.2-win64.exe > > >$ sudo ./bin/traffic_server -T cachekey 2>&1 | grep 'finalizing cache key' >[Aug 5 15:36:41.095] Server {0x7f6784c42700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ >mydomain.com/twomore/vlc-2.2.2-win64.exe' >[Aug 5 15:36:41.179] Server {0x7f6784c42700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ >mydomain.com/twomore/vlc-2.2.2-win64.exe' >[Aug 5 15:36:41.232] Server {0x7f6784c42700} DIAG: (cachekey) cachekey.cc >:594:finalize() finalizing cache key '/ >mydomain.com/twomore/vlc-2.2.2-win64.exe' > > >Again these are just examples and you would have to play more to make sure >things work / fit well (and to cook the final regex). > >You could find latest documentation here: > >https://docs.trafficserver.apache.org/en/latest/admin-guide/plugins/cachekey.en.html > >(plus an example on how to test w/o running the server in debug mode) > >And play with regex on sites like this: https://regex101.com > > >HTH! > >Cheers, >—Gancho >>On Aug 5, 2016, at 11:34 AM, [email protected] wrote: >>Hi Gancho, >>Jist to follow up on my request on cachekey >>-- >>Thanks >>f. Wednesday, 03 August 2016, 00:45AM +05:00 from Muhammad Faisal >>[email protected] : >> >>>Hi Gancho, >>>Let me explain the CDN content challenge which i'm trying to deal since last >>>couple of months. I believe the Cachekey plugin in v6.2 can help. >>> >>>URL: >>>http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe >>>URL: >>>http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/vlc-2.2.2-win64.exe >>>URL: >>>http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/vlc-2.2.2-win64.exe >>> >>>Random string in the URL "/9546/46cfd241f1da4ae9812f512f7b36643c" and origin >>>server "fs37,fs31" keeps on changing due to this the same object is cached >>>every time a user tries to download the file >>> >>>I want to save the object once and should be deliver to client; for this i >>>need to normalize the cachekey and eliminate the random strings from the >>>cachekey in this way i would be able to avoid cache object duplication and >>>increase cache hit ratio (please correct me if im wrong). >>> >>>e.g filehippo site has below sequence: >>> >>>When I click download button there are two requests one 301 which contains >>>(Location header for the requested content) and second 200: >>> >>> >>>GET >>>/download/file/6853a2c840eaefd1d7da43d6f2c94863adc5f470927402e6518d70573a99114d/ >>> HTTP/1.1 >>>Host: filehippo.com >>>Accept: >>>text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 >>>Accept-Encoding: gzip, deflate, sdch >>>Accept-Language: en-US,en;q=0.8 >>>Cookie: FHSession=mfzdaugt4nu11q3yfxfkjyox; >>>FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM; __utmt_UA-5815250-1=1; >>>__qca=P0-1359511593-1459345103148; >>>__utma=144473122.1934842269.1459345103.1459345103.1459345103.1; >>>__utmb=144473122.3.10.1459345119355; __utmc=144473122; >>>__utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); >>> __utmv=144473122.|1=AB%20Test=new-home-v1=1 >>>Referer: >>>http://filehippo.com/download_vlc_64/download/56a450f832aee6bb4fda3b01259f9866/ >>>Upgrade-Insecure-Requests: 1 >>>User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, >>>like Gecko) Chrome/49.0.2623.87 Safari/537.36 >>> >>>HTTP/1.1 301 Moved Permanently >>>Accept-Ranges: bytes >>>Age: 0 >>>Cache-Control: private >>>Connection: keep-alive >>>Content-Length: 0 >>>Content-Type: text/html >>>Date: Wed, 30 Mar 2016 13:38:45 GMT >>>Location: >>>http://fs37.filehippo.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe >>>Via: 1.1 varnish >>>X-Cache: MISS >>>X-Cache-Hits: 0 >>>x-debug-output: FHSession=mfzdaugt4nu11q3yfxfkjyox; >>>FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM; __utmt_UA-5815250-1=1; >>>__qca=P0-1359511593-1459345103148; >>>__utma=144473122.1934842269.1459345103.1459345103.1459345103.1; >>>__utmb=144473122.3.10.1459345119355; __utmc=144473122; >>>__utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); >>> __utmv=144473122.|1=AB%20Test=new-home-v1=1 >>>X-Served-By: cache-lhr6334-LHR >>> >>>200 Header: Why ATS is not caching octet stream despite having CONFIG >>>proxy.config.http.cache.required_headers INT 1 >>>GET /9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe HTTP/1.1 >>>Host: fs37.filehippo.com >>>Accept: >>>text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 >>>Accept-Encoding: gzip, deflate, sdch >>>Accept-Language: en-US,en;q=0.8 >>>Cookie: __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148; >>>__utma=144473122.1934842269.1459345103.1459345103.1459345103.1; >>>__utmb=144473122.3.10.1459345119355; __utmc=144473122; >>>__utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); >>> __utmv=144473122.|1=AB%20Test=new-home-v1=1 >>>Referer: >>>http://filehippo.com/download_vlc_64/download/56a450f832aee6bb4fda3b01259f9866/ >>>Upgrade-Insecure-Requests: 1 >>>User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, >>>like Gecko) Chrome/49.0.2623.87 Safari/537.36 >>> >>>HTTP/1.1 200 OK >>>Accept-Ranges: bytes >>>Age: 739 >>>Connection: keep-alive >>>Content-Length: 31367109 >>>Content-Type: application/octet-stream >>>Date: Wed, 30 Mar 2016 13:26:43 GMT >>>ETag: "81341be3a62d11:0" >>>Last-Modified: Mon, 02 Aug 2016 06:34:21 GMT >>> >>> >>> >>>-- >>>Regards, >>>Faisal. >>> >>> >>> >>>------ Original Message ------ >>>From: [email protected] >>>To: [email protected] >>>Sent: 8/2/2016 8:25:13 PM >>>Subject: Re[2]: Dynamic content Caching challenges >>> >>>>Hi >>>>Im on way will catch you on irc today >>>>-- >>>>Thanks >>>>f. Tuesday, 02 August 2016, 03:48AM +05:00 from Gancho Tenev >>>>[email protected] : >>>> >>>>>Hi Faisal, >>>>> >>>>>I am probably missing background/details since the configs you provided >>>>>did not make enough sense to me. >>>>> >>>>>Maybe the following would work? (have not tested it since I don’t have >>>>>enough info) >>>>> >>>>>regex_map http://(fs[0-9]+ ). filehippo.com http://$1.filehippo.com >>>>>@plugin=cachekey.so @pparam=--static-prefix= filehippo.com >>>>> >>>>> >>>>>Could we just pick one use-case and work through it? For instance “ >>>>>filehippo.com ”. >>>>> >>>>>Since cacheurl has been deprecated let us discuss cachekey configs. >>>>> >>>>>Please provide: >>>>>- few samples of filehippo.com URIs >>>>>- the corresponding remap.config rule >>>>>- and then please describe how would you like URIs to match the entries in >>>>>the cache >>>>>so we can come up with the cachekey configs and we can take it from there. >>>>> >>>>>Cheers, >>>>>—Gancho >>>>> >>>>> >>>>> >>>>>>On Apr 13, 2016, at 11:22 AM, Muhammad Faisal < [email protected] > >>>>>>wrote: >>>>>>Hi, >>>>>>I'm trying to deal with dynamic content to be cached by ATS. By Dynamic I >>>>>>mean the URL for the actual content is always change this results in >>>>>>wastage of Cache storage and low hit rate. As per my understanding I have >>>>>>two challenges atm: >>>>>> >>>>>>1- Websites with dynamic URL for requested content (e.g filehippo, >>>>>>download.com etc etc) >>>>>>2- Streaming web sites where the dynamic URL has 206 (partial content) >>>>>> >>>>>>I tried cacheurl plugin to assign as well as cachekey plugin. But i >>>>>>couldn’t make the content cache friendly anyways. Below are the configs i >>>>>>have tried so far: >>>>>> >>>>>>cachekey Plugin configs are done on remap.config file as : regex_map >>>>>>http://(fs[0-9]+).filehippo.com http://$1.filehippo.com >>>>>>@plugin=cachekey.so >>>>>>CacheURL plugin config: >>>>>> >>>>>>http://.*[.]filehippo.com\/.*\/.*\/.*(\.exe ) >>>>>>http://cdn.filehippo.com/$1 >>>>>>http://.*\.gear3rd.net\/.*\/.*\/(.*\.mp4 ) http://cdn..gear3rd.net/$1 >>>>>>http://(cw[0-9]+).gear3rd.net\/\/files\/videos\/.*\/.*\/(.*\.mp4 ) >>>>>>http://cdn.gear3rd.net/$1&$2 >>>>>>https?\:\/\/.*\/(.*\..*(mp4|3gp|flv))\?.* >>>>>>http://video-file.ats.internal/$1 >>>>>>If someone has successfully configured the above scenario please help me >>>>>>out as i dont have programming background to deal with this complexity. >>>>>> >>>>>>-- >>>>>>Regards, >>>>>>Faisal.
