Hi,
I tried the following configs not sure the implementation is correct or not because I can see the downloads are still being served from origin. Also when i ran sudo ./bin/traffic_server -T cachekey 2>&1 | grep 'finalizing cache key' nothing appeared either on console or diags.log

regex_map http://(\w+[0-9]+).filehippo.com  \
          http://$1.filehippo.com \
              @plugin=cachekey.so \
@pparam=--capture-path=/(http:\/\/)([0-9-aA-zZ]+)\.(filehippo\.com)\/(.*)\/([aA-zZ0-9-._]+)(.*)?$/$3$5/ \
                  @pparam=--static-prefix=filehippo.com
PS: Origin has changes the URL pattern this is why the regexp is modified after testing it on regex101.com
--
Regards,
Faisal.



------ Original Message ------
From: [email protected]
To: "Gancho Tenev" <[email protected]>
Cc: "Users" <[email protected]>
Sent: 8/6/2016 12:14:10 PM
Subject: Re[2]: Dynamic content Caching challenges

Thanks alot Gancho. The provided guidelines are really helpful. Will share feedback on this after implementation.

--
Thanks
f.

Saturday, 06 August 2016, 03:45AM +05:00 from Gancho Tenev [email protected]:

Hello Faisal,

You could do it in many different ways, using different cachekey params, a number of different regex expressions to match, so you would have to experiment a little bit.

Please find a couple of quick examples below.

if you like to match all the following URIs by ignoring the first part of the host name and the path up to the filename (to remove the “random" parts):


URL: http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe URL: http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/vlc-2.2.2-win64.exe URL: http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/vlc-2.2.2-win64.exe

to the same cache key '/mydomain.com/vlc-2.2.2-win64.exe', you could do something like the following:


$ cat remap.config
regex_map http://(fs[0-9]+).mydomain.com  \
          http://$1.mydomain.com \
              @plugin=cachekey.so \
@pparam=--capture-path=/((?:\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$/$2/ \
                  @pparam=--static-prefix=mydomain.com

$ curl -x 127.0.0.1:80 \
http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe \ http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/vlc-2.2.2-win64.exe \ http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/vlc-2.2.2-win64.exe

$ sudo ./bin/traffic_server -T cachekey 2>&1 | grep 'finalizing cache key' [Aug 5 15:24:37.491] Server {0x7f042ace0700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/vlc-2.2.2-win64.exe' [Aug 5 15:24:37.575] Server {0x7f042ace0700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/vlc-2.2.2-win64.exe' [Aug 5 15:24:37.625] Server {0x7f042ace0700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/vlc-2.2.2-win64.exe'


If you like to ignore only the first 2 parts from the path you could do something like: @pparam=--capture-path=/((?:\/\w+){0,2}\/)((?:\w+\/)*)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$/$2$3/ \

and then

$ curl -x 127.0.0.1:80 \
http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe \ http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/vlc-2.2.2-win64.exe \ http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/vlc-2.2.2-win64.exe

would produce:

$ sudo ./bin/traffic_server -T cachekey 2>&1 | grep 'finalizing cache key' [Aug 5 15:35:43.574] Server {0x7f6784d43700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/vlc-2.2.2-win64.exe' [Aug 5 15:35:43.660] Server {0x7f6784d43700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/vlc-2.2.2-win64.exe' [Aug 5 15:35:43.708] Server {0x7f6784d43700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/vlc-2.2.2-win64.exe'


and

$ curl -x 127.0.0.1:80 \
http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/onemore/twomore/vlc-2.2.2-win64.exe \ http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/onemore/twomore/vlc-2.2.2-win64.exe \ http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/onemore/twomore/vlc-2.2.2-win64.exe

$ sudo ./bin/traffic_server -T cachekey 2>&1 | grep 'finalizing cache key' [Aug 5 15:36:41.095] Server {0x7f6784c42700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/twomore/vlc-2.2.2-win64.exe' [Aug 5 15:36:41.179] Server {0x7f6784c42700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/twomore/vlc-2.2.2-win64.exe' [Aug 5 15:36:41.232] Server {0x7f6784c42700} DIAG: (cachekey) cachekey.cc:594:finalize() finalizing cache key '/mydomain.com/twomore/vlc-2.2.2-win64.exe'


Again these are just examples and you would have to play more to make sure things work / fit well (and to cook the final regex).

You could find latest documentation here:
https://docs.trafficserver.apache.org/en/latest/admin-guide/plugins/cachekey.en.html
(plus an example on how to test w/o running the server in debug mode)

And play with regex on sites like this: https://regex101.com


HTH!

Cheers,
—Gancho

On Aug 5, 2016, at 11:34 AM, [email protected] wrote:

Hi Gancho,
Jist to follow up on my request on cachekey

--
Thanks
f.

Wednesday, 03 August 2016, 00:45AM +05:00 from Muhammad Faisal [email protected]:

Hi Gancho,
Let me explain the CDN content challenge which i'm trying to deal since last couple of months. I believe the Cachekey plugin in v6.2 can help.

URL: http://fs37.mydomain.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe URL: http://fs35.mydomain.com/9546/56dafac241f1da4ae9812f512f756bccf/vlc-2.2.2-win64.exe URL: http://fs31.mydomain.com/9546/436fd241f1da4ae9812f512f7b55667c/vlc-2.2.2-win64.exe

Random string in the URL "/9546/46cfd241f1da4ae9812f512f7b36643c" and origin server "fs37,fs31" keeps on changing due to this the same object is cached every time a user tries to download the file

I want to save the object once and should be deliver to client; for this i need to normalize the cachekey and eliminate the random strings from the cachekey in this way i would be able to avoid cache object duplication and increase cache hit ratio (please correct me if im wrong).

e.g filehippo site has below sequence:

When I click download button there are two requests one 301 which contains (Location header for the requested content) and second 200:


GET /download/file/6853a2c840eaefd1d7da43d6f2c94863adc5f470927402e6518d70573a99114d/ HTTP/1.1
Host: filehippo.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cookie: FHSession=mfzdaugt4nu11q3yfxfkjyox; FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM; __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148; __utma=144473122.1934842269.1459345103.1459345103.1459345103.1; __utmb=144473122.3.10.1459345119355; __utmc=144473122; __utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmv=144473122.|1=AB%20Test=new-home-v1=1 Referer: http://filehippo.com/download_vlc_64/download/56a450f832aee6bb4fda3b01259f9866/
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36

HTTP/1.1 301 Moved Permanently
Accept-Ranges: bytes
Age: 0
Cache-Control: private
Connection: keep-alive
Content-Length: 0
Content-Type: text/html
Date: Wed, 30 Mar 2016 13:38:45 GMT
Location: http://fs37.filehippo.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe
Via: 1.1 varnish
X-Cache: MISS
X-Cache-Hits: 0
x-debug-output: FHSession=mfzdaugt4nu11q3yfxfkjyox; FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM; __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148; __utma=144473122.1934842269.1459345103.1459345103.1459345103.1; __utmb=144473122.3.10.1459345119355; __utmc=144473122; __utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmv=144473122.|1=AB%20Test=new-home-v1=1
X-Served-By: cache-lhr6334-LHR

200 Header: Why ATS is not caching octet stream despite having CONFIG proxy.config.http.cache.required_headers INT 1 GET /9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe HTTP/1.1
Host: fs37.filehippo.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cookie: __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148; __utma=144473122.1934842269.1459345103.1459345103.1459345103.1; __utmb=144473122.3.10.1459345119355; __utmc=144473122; __utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmv=144473122.|1=AB%20Test=new-home-v1=1 Referer: http://filehippo.com/download_vlc_64/download/56a450f832aee6bb4fda3b01259f9866/
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 739
Connection: keep-alive
Content-Length: 31367109
Content-Type: application/octet-stream
Date: Wed, 30 Mar 2016 13:26:43 GMT
ETag: "81341be3a62d11:0"
Last-Modified: Mon, 02 Aug  2016 06:34:21 GMT



--
Regards,
Faisal.



------ Original Message ------
From: [email protected]
To: [email protected]
Sent: 8/2/2016 8:25:13 PM
Subject: Re[2]: Dynamic content Caching challenges

Hi
Im on way will catch you on irc today

--
Thanks
f.

Tuesday, 02 August 2016, 03:48AM +05:00 from Gancho Tenev [email protected]:

Hi Faisal,

I am probably missing background/details since the configs you provided did not make enough sense to me.

Maybe the following would work? (have not tested it since I don’t have enough info)

regex_map http://http//(fs[0-9]+/).filehippo.com http://http//$1.filehippo.com/@plugin=cachekey.so @pparam=--static-prefix=filehippo.com


Could we just pick one use-case and work through it? For instance “filehippo.com”.

Since cacheurl has been deprecated let us discuss cachekey configs.

Please provide:
- few samples of filehippo.com URIs
- the corresponding remap.config rule
- and then please describe how would you like URIs to match the entries in the cache so we can come up with the cachekey configs and we can take it from there.

Cheers,
—Gancho



On Apr 13, 2016, at 11:22 AM, Muhammad Faisal <[email protected]> wrote:

Hi,
I'm trying to deal with dynamic content to be cached by ATS. By Dynamic I mean the URL for the actual content is always change this results in wastage of Cache storage and low hit rate. As per my understanding I have two challenges atm:

1- Websites with dynamic URL for requested content (e.g filehippo, download.com etc etc) 2- Streaming web sites where the dynamic URL has 206 (partial content)

I tried cacheurl plugin to assign as well as cachekey plugin. But i couldn’t make the content cache friendly anyways. Below are the configs i have tried so far:

cachekey Plugin configs are done on remap.config file as : regex_map http://(fs[0-9]+).filehippo.com http://http//$1.filehippo.com/ @plugin=cachekey.so
CacheURL plugin config:

http://.*[.]filehippo.com\/.*\/.*\/.*(\.exe) http://cdn.filehippo.com/$1 http://.*\.gear3rd.net\/.*\/.*\/(.*\.mp4) http://http//cdn..gear3rd.net/$1 http://(cw[0-9]+).gear3rd.net\/\/files\/videos\/.*\/.*\/(.*\.mp4) http://cdn.gear3rd.net/$1&$2 https?\:\/\/.*\/(.*\..*(mp4|3gp|flv))\?.* http://video-file.ats.internal/$1 If someone has successfully configured the above scenario please help me out as i dont have programming background to deal with this complexity.

--
Regards,
Faisal.

Reply via email to