I have been following this thread because I'm seeing the same results as Réjean and also originally raised TS-1767 which Réjean mentioned earlier in the thread.
In doing the same test as below to validate that 2 different client urls mapped to the same origin server only cache 1 object as explained in the example below I have found that changing the proxy.config.url_remap.pristine_host_hdr effects the outcome. If proxy.config.url_remap.pristine_host_hdr = 0 then the example below works and my issues with TS-1767 go away. With proxy.config.url_remap.pristine_host_hdr = 1, I then get an object cached for each client url and TS-1767 is an issue Scott -----Original Message----- From: Leif Hedstrom [mailto:[email protected]] Sent: Friday, 21 June 2013 3:42 PM To: Réjean Bouchard Cc: [email protected] Subject: Re: Want to get the original URL On 6/20/13 2:46 PM, Réjean Bouchard wrote: > The reason why I'm looking for this is simple. The TS keep multiple > copies based on the inbound domain. Here is a way to prouve this concept: > > Create 2 domain ex: ts.mysite.com and ts2.mysite.com. > Remap those domains to www.mysite.com > Create test.txt file with the text "first file" > Go to ts.mysite.com/test.txt : you will see "first file" > Change test.txt content to "second file" > Go to ts2.mysite.com/test.txt : you will see "second file" > Clear browser cache Well, that's not how it's designed to behave, and I can not reproduce this in my own tests. This is what I have in my remap.config map http://ts1.example.com http://localhost:82 map http://ts2.example.com http://localhost:82 I cleared the cache ("sudo traffic_server -Cclear"), and started it up: $ curl -D - -H "Host: ts1.example.com" -H "Cache-Control: only-if-cached" http://localhost/test.txt HTTP/1.1 504 Not Cached $ curl -D - -H "Host: ts2.example.com" -H "Cache-Control: only-if-cached" http://localhost/test.txt HTTP/1.1 504 Not Cached Neither requests gives a cache hit. Now I allow it to cache for the ts1.example.com domain: $ curl -D - -H "Host: ts1.example.com" http://localhost/test.txt HTTP/1.1 200 OK Then same tests as above: $ curl -D - -H "Host: ts1.example.com" -H "Cache-Control: only-if-cached" http://localhost/test.txt HTTP/1.1 200 OK $ curl -D - -H "Host: ts2.example.com" -H "Cache-Control: only-if-cached" http://localhost/test.txt HTTP/1.1 200 OK I can also verify that both URLs gives the same response. And the Age: header (a good indicator) are identical, and I do not see an origin request for more than one request. I have no idea why you are not getting this behavior. What you are experiencing is simply not how it works. A *wild* guess is that you are maybe doing Vary: on some headers, and that causes it to create different entries for various requests (which is as it should). -- Leif > Change test.txt content to "third file" > Go to ts.mysite.com/test.txt : you will see "first file" > Go to ts2.mysite.com/test.txt : you will see "second file" > There is only one entry in cache if you scan it from regex search > > So, the reason why you want to be able to see the original URL request is to > be able to flush all the version of test.txt. > Let say that you have a 15,000,000 images cached that is generated by users > and you want to purge the cache of every file that have some values in the > URL (ex: picture size 10X40). > Flushing the complete cache for that purpose can be trivial. In the other > hand, having to generate a purge request for every image in the database is > not the optimal way and can be a pain. > Now, having the ability to purge from a regex can be the optimal and the > best solution. > I'm fixing the webUI for this purpose. And since the system return only the > remapped URL and it's not possible to purge a remapped URL, it's not very > usefull. I try the HTTPInfo->request_url_get() function return nothing, I > decided to ask here where the info was. > > So, what would think if I fix the TS so this information may be available by > the function? Do you see a reason why not? > > > Réjean Bouchard > Nexweb > > > -----Message d'origine----- > De : Leif Hedstrom [mailto:[email protected]] > Envoyé : 20 juin 2013 10:42 > À : [email protected] > Cc : Réjean Bouchard > Objet : Re: Want to get the original URL > > On 6/20/13 6:49 AM, Réjean Bouchard wrote: >> 4 - Finally, this is the same problem when we check the checkbox and >> try to click on the "DELETE" button. >> >> >> >> So does anybody tell me where i can find those originals URL? > > Once in the cache, you can not track it back to the "original URL" (I'm > fairly certain at least). There's a simple reason for this: There are no > guarantees of a 1-to-1 mapping. It's entirely possible, and sometimes > likely, that 1,000 URLs can map to the same cache URL. Or 1,000,000 million > URLs... > > If this is important to you, you can log both the pristine and remapped URL, > and build up some sort of relationship in an external system. > > Cheers, > > -- Leif > >
