I have been following this thread because I'm seeing the same results as Réjean 
and also originally raised TS-1767 which Réjean mentioned earlier in the thread.

In doing the same test as below to validate that 2 different client urls mapped 
to the same origin server only cache 1 object as explained in the example below 
I have found that changing the proxy.config.url_remap.pristine_host_hdr effects 
the outcome.

If proxy.config.url_remap.pristine_host_hdr = 0 then the example below works 
and my issues with TS-1767 go away.
With proxy.config.url_remap.pristine_host_hdr = 1, I then get an object cached 
for each client url and TS-1767 is an issue

Scott

-----Original Message-----
From: Leif Hedstrom [mailto:[email protected]] 
Sent: Friday, 21 June 2013 3:42 PM
To: Réjean Bouchard
Cc: [email protected]
Subject: Re: Want to get the original URL

On 6/20/13 2:46 PM, Réjean Bouchard wrote:
> The reason why I'm looking for this is simple.  The TS keep multiple 
> copies based on the inbound domain.  Here is a way to prouve this concept:
>
> Create 2 domain ex: ts.mysite.com and ts2.mysite.com.
> Remap those domains to www.mysite.com
> Create test.txt file with the text "first file"
> Go to ts.mysite.com/test.txt  :  you will see "first file"
> Change test.txt content to "second file"
> Go to ts2.mysite.com/test.txt  :  you will see "second file"
> Clear browser cache


Well, that's not how it's designed to behave, and I can not reproduce this in 
my own tests. This is what I have in my remap.config

map http://ts1.example.com  http://localhost:82 map http://ts2.example.com  
http://localhost:82


I cleared the cache ("sudo traffic_server -Cclear"), and started it up:

$ curl -D - -H "Host: ts1.example.com" -H "Cache-Control: 
only-if-cached" http://localhost/test.txt
HTTP/1.1 504 Not Cached

$ curl -D - -H "Host: ts2.example.com" -H "Cache-Control: 
only-if-cached" http://localhost/test.txt
HTTP/1.1 504 Not Cached

Neither requests gives a cache hit. Now I allow it to cache for the 
ts1.example.com domain:

$ curl -D - -H "Host: ts1.example.com" http://localhost/test.txt
HTTP/1.1 200 OK


Then same tests as above:

$ curl -D - -H "Host: ts1.example.com" -H "Cache-Control: 
only-if-cached" http://localhost/test.txt
HTTP/1.1 200 OK

$ curl -D - -H "Host: ts2.example.com" -H "Cache-Control: 
only-if-cached" http://localhost/test.txt
HTTP/1.1 200 OK


I can also verify that both URLs gives the same response. And the Age: 
header (a good indicator) are identical, and I do not see an origin request for 
more than one request.


I have no idea why you are not getting this behavior. What you are 
experiencing is simply not how it works. A *wild* guess is that you are 
maybe doing Vary: on some headers, and that causes it to create 
different entries for various requests (which is as it should).

-- Leif



> Change test.txt content to "third file"
> Go to ts.mysite.com/test.txt  :  you will see "first file"
> Go to ts2.mysite.com/test.txt  :  you will see "second file"
> There is only one entry in cache if you scan it from regex search
>
> So, the reason why you want to be able to see the original URL request is to
> be able to flush all the version of test.txt.
> Let say that you have a 15,000,000 images cached that is generated by users
> and you want to purge the cache of every file that have some values in the
> URL (ex: picture size 10X40).
> Flushing the complete cache for that purpose can be trivial.  In the other
> hand, having to generate a purge request for every image in the database is
> not the optimal way and can be a pain.
> Now, having the ability to purge from a regex can be the optimal and the
> best solution.
> I'm fixing the webUI for this purpose.  And since the system return only the
> remapped URL and it's not possible to purge a remapped URL, it's not very
> usefull.  I try the HTTPInfo->request_url_get() function return nothing, I
> decided to ask here where the info was.
>
> So, what would think if I fix the TS so this information may be available by
> the function?  Do you see a reason why not?
>
>
> Réjean Bouchard
> Nexweb
>
>
> -----Message d'origine-----
> De : Leif Hedstrom [mailto:[email protected]]
> Envoyé : 20 juin 2013 10:42
> À : [email protected]
> Cc : Réjean Bouchard
> Objet : Re: Want to get the original URL
>
> On 6/20/13 6:49 AM, Réjean Bouchard wrote:
>> 4 - Finally, this is the same problem when we check the checkbox and
>> try to click on the "DELETE" button.
>>
>>
>>
>> So does anybody tell me where i can find those originals URL?
>
> Once in the cache, you can not track it back to the "original URL" (I'm
> fairly certain at least). There's a simple reason for this: There are no
> guarantees of a 1-to-1 mapping. It's entirely possible, and sometimes
> likely,  that 1,000 URLs can map to the same cache URL. Or 1,000,000 million
> URLs...
>
> If this is important to you, you can log both the pristine and remapped URL,
> and build up some sort of relationship in an external system.
>
> Cheers,
>
> -- Leif
>
>


Reply via email to