Re: ATS 9.2.3 and cache bypass with parent

2024-02-06 Thread Veiko Kukk
Unfortunately, still requests go through parent when go_direct=true.

1707218323.388 18 127.0.0.1 TCP_MISS/400 543 GET
https://origin_server.tld/full/path/to/test/file.txt -
PARENT_HIT/x.x.x.x text/html

Where should I submit a bug report? How to work around?
How could I log more debug info about where the wrong decision is made
to send this request to parent?

ATS 9.2.3

Thanks ahead,
Veiko

Kontakt John Rushford () kirjutas kuupäeval N,
21. detsember 2023 kell 18:22:
>
> That’s correct, go_direct=true
>
> > On Dec 21, 2023, at 9:03 AM, Veiko Kukk  wrote:
> >
> >> We have a test url that is located on upstream that we query regularly
> >> to check if upstream storage is available and how fast it is to
> >> respond. This test url is obviously excluded from caching in ats.
> >>
> >> In cache.config:
> >> url_regex=.*/full/path/to/test/file.txt$ action=never-cache
> >
> > I had missed the relevant section from parent.config in my previous letter.
> >
> > url_regex=.*/full/path/to/test/file.txt$ go_direct=true
> >
> > I'm pretty sure that with this config, requests are not supposed to go
> > through parent...
> >
> > Veiko
>


Re: ATS 9.2.3 and cache bypass with parent

2023-12-21 Thread Veiko Kukk
> We have a test url that is located on upstream that we query regularly
> to check if upstream storage is available and how fast it is to
> respond. This test url is obviously excluded from caching in ats.
>
> In cache.config:
> url_regex=.*/full/path/to/test/file.txt$ action=never-cache

I had missed the relevant section from parent.config in my previous letter.

url_regex=.*/full/path/to/test/file.txt$ go_direct=true

I'm pretty sure that with this config, requests are not supposed to go
through parent...

Veiko


ATS 9.2.3 and cache bypass with parent

2023-12-19 Thread Veiko Kukk
Hi

Please help me to understand how should bypass work with ats 9. It
worked well with ats 7, but apparently something has changed that I
didn't see in changelogs before upgrading to 9.2.

We have a test url that is located on upstream that we query regularly
to check if upstream storage is available and how fast it is to
respond. This test url is obviously excluded from caching in ats.

In cache.config:
url_regex=.*/full/path/to/test/file.txt$ action=never-cache

Logs for that url show:
1702982167.173 493 127.0.0.1 TCP_MISS/200 637 GET
https://storage.hostname.tld//full/path/to/test/file.txt -
DIRECT/storage.hostname.tld text/plain

Do I understand correctly that this config and log indicate that
requests against this url do not go through the parent?
Isn't "DIRECT/storage.hostname.tld" in logs indicating that ats itself
believes it does access upstream directly and not through parent?

If yes, then why does disabling parent reduce request time from
400-500ms to under 100ms? Is my config not correct for the ats 9..x.x?

What must be added to ats 9.2 configuration to avoid passing test url
through parent?

best regards,
Veiko


Re: ATS 9.2.3 and proxy.config.http.per_server.connection.min

2023-10-25 Thread Veiko Kukk
The link does not work.

Kontakt Mo Chen () kirjutas kuupäeval T, 24.
oktoober 2023 kell 18:18:
>
> I've created a Github issue to track this: 
> https://github.pie.apple.com/ats/trafficserver/issues/672


ATS 9.2.3 and proxy.config.http.per_server.connection.min

2023-10-23 Thread Veiko Kukk
Hi

I'm upgrading from 7.1 to 9.2.3 and attempting to get a certain amount
of connections kept open between ATS and upstream server to avoid
connection creation overhead.

In 7.1 this worked well CONFIG
proxy.config.http.origin_min_keep_alive_connections INT 50

In 9.2.3 enabling CONFIG proxy.config.http.per_server.connection.min INT 50
results in many lines of errors displayed in diags.log

[Oct 23 16:26:59.192] [ET_NET 0] ERROR: [http_ss] [188] number of
connections should be greater than or equal to zero: 4294967295
[Oct 23 16:27:15.309] [ET_NET 0] ERROR: [http_ss] [210] number of
connections should be greater than or equal to zero: 4294967295
[Oct 23 16:27:29.543] [ET_NET 1] ERROR: [http_ss] [203] number of
connections should be greater than or equal to zero: 4294967295
[Oct 23 16:27:59.336] [ET_NET 0] ERROR: [http_ss] [219] number of
connections should be greater than or equal to zero: 4294967295
[Oct 23 16:28:15.683] [ET_NET 0] ERROR: [http_ss] [241] number of
connections should be greater than or equal to zero: 4294967295
[Oct 23 16:28:31.506] [ET_NET 1] ERROR: [http_ss] [234] number of
connections should be greater than or equal to zero: 4294967295

Connections through proxy work even though those strange errors are displayed.

Veiko


Re: RAM cache on NVMe

2023-04-10 Thread Veiko Kukk
Sharding based on domain won't work for us. We use OVH Swift as
backend and there is no option to redistribute based on size or domain
name.
I googled around before writing my first letter and was happy to find
this 
https://issues.apache.org/jira/browse/TS-1728?focusedCommentId=13635926=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13635926

It's exactly what we would need. Did it not get implemented 10 years ago?

I believe our situation is not unique and the need to distribute based
on storage type and object size would be quite common, considering SSD
and HDD differences. SSD-s are still too small to have proper CDN node
cache capacity and HDD-s are too slow in seeking for small files.

Veiko


Kontakt Leif Hedstrom () kirjutas kuupäeval E, 10.
aprill 2023 kell 19:23:
>
> I don’t think you can have such control, at least not now. I think the best 
> you could do is to shard your content (small vs large) into two (or more) 
> domains, and then you can assign volumes based on those names.
>
> — Leif
>
> From hosting.config:
>
>
> #   Primary destination specifiers are
> # domain=
> # hostname=
>
>
>
>
> > On Apr 10, 2023, at 5:51 AM, Veiko Kukk  wrote:
> >
> > Hi
> >
> > We are currently using Nginx in front of ATS to store hot content on
> > NVMe-s, but would like to drop it and only use ATS. ATS is using full
> > SATA HDD-s to store its content (about 150TB per node) and ATS RAM
> > cache is disabled entirely.
> >
> > From reading ATS documentation, I only found how to enable RAM cache
> > for objects smaller than x, but nothing about how to create volume for
> > smaller files from actual storage devices. The idea here is that HDD-s
> > are more suitable (better performing) for larger objects that are read
> > sequentially and SSD-s for smaller files because seek penalty isn't as
> > high with solid state drives as it is with rotating media.
> >
> > How could I create volume over NVMe drivers that would only store
> > smaller than x size files?
> >
> > Thanks ahead,
> > Veiko
>


RAM cache on NVMe

2023-04-10 Thread Veiko Kukk
Hi

We are currently using Nginx in front of ATS to store hot content on
NVMe-s, but would like to drop it and only use ATS. ATS is using full
SATA HDD-s to store its content (about 150TB per node) and ATS RAM
cache is disabled entirely.

>From reading ATS documentation, I only found how to enable RAM cache
for objects smaller than x, but nothing about how to create volume for
smaller files from actual storage devices. The idea here is that HDD-s
are more suitable (better performing) for larger objects that are read
sequentially and SSD-s for smaller files because seek penalty isn't as
high with solid state drives as it is with rotating media.

How could I create volume over NVMe drivers that would only store
smaller than x size files?

Thanks ahead,
Veiko


Re: Understanding _total_response_bytes statistics

2020-06-25 Thread Veiko Kukk
Hi again!

I wonder if such statistics is considered normal?
Should I open issue at Github instead of writing here?

Veiko

Kontakt Veiko Kukk () kirjutas kuupäeval K, 3.
juuni 2020 kell 14:48:
>
> Hi,
>
> ATS 7.1.2, parent routing enabled. Everything goes through parent
> except one test url request that has very small response size (20
> bytes).
>
> To calculate ATS cache byte hit ratio, I'm using following metrics:
> "proxy.node.http.user_agent_total_response_bytes": "363046018479",
> "proxy.node.http.origin_server_total_response_bytes": "359470059035",
> "proxy.node.http.parent_proxy_total_response_bytes": "356244228124",
>
> I wonder how is it possible that (parent_proxy_total_response_bytes +
> origin_server_total_response_bytes) is bigger than
> user_agent_total_response_bytes? Everything going to user agent
> (client) must come from either local cache or from parent or from
> origin. This looks like stats bug to me. Maybe even fixed in some
> newer ATS version?
>
> According to my understanding, only those responses coming directly
> from origin should be included in
> proxy.node.http.origin_server_total_response_bytes and only responses
> coming from parent should be included in
> proxy.node.http.parent_proxy_total_response_bytes.
>
> Best regards,
> Veiko


Understanding _total_response_bytes statistics

2020-06-03 Thread Veiko Kukk
Hi,

ATS 7.1.2, parent routing enabled. Everything goes through parent
except one test url request that has very small response size (20
bytes).

To calculate ATS cache byte hit ratio, I'm using following metrics:
"proxy.node.http.user_agent_total_response_bytes": "363046018479",
"proxy.node.http.origin_server_total_response_bytes": "359470059035",
"proxy.node.http.parent_proxy_total_response_bytes": "356244228124",

I wonder how is it possible that (parent_proxy_total_response_bytes +
origin_server_total_response_bytes) is bigger than
user_agent_total_response_bytes? Everything going to user agent
(client) must come from either local cache or from parent or from
origin. This looks like stats bug to me. Maybe even fixed in some
newer ATS version?

According to my understanding, only those responses coming directly
from origin should be included in
proxy.node.http.origin_server_total_response_bytes and only responses
coming from parent should be included in
proxy.node.http.parent_proxy_total_response_bytes.

Best regards,
Veiko


Re: CPU load at idle

2019-12-18 Thread Veiko Kukk
Hi

Before I try anything, I need to understand what it means.
What does this option do? What polling? net indicates it has something
to do with network, but what does it poll on network?

Veiko

On Sun, 15 Dec 2019 at 20:25, Shu Kit Chan  wrote:
>
> Perhaps try this out? -
> https://docs.trafficserver.apache.org/en/latest/admin-guide/performance/index.en.html#polling-timeout
>
> On Fri, Dec 13, 2019 at 6:07 AM Veiko Kukk  wrote:
> >
> > Hi
> >
> > ATS 7.1.2, Centos 7.7.
> > I'm seeing 15-16% CPU load while there is no traffic at all.
> > proxy.process.cache.bytes_used 23934931615744
> > proxy.process.cache.bytes_total 23935065833472
> >
> > Is that normal?
> >
> > Veiko


CPU load at idle

2019-12-13 Thread Veiko Kukk
Hi

ATS 7.1.2, Centos 7.7.
I'm seeing 15-16% CPU load while there is no traffic at all.
proxy.process.cache.bytes_used 23934931615744
proxy.process.cache.bytes_total 23935065833472

Is that normal?

Veiko


Re: [PROPOSAL] Replace LuaJIT configurations with YAML

2018-05-08 Thread Veiko Kukk
+1 from me too.

It would be nice to have modern, easily parsable (for whatever automation)
configuration file syntax.

Veiko


2018-05-08 11:01 GMT+03:00 Bryan Call :

> +1
>
> -Bryan
>
>
> > On May 7, 2018, at 12:47 PM, Leif Hedstrom  wrote:
> >
> > Hi,
> >
> > I’d like to propose that we eliminate the existing LuaJIT configurations
> with a simple YAML format. This would be the first step towards a unified
> configuration format, and I think we have to admit defeat on Lua, and do
> something simpler and more normal for both developers and users.
> >
> > There are only 2 (or 3) configurations using the LuaJIT configurations:
> >
> >   logging.config
> >   sni.config
> >
> >
> > But, also see my other proposal of killing metrics.config, which is
> currently in LuaJIT as well.
> >
> > If we agree to this, we’ll remove LuaJIT as a first class citizen from
> ATS v8.0.0. For the ts-lua plugin, we’ll continue to support this (of
> course), but it will require an externally provided LuaJIT distribution. We
> already have the —with-luajit configure option, and we’ll expand on that to
> auto-detect system level LuaJIT availability.
> >
> > If there are no strong arguments against this proposal, I’d like to get
> this change done for ATS v8.0.0.
> >
> > Thanks,
> >
> > — Leif
> >
>
>


Re: Parent initially marked as down

2018-03-29 Thread Veiko Kukk
Answering myself.
URL remapping per parent does not seem to be possible.
Nor is it poissible to define whether parent should be connected via SSL on
in plain text.

Veiko


2018-03-29 11:25 GMT+03:00 Veiko Kukk <veiko.k...@gmail.com>:

> Hi,
>
> I think that suggestion you had was righ. Just changing remap.config to
> not rewrite http to https made parent access working.
>
> Now I need to find a way to change remap so that if it's sent to parent,
> its http and when sent directly to origin, it would be https.
>
> How would this be possible?
>
> Veiko
>
>
> 2018-03-28 19:49 GMT+03:00 Jeremy Payne <jp557...@gmail.com>:
>
>> Unless things have changed, the scheme defined in the remapped URL is
>> the scheme used when polling the parent server.
>> If you enable debug, you'll see the scheme sent to the parent.
>> A packet trace will also reveal the same.
>>
>>
>>
>> On Wed, Mar 28, 2018 at 11:38 AM, Veiko Kukk <veiko.k...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > No, only http. Connections to origin (via Internet) are made with
>> https, but
>> > internally we only do http (cheaper, simpler).
>> >
>> > Why do you think, parent logs about child connection are indicating
>> CONNECT
>> > (https)?
>> >
>> > 1522251150.831 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
>> text/htm
>> >
>> > Veiko
>> >
>> >
>> > 2018-03-28 19:01 GMT+03:00 Jeremy Payne <jp557...@gmail.com>:
>> >>
>> >> I think the issue is the child is sending a https request to the
>> parent.
>> >> Does the parent support https on port 3128 ?
>> >>
>> >> On Wed, Mar 28, 2018 at 10:45 AM, Veiko Kukk <veiko.k...@gmail.com>
>> wrote:
>> >> > Hi,
>> >> >
>> >> > I'm trying to get ATS 7.1.2 working with single parent and failover
>> to
>> >> > origin.
>> >> > "clients" make request against ATS internally with plain http, with
>> >> > remap.config we map those requests to https.
>> >> > regex_map http://storage.(.*).cloud.ovh.net
>> >> > https://storage.$1.cloud.ovh.net
>> >> > @plugin=cachekey.so @pparam=--remove-all-params=true
>> >> > @pparam=--static-prefix=cloud_ovh_net
>> >> >
>> >> > parent.config
>> >> >
>> >> > dest_domain=. parent="192.168.1.52:3128" go_direct=false
>> >> >
>> >> > I've set go_direct to false as otherwise request would go directly to
>> >> > origin.
>> >> >
>> >> > From diags.log, when starting up ATS:
>> >> > [Mar 28 15:32:21.720] Server {0x2ae732c203c0} NOTE: traffic server
>> >> > running
>> >> > [Mar 28 15:32:21.826] Server {0x2ae73751e700} NOTE: cache enabled
>> >> > [Mar 28 15:32:22.735] Server {0x2ae73751e700} NOTE: Parent initially
>> >> > marked
>> >> > as down 192.168.1.52:3128
>> >> > [Mar 28 15:32:47.695] Server {0x2ae73751e700} NOTE: Failure threshold
>> >> > met
>> >> > failcount:10 >= threshold:10, http parent proxy 192.168.1.52:3128
>> marked
>> >> > down
>> >> >
>> >> > Why?
>> >> >
>> >> > # telnet 192.168.1.52 3128
>> >> > Trying 192.168.1.52...
>> >> > Connected to 192.168.1.52.
>> >> > Escape character is '^]'.
>> >> >
>> >> > There is proper working network connection between child and parent
>> ATS.
>> >> >
>> >> > On parent access.log:
>> >> >
>> >> > 1522251150.831 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
>> >> > text/html
>> >> > 1522251150.832 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
>> >> > text/html
>> >> > 1522251150.832 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
>> >> > text/html
>> >> > 1522251150.833 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
>> >> > text/html
>> >> > 1522251152.789 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
>> >> > text/html
>> >> > 1522251152.790 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
>> >> > text/html
>> >> > 1522251152.791 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
>> >> > text/html
>> >> > 1

Re: Parent initially marked as down

2018-03-29 Thread Veiko Kukk
Hi,

I think that suggestion you had was righ. Just changing remap.config to not
rewrite http to https made parent access working.

Now I need to find a way to change remap so that if it's sent to parent,
its http and when sent directly to origin, it would be https.

How would this be possible?

Veiko


2018-03-28 19:49 GMT+03:00 Jeremy Payne <jp557...@gmail.com>:

> Unless things have changed, the scheme defined in the remapped URL is
> the scheme used when polling the parent server.
> If you enable debug, you'll see the scheme sent to the parent.
> A packet trace will also reveal the same.
>
>
>
> On Wed, Mar 28, 2018 at 11:38 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
> > Hi,
> >
> > No, only http. Connections to origin (via Internet) are made with https,
> but
> > internally we only do http (cheaper, simpler).
> >
> > Why do you think, parent logs about child connection are indicating
> CONNECT
> > (https)?
> >
> > 1522251150.831 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/htm
> >
> > Veiko
> >
> >
> > 2018-03-28 19:01 GMT+03:00 Jeremy Payne <jp557...@gmail.com>:
> >>
> >> I think the issue is the child is sending a https request to the parent.
> >> Does the parent support https on port 3128 ?
> >>
> >> On Wed, Mar 28, 2018 at 10:45 AM, Veiko Kukk <veiko.k...@gmail.com>
> wrote:
> >> > Hi,
> >> >
> >> > I'm trying to get ATS 7.1.2 working with single parent and failover to
> >> > origin.
> >> > "clients" make request against ATS internally with plain http, with
> >> > remap.config we map those requests to https.
> >> > regex_map http://storage.(.*).cloud.ovh.net
> >> > https://storage.$1.cloud.ovh.net
> >> > @plugin=cachekey.so @pparam=--remove-all-params=true
> >> > @pparam=--static-prefix=cloud_ovh_net
> >> >
> >> > parent.config
> >> >
> >> > dest_domain=. parent="192.168.1.52:3128" go_direct=false
> >> >
> >> > I've set go_direct to false as otherwise request would go directly to
> >> > origin.
> >> >
> >> > From diags.log, when starting up ATS:
> >> > [Mar 28 15:32:21.720] Server {0x2ae732c203c0} NOTE: traffic server
> >> > running
> >> > [Mar 28 15:32:21.826] Server {0x2ae73751e700} NOTE: cache enabled
> >> > [Mar 28 15:32:22.735] Server {0x2ae73751e700} NOTE: Parent initially
> >> > marked
> >> > as down 192.168.1.52:3128
> >> > [Mar 28 15:32:47.695] Server {0x2ae73751e700} NOTE: Failure threshold
> >> > met
> >> > failcount:10 >= threshold:10, http parent proxy 192.168.1.52:3128
> marked
> >> > down
> >> >
> >> > Why?
> >> >
> >> > # telnet 192.168.1.52 3128
> >> > Trying 192.168.1.52...
> >> > Connected to 192.168.1.52.
> >> > Escape character is '^]'.
> >> >
> >> > There is proper working network connection between child and parent
> ATS.
> >> >
> >> > On parent access.log:
> >> >
> >> > 1522251150.831 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251150.832 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251150.832 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251150.833 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251152.789 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251152.790 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251152.791 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251152.791 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251157.344 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251157.345 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251157.345 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251157.346 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251167.693 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251167.694 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251167.694 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/html
> >> > 1522251167.695 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> >> > text/htm
> >> >
> >> > Where does this come from??
> >> > There is and should never be anything at "/" on parent.
> >> > I assume this is internal health check on child that tries to request
> >> > "/" on
> >> > parent and since it's obviously failing, will mark parent down and
> never
> >> > use
> >> > it.
> >> >
> >> > How to change how parent is tested? I found nothing regarding to
> parent
> >> > health check in documentation.
> >> >
> >> > Thanks,
> >> > Veiko
> >> >
> >
> >
>


Re: Parent initially marked as down

2018-03-28 Thread Veiko Kukk
Hi,

No, only http. Connections to origin (via Internet) are made with https,
but internally we only do http (cheaper, simpler).

Why do you think, parent logs about child connection are indicating CONNECT
(https)?

1522251150.831 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/htm

Veiko


2018-03-28 19:01 GMT+03:00 Jeremy Payne <jp557...@gmail.com>:

> I think the issue is the child is sending a https request to the parent.
> Does the parent support https on port 3128 ?
>
> On Wed, Mar 28, 2018 at 10:45 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
> > Hi,
> >
> > I'm trying to get ATS 7.1.2 working with single parent and failover to
> > origin.
> > "clients" make request against ATS internally with plain http, with
> > remap.config we map those requests to https.
> > regex_map http://storage.(.*).cloud.ovh.net https://storage.$
> 1.cloud.ovh.net
> > @plugin=cachekey.so @pparam=--remove-all-params=true
> > @pparam=--static-prefix=cloud_ovh_net
> >
> > parent.config
> >
> > dest_domain=. parent="192.168.1.52:3128" go_direct=false
> >
> > I've set go_direct to false as otherwise request would go directly to
> > origin.
> >
> > From diags.log, when starting up ATS:
> > [Mar 28 15:32:21.720] Server {0x2ae732c203c0} NOTE: traffic server
> running
> > [Mar 28 15:32:21.826] Server {0x2ae73751e700} NOTE: cache enabled
> > [Mar 28 15:32:22.735] Server {0x2ae73751e700} NOTE: Parent initially
> marked
> > as down 192.168.1.52:3128
> > [Mar 28 15:32:47.695] Server {0x2ae73751e700} NOTE: Failure threshold met
> > failcount:10 >= threshold:10, http parent proxy 192.168.1.52:3128 marked
> > down
> >
> > Why?
> >
> > # telnet 192.168.1.52 3128
> > Trying 192.168.1.52...
> > Connected to 192.168.1.52.
> > Escape character is '^]'.
> >
> > There is proper working network connection between child and parent ATS.
> >
> > On parent access.log:
> >
> > 1522251150.831 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251150.832 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251150.832 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251150.833 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251152.789 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251152.790 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251152.791 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251152.791 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251157.344 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251157.345 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251157.345 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251157.346 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251167.693 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251167.694 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251167.694 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/html
> > 1522251167.695 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/-
> text/htm
> >
> > Where does this come from??
> > There is and should never be anything at "/" on parent.
> > I assume this is internal health check on child that tries to request
> "/" on
> > parent and since it's obviously failing, will mark parent down and never
> use
> > it.
> >
> > How to change how parent is tested? I found nothing regarding to parent
> > health check in documentation.
> >
> > Thanks,
> > Veiko
> >
>


Parent initially marked as down

2018-03-28 Thread Veiko Kukk
Hi,

I'm trying to get ATS 7.1.2 working with single parent and failover to
origin.
"clients" make request against ATS internally with plain http, with
remap.config we map those requests to https.
regex_map http://storage.(.*).cloud.ovh.net https://storage.$1.cloud.ovh.net
@plugin=cachekey.so @pparam=--remove-all-params=true
@pparam=--static-prefix=cloud_ovh_net

parent.config

dest_domain=. parent="192.168.1.52:3128" go_direct=false

I've set go_direct to false as otherwise request would go directly to
origin.

>From diags.log, when starting up ATS:
[Mar 28 15:32:21.720] Server {0x2ae732c203c0} NOTE: traffic server running
[Mar 28 15:32:21.826] Server {0x2ae73751e700} NOTE: cache enabled
[Mar 28 15:32:22.735] Server {0x2ae73751e700} NOTE: Parent initially marked
as down 192.168.1.52:3128
[Mar 28 15:32:47.695] Server {0x2ae73751e700} NOTE: Failure threshold met
failcount:10 >= threshold:10, http parent proxy 192.168.1.52:3128 marked
down

Why?

# telnet 192.168.1.52 3128
Trying 192.168.1.52...
Connected to 192.168.1.52.
Escape character is '^]'.

There is proper working network connection between child and parent ATS.

On parent access.log:

1522251150.831 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251150.832 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251150.832 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251150.833 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251152.789 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251152.790 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251152.791 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251152.791 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251157.344 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251157.345 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251157.345 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251157.346 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251167.693 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251167.694 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251167.694 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/html
1522251167.695 0 192.168.1.51 ERR_INVALID_REQ/400 491 - / - NONE/- text/htm

Where does this come from??
There is and should never be anything at "/" on parent.
I assume this is internal health check on child that tries to request "/"
on parent and since it's obviously failing, will mark parent down and never
use it.

How to change how parent is tested? I found nothing regarding to parent
health check in documentation.

Thanks,
Veiko


Re: RPM or a .spec file?

2018-03-28 Thread Veiko Kukk
Hi,

You can see if spec file found in my Fedora copr ATS builds suits you.
https://copr-be.cloud.fedoraproject.org/results/vkukk/apache-traffic-server-7/epel-7-x86_64/00707444-trafficserver/

Veiko


2018-03-28 5:44 GMT+03:00 John Garvin :

> Hi @users,
>
> I'm trying to compile trafficserver 7.1.2 on CentOS 7.2. I'm running into
> some issues beyond my ability to resolve (probably due to unfamiliarity
> with rpm and spec files*).
>
> Someone must have addressed the need for a trafficserver rpm before. Would
> anyone who has created a spec file or an rpm be kind enough to share it?
>
> Thanks much,
>
> --
> John Garvin
>
> *The problem - in my spec file, I build in $RPM_BUILD_ROOT/usr, and
> compilation proceeds happily until installing the bundled perl modules.
> make sees the Makefile-pl's DESTDIR as unset. and tries to install to /usr,
> which isn't what I want.
>


Re: ATS slow after ERR_CONNECT_FAIL

2018-03-06 Thread Veiko Kukk
Hi again,

This list seems really quiet.
Meanwhile, I've enabled slow log which brings in strange results that I
can't explain. Maybe somebody else can explain it.

status: 200 unique id:  redirection_tries: 0 bytes: 20 fd: 0 client state:
0 server state: 9 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done:
0.000 cache_open_read_begin: 0.006 cache_open_read_end: 0.920
dns_lookup_begin: 0.920 dns_lookup_end: 0.920 server_connect: 4.204
server_connect_end: 4.204 server_first_read: 43.955
server_read_header_done: 43.955 server_close: 43.957 ua_write: 43.957
ua_close: 43.957 sm_finish: 43.957 plugin_active: 0.000 plugin_total: 0.000
status: 200 unique id:  redirection_tries: 0 bytes: 20 fd: 0 client state:
0 server state: 9 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done:
0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000
dns_lookup_begin: 0.000 dns_lookup_end: 0.000 server_connect: 2.384
server_connect_end: 2.384 server_first_read: 76.550
server_read_header_done: 76.550 server_close: 76.553 ua_write: 76.553
ua_close: 76.553 sm_finish: 76.553 plugin_active: 0.000 plugin_total: 0.000

1) Why there is cache read when cache.config has url_regex=that_url_regex
action=never-cache If the url matches never cache action, why is it looked
up at all in the beginning?
2) What's going on between dns_lookup_end and server_connect? Where is the
time spent? What timeout setting should apply to server_connect? Shouldn't
it be proxy.config.http.connect_attempts_timeout which is set to 2 seconds.
3) What timeout applies to time between server_connect_end
and server_first_read? Isn't
it proxy.config.http.transaction_no_activity_timeout_out which on our case
is set to 6 seconds.

Current config timeouts:

proxy.config.http.transaction_no_activity_timeout_out: 6
proxy.config.http.connect_attempts_timeout: 2
proxy.config.lm.pserver_timeout_secs: 1
proxy.config.lm.pserver_timeout_msecs: 0
proxy.config.cluster.peer_timeout: 30
proxy.config.cluster.mc_poll_timeout: 5
proxy.config.cluster.startup_timeout: 10
proxy.config.process_manager.timeout: 5
proxy.config.vmap.down_up_timeout: 10
proxy.config.http.congestion_control.default.live_os_conn_timeout: 60
proxy.config.http.congestion_control.default.dead_os_conn_timeout: 15
proxy.config.http.parent_proxy.connect_attempts_timeout: 30
proxy.config.http.keep_alive_no_activity_timeout_in: 120
proxy.config.http.keep_alive_no_activity_timeout_out: 120
proxy.config.websocket.no_activity_timeout: 600
proxy.config.websocket.active_timeout: 3600
proxy.config.http.transaction_no_activity_timeout_in: 30
proxy.config.http.transaction_active_timeout_in: 900
proxy.config.http.transaction_active_timeout_out: 0
proxy.config.http.accept_no_activity_timeout: 120
proxy.config.http.background_fill_active_timeout: 0
proxy.config.http.post_connect_attempts_timeout: 1800
proxy.config.socks.socks_timeout: 100
proxy.config.socks.server_connect_timeout: 10
proxy.config.socks.server_retry_timeout: 300
proxy.config.net.poll_timeout: 10
proxy.config.net.default_inactivity_timeout: 86400
proxy.config.dns.lookup_timeout: 20
proxy.config.hostdb.lookup_timeout: 30
proxy.config.hostdb.timeout: 86400
proxy.config.hostdb.fail.timeout: 0
proxy.config.log.collation_host_timeout: 86390
proxy.config.log.collation_client_timeout: 86400
proxy.config.ssl.session_cache.timeout: 0
proxy.config.ssl.handshake_timeout_in: 0
proxy.config.ssl.ocsp.cache_timeout: 3600
proxy.config.ssl.ocsp.request_timeout: 10
proxy.config.http2.accept_no_activity_timeout: 120
proxy.config.http2.no_activity_timeout_in: 120
proxy.config.http2.active_timeout_in: 900

Best regards,
Veiko


2018-03-02 17:18 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:

> Hi,
>
> ATS 7.1.2, CentOS 7.4.
>
> We have regular requests via ATS in reverse mode to get response time
> statistics and decide whether source is available. This request is excluded
> from caching, so it always results in TCP_MISS/200 in normal conditions
> with response time fluctuation depending on the distance of ATS server from
> source.
>
> Sometimes there are network errors that get logged
> as ERR_CONNECT_FAIL/502, after which there are again TCP_MISS/200 but with
> very high latency!
> This happens on all ATS servers we have deployed.
>
> For example, when normally before network error response times fluctuate
> around 150ms, then after they fluctuate around 1300 ms. After restarting
> ATS, response times are back to normal.
>
> Response times from log:
> 168
> 12979
> 2755
> 117256
> 56442
> 40104
> 1043
> 30955
> 30972
> 18418
> 26542
> 4172
> 8587
> 59259
> 4674
> 37166
> 16123
> 67019
> 41723
> 3497
> 6957
> 18684
> 17663
> 14634
> 20036
> 1305
> 14815
> 3526
> 10542
> 62519
> 22025
> 40556
> 36821
> 2342
> 2644
> 3695
> 7059
> 45581
> 1706
> 30947
> 16001
> 136383
> 1345
&

ATS slow after ERR_CONNECT_FAIL

2018-03-02 Thread Veiko Kukk
Hi,

ATS 7.1.2, CentOS 7.4.

We have regular requests via ATS in reverse mode to get response time
statistics and decide whether source is available. This request is excluded
from caching, so it always results in TCP_MISS/200 in normal conditions
with response time fluctuation depending on the distance of ATS server from
source.

Sometimes there are network errors that get logged as ERR_CONNECT_FAIL/502,
after which there are again TCP_MISS/200 but with very high latency!
This happens on all ATS servers we have deployed.

For example, when normally before network error response times fluctuate
around 150ms, then after they fluctuate around 1300 ms. After restarting
ATS, response times are back to normal.

Response times from log:
168
12979
2755
117256
56442
40104
1043
30955
30972
18418
26542
4172
8587
59259
4674
37166
16123
67019
41723
3497
6957
18684
17663
14634
20036
1305
14815
3526
10542
62519
22025
40556
36821
2342
2644
3695
7059
45581
1706
30947
16001
136383
1345
72770
1087
1311
1074
1080
1101
1033
1045
1057
1109
1078
1104
1059
1066
1062
1143
1074
1095
1207
1059
1067
1101
1110
1069
1394
1278
1323
1319
1315
1317
1317
1343
1270
1284
1362
1446
1363
1302
1321
1330
1317
1335
1346
1312

and after restart:

593
362
129
381
361
380
394
394
117
384
341
112
88
363
397
93
385
356
371
360
411
120
338
497
449
377
131
458
161
372
112
364
87
129
340
93
114
338
377
454
424
381
381
348
129
336
116
404
108
340
114

Best regards,
Veiko


Re: very long cache_open_read_end - timeout

2018-02-26 Thread Veiko Kukk
Hi Bryan,

We decided to create 4 volumes. No strong reasoning behind it, just 2
seemed too little to avoid those blocking timeouts and 4 was next
comfortable to set using 25%.

Veiko


2018-02-23 21:21 GMT+02:00 Bryan Call <bc...@apache.org>:

> Veiko,
>
> How many volumes did you create?
>
> -Bryan
>
>
> On Feb 23, 2018, at 3:14 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
> Hi, Mateusz
>
> I wonder if you might experience same issue as me.
> https://github.com/apache/trafficserver/issues/3057
>
> For me, solution, or I would rather call it workaround was to create
> volumes.
>
> Unfortunately, documentation regarding to storage performance
> issues/tuning is missing.
>
> Best regards,
> Veiko
>
>
> 2018-02-22 13:59 GMT+02:00 Mateusz Zajakala <zajak...@gmail.com>:
>
>> Hi,
>>
>> I have a problem - from time to time my ATS 7.2.1 has problems with
>> serving static files from origin, client transactions last over 10s and I
>> see the following diagnostic (I have enabled logging "slow requests")
>>
>>  cache_open_read_begin: 0.000 cache_open_read_end: 10.640
>>
>> Any requests to the same URL are blocked for over 10s. Restarting ATS
>> solves the problem.
>>
>> What does it mean? Does it have to do with open_read feature?  but the
>> timeouts don't add...
>>
>> Any suggestions appreciated!
>>
>> Here's my records.config with configs related to avoiding thundering
>> herd...
>>
>> CONFIG proxy.config.cache.enable_read_while_writer INT 1
>> CONFIG proxy.config.http.background_fill_active_timeout INT 0
>> CONFIG proxy.config.http.background_fill_completed_threshold FLOAT
>> 0.00
>> CONFIG proxy.config.cache.read_while_writer.max_retries INT 5
>> CONFIG proxy.config.cache.read_while_writer_retry.delay INT 200
>> CONFIG proxy.config.http.cache.max_open_read_retries INT 5
>> CONFIG proxy.config.http.cache.open_read_retry_time INT 200
>>
>> CONFIG proxy.config.http.cache.open_write_fail_action INT 2
>>
>>
>
>


Re: very long cache_open_read_end - timeout

2018-02-23 Thread Veiko Kukk
Hi, Mateusz

I wonder if you might experience same issue as me.
https://github.com/apache/trafficserver/issues/3057

For me, solution, or I would rather call it workaround was to create
volumes.

Unfortunately, documentation regarding to storage performance issues/tuning
is missing.

Best regards,
Veiko


2018-02-22 13:59 GMT+02:00 Mateusz Zajakala :

> Hi,
>
> I have a problem - from time to time my ATS 7.2.1 has problems with
> serving static files from origin, client transactions last over 10s and I
> see the following diagnostic (I have enabled logging "slow requests")
>
>  cache_open_read_begin: 0.000 cache_open_read_end: 10.640
>
> Any requests to the same URL are blocked for over 10s. Restarting ATS
> solves the problem.
>
> What does it mean? Does it have to do with open_read feature?  but the
> timeouts don't add...
>
> Any suggestions appreciated!
>
> Here's my records.config with configs related to avoiding thundering
> herd...
>
> CONFIG proxy.config.cache.enable_read_while_writer INT 1
> CONFIG proxy.config.http.background_fill_active_timeout INT 0
> CONFIG proxy.config.http.background_fill_completed_threshold FLOAT
> 0.00
> CONFIG proxy.config.cache.read_while_writer.max_retries INT 5
> CONFIG proxy.config.cache.read_while_writer_retry.delay INT 200
> CONFIG proxy.config.http.cache.max_open_read_retries INT 5
> CONFIG proxy.config.http.cache.open_read_retry_time INT 200
>
> CONFIG proxy.config.http.cache.open_write_fail_action INT 2
>
>


Re: Understanding RAM cache size and limits

2018-02-18 Thread Veiko Kukk
Yes, other server have same issue too.

It takes some time to fill RAM cache and it's efficiency for our use
case/setup is not good as seen from my original post.
We are actually planning to disable RAM cache entirely. But I don't see why
it would not go over limit again after restarting. It's not like
computers/programs behave differently at every startup.

Veiko


2018-02-16 23:34 GMT+02:00 Bryan Call <bc...@apache.org>:

> Are you see this on more than one host?  Have you tried restarting it and
> see if it goes over the maximum again?
>
> Here are are the stats I am seeing from a 7.1.2 server in production:
>
> [bcall@e10 ~]$ traffic_ctl metric match ram
> proxy.process.cache.ram_cache.total_bytes 317
> proxy.process.cache.ram_cache.bytes_used 31993472640
> proxy.process.cache.volume_1.ram_cache.total_bytes 80
> proxy.process.cache.volume_1.ram_cache.bytes_used 7997441792
> proxy.process.cache.volume_1.ram_cache.hits 32096586
> proxy.process.cache.volume_1.ram_cache.misses 10833490
> proxy.process.cache.volume_2.ram_cache.total_bytes 80
> proxy.process.cache.volume_2.ram_cache.bytes_used 7998451328
> proxy.process.cache.volume_2.ram_cache.hits 34837324
> proxy.process.cache.volume_2.ram_cache.misses 10840846
> proxy.process.cache.volume_3.ram_cache.total_bytes 80
> proxy.process.cache.volume_3.ram_cache.bytes_used 7999866624
> proxy.process.cache.volume_3.ram_cache.hits 32447086
> proxy.process.cache.volume_3.ram_cache.misses 10697251
> proxy.process.cache.volume_4.ram_cache.total_bytes 77
> proxy.process.cache.volume_4.ram_cache.bytes_used 7997786880
> proxy.process.cache.volume_4.ram_cache.hits 30927240
> proxy.process.cache.volume_4.ram_cache.misses 10826102
>
> > On Feb 16, 2018, at 3:56 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
> >
> > Hi,
> >
> > We have strange situation with ATS 7.1.2 where RAM cache has grown
> beyond set limits.
> >
> > # /opt/trafficserver/bin/traffic_ctl metric match ram
> > proxy.process.cache.ram_cache.total_bytes 3112697856
> > proxy.process.cache.ram_cache.bytes_used 4493088640
> > proxy.process.cache.ram_cache.hits 1548
> > proxy.process.cache.ram_cache.misses 1635443
> > proxy.process.cache.volume_1.ram_cache.total_bytes 778174464
> > proxy.process.cache.volume_1.ram_cache.bytes_used 1067949184
> > proxy.process.cache.volume_1.ram_cache.hits 374
> > proxy.process.cache.volume_1.ram_cache.misses 403206
> > proxy.process.cache.volume_2.ram_cache.total_bytes 778174464
> > proxy.process.cache.volume_2.ram_cache.bytes_used 1129483520
> > proxy.process.cache.volume_2.ram_cache.hits 368
> > proxy.process.cache.volume_2.ram_cache.misses 410164
> > proxy.process.cache.volume_3.ram_cache.total_bytes 778174464
> > proxy.process.cache.volume_3.ram_cache.bytes_used 1084551424
> > proxy.process.cache.volume_3.ram_cache.hits 357
> > proxy.process.cache.volume_3.ram_cache.misses 408656
> > proxy.process.cache.volume_4.ram_cache.total_bytes 778174464
> > proxy.process.cache.volume_4.ram_cache.bytes_used 1211104512
> > proxy.process.cache.volume_4.ram_cache.hits 449
> > proxy.process.cache.volume_4.ram_cache.misses 413417
> >
> > Relevant config parameters:
> > # /opt/trafficserver/bin/traffic_ctl config match ram
> > proxy.config.cache.ram_cache_cutoff: 16777216
> > proxy.config.cache.ram_cache.size: -1
> > proxy.config.cache.ram_cache.algorithm: 1
> > proxy.config.cache.ram_cache.use_seen_filter: 1
> > proxy.config.cache.ram_cache.compress: 0
> > proxy.config.cache.ram_cache.compress_percent: 90
> > proxy.config.ssl.server.dhparams_file: NULL
> > proxy.config.http2.max_frame_size: 16384
> >
> > I understand that, if set to -1, ATS will determine RAM cache size
> automatically and that in our case would be 
> proxy.process.cache.ram_cache.total_bytes
> 3112697856 .
> >
> > Why and how can it use more than that?
> >
> > --
> > Veiko
> >
>
>


Understanding RAM cache size and limits

2018-02-16 Thread Veiko Kukk
Hi,

We have strange situation with ATS 7.1.2 where RAM cache has grown beyond
set limits.

# /opt/trafficserver/bin/traffic_ctl metric match ram
proxy.process.cache.ram_cache.total_bytes 3112697856
proxy.process.cache.ram_cache.bytes_used 4493088640
proxy.process.cache.ram_cache.hits 1548
proxy.process.cache.ram_cache.misses 1635443
proxy.process.cache.volume_1.ram_cache.total_bytes 778174464
proxy.process.cache.volume_1.ram_cache.bytes_used 1067949184
proxy.process.cache.volume_1.ram_cache.hits 374
proxy.process.cache.volume_1.ram_cache.misses 403206
proxy.process.cache.volume_2.ram_cache.total_bytes 778174464
proxy.process.cache.volume_2.ram_cache.bytes_used 1129483520
proxy.process.cache.volume_2.ram_cache.hits 368
proxy.process.cache.volume_2.ram_cache.misses 410164
proxy.process.cache.volume_3.ram_cache.total_bytes 778174464
proxy.process.cache.volume_3.ram_cache.bytes_used 1084551424
proxy.process.cache.volume_3.ram_cache.hits 357
proxy.process.cache.volume_3.ram_cache.misses 408656
proxy.process.cache.volume_4.ram_cache.total_bytes 778174464
proxy.process.cache.volume_4.ram_cache.bytes_used 1211104512
proxy.process.cache.volume_4.ram_cache.hits 449
proxy.process.cache.volume_4.ram_cache.misses 413417

Relevant config parameters:
# /opt/trafficserver/bin/traffic_ctl config match ram
proxy.config.cache.ram_cache_cutoff: 16777216
proxy.config.cache.ram_cache.size: -1
proxy.config.cache.ram_cache.algorithm: 1
proxy.config.cache.ram_cache.use_seen_filter: 1
proxy.config.cache.ram_cache.compress: 0
proxy.config.cache.ram_cache.compress_percent: 90
proxy.config.ssl.server.dhparams_file: NULL
proxy.config.http2.max_frame_size: 16384

I understand that, if set to -1, ATS will determine RAM cache size
automatically and that in our case would be
proxy.process.cache.ram_cache.total_bytes 3112697856 .

Why and how can it use more than that?

-- 
Veiko


Re: Understanding ATS memory usage

2018-01-31 Thread Veiko Kukk
Hi Bryan,

System in general is under light load. No other processes cause latency.

I've submitted issue https://github.com/apache/trafficserver/issues/3057


Veiko


2018-01-27 2:20 GMT+02:00 Bryan Call <bc...@apache.org>:

> I came across this command and it has helped track down some latency
> issues caused by other processes (ss -s).  Can you run it during the time
> you are seeing latency issues and post the results here?
>
> dstat -c --top-cpu -d --top-bio --top-latency -n
>
> -Bryan
>
> On Jan 26, 2018, at 5:48 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
> Hi again,
>
> I'd really appreciate if somebody could point me in the right direction
> how to solve this.
> During whatever ATS does after each ~ 50 minutes, it has strong effect on
> response times.
> ATS is used in reverse proxy mode and we run regular tests against test
> URL on proxied server(s) (excluded from caching in ATS config).
> This test GET is ran as HAproxy health check after ~15 seconds for two
> local HAproxy backends which both pass requests to single local ATS.
>
> Quite complex setup, but point here being that tests run frequently and
> give information about ATS response times over long period.
>
> Total test runs today: 6364
> Tests that took over 7s today: 50
>
> Distribution of requests, first column is response time, second amount of
> requests under that value.
> 100 1292
> 300 4351
> 500 5194
> 700 5578
> 900 5794
> 1200 5985
> 1400 6058
> 1800 6143
>
> Here is the output of tests log that contains all the extremely slow
> responses. Test response size is only 609 bytes. Usually response time
> fluctuates around
>
> 2018-01-26T01:13:32.150186+00:00 12412
> 2018-01-26T01:13:32.150188+00:00 20803
> 2018-01-26T02:05:04.536931+00:00 29764
> 2018-01-26T02:05:04.536936+00:00 27271
> 2018-01-26T02:05:04.536941+00:00 10233
> 2018-01-26T02:56:26.968987+00:00 9511
> 2018-01-26T02:56:26.968989+00:00 30084
> 2018-01-26T02:56:26.968991+00:00 27337
> 2018-01-26T04:39:21.947460+00:00 24171
> 2018-01-26T04:39:21.947462+00:00 12042
> 2018-01-26T04:39:21.947464+00:00 36979
> 2018-01-26T04:39:31.954116+00:00 7369
> 2018-01-26T04:39:31.954118+00:00 32305
> 2018-01-26T04:39:31.954120+00:00 19779
> 2018-01-26T04:47:42.349748+00:00 29177
> 2018-01-26T04:47:42.349754+00:00 26212
> 2018-01-26T04:47:42.349757+00:00 21645
> 2018-01-26T04:47:42.349759+00:00 24932
> 2018-01-26T05:39:04.925435+00:00 32361
> 2018-01-26T05:39:04.925438+00:00 33587
> 2018-01-26T05:39:04.925440+00:00 8173
> 2018-01-26T05:39:04.925443+00:00 28149
> 2018-01-26T05:39:04.925445+00:00 29115
> 2018-01-26T06:30:27.643170+00:00 7423
> 2018-01-26T06:30:27.643172+00:00 32271
> 2018-01-26T06:30:27.643174+00:00 18927
> 2018-01-26T06:30:27.643179+00:00 27849
> 2018-01-26T06:30:37.644023+00:00 15160
> 2018-01-26T07:21:50.231681+00:00 19208
> 2018-01-26T07:21:50.231684+00:00 14984
> 2018-01-26T08:13:12.874501+00:00 16876
> 2018-01-26T08:13:22.885389+00:00 14007
> 2018-01-26T09:04:35.509167+00:00 9016
> 2018-01-26T09:04:35.509172+00:00 9356
> 2018-01-26T09:55:58.052277+00:00 24137
> 2018-01-26T09:55:58.052280+00:00 23709
> 2018-01-26T09:55:58.052282+00:00 19901
> 2018-01-26T09:55:58.052284+00:00 19034
> 2018-01-26T10:47:10.614261+00:00 23419
> 2018-01-26T10:47:10.614263+00:00 18967
> 2018-01-26T11:38:32.984318+00:00 14425
> 2018-01-26T11:38:32.984324+00:00 9797
> 2018-01-26T11:38:32.984326+00:00 11161
> 2018-01-26T11:38:32.984329+00:00 16228
> 2018-01-26T12:29:45.511517+00:00 15580
> 2018-01-26T12:29:45.511520+00:00 11439
> 2018-01-26T13:20:58.023816+00:00 21360
> 2018-01-26T13:20:58.023818+00:00 19488
> 2018-01-26T13:20:58.023821+00:00 14737
> 2018-01-26T13:20:58.023823+00:00 17118
>
>
> The question is: why does ATS regularily slow down? Are there some
> internal management jobs done, that use same single queue as requests
> coming in and then requests just wait in queue until internal processes
> finish?
> It's getting worse as ATS uptime increases, triggering HAproxy health
> check timeouts. After restart it's not that bad again for a while.
>
> How to get rid of this regular slowness?
>
> Best regards,
> Veiko
>
>
>
> 2018-01-23 13:53 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:
>
>> Hi,
>>
>> I should have noted before that, during that timeframe, there is no
>> higher disk activity than on average. No higher load, no disk latency, no
>> cpu load. Nothing abnormal except slow ATS.
>> ATS is running on CentOS 7 directly on hardware dedicated server.
>>
>> Dirty pages related config that's been always there for that server:
>>
>> vm.dirty_background_ratio = 

Re: Understanding ATS memory usage

2018-01-26 Thread Veiko Kukk
Hi again,

I'd really appreciate if somebody could point me in the right direction how
to solve this.
During whatever ATS does after each ~ 50 minutes, it has strong effect on
response times.
ATS is used in reverse proxy mode and we run regular tests against test URL
on proxied server(s) (excluded from caching in ATS config).
This test GET is ran as HAproxy health check after ~15 seconds for two
local HAproxy backends which both pass requests to single local ATS.

Quite complex setup, but point here being that tests run frequently and
give information about ATS response times over long period.

Total test runs today: 6364
Tests that took over 7s today: 50

Distribution of requests, first column is response time, second amount of
requests under that value.
100 1292
300 4351
500 5194
700 5578
900 5794
1200 5985
1400 6058
1800 6143

Here is the output of tests log that contains all the extremely slow
responses. Test response size is only 609 bytes. Usually response time
fluctuates around

2018-01-26T01:13:32.150186+00:00 12412
2018-01-26T01:13:32.150188+00:00 20803
2018-01-26T02:05:04.536931+00:00 29764
2018-01-26T02:05:04.536936+00:00 27271
2018-01-26T02:05:04.536941+00:00 10233
2018-01-26T02:56:26.968987+00:00 9511
2018-01-26T02:56:26.968989+00:00 30084
2018-01-26T02:56:26.968991+00:00 27337
2018-01-26T04:39:21.947460+00:00 24171
2018-01-26T04:39:21.947462+00:00 12042
2018-01-26T04:39:21.947464+00:00 36979
2018-01-26T04:39:31.954116+00:00 7369
2018-01-26T04:39:31.954118+00:00 32305
2018-01-26T04:39:31.954120+00:00 19779
2018-01-26T04:47:42.349748+00:00 29177
2018-01-26T04:47:42.349754+00:00 26212
2018-01-26T04:47:42.349757+00:00 21645
2018-01-26T04:47:42.349759+00:00 24932
2018-01-26T05:39:04.925435+00:00 32361
2018-01-26T05:39:04.925438+00:00 33587
2018-01-26T05:39:04.925440+00:00 8173
2018-01-26T05:39:04.925443+00:00 28149
2018-01-26T05:39:04.925445+00:00 29115
2018-01-26T06:30:27.643170+00:00 7423
2018-01-26T06:30:27.643172+00:00 32271
2018-01-26T06:30:27.643174+00:00 18927
2018-01-26T06:30:27.643179+00:00 27849
2018-01-26T06:30:37.644023+00:00 15160
2018-01-26T07:21:50.231681+00:00 19208
2018-01-26T07:21:50.231684+00:00 14984
2018-01-26T08:13:12.874501+00:00 16876
2018-01-26T08:13:22.885389+00:00 14007
2018-01-26T09:04:35.509167+00:00 9016
2018-01-26T09:04:35.509172+00:00 9356
2018-01-26T09:55:58.052277+00:00 24137
2018-01-26T09:55:58.052280+00:00 23709
2018-01-26T09:55:58.052282+00:00 19901
2018-01-26T09:55:58.052284+00:00 19034
2018-01-26T10:47:10.614261+00:00 23419
2018-01-26T10:47:10.614263+00:00 18967
2018-01-26T11:38:32.984318+00:00 14425
2018-01-26T11:38:32.984324+00:00 9797
2018-01-26T11:38:32.984326+00:00 11161
2018-01-26T11:38:32.984329+00:00 16228
2018-01-26T12:29:45.511517+00:00 15580
2018-01-26T12:29:45.511520+00:00 11439
2018-01-26T13:20:58.023816+00:00 21360
2018-01-26T13:20:58.023818+00:00 19488
2018-01-26T13:20:58.023821+00:00 14737
2018-01-26T13:20:58.023823+00:00 17118


The question is: why does ATS regularily slow down? Are there some internal
management jobs done, that use same single queue as requests coming in and
then requests just wait in queue until internal processes finish?
It's getting worse as ATS uptime increases, triggering HAproxy health check
timeouts. After restart it's not that bad again for a while.

How to get rid of this regular slowness?

Best regards,
Veiko



2018-01-23 13:53 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:

> Hi,
>
> I should have noted before that, during that timeframe, there is no higher
> disk activity than on average. No higher load, no disk latency, no cpu
> load. Nothing abnormal except slow ATS.
> ATS is running on CentOS 7 directly on hardware dedicated server.
>
> Dirty pages related config that's been always there for that server:
>
> vm.dirty_background_ratio = 5
> vm.dirty_ratio = 40
> vm.swappiness = 0
>
> # free -m
>total   used   free  shared
> buff/cache   available
> Mem: 128831   2784124364331   98554
>  95722
> Swap:  4095   04095
>
> As you see, there is lot of available memory.
>
> I don't see how writing dirty pages could slow down ATS when there is no
> indication of excessive load on any of the system resources.
> And this strange regularity: every ~ 50 minutes this happens. Like some
> regular (cronjob like) task is being ran inside ATS that delays all other
> tasks.
>
> ATS is using 9TB raw partition if that information might be relevant.
>
> Could you point me to the documentation dealing with dir entry sync
> periods of ATS?
>
>
> --
> Veiko
>
>
>
> 2018-01-23 12:12 GMT+02:00 Leif Hedstrom <zw...@apache.org>:
>
>>
>>
>> On Jan 23, 2018, at 7:36 PM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>>
>> Hi again,
>>
>> During that mysterious task t

Re: Understanding ATS memory usage

2018-01-23 Thread Veiko Kukk
Hi,

I should have noted before that, during that timeframe, there is no higher
disk activity than on average. No higher load, no disk latency, no cpu
load. Nothing abnormal except slow ATS.
ATS is running on CentOS 7 directly on hardware dedicated server.

Dirty pages related config that's been always there for that server:

vm.dirty_background_ratio = 5
vm.dirty_ratio = 40
vm.swappiness = 0

# free -m
   total   used   free  shared
buff/cache   available
Mem: 128831   2784124364331   98554
 95722
Swap:  4095   04095

As you see, there is lot of available memory.

I don't see how writing dirty pages could slow down ATS when there is no
indication of excessive load on any of the system resources.
And this strange regularity: every ~ 50 minutes this happens. Like some
regular (cronjob like) task is being ran inside ATS that delays all other
tasks.

ATS is using 9TB raw partition if that information might be relevant.

Could you point me to the documentation dealing with dir entry sync periods
of ATS?


-- 
Veiko



2018-01-23 12:12 GMT+02:00 Leif Hedstrom <zw...@apache.org>:

>
>
> On Jan 23, 2018, at 7:36 PM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
> Hi again,
>
> During that mysterious task that happens after ~ 50-51 minutes causes
> requests/responses to slow down very much, even time out.
> Requests that usually take few hundred milliseconds are now taking over
> 30s and timing out. This happens only during that time when memory
> consumption is suddenly dropped by ATS. Happens for both bypassed urls and
> for hits.
> ATS version is 7.1.1 and this looks like serious bug for me.
>
>
>
> That sounds suspiciously like kernel paging activity, maybe it’s spending
> that time dumping flushing dirty pages?  Maybe transparent huge pages ? Or
> tweak the sysctl’s for dirty page ratios?
>
> The other thing to possibly look at is the dir entry sync periods of ATS.
> Whenever we sync those to disk, we consume both more memory and more disk
> I/O, and maybe you are putting too much pressure on the VM (i.e. maybe you
> need to turn down the RAM cache or tweak the amount of directory entries
> you have).
>
> — Leif
>
> E.g.
>
> https://lonesysadmin.net/2013/12/22/better-linux-disk-
> caching-performance-vm-dirty_ratio/
>
>
>


Re: Understanding ATS memory usage

2018-01-23 Thread Veiko Kukk
Hi again,

During that mysterious task that happens after ~ 50-51 minutes causes
requests/responses to slow down very much, even time out.
Requests that usually take few hundred milliseconds are now taking over 30s
and timing out. This happens only during that time when memory consumption
is suddenly dropped by ATS. Happens for both bypassed urls and for hits.
ATS version is 7.1.1 and this looks like serious bug for me.

Regards,
Veiko


2017-12-19 18:28 GMT+02:00 Alan Carroll <solidwallofc...@oath.com>:

> It's a complex subject hard to put in an email. A few notes:
>
> 1) You shouldn't let an ATS box swap. That almost always ends badly.
> Adding more ram or adjusting the configuration to avoid it is better. I
> think we set swappiness to 0.
>
> 2) The cache directory takes memory independent of the ram cache. This is
> 10 bytes per directory entry. The number of directory entries is roughly
> the cache disk size divided by the average object size as set in
> records.config.
>
> 3) ATS does not use the kernel page cache for its own cache operations.
>
> 4) A larger ram cache almost always creates better performance, but the
> yield curve can differ quite a lot. What the ram cache does is enable
> cached data to be served from ram instead of disk. However, once the ram
> cache covers most of the working set, additional ram yields marginal
> benefits. E.g. putting an object fetched once a day in ram cache is better,
> but not very much.
>
> 5) I think what your'e seeing with your graph is cache directory
> synchronization to disk. To do that, ATS allocates memory for a copy of the
> cache directory, copies the directory there, then writes it out. It should
> be doing that somewhat peicemeal because a full duplicate of cache
> directory can be very large.
>
>
> On Tue, Dec 19, 2017 at 3:04 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
>> Hi,
>>
>> Really nobody knows how ATS uses memory?
>>
>> Veiko
>>
>> 2017-12-12 14:44 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:
>>
>>> Hi,
>>>
>>> I'm confused about ATS memory configuration. I have a server with CentOS
>>> 7, ATS 7.1.1, 64GB memory and ~ 10TB disk.
>>> traffic_server process takes ~ 23GB memory with the configuration option
>>> (8GB)
>>> CONFIG proxy.config.cache.ram_cache.size INT 8589934592
>>> <(858)%20993-4592>
>>> ATS is using raw partition on HDD.
>>>
>>> * Why does it swap when there is page cache that's basically free memory
>>> that could be used before swapping. vm.swappiness is 10, i had it set to 0
>>> too, then system does not swap.
>>> * Considering ATS is using O_DIRECT with raw partitions and it's own
>>> memory management for disk cache, would that mean that ATS is not using
>>> kernel page cache at all?
>>> * Would ATS benefit  from larger RAM cache considering it has it's own
>>> disk buffer management.
>>>
>>> Also, most strange is that there are frequent memory usage drops of
>>> traffic_server process. After around 50 minutes, 10GB memory is released
>>> and immediately consumed again. Attaching screenshot.
>>>
>>> Regards,
>>> Veiko
>>>
>>>
>>
>


Re: Avoiding TCP_REFRESH_HIT

2018-01-09 Thread Veiko Kukk
Just to give back to community what was the final solution

* /etc/trafficserver/header_rewrite.config [1]
cond %{READ_RESPONSE_HDR_HOOK} [AND]
cond %{STATUS} = 200
rm-header Expires
set-header Cache-Control "max-age=157784630"

* /etc/trafficserver/plugin.config [2]
header_rewrite.so header_rewrite.config

* /etc/trafficserver/records.config [3]
CONFIG proxy.config.http.cache.required_headers INT 2

[1]
https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/plugins/header_rewrite.en.html
[2]
https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/files/plugin.config.en.html
[3]
https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/files/records.config.en.html#proxy-config-http-cache-required-headers

It's important to add max-age only to HTTP status code 200 and to set
required headers to contain max-age.
set-header overwrites existing 'Cache-Control'.


Regards,
Veiko



2017-11-24 11:24 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:

> Hi,
>
> 1) I don't know. ATS is not in production yet, i just noticed that it
> happened during testing. Since we are just moving from Squid to ATS, i'm
> testing latest stable ATS, thats being 7.1.1 at the moment.
>
> 2) Thank you for that! That's exatly what I was looking for, some clever
> way to overwrite ATS staleness decision.
>
> Regards,
> Veiko
>
>
> 2017-11-23 22:33 GMT+02:00 Shu Kit Chan <chanshu...@gmail.com>:
>
>> 1) How often does TCP_REFRESH_HIT happen? Just a wild guess. If you
>> are using ATS < 6.2.0, there is a fuzz feature that may cause this to
>> happen very once in a while.
>>
>> 2) Here is an alternative approach. you can try to use ts_lua plugin
>> with the following script.
>>
>> function cache_lookup()
>>   ts.debug('cache-lookup')
>>   -- mark stale hit as fresh hit
>>   local cache_status = ts.http.get_cache_lookup_status()
>>   if cache_status == TS_LUA_CACHE_LOOKUP_HIT_STALE then
>> ts.debug('stale hit')
>> ts.http.set_cache_lookup_status(TS_LUA_CACHE_LOOKUP_HIT_FRESH)
>>   end
>>   return 0
>> end
>>
>> function do_global_read_request()
>>   ts.hook(TS_LUA_HOOK_CACHE_LOOKUP_COMPLETE, cache_lookup)
>>   return 0
>> end
>>
>> 3) You can also try to debug it by turning on debug and use debug tag
>> of "http.*" and see what kind of messages are generated in traffic.out
>> when TCP_REFRESH_HIT happened. It should give you some hints on why a
>> revalidate is needed.
>>
>> Thanks. Hopefully it helps.
>>
>> Kit
>>
>>
>>
>> On Thu, Nov 23, 2017 at 10:41 AM, Stephen Washburn
>> <step...@stephenwashburn.com> wrote:
>> > Ah… sorry about that.
>> >
>> > Stephen
>> >
>> > On Nov 23, 2017, at 10:22, Veiko Kukk <veiko.k...@gmail.com> wrote:
>> >
>> > Hi Stephen,
>> >
>> > As i wrote in my first post, i've set CONFIG
>> > proxy.config.http.cache.when_to_revalidate INT 3
>> >
>> > Veiko
>> >
>> >
>> > 2017-11-23 19:56 GMT+02:00 Stephen Washburn <
>> step...@stephenwashburn.com>:
>> >>
>> >> Apologies if I’m missing something, but doesn’t that page say that
>> there
>> >> is an option to have it treat freshness as such:
>> >>
>> >> Traffic Server considers all HTTP objects in the cache to be
>> fresh:Never
>> >> revalidate HTTP objects in the cache with the origin server.
>> >>
>> >>
>> >> By modifying proxy.config.http.cache.when_to_revalidate
>> >>
>> >>
>> >> https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/
>> files/records.config.en.html#proxy-config-http-cache-when-to-revalidate
>> >>
>> >> Stephen
>> >>
>> >> On Nov 23, 2017, at 09:26, Veiko Kukk <veiko.k...@gmail.com> wrote:
>> >>
>> >> Now to think about it, i might have set dest_domain to wrong value.
>> >> Documentation is not that clear on that. If there are x.y.z.tld and
>> >> a.b.z.tld then what have to be written to dest_domain to capture both
>> of
>> >> those?
>> >> dest_domain=z.tld
>> >> or
>> >> dest_domain=*.*.z.tld
>> >>
>> >> Or someting else?
>> >>
>> >> Veiko
>> >>
>> >>
>> >> 2017-11-23 19:20 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:
>> >>>
>> >>> Hi Alan,
>> >>>
>> >>> That is what i had already done in cache.config:
>>

Re: accepting large number of inbound TCP connections

2017-12-29 Thread Veiko Kukk
Hi Mateusz

When you run ab against your ATS to create high enough artificial load, do
all requests succeed and how quickly? Using some tool like wireshark during
ab testing to dump and analyze TCP traffic could then give you hint where
exactly the failure happens.
To exclude any network bottlenecks, run ab locally on server.

Veiko


2017-12-28 22:58 GMT+02:00 Mateusz Zajakala :

> Hi,
>
> I'm trying to optimize the throughput of ATS 6.2.0 running on 16G / 8
> cores server. ATS handles up to 7 Gbps of traffic (circa 500 requests /
> second) serving up to 80% of traffic from ram-disk-based cache.
>
> The problem I'm seeing is that from time to time my http clients can't
> connect to the server reasonably fast (I define this as 1 second to
> establish TCP conn). Unfortunately, http keep alive is not used by clients,
> so those 500 request / second are all made over new TCP connections.
> Clients connect, retrieve the file and disconnect. I do realize the
> overheads, but this is not something I can easily change (client-side)...
>
> I'm wondering what I can do to improve the performance and eliminate those
> failed connection attempts. Some ideas I have tried
> - 3 connection throttle in records.config (afaik this also sets the
> max no of open files for ATS)
> - tcp_fin_timeout is set to 1 - I'm not running out of ports because of
> sockets stuck in TIME_WAIT, I have checked. At any given time I have no
> more than 1k TCP connections open
>
> Unfortunately, I'm not sure where these incomng connections are
> dropped/stuck and I'm not sure which TCP stats would help understanding
> this. I have also not tweaked around default Centos 7 TCP settings as I
> don't feel competent enough.
>
> One thing that caught my attention is proxy.config.accept_threads value
> set to 1 (default). This seems really low given the traffic, but I read
> somewhere that it's best left at that. Can you please comment on that?
> Shouldn't this value be adjusted (e.g. 4 or more)? Or even move the accepts
> to worker threads?
>
> I'm not seeing any meaningful errors in ATS logs, but there are no debug
> tsgs enabled. Any suggestion on how to debug / improve much appreciated.
>
> Thanks
> Mateusz
>
>
>


Re: How to purge all cached negative responses

2017-12-21 Thread Veiko Kukk
Thank you for your answer.

Sure I can find those failures from logs, but it's no good because, well,
they have then already failed for client.
I've read about cache inspector, but it does not seem to be able to filter
based on HTTP status code.

Veiko


2017-12-21 14:21 GMT+02:00 David Carlin <dcar...@oath.com>:

> Can you grab the list of objects from the log files?   Thats only thing I
> can think of.
>
> The cache inspector exists, but I've never had any luck with it.  I think
> our cache is too large for it:
>
> https://docs.trafficserver.apache.org/en/latest/admin-
> guide/storage/index.en.html#inspecting-the-cache
>
> David
>
> On Thu, Dec 21, 2017 at 5:40 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
>> Hi,
>>
>> We had configuration mistake that enforced Cache-Contro:
>> max-age=157784630 also to negative responses that then got cached.
>>
>> Now, after fixing config, we need to purge all those objects from cache.
>> It's even good enough if I could get list of objects with certain HTTP
>> status code, then I could write script that purges those objects one by one.
>>
>> How to do mass purge based on HTTP status code or just get list of
>> objects based on HTTP status code?
>>
>>
>> Veiko
>>
>>
>


How to purge all cached negative responses

2017-12-21 Thread Veiko Kukk
Hi,

We had configuration mistake that enforced Cache-Contro: max-age=157784630
also to negative responses that then got cached.

Now, after fixing config, we need to purge all those objects from cache.
It's even good enough if I could get list of objects with certain HTTP
status code, then I could write script that purges those objects one by one.

How to do mass purge based on HTTP status code or just get list of objects
based on HTTP status code?


Veiko


Re: Exclude some URL-s from statistics and logging

2017-12-20 Thread Veiko Kukk
Hi,

I wonder if stats over HTTP request are included in HIT/MISS statistics?

Veiko


2017-12-06 14:05 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:

> Thank you. Actually, log filter is nice but most important would be
> statistics filter.
>
> Veiko
>
> 2017-12-06 3:08 GMT+02:00 Shu Kit Chan <chanshu...@gmail.com>:
>
>> For log, the log filter can be used to separate logs for a particular
>> domain into separate files
>> https://docs.trafficserver.apache.org/en/latest/admin-guide/
>> files/logging.config.en.html
>>
>> Kit
>>
>> On Tue, Dec 5, 2017 at 4:35 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>> > Hi everyone,
>> >
>> > ATS 7.1.1. I'd like to exclude some URL-s from cache statistics because
>> they
>> > are for monitoring purposes only. Requests, that run relatively
>> frequently,
>> > and ATS is configured to never cache those requests.
>> > I'd like to separate that from statistics and access logs to avoid
>> > distortion of real content statistics.
>> >
>> > Is it possible to:
>> > * Exclude certian URL's from internal statistics provided by
>> > https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/
>> monitoring/statistics/accessing.en.html#stats-over-http
>> > * Log some URL-s to separate logfile.
>> >
>> > Veiko
>> >
>>
>
>


Re: Documentation page is broken

2017-12-20 Thread Veiko Kukk
Hi again,

Could somebody revert that change? It's really annoying to use page source
to get to read the documentation.

Veiko


2017-12-19 11:26 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:

> Hi,
>
> It worked yesteday, today i can't access parts of it. For example
> https://docs.trafficserver.apache.org/en/7.1.x/admin-
> guide/files/records.config.en.html#ram-cache
>
> Veiko
>
>
>


Documentation page is broken

2017-12-19 Thread Veiko Kukk
Hi,

It worked yesteday, today i can't access parts of it. For example
https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/files/records.config.en.html#ram-cache

Veiko


Re: Understanding ATS memory usage

2017-12-19 Thread Veiko Kukk
Hi,

Really nobody knows how ATS uses memory?

Veiko

2017-12-12 14:44 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:

> Hi,
>
> I'm confused about ATS memory configuration. I have a server with CentOS
> 7, ATS 7.1.1, 64GB memory and ~ 10TB disk.
> traffic_server process takes ~ 23GB memory with the configuration option
> (8GB)
> CONFIG proxy.config.cache.ram_cache.size INT 8589934592
> ATS is using raw partition on HDD.
>
> * Why does it swap when there is page cache that's basically free memory
> that could be used before swapping. vm.swappiness is 10, i had it set to 0
> too, then system does not swap.
> * Considering ATS is using O_DIRECT with raw partitions and it's own
> memory management for disk cache, would that mean that ATS is not using
> kernel page cache at all?
> * Would ATS benefit  from larger RAM cache considering it has it's own
> disk buffer management.
>
> Also, most strange is that there are frequent memory usage drops of
> traffic_server process. After around 50 minutes, 10GB memory is released
> and immediately consumed again. Attaching screenshot.
>
> Regards,
> Veiko
>
>


Re: Garbled log date stamp

2017-12-18 Thread Veiko Kukk
Hi Leif,

Thank you for the answer. Good to know, it's known bug.

Veiko


2017-12-18 5:35 GMT+02:00 Leif Hedstrom <zw...@apache.org>:

> Some broken log tags were fixed fairly recently, which also is going into
> 7.1.2.Look at https://github.com/apache/trafficserver/pull/2943 and see
> if this is your issue?
>
> — Leif
>
>
> On Dec 11, 2017, at 7:08 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
> Hi
>
> ATS 7.1.1.
> I'm trying to get human readable timestamps with ascii log formats.
>
> -- Squid Log Format.
> squid = format {
>   Format = '% % % %/% % % %
> % %/% %'
> }
>
> test_url_include = filter.accept('cquc CONTAIN /testurl')
> test_url_exclude = filter.reject('cquc CONTAIN /testurl')
>
> -- Log only  test requests
> log.ascii {
> Filename = 'test_url',
> Format = squid,
> Filters = { test_url_include }
> }
>
> -- Normal client usage, excluded test url
> log.ascii {
> Filename = 'access',
> Format = squid,
> Filters = { test_url_exclude }
> }
>
> # file test_url.log
> test_url.log: data
>
> When trying to open with less, less says file is binary.
>  less test_url.log
> "test_url.log" may be a binary file.  See it anyway?
>
> With timestamp in cqtq this does not happen. Why?
>
>
>
>


Re: Cache HIT/MISS header

2017-12-14 Thread Veiko Kukk
2017-12-13 21:09 GMT+02:00 James Peach :

>
>
> I'd use the `xdebug` plugin, see  apache.org/en/7.1.x/admin-guide/plugins/xdebug.en.html>. If you want the
> `X-Cache` header in every response, then you can use `header_rewrite` to
> inject the appropriate `X-Debug` header on every request.
>
>
Why would you rewrite headers twice and include unneccessary x-debug plugin
to achieve what can be done much simpler way. This does not seem optimal
way.

Veiko


Re: Cache HIT/MISS header

2017-12-13 Thread Veiko Kukk
If you also need MISS (our setup does not, we know amount of requests and
amount of HIT's), you need to add another condition with set-header X-Cache
"MISS"

-- 
Veiko


2017-12-12 14:10 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:

> Hi,
>
> I recently had exact same task: to include cache status in response
> headers. That's what I did:
>
> * proxy.config.http.insert_response_via_str 2
> * Using header_rewrite plugin to create additional header with following
> config:
> cond %{SEND_RESPONSE_HDR_HOOK} [AND]
> cond %{HEADER:Via} /(\[cH|\[cR)/
> set-header X-Cache "HIT"
>
> Veiko
>
>
> 2017-12-12 2:29 GMT+02:00 Igor Cicimov <ig...@encompasscorporation.com>:
>
>> You can use the Via header:
>>
>> 
>> ##
>> # Via: headers. Docs:
>> # https://docs.trafficserver.apache.org/records.config#proxy-
>> config-http-insert-response-via-str
>> 
>> ##
>> CONFIG proxy.config.http.insert_request_via_str INT 1
>> CONFIG proxy.config.http.insert_response_via_str INT 3
>> CONFIG proxy.config.http.response_via_str STRING ATS
>>
>> that will insert values like below that you can decode:
>>
>> # traffic_via  '[cHs f ]'
>> Via header is [cHs f ], Length is 8
>> Via Header Details:
>> *Result of Traffic Server cache lookup for URL  :in cache, fresh
>> (a cache "HIT")*
>> Response information received from origin server   :no server
>> connection needed
>> Result of document write-to-cache: :no cache write
>> performed
>>
>> for detailed stats (insert_response_via_str INT 3):
>>
>> # traffic_via 'uScHs f p eN:t cCHi p s '
>> Via header is uScHs f p eN:t cCHi p s , Length is 24
>> Via Header Details:
>> Request headers received from client   :simple request
>> (not conditional)
>> *Result of Traffic Server cache lookup for URL  :in cache, fresh
>> (a cache "HIT")*
>> Response information received from origin server   :no server
>> connection needed
>> Result of document write-to-cache: :no cache write
>> performed
>> Proxy operation result :unknown
>> Error codes (if any)   :no error
>> Tunnel info:no tunneling
>> Cache Type :cache
>> *Cache Lookup Result:cache hit*
>> ICP status :no icp
>> Parent proxy connection status :no parent proxy
>> or unknown
>> Origin server connection status:no server
>> connection needed
>>
>> but you might be already familiar with it and not exactly what you need.
>>
>>
>> On Tue, Dec 12, 2017 at 11:11 AM, Miles Libbey <mlib...@apache.org>
>> wrote:
>>
>>> Perhaps use the X-Debug header:
>>> https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/p
>>> lugins/xdebug.en.html
>>> and maybe a global header_rewrite rule to add the magic header to make
>>> the debug part appear?
>>>
>>> On Mon, Dec 11, 2017 at 8:57 AM, Benjamin Morel
>>> <benjamin.mo...@gmail.com> wrote:
>>> > Sorry if this has been asked before, but I couldn't find it in the
>>> docs.
>>> >
>>> > I'm using ATS as a forward proxy. Is there a way to add a response
>>> header to
>>> > tell me if the request was a HIT or a MISS?
>>> >
>>> > Something like: X-Cache: HIT
>>> >
>>> > Thanks in advance,
>>> > Benjamin
>>>
>>
>>
>>
>


Understanding ATS memory usage

2017-12-12 Thread Veiko Kukk
Hi,

I'm confused about ATS memory configuration. I have a server with CentOS 7,
ATS 7.1.1, 64GB memory and ~ 10TB disk.
traffic_server process takes ~ 23GB memory with the configuration option
(8GB)
CONFIG proxy.config.cache.ram_cache.size INT 8589934592
ATS is using raw partition on HDD.

* Why does it swap when there is page cache that's basically free memory
that could be used before swapping. vm.swappiness is 10, i had it set to 0
too, then system does not swap.
* Considering ATS is using O_DIRECT with raw partitions and it's own memory
management for disk cache, would that mean that ATS is not using kernel
page cache at all?
* Would ATS benefit  from larger RAM cache considering it has it's own disk
buffer management.

Also, most strange is that there are frequent memory usage drops of
traffic_server process. After around 50 minutes, 10GB memory is released
and immediately consumed again. Attaching screenshot.

Regards,
Veiko


Re: Cache HIT/MISS header

2017-12-12 Thread Veiko Kukk
Hi,

I recently had exact same task: to include cache status in response
headers. That's what I did:

* proxy.config.http.insert_response_via_str 2
* Using header_rewrite plugin to create additional header with following
config:
cond %{SEND_RESPONSE_HDR_HOOK} [AND]
cond %{HEADER:Via} /(\[cH|\[cR)/
set-header X-Cache "HIT"

Veiko


2017-12-12 2:29 GMT+02:00 Igor Cicimov :

> You can use the Via header:
>
> 
> ##
> # Via: headers. Docs:
> # https://docs.trafficserver.apache.org/records.config#
> proxy-config-http-insert-response-via-str
> 
> ##
> CONFIG proxy.config.http.insert_request_via_str INT 1
> CONFIG proxy.config.http.insert_response_via_str INT 3
> CONFIG proxy.config.http.response_via_str STRING ATS
>
> that will insert values like below that you can decode:
>
> # traffic_via  '[cHs f ]'
> Via header is [cHs f ], Length is 8
> Via Header Details:
> *Result of Traffic Server cache lookup for URL  :in cache, fresh
> (a cache "HIT")*
> Response information received from origin server   :no server
> connection needed
> Result of document write-to-cache: :no cache write
> performed
>
> for detailed stats (insert_response_via_str INT 3):
>
> # traffic_via 'uScHs f p eN:t cCHi p s '
> Via header is uScHs f p eN:t cCHi p s , Length is 24
> Via Header Details:
> Request headers received from client   :simple request
> (not conditional)
> *Result of Traffic Server cache lookup for URL  :in cache, fresh
> (a cache "HIT")*
> Response information received from origin server   :no server
> connection needed
> Result of document write-to-cache: :no cache write
> performed
> Proxy operation result :unknown
> Error codes (if any)   :no error
> Tunnel info:no tunneling
> Cache Type :cache
> *Cache Lookup Result:cache hit*
> ICP status :no icp
> Parent proxy connection status :no parent proxy or
> unknown
> Origin server connection status:no server
> connection needed
>
> but you might be already familiar with it and not exactly what you need.
>
>
> On Tue, Dec 12, 2017 at 11:11 AM, Miles Libbey  wrote:
>
>> Perhaps use the X-Debug header:
>> https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/
>> plugins/xdebug.en.html
>> and maybe a global header_rewrite rule to add the magic header to make
>> the debug part appear?
>>
>> On Mon, Dec 11, 2017 at 8:57 AM, Benjamin Morel
>>  wrote:
>> > Sorry if this has been asked before, but I couldn't find it in the docs.
>> >
>> > I'm using ATS as a forward proxy. Is there a way to add a response
>> header to
>> > tell me if the request was a HIT or a MISS?
>> >
>> > Something like: X-Cache: HIT
>> >
>> > Thanks in advance,
>> > Benjamin
>>
>
>
>


Garbled log date stamp

2017-12-11 Thread Veiko Kukk
Hi

ATS 7.1.1.
I'm trying to get human readable timestamps with ascii log formats.

-- Squid Log Format.
squid = format {
  Format = '% % % %/% % % %
% %/% %'
}

test_url_include = filter.accept('cquc CONTAIN /testurl')
test_url_exclude = filter.reject('cquc CONTAIN /testurl')

-- Log only  test requests
log.ascii {
Filename = 'test_url',
Format = squid,
Filters = { test_url_include }
}

-- Normal client usage, excluded test url
log.ascii {
Filename = 'access',
Format = squid,
Filters = { test_url_exclude }
}

# file test_url.log
test_url.log: data

When trying to open with less, less says file is binary.
 less test_url.log
"test_url.log" may be a binary file.  See it anyway?

With timestamp in cqtq this does not happen. Why?


Re: Exclude some URL-s from statistics and logging

2017-12-06 Thread Veiko Kukk
Thank you. Actually, log filter is nice but most important would be
statistics filter.

Veiko

2017-12-06 3:08 GMT+02:00 Shu Kit Chan <chanshu...@gmail.com>:

> For log, the log filter can be used to separate logs for a particular
> domain into separate files
> https://docs.trafficserver.apache.org/en/latest/admin-
> guide/files/logging.config.en.html
>
> Kit
>
> On Tue, Dec 5, 2017 at 4:35 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
> > Hi everyone,
> >
> > ATS 7.1.1. I'd like to exclude some URL-s from cache statistics because
> they
> > are for monitoring purposes only. Requests, that run relatively
> frequently,
> > and ATS is configured to never cache those requests.
> > I'd like to separate that from statistics and access logs to avoid
> > distortion of real content statistics.
> >
> > Is it possible to:
> > * Exclude certian URL's from internal statistics provided by
> > https://docs.trafficserver.apache.org/en/7.1.x/admin-
> guide/monitoring/statistics/accessing.en.html#stats-over-http
> > * Log some URL-s to separate logfile.
> >
> > Veiko
> >
>


Exclude some URL-s from statistics and logging

2017-12-05 Thread Veiko Kukk
Hi everyone,

ATS 7.1.1. I'd like to exclude some URL-s from cache statistics because
they are for monitoring purposes only. Requests, that run relatively
frequently, and ATS is configured to never cache those requests.
I'd like to separate that from statistics and access logs to avoid
distortion of real content statistics.

Is it possible to:
* Exclude certian URL's from internal statistics provided by
https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/monitoring/statistics/accessing.en.html#stats-over-http
* Log some URL-s to separate logfile.

Veiko


Re: Avoiding TCP_REFRESH_HIT

2017-11-23 Thread Veiko Kukk
Hi Stephen,

As i wrote in my first post, i've set CONFIG
proxy.config.http.cache.when_to_revalidate
INT 3

Veiko


2017-11-23 19:56 GMT+02:00 Stephen Washburn <step...@stephenwashburn.com>:

> Apologies if I’m missing something, but doesn’t that page say that there
> is an option to have it treat freshness as such:
>
> *Traffic Server considers all HTTP objects in the cache to be fresh:*Never
> revalidate HTTP objects in the cache with the origin server.
>
>
> By modifying proxy.config.http.cache.when_to_revalidate
>
> https://docs.trafficserver.apache.org/en/7.1.x/admin-
> guide/files/records.config.en.html#proxy-config-http-cache-
> when-to-revalidate
>
> Stephen
>
> On Nov 23, 2017, at 09:26, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
> Now to think about it, i might have set dest_domain to wrong value.
> Documentation is not that clear on that. If there are x.y.z.tld and
> a.b.z.tld then what have to be written to dest_domain to capture both of
> those?
> dest_domain=z.tld
> or
> dest_domain=*.*.z.tld
>
> Or someting else?
>
> Veiko
>
>
> 2017-11-23 19:20 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:
>
>> Hi Alan,
>>
>> That is what i had already done in cache.config:
>> dest_domain=.*.source.tld ttl-in-cache=d
>>
>> Of cource, source.tld is actually real domain, and this did not avoid
>> checking origin for object freshness, it was still considered stale by ATS.
>>
>> https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/
>> configuration/cache-basics.en.html#ensuring-cached-object-freshness
>> describes that if Expires is present, it is used to calculate objects
>> freshness. And there is no way to ignore it. I've configured now ATS to
>> remove Expires header and set Cache-control: max-age=157784630 with header
>> rewrite plugin and cond %{READ_RESPONSE_HDR_HOOK}
>> Will see it that helps.
>>
>> Veiko
>>
>>
>> 2017-11-23 18:38 GMT+02:00 Alan Carroll <solidwallofc...@oath.com>:
>>
>>> You might try fiddling with the 'cache.config' file and set a cache TTL
>>> of 10 years or so.
>>>
>>> On Thu, Nov 23, 2017 at 10:11 AM, Veiko Kukk <veiko.k...@gmail.com>
>>> wrote:
>>>
>>>> Hi David,
>>>>
>>>> Objects are not fetched from ATS via browser. ATS is just internal
>>>> cache. Only problem is to trick ATS into believing that object is always
>>>> fresh, never stale.
>>>> I wonder if modifying headers before ATS (READ_RESPONSE_HDR_HOOK)
>>>> removing or changing Expires and/or adding max-age to some very big value
>>>> might be right way to go for me.
>>>>
>>>> Veiko
>>>>
>>>>
>>>> 2017-11-23 17:52 GMT+02:00 David Carlin <dcar...@oath.com>:
>>>>
>>>>> Have you considered adding "Cache-Control: Immutable" to these objects
>>>>> which will never require re-validation?  This will prevent the browser 
>>>>> from
>>>>> attempting an If-Modified-Since request.
>>>>>
>>>>> https://hacks.mozilla.org/2017/01/using-immutable-caching-to
>>>>> -speed-up-the-web/
>>>>>
>>>>> David
>>>>>
>>>>> On Thu, Nov 23, 2017 at 10:07 AM, Veiko Kukk <veiko.k...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> In addition to my previous e-mail, headers that are provided by
>>>>>> source to ATS:
>>>>>>
>>>>>> < HTTP/1.1 200 OK
>>>>>> < Content-Length: 1185954
>>>>>> < Accept-Ranges: bytes
>>>>>> < Last-Modified: Mon, 02 Nov 2015 17:56:12 GMT
>>>>>> < Etag: 92ef40097ba87bdf09efcf7e1cefd32a
>>>>>> < X-Timestamp: 1446486971.39466
>>>>>> < Content-Type: application/octet-stream
>>>>>> < Content-Disposition: attachment; 
>>>>>> filename="ABIYohNyPrJNjvFsAdgN5wc8D-8Yo4ZO.m4s";
>>>>>> filename*=UTF-8''ABIYohNyPrJNjvFsAdgN5wc8D-8Yo4ZO.m4s
>>>>>> < Expires: Thu, 23 Nov 2017 15:27:30 GMT
>>>>>> < X-Trans-Id: tx3a0af5473d5c41d38195c-005a16e30d
>>>>>> < X-Openstack-Request-Id: tx3a0af5473d5c41d38195c-005a16e30d
>>>>>> < Date: Thu, 23 Nov 2017 15:02:37 GMT
>>>>>> < X-IPLB-Instance: 12631
>>>>>>
>>>>>> I assume, Expires header is here to blame and must be

Re: Avoiding TCP_REFRESH_HIT

2017-11-23 Thread Veiko Kukk
Hi Alan,

That is what i had already done in cache.config:
dest_domain=.*.source.tld ttl-in-cache=d

Of cource, source.tld is actually real domain, and this did not avoid
checking origin for object freshness, it was still considered stale by ATS.

https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/configuration/cache-basics.en.html#ensuring-cached-object-freshness
describes that if Expires is present, it is used to calculate objects
freshness. And there is no way to ignore it. I've configured now ATS to
remove Expires header and set Cache-control: max-age=157784630 with header
rewrite plugin and cond %{READ_RESPONSE_HDR_HOOK}
Will see it that helps.

Veiko


2017-11-23 18:38 GMT+02:00 Alan Carroll <solidwallofc...@oath.com>:

> You might try fiddling with the 'cache.config' file and set a cache TTL of
> 10 years or so.
>
> On Thu, Nov 23, 2017 at 10:11 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
>> Hi David,
>>
>> Objects are not fetched from ATS via browser. ATS is just internal cache.
>> Only problem is to trick ATS into believing that object is always fresh,
>> never stale.
>> I wonder if modifying headers before ATS (READ_RESPONSE_HDR_HOOK)
>> removing or changing Expires and/or adding max-age to some very big value
>> might be right way to go for me.
>>
>> Veiko
>>
>>
>> 2017-11-23 17:52 GMT+02:00 David Carlin <dcar...@oath.com>:
>>
>>> Have you considered adding "Cache-Control: Immutable" to these objects
>>> which will never require re-validation?  This will prevent the browser from
>>> attempting an If-Modified-Since request.
>>>
>>> https://hacks.mozilla.org/2017/01/using-immutable-caching-to
>>> -speed-up-the-web/
>>>
>>> David
>>>
>>> On Thu, Nov 23, 2017 at 10:07 AM, Veiko Kukk <veiko.k...@gmail.com>
>>> wrote:
>>>
>>>> In addition to my previous e-mail, headers that are provided by source
>>>> to ATS:
>>>>
>>>> < HTTP/1.1 200 OK
>>>> < Content-Length: 1185954
>>>> < Accept-Ranges: bytes
>>>> < Last-Modified: Mon, 02 Nov 2015 17:56:12 GMT
>>>> < Etag: 92ef40097ba87bdf09efcf7e1cefd32a
>>>> < X-Timestamp: 1446486971.39466
>>>> < Content-Type: application/octet-stream
>>>> < Content-Disposition: attachment; 
>>>> filename="ABIYohNyPrJNjvFsAdgN5wc8D-8Yo4ZO.m4s";
>>>> filename*=UTF-8''ABIYohNyPrJNjvFsAdgN5wc8D-8Yo4ZO.m4s
>>>> < Expires: Thu, 23 Nov 2017 15:27:30 GMT
>>>> < X-Trans-Id: tx3a0af5473d5c41d38195c-005a16e30d
>>>> < X-Openstack-Request-Id: tx3a0af5473d5c41d38195c-005a16e30d
>>>> < Date: Thu, 23 Nov 2017 15:02:37 GMT
>>>> < X-IPLB-Instance: 12631
>>>>
>>>> I assume, Expires header is here to blame and must be overriden in ATS
>>>> config, but how? I don't have control over source, its Openstack Swift
>>>> object storage.
>>>>
>>>> Veiko
>>>>
>>>>
>>>> 2017-11-23 16:35 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> Could ATS in reverse proxy mode be configured such way that it would
>>>>> never try to revalidate from source? It is known that in our case, object
>>>>> never changes (and is never refetched from source) and it is desirable to
>>>>> avoid any source validation. Validation verification adds significant
>>>>> overhead and we need to avoid it. Response to client with TCP_REFRESH_HIT
>>>>> would take 100-200ms instead of 0-10 in case of direct local TCP_HIT.
>>>>>
>>>>> I've configured following:
>>>>> dest_domain=.*.source.tld action=ignore-no-cache
>>>>> dest_domain=.*.source.tld revalidate=d
>>>>> dest_domain=.*.source.tld ttl-in-cache=d
>>>>>
>>>>> CONFIG proxy.config.http.cache.when_to_revalidate INT 3
>>>>> CONFIG proxy.config.http.cache.required_headers INT 0
>>>>>
>>>>> But i still get TCP_REFRESH_HIT even when  days have not passed
>>>>> (obviously).
>>>>>
>>>>> NB! ATS is used as internal cache and our 'client' never explicitly
>>>>> requests revalidation.
>>>>>
>>>>> Thanks,
>>>>> Veiko
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: Avoiding TCP_REFRESH_HIT

2017-11-23 Thread Veiko Kukk
Hi David,

Objects are not fetched from ATS via browser. ATS is just internal cache.
Only problem is to trick ATS into believing that object is always fresh,
never stale.
I wonder if modifying headers before ATS (READ_RESPONSE_HDR_HOOK) removing
or changing Expires and/or adding max-age to some very big value might be
right way to go for me.

Veiko


2017-11-23 17:52 GMT+02:00 David Carlin <dcar...@oath.com>:

> Have you considered adding "Cache-Control: Immutable" to these objects
> which will never require re-validation?  This will prevent the browser from
> attempting an If-Modified-Since request.
>
> https://hacks.mozilla.org/2017/01/using-immutable-
> caching-to-speed-up-the-web/
>
> David
>
> On Thu, Nov 23, 2017 at 10:07 AM, Veiko Kukk <veiko.k...@gmail.com> wrote:
>
>> In addition to my previous e-mail, headers that are provided by source to
>> ATS:
>>
>> < HTTP/1.1 200 OK
>> < Content-Length: 1185954
>> < Accept-Ranges: bytes
>> < Last-Modified: Mon, 02 Nov 2015 17:56:12 GMT
>> < Etag: 92ef40097ba87bdf09efcf7e1cefd32a
>> < X-Timestamp: 1446486971.39466
>> < Content-Type: application/octet-stream
>> < Content-Disposition: attachment; 
>> filename="ABIYohNyPrJNjvFsAdgN5wc8D-8Yo4ZO.m4s";
>> filename*=UTF-8''ABIYohNyPrJNjvFsAdgN5wc8D-8Yo4ZO.m4s
>> < Expires: Thu, 23 Nov 2017 15:27:30 GMT
>> < X-Trans-Id: tx3a0af5473d5c41d38195c-005a16e30d
>> < X-Openstack-Request-Id: tx3a0af5473d5c41d38195c-005a16e30d
>> < Date: Thu, 23 Nov 2017 15:02:37 GMT
>> < X-IPLB-Instance: 12631
>>
>> I assume, Expires header is here to blame and must be overriden in ATS
>> config, but how? I don't have control over source, its Openstack Swift
>> object storage.
>>
>> Veiko
>>
>>
>> 2017-11-23 16:35 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:
>>
>>> Hi,
>>>
>>> Could ATS in reverse proxy mode be configured such way that it would
>>> never try to revalidate from source? It is known that in our case, object
>>> never changes (and is never refetched from source) and it is desirable to
>>> avoid any source validation. Validation verification adds significant
>>> overhead and we need to avoid it. Response to client with TCP_REFRESH_HIT
>>> would take 100-200ms instead of 0-10 in case of direct local TCP_HIT.
>>>
>>> I've configured following:
>>> dest_domain=.*.source.tld action=ignore-no-cache
>>> dest_domain=.*.source.tld revalidate=d
>>> dest_domain=.*.source.tld ttl-in-cache=d
>>>
>>> CONFIG proxy.config.http.cache.when_to_revalidate INT 3
>>> CONFIG proxy.config.http.cache.required_headers INT 0
>>>
>>> But i still get TCP_REFRESH_HIT even when  days have not passed
>>> (obviously).
>>>
>>> NB! ATS is used as internal cache and our 'client' never explicitly
>>> requests revalidation.
>>>
>>> Thanks,
>>> Veiko
>>>
>>>
>>
>


Re: Avoiding TCP_REFRESH_HIT

2017-11-23 Thread Veiko Kukk
In addition to my previous e-mail, headers that are provided by source to
ATS:

< HTTP/1.1 200 OK
< Content-Length: 1185954
< Accept-Ranges: bytes
< Last-Modified: Mon, 02 Nov 2015 17:56:12 GMT
< Etag: 92ef40097ba87bdf09efcf7e1cefd32a
< X-Timestamp: 1446486971.39466
< Content-Type: application/octet-stream
< Content-Disposition: attachment;
filename="ABIYohNyPrJNjvFsAdgN5wc8D-8Yo4ZO.m4s";
filename*=UTF-8''ABIYohNyPrJNjvFsAdgN5wc8D-8Yo4ZO.m4s
< Expires: Thu, 23 Nov 2017 15:27:30 GMT
< X-Trans-Id: tx3a0af5473d5c41d38195c-005a16e30d
< X-Openstack-Request-Id: tx3a0af5473d5c41d38195c-005a16e30d
< Date: Thu, 23 Nov 2017 15:02:37 GMT
< X-IPLB-Instance: 12631

I assume, Expires header is here to blame and must be overriden in ATS
config, but how? I don't have control over source, its Openstack Swift
object storage.

Veiko


2017-11-23 16:35 GMT+02:00 Veiko Kukk <veiko.k...@gmail.com>:

> Hi,
>
> Could ATS in reverse proxy mode be configured such way that it would never
> try to revalidate from source? It is known that in our case, object never
> changes (and is never refetched from source) and it is desirable to avoid
> any source validation. Validation verification adds significant overhead
> and we need to avoid it. Response to client with TCP_REFRESH_HIT would take
> 100-200ms instead of 0-10 in case of direct local TCP_HIT.
>
> I've configured following:
> dest_domain=.*.source.tld action=ignore-no-cache
> dest_domain=.*.source.tld revalidate=d
> dest_domain=.*.source.tld ttl-in-cache=d
>
> CONFIG proxy.config.http.cache.when_to_revalidate INT 3
> CONFIG proxy.config.http.cache.required_headers INT 0
>
> But i still get TCP_REFRESH_HIT even when  days have not passed
> (obviously).
>
> NB! ATS is used as internal cache and our 'client' never explicitly
> requests revalidation.
>
> Thanks,
> Veiko
>
>


Avoiding TCP_REFRESH_HIT

2017-11-23 Thread Veiko Kukk
Hi,

Could ATS in reverse proxy mode be configured such way that it would never
try to revalidate from source? It is known that in our case, object never
changes (and is never refetched from source) and it is desirable to avoid
any source validation. Validation verification adds significant overhead
and we need to avoid it. Response to client with TCP_REFRESH_HIT would take
100-200ms instead of 0-10 in case of direct local TCP_HIT.

I've configured following:
dest_domain=.*.source.tld action=ignore-no-cache
dest_domain=.*.source.tld revalidate=d
dest_domain=.*.source.tld ttl-in-cache=d

CONFIG proxy.config.http.cache.when_to_revalidate INT 3
CONFIG proxy.config.http.cache.required_headers INT 0

But i still get TCP_REFRESH_HIT even when  days have not passed
(obviously).

NB! ATS is used as internal cache and our 'client' never explicitly
requests revalidation.

Thanks,
Veiko


Cache sharing between siblings

2017-11-15 Thread Veiko Kukk
Hi,

I'm new to ATS, so far we've been using Squid, but due to it's instability
are moving away.

I'm interested in avoiding regional cache duplication in CDN system.
Currently, we have several Squid servers working as siblings and only
pulling from origin if none of them has the object in cache, if sibling has
object in cache, it's pulled from sibling but not stored locally. Only new
content fetched from origin is store locally. Sibling lookup is done via
ICP and has been working quite well so far.

Is similar setup possible with ATS?
I found no information in documentation how to configure siblings (only
parent configuration is documented). Could you point me to right direction?

Regards,
Veiko