[ 
https://issues.apache.org/jira/browse/TS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheer Vinukonda updated TS-3549:
----------------------------------
    Description: 
When ATS is used as a delivery server for a video live streaming event, it's 
possible that there are a huge number of concurrent requests for the same 
object. Depending on the type of the object being requested, the cache lookup 
for those objects can result in either a stale copy of the object (e.g manifest 
files) or a complete cache miss (e.g segment files). ATS currently supports 
different types of connection collapse (e.g. *read-while-write* functionality - 
*https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#read-while-writer*,
 swr etc) but, in order for the *rww* to kick-in, ATS requires the complete 
response headers for the object be received and validated. In other words, 
until this happens, any number of incoming requests for the same object that 
result in a cache miss or a cache stale would be forwarded to the origin. For a 
scenario such as a live event, this leaves a sufficiently significant window, 
where there could be 100's of requests being forwarded to the origin for the 
same object. It has been observed during production that this results in 
significant increase in latency for the objects waiting in read-while-write 
state. 

Note that, there are also a couple of settings 
*proxy.config.http.cache.open_read_retry_time* and 
*proxy.config.http.cache.max_open_read_retries* 
(*https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#open-read-retry-timeout*)
 that can alleviate the thundering herd to some extent, by re-trying to get the 
read lock for the object as configured. With these configured, ATS would retry 
to get the read lock for as long and if it's still not available due to the 
write lock being held by the first request that was forwarded to the origin 
(for e.g. the response headers have not been received yet), then all the 
waiting requests would simply be forwarded to the origin (by disabling cache 
for each of them). 

It is almost impossible to get the above settings accurate to help in all 
possible situations (traffic, concurrent connections, network conditions etc). 
Due to this reason, a configurable workaround is proposed below that avoids the 
thundering herd completely. The patch below is mainly from [~jlaue] and 
[~psudaemon] with some additional clean up, configuration control and debug 
headers etc.

Basically, when configured, on failing to obtain a write lock for an object 
(which means, there's another ongoing parallel request for the same object that 
was forwarded to the origin), if it's a cache refresh miss, a stale copy of the 
object is served, while if it's a complete cache miss, a *502* error is 
returned to let the client (e.g. player) to reattempt. The *502* error also 
includes a special internal ATS header named {{@ats-internal-messages}} with 
the appropriate value to allow for custom logging or for plugins to take any 
appropriate actions (e.g. prevent a fail-over if there's such a plugin that 
does fail-over on a regular 502 error).

  was:
When ATS is used as a delivery server for a video live streaming event, it's 
possible that there are a huge number of concurrent requests for the same 
object. Depending on the type of the object being requested, the cache lookup 
for those objects can result in either a stale copy of the object (e.g manifest 
files) or a complete cache miss (e.g segment files). ATS currently supports 
different types of connection collapse (e.g. *read-while-write* functionality - 
*https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#read-while-writer*)
 but, in order for this to kick-in, ATS requires the complete response headers 
for the object be received and validated. In other words, until this happens, 
any number of incoming requests for the same object that result in a cache miss 
or a cache stale would be forwarded to the origin. For a scenario such as a 
live event, this leaves a sufficiently significant window, where there could be 
100's of requests being forwarded to the origin for the same object. It has 
been observed during production that this results in significant increase in 
latency for the objects waiting in read-while-write state. 

Note that, there are also a couple of settings 
*proxy.config.http.cache.open_read_retry_time* and 
*proxy.config.http.cache.max_open_read_retries* 
(*https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#open-read-retry-timeout*)
 that can alleviate the thundering herd to some extent, by re-trying to get the 
read lock for the object as configured. With these configured, ATS would retry 
to get the read lock for as long and if it's still not available due to the 
write lock being held by the first request that was forwarded to the origin 
(for e.g. the response headers have not been received yet), then all the 
waiting requests would simply be forwarded to the origin (by disabling cache 
for each of them). 

It is almost impossible to get the above settings accurate to help in all 
possible situations (traffic, concurrent connections, network conditions etc). 
Due to this reason, a configurable workaround is proposed below that avoids the 
thundering herd completely. The patch below is mainly from [~jlaue] and 
[~psudaemon] with some additional clean up, configuration control and debug 
headers etc.

Basically, when configured, on failing to obtain a write lock for an object 
(which means, there's another ongoing parallel request for the same object that 
was forwarded to the origin), if it's a cache refresh miss, a stale copy of the 
object is served, while if it's a complete cache miss, a *502* error is 
returned to let the client (e.g. player) to reattempt. The *502* error also 
includes a special internal ATS header named {{@ats-internal-messages}} with 
the appropriate value to allow for custom logging or for plugins to take any 
appropriate actions (e.g. prevent a fail-over if there's such a plugin that 
does fail-over on a regular 502 error).


> configurable option to avoid thundering herd due to concurrent requests for 
> the same object
> -------------------------------------------------------------------------------------------
>
>                 Key: TS-3549
>                 URL: https://issues.apache.org/jira/browse/TS-3549
>             Project: Traffic Server
>          Issue Type: New Feature
>          Components: HTTP
>    Affects Versions: 5.3.0
>            Reporter: Sudheer Vinukonda
>            Assignee: Sudheer Vinukonda
>             Fix For: 6.0.0
>
>         Attachments: TS-3549.diff
>
>
> When ATS is used as a delivery server for a video live streaming event, it's 
> possible that there are a huge number of concurrent requests for the same 
> object. Depending on the type of the object being requested, the cache lookup 
> for those objects can result in either a stale copy of the object (e.g 
> manifest files) or a complete cache miss (e.g segment files). ATS currently 
> supports different types of connection collapse (e.g. *read-while-write* 
> functionality - 
> *https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#read-while-writer*,
>  swr etc) but, in order for the *rww* to kick-in, ATS requires the complete 
> response headers for the object be received and validated. In other words, 
> until this happens, any number of incoming requests for the same object that 
> result in a cache miss or a cache stale would be forwarded to the origin. For 
> a scenario such as a live event, this leaves a sufficiently significant 
> window, where there could be 100's of requests being forwarded to the origin 
> for the same object. It has been observed during production that this results 
> in significant increase in latency for the objects waiting in 
> read-while-write state. 
> Note that, there are also a couple of settings 
> *proxy.config.http.cache.open_read_retry_time* and 
> *proxy.config.http.cache.max_open_read_retries* 
> (*https://docs.trafficserver.apache.org/en/latest/admin/http-proxy-caching.en.html#open-read-retry-timeout*)
>  that can alleviate the thundering herd to some extent, by re-trying to get 
> the read lock for the object as configured. With these configured, ATS would 
> retry to get the read lock for as long and if it's still not available due to 
> the write lock being held by the first request that was forwarded to the 
> origin (for e.g. the response headers have not been received yet), then all 
> the waiting requests would simply be forwarded to the origin (by disabling 
> cache for each of them). 
> It is almost impossible to get the above settings accurate to help in all 
> possible situations (traffic, concurrent connections, network conditions 
> etc). Due to this reason, a configurable workaround is proposed below that 
> avoids the thundering herd completely. The patch below is mainly from 
> [~jlaue] and [~psudaemon] with some additional clean up, configuration 
> control and debug headers etc.
> Basically, when configured, on failing to obtain a write lock for an object 
> (which means, there's another ongoing parallel request for the same object 
> that was forwarded to the origin), if it's a cache refresh miss, a stale copy 
> of the object is served, while if it's a complete cache miss, a *502* error 
> is returned to let the client (e.g. player) to reattempt. The *502* error 
> also includes a special internal ATS header named {{@ats-internal-messages}} 
> with the appropriate value to allow for custom logging or for plugins to take 
> any appropriate actions (e.g. prevent a fail-over if there's such a plugin 
> that does fail-over on a regular 502 error).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to