Now that HTTP/2 priorities are no longer crashing, we did some short experiments activating in production. However, that resulted in some customer-side latency problems which I have been digging into through the past week. I have been concentrating on exercising ATS via a continuous streaming site. I have been taking advantage for browser developer hooks to simulate slower network speeds.
I found a paper which gave a nice overview of browser priority strategies https://www.researchgate.net/publication/324514529_HTTP2_Prioritization_and_its_Impact_on_Web_Performance . Chrome does a chain of priorities (first come, first served). The session window limit seems to be hit first. Firefox uses priority holder nodes to set up 4 or 5 buckets. Safari seems to do a naive round robin approach where all priority nodes are direct children of the root. My testing has concentrated on Firefox and Chrome. The PR adds debug messages to dump the state of the priority tree as JSON output which has been very useful for tracking down these issues. This PR has fixes for two problems. * For Chrome, I would occasionally see the priority chain be broken. Originally, the new direct descendent would be a shadow node. Then I took out the shadow node addition unless the parent_id was in the future and it would just be that new stream as the direct root descendent. Since the point value of the new node was lower than the others, it would unfairly get a higher priority of the bandwidth (based on the weighted-fair queuing logic) starving out the older nodes and violating Chrome's first come first served policy. The problem is that the parent of the new node got finished and removed from the priority tree before the stream id request appeared. I saw this mostly with a high bandwidth client. I addressed this issue by adding a small circular buffer of ancestors, so if the parent is missing, we can look to see where that node originally plugged in. It isn't perfect since we are not keeping a complete history, but once I made this change, I no longer saw the problem. I think in practice you will only have one or two parents missing before the next stream request comes. * For Firefox, we are inserting nodes in the priority tree for priority frames, but we were treating them as shadow nodes. So once their last direct descendent was deleted, the priority node was also deleted losing the buckets for future requests on the session. This is not desirable for the Firefox strategy since we are relying on the priority direct root descendants as weight buckets. So rather than using the fact that there is no associated Http2Stream node to identify a priority node as a shadow (true for shadows and priority frames), I added another flag to explicitly mark a node as a shadow. I am still a bit unclear on the use of shadow nodes. I did some searching based on our original behavior of creating shadow nodes for unknown parent and found this discussion https://github.com/http2/http2-spec/issues/764 where Lukasa declares that the right way to do this is to make the new stream for the missing parent and do default priority processing. But in their example, they have stream 6 depending on stream 7 (presumably a future stream). I looked into how nghttp2 does it (which is declared to be the right way) and indeed they create a new node for the missing parent with default weight and root it at the root node. Based on that discussion, I am assuming that shadow nodes are only to represent nodes that will come in the future (e.g. parent id > new stream id). We are still in the process of testing these changes, but I wanted to get more eyes on this. Particularly since I was not involved in the original priority code development. I also need to augment the unit tests to model the chrome and firefox scenarios. I hope to test some android apps. It would also be good to test some iOS apps, but I don't have access to an iOS device. [ Full content available at: https://github.com/apache/trafficserver/pull/4212 ] This message was relayed via gitbox.apache.org for [email protected]
