Please see the wiki page link I posted upthread, which in turn links to the issue and PR for the QPACK static table. It describes how raw information was collected from the HTTP archive, analysed with BigQuery, and selectively tailored. Mike Bishop can probably add more detail if required.
On Fri, 16 Dec 2022, 17:55 Roy T. Fielding, <[email protected]> wrote: > > On Dec 16, 2022, at 5:24 AM, Julian Reschke <[email protected]> > wrote: > > > > But then, maybe the interesting question is "how did this value into the > > table in the first place"? Anything we can learn here not to do a > > similar mistake again? > > I did mention it many times during the development of h2 and quic > that the original capture trace was inside a university cache hierarchy's > proxy traffic that likely wasn't representative of the open Internet. > > The response was always that updating to a fresh capture would be too > big of a change prior to publication. Likewise, getting a better trace > from multiple real sources (to avoid server-specific specialization) > wasn't something that anyone wanted to organize due to the privacy > issues. > > What we should have done is reach out to a third party > (like UCSD's group that does Internet measurements) to organize it > as a research project, and just adopt their results at the end. > > ....Roy > >
