research and reflections on HTTP header ordering (was: Re: [PATCH] FIFO header order support in HTTP::Headers)

Mark Stosberg Tue, 26 Jan 2010 08:35:47 -0800

In 2008 there was some discussion about an option to preserve the
ordering of HTTP headers. Part of that thread is quoted below.


The idea resurfaced in another form with the release of
HTTP::Headers::Fast, which provided a method to get back the the
headers unsorted. However, the motivation was different there--
performance-- and the implementation as different as well. It returns
headers in essentially random order instead the order in which which
they were created or transmitted. 

I took an interest in the issue of HTTP header ordering and researched
what several other Perl modules do in regards to this as well as Ruby's
Rack. I published the result on my blog:

http://mark.stosberg.com/blog/2010/01/generating-http-headers-sorted-or-unsorted.html

The summary is that I support the option for unsorted headers in
HTTP::Headers. Michael Greb made a good case for it, and the
possibility for a performance improvement is attractive too.

    Mark

On Sun, 7 Sep 2008 15:53:46 +0200
"Gisle Aas" <gi...@aas.no> wrote:

> On Sun, Sep 7, 2008 at 1:49 PM, Michael Greb <mg...@linode.com> wrote:
> > On Sep 5, 2008, at 7:23 PM, Gisle Aas wrote:
> >>
> >> True; and in this case we need to define what happens when fields are
> >> modified with 'push', 'set' or 'init' and 'remove' as that's the API
> >> that modify stuff.  Let me suggest the following definition of the
> >> behaviour:
> >>
> >> - 'push' always append the field at the end of all headers.  multiple
> >> occurrences of a field name do not have to be consecutive.
> >>
> >> - 'init' either does nothing or it works like 'push'.
> >>
> >> - 'remove' will always remove all concurrences of a field.
> >>
> >> - 'set' will work like 'push' if no other occurrence of the field exists.
> >>
> >> - 'set' will update the first occurrence if the field exists (and
> >> remove all other occurrences).  if multiple field values is provided
> >> with 'set' they are basically all injected at the location of the
> >> first existing value.
> >
> >
> > On Sep 6, 2008 at 2:57 AM, Gisle Aas wrong:
> >>
> >> I think it makes sense to be able to enable them separately.
> >> Suggested interface:
> >>
> >>  $h->scan(\&cb, original_order => 1, original_case => 1);
> >>  $h->as_string(eol => "\n", original_order => 1, original_case => 1);'
> >
> > The attached patch uses the interface above and works towards the behavior
> > outlined in the first message.  Due to the headers being stored as a hash,
> > pushing does not currently preserve previous values, second and subsequent
> > pushes of the same header will overwrite the previous value.  Supporting
> > this would require a change in how the headers are stored within the module.
> >  Your thoughts?
> 
> I think it's better to just use your original approach and just keep
> the representation like used to be with the addition of an array that
> records the original field names and their order.  This should lead to
> a smaller patch as the only thing that need to change is the code that
> sets headers and the scan method.  I also like header lockups to be
> efficient and the representation compact.
> 
> > Server: Fool/1.0
> > content-encoding: gzip
> > Content-Type: text/plain; charset="UTF-8"
> > Content-Encoding: base64
> > Date: Fri Sep  5 10:24:37 CEST 2008
> >
> > Would be stored as (assuming push_header):
> 
> My suggestion would be:
> 
> bless {
>     "content-encoding" => ["\n gzip", "base64"],
>     "content-type" => "text/plain; charset=\"UTF-8\"",
>     "date" => "Fri Sep  5 10:24:37 CEST 2008",
>     "server" => "Fool/1.0",
>     "::original_fields" => [
>         "Server",
>         "content-encoding",
>         "Content-Type",
>         "Content-Encoding",
>         "Date",
>     ],
> }, "HTTP::Headers";
> 
> The invariant that needs to hold is that there is the same number of
> elements in {"::original_fields"} as there are values for all the
> others keys.
> 
> Pushing a value is trivial; only change from what we have now is
> appending the original field name to {"::original_fields"}.
> 
> The only state modification operation that becomes more complex is
> setting of a value header value.  It has to:
> 
>   - update the values in the hash as before
>   - locate the first occurence of the field name in
> {"::original_fields"}  => $idx
>   - remove all other occurrences of the field name
>   - splice(@{"::original_fields"}, $idx, 1, ($orig_field_name) x
> $numbers_of_values_set);
> 
> When 'scan' wants to iterate over the original headers it would have
> to keep an index into the values array for each field that repeat.
> 
> An more compact representation could be to store {"::original_fields"}
> as a ":"-separated string; but we can think about that optimization
> later.
> 
> --Gisle
> 


-- 
 . . . . . . . . . . . . . . . . . . . . . . . . . . . 
   Mark Stosberg            Principal Developer  
   m...@summersault.com     Summersault, LLC     
   765-939-9301 ext 202     database driven websites
 . . . . . http://www.summersault.com/ . . . . . . . .

research and reflections on HTTP header ordering (was: Re: [PATCH] FIFO header order support in HTTP::Headers)

Reply via email to