Portland Linux/Unix Group General Meeting Announcement
Who: Sumana Harihareswara
What: HTTP Can do THAT?!
Where: PSU, 1930 SW 4th Ave. Room FAB 86-01 (Lower Level)
When: Thursday, June 2nd, 2016 at 7pm
Why: The pursuit of technology freedom
Stream: http://pdxlinux.org/live (PSU WiFi Permitting)
Web developers who only know about GET and POST and use the most popular
headers and response codes are missing out! Underappreciated verbs,
headers, and response codes can boost your web application's
performance, flexibility, and testability, and help you better
appreciate the structure of the web.
The version of the Hypertext Transfer Protocol you will deal with most
is 1.1. As a quick refresher: Clients and servers talk to each other via
HTTP messages (requests and responses), which are clear text comprising
start-lines, headers, and bodies.
METHODS
GET ("gimme") and POST ("here you go") are overwhelmingly popular, the
Dave Matthews Band of methods. To illustrate their importance: you can
create an API that allows the user to POST but not GET, but that would
be a terrible idea. https://github.com/brainwane/secureapi demonstrates
this with Python 2 code using the BaseHTTPServer standard library.
Using POST to mean "Create resource", "Update resource", and "Delete
resource" is inelegant! So why do we overload POST, and what are the
alternatives? PUT, meaning "create resource," is implemented throughout
the HTTP 1.1 ecology and is unambiguously great; be more careful with
DELETE, which deletes a resource (as demonstrated with Python 2 code
using the BaseHTTPServer standard library and the requests library).
It's also worth looking into PATCH and OPTIONS for specialized use cases.
An exciting alternative to GET is HEAD, which requests only the metadata
about a resource; if the client really only needs to know whether it
could GET a resource, or wants a resource's size, last-modified
timestamp, or other information available in its headers, using HEAD
instead of GET can speed performance by more than 50%. I demonstrate
this using the requests library and the %timeit functionality in IPython.
Also, why am I both discussing good and bad ideas throughout this talk,
and how can you tell the difference? Sometimes bad ideas are easy ways
to understand edge cases (also, they're funny). The "horror
world-to-whiteboard scale" gives you my take on whether or not you
should try out what I'm describing.
HEADERS
Call-and-response header pairs such as Last-Modified and
If-Modified-Since/If-Unmodified-Since allow the client to conditionally
specify its preferences; you can save client-side processing time, and
test your application more thoroughly, by knowing and using the right
headers. For instance, check for cache problems by using Cache-Control
and ETag. (But not all headers are useful; for instance, the From header
is basically obsoleted by more advanced analytics and by the User-Agent
header.)
We require that clients send a Host header with all requests; Host works
with the path specified in the start-line, the two together forming the
full address of the resource. Sometimes the host is merely the domain
name of the server, but you can't depend on the assumption that the host
will be obvious to all the systems between the client and the server.
The client might send a request to an IP address, or to one of several
virtual hosts that act as subdomains on one host. This level of
redundancy can lead to unintended consequences; for instance, by
intentionally malforming the Host headers of GET requests, spammers can
leave links to their own sites in your access logs.
You can define your own header when sending requests or responses, and
many organizations do this; the convention is to prepend "X-" to bespoke
headers. It's easy to do this when hand-writing requests, and I'll also
demonstrate how to do this in a Python web framework.
RESPONSE CODES
Response codes (a.k.a. status codes) have well-specified semantics. For
instance, they come in five numbered classes, and the three-digit
integer should be sufficient to explain the response -- the
"reason-phrase" (the English explanation) should not be a necessary data
point for the client to use when debugging. As several responses sent by
real, working web servers demonstrate, if you don't respect this
principle, the results can be hilariously confusing.
HTTP includes useful response codes that mean more specific things than
"OK" or "nope"; 410 Gone, 304 Not Modified, and 451 Unavailable for
Legal Reasons help you and your users move faster, debug, test, and
recover from unavailable content.
I demonstrate how to alter the reason-phrases in your web application's
response codes, using the http standard library in Python 3:
https://gitlab.com/brainwane/http-can-do-that/
From "don't cache this" instructions to look-before-you-leap requests
to using the "Content-Disposition" header to tell clients that a
resource should be treated as an attachment, HTTP already contains an
embarrassment of riches. Reading up on it gives you both a feeling of
power, of increased capability, and a sense of wonder, in discovering a
new way to look at the world. What might the web have been? What might
it still be?
Calagator Page: http://calagator.org/events/1250470042
Many will head to the Lucky Lab at 1945 NW Quimby St. after the meeting.
Rideshares Available
PLUG Page with information about all PLUG events: http://pdxlinux.org/
Follow PLUG on Twitter: http://twitter.com/pdxlinux
PLUG is open to everyone and does not tolerate abusive behavior on its
mailing lists or at its meetings.
See you there!
Michael Dexter
PLUG Volunteer
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug