<<Ever see the famous "Apache vs. Yaws" graphs
<http://www.sics.se/%7Ejoe/apachevsyaws.html> and wonder whether you,
too, should be using Yaws? The graphs show what at first seems to be an
unbelievably huge scalability advantage for Yaws, with its ability to
scale to over 80000 parallel connections while Apache keels over at only
4000. Reactions to these graphs tend to be quite polarized, typically
either one of "there's no way these graphs are accurate" or "they must
have misconfigured Apache", to the opposite reaction of "Wow, I need to
try using Yaws!"
Regardless of whether you believe the Yaws comparison graphs or not,
Yaws <http://yaws.hyber.org/> is a solid web server for serving dynamic
content. Claes Wikström wrote Yaws - "Yet Another Web Server" - in
Erlang <http://www.erlang.org/>, a programming language created
specifically to support long-running, concurrent, highly reliable
distributed systems. (To learn more about Erlang, get a copy of the
wonderful book /Programming Erlang/
<http://www.pragprog.com/titles/jaerlang>, written by the language's
creator, Joe Armstrong.) The flexibility of Yaws combined with several
unique features of Erlang makes them a compelling combination for a
RESTful web services platform. If you're serving static pages, grab
lighttpd <http://www.lighttpd.net/> or nginx <http://nginx.net/>
instead, but if you're writing dynamic RESTful web services, then Yaws
is definitely worth exploring. In this article, I'll relate some of my
experiences with using Yaws and Erlang for web services development.
Yaws Basics
Yaws provides several ways of serving dynamic web content and supporting
RESTful web services:
*
/Embedding Erlang code within static pages/. With this approach,
you embed Erlang code within a function named |out/1| within ...
tags directly into static content. Files of this nature have a
|.yaws|... tags with the result of executing the |out/1| function
they're expected to contain. In Erlang terms, |out/1| is a
function of arity 1, i.e., a function taking one argument. Its
argument is expected to be a Yaws |arg| record, which is a data
structure that Yaws uses to communicate details for incoming
requests to the code handling them. For example, an |arg| record
supplies information such as the request URI, incoming headers,
|POST| data, etc. extension, by which Yaws knows to process the
file and replace the
*
/Application Modules (appmods)/. The Yaws appmod facility lets
application code take control of URIs. In the approach described
above, Erlang code is embedded within static files whose URIs are
determined by their pathnames relative to the web server's
document root. With an appmod, however, the application controls
the meaning of URIs, and such URIs usually do not correspond to
any file system artifacts. Appmods are basically Erlang modules
that export an |out/1| function. Such modules are configured in
the Yaws configuration file to correspond to a URI path element.
When a request is made containing a path element associated with a
registered appmod, Yaws invokes that module's |out/1| function,
passing it an |arg| record. The appmod's |out/1| function can then
examine the rest of the URI to determine the precise resource that
is the target of the incoming request, and respond accordingly.
*
/Yaws applications (yapps)/. Unlike appmods which are usually just
single Erlang modules, Yaws yapps are full-fledged applications.
Each yapp has its own document root, and each can have its own set
of appmods. Specifically, yapps are Erlang/OTP applications. OTP,
which stands for "Open Telecom Platform," is a set of well-proven
libraries and frameworks that provide Erlang applications with
powerful capabilities. OTP encapsulates idioms and approaches for
achieving distribution, event handling, and high reliability,
among many other things. Erlang/OTP has been proven in real-world
field usage within a variety of telecom systems, for example, some
of which mark their downtime in just a few milliseconds per year.
All three of these approaches, which are detailed at the Yaws website
<http://yaws.hyber.org/>, can be usefully applied within a RESTful web
service, depending on the specific nature of the service itself.
However, in my experience, yapps and appmods work the best, because they
provide the most control to the web application.
RESTful Design
Since we want to develop RESTful web services, let's look at some
details of REST, which stands for "Representational State Transfer." Roy
T. Fielding coined the term "REST" in his doctoral thesis
<http://www.ics.uci.edu/%7Efielding/pubs/dissertation/top.htm> to
describe an architectural style suitable for large-scale distributed
systems like the web. HTTP is essentially an implementation of REST. The
term "representational state transfer" refers to the fact that RESTful
systems operate via the exchange of representations of resource state in
requests and replies. For example, the typical web page retrieved with
an HTTP GET is an HTML representation of the web resource identified by
the URI targeted by the GET.
When developing a RESTful web service, these are the key areas to pay
attention to:
*
Resources and resource identifiers
*
Methods supported by each resource
*
Formats of data interchanged between client and server
*
Status codes
*
Applicable HTTP headers for each request and response
Let's consider each of these areas in the context of Yaws and Erlang.
Resource Identifiers
Designing a RESTful web service requires you to think about what
resources comprise your service, how to best identify them, and how they
relate to one another. RESTful resources are identified by URIs.
Normally, related resources have URIs that are themselves related,
sharing common path elements. For example, in a web-based bug tracking
system, all the bugs for imaginary project "Phoenix" might be found
under the URI |http://www.example.com/projects/Phoenix/bugs/|, whereas
the specific bug numbered 12345 might be under
|http://www.example.com/projects/Phoenix/bugs/12345/|. RESTful resources
also tend to provide URIs for other resources within their own state
representations. This allows clients retrieving a particular resource's
state to use the URIs returned within the state representation to
navigate to other portions of the overall web application.
In Yaws, the |arg| record indicates the request URI, and the |yaws_api|
module provides the |request_url| function to easily retrieve it:
out(Arg) ->
Uri = yaws_api:request_url(Arg),
Path = string:tokens(Url#url.path, "/"),
Once you have the request URI, I've found that it's handy to tokenize
the request path as shown above, by splitting it on its forward slashes.
The result is a list of path elements that begin at the URI point where
you've tied your appmod. For example, let's assume we've tied an appmod
onto the "projects" path element in the URI
|http://www.example.com/projects/|. If a request is made on any URI
containing this URI as its prefix, the appmod's |out/1| function will
wind up with a list of separated path elements indicating the target
resource of the request. For example, a request for URI
|http://www.example.com/projects/Phoenix/bugs/| will result in the
following Erlang list of path elements in the |Path| variable after
executing the code shown above:
["projects", "Phoenix", "bugs"]
The utility of splitting the URI is that it makes further dispatching
quite simple, thanks to Erlang's pattern matching. For example, we can
write a separate function, let's call it |out/2|, to handle this
specific URI by defining the function head like this:
out(Arg, ["projects", Project, "bugs"]) ->
% code to handle this URI goes here.
This |out/2| function will handle all requests for bug lists for all
projects we know about, with the variable |Project|, which is available
to the function body, being set to the specific project name being
requested. Supporting additional URIs is equally as simple: just add
more variants of the |out/2| function. You can also feel free to use a
name other than |out| for these functions if you wish, since they are
not invoked directly by the Yaws framework.
Note that properly defining your resource URIs yields significant
benefits. With appmods and yapps, having a rich URI space is quite
simple because of the simplicity of tying different appmods onto
different URI path elements, and the ease of dispatching. Erlang pattern
matching makes handling requests for different URIs trivial. Contrast
this with the poor style traditionally used for defining non-RESTful
services, where all services are given the same URI. This URI typically
points to a script that uses information provided within the request
body or through URI query strings to determine where to actually
dispatch the request. The URIs that result from the Erlang/Yaws
dispatching technique shown above are far cleaner than the overloaded
URIs with seemingly endless parameter lists that result from the
traditional approach.
Resource Methods
The methods that web clients can invoke on a web resource are defined by
HTTP's verbs, primarily |GET|, |PUT|, |POST|, and |DELETE|. However,
individual resources tend to support only a subset of those verbs. When
you design your web service, you need to determine what methods each of
your resources supports, bearing in mind the semantics expected for each
HTTP verb as defined in RFC 2616 <http://www.ietf.org/rfc/rfc2616.txt>.
In Yaws, the request method is found in the |http_request| record,
accessible via the |arg| record:
Method = (Arg#arg.req)#http_request.method
This returns an Erlang atom representing the request method, which can
then be added into our pattern-matching dispatching approach. We can add
a new parameter to our |out| function, turning it into |out/3|, to
include the request method:
out(Arg, 'GET', ["projects", Project, "bugs"]) ->
% code to handle GET for this URI goes here.
This variant of the out function handles only |GET| requests for bug
lists for each of our projects. Another variant might handle a |POST|,
presumably to add a new bug to the list. To allow only |GET| and |POST|
but disallow all other verbs, you'd simply write a catch-all function
for the same URI:
out(Arg, 'GET', ["projects", Project, "bugs"]) ->
% code to handle GET for this URI goes here;
out(Arg, 'POST', ["projects", Project, "bugs"]) ->
% code to handle POST for this URI goes here;
out(Arg, _Method, ["projects", _Project, "bugs"]) ->
[{status, 405}].
Here, methods other than |GET| and |POST| will match the third variant,
which returns HTTP status 405, which means "method not allowed." The
leading underscores on the |Method| and |Project| variables quiet
compiler warnings about them being unused.
Just as with URI dispatching, Erlang pattern matching makes dispatching
to separate functions to handle separate HTTP verbs trivial.
Representation Formats
When designing a RESTful web service, you need to consider what
representation(s) each resource supports. Web service resources often
support XML or JSON representations, for example. Erlang supplies the
xmerl library <http://www.erlang.org/doc/apps/xmerl/index.html> for
creating and reading XML, and Yaws provides a straightforward JSON
module. Both work quite well.
You can access an incoming request's |Accept| header to determine what
representation(s) the client prefers. This header is available in a
|headers| record, also available through the |arg| record:
Accept_hdr = (Arg#arg.headers)#headers.accept
If your resource supports multiple representations, you can check this
header to see if the client indicated which representation it prefers.
If the client did not send an |Accept| header, the |Accept_hdr| variable
shown above will be set to the atom |undefined|, and your resource can
supply whatever representation it deems best. Otherwise, your service
can parse the |Accept_hdr| value to determine which representation to
send. If the client requests representations that your resource cannot
fulfill, it can return HTTP status 406, which means "not acceptable,"
along with a body indicating what formats are acceptable:
case Accept_hdr of
undefined ->
% return default representation;
"application/xml" ->
% return XML representation;
"application/json" ->
% return JSON representation;?
_Other ->
Msg = "Accept: application/xml, application/json",
Error = "Error 406",
[{status, 406},
{header, {content_type, "text/html"}},
{ehtml,
[{head, [], [{title, [], Error}]},
{body, [],
[{h1, [], Error},
{p, [], Msg}]}]}]
end.
The Erlang code above checks the |Accept_hdr| value to see if it's
either |application/xml| or |application/json|. If it's either of those,
the resource returns a suitable representation, but if not, the code
returns an HTTP status 406 along with an HTML document indicating the
representations the resource is willing to provide.
Another way of handling the desired representation is - you guessed it -
adding it as another parameter to our |out| handler function. This way,
Erlang pattern matching ensures that our request gets dispatched to the
right handler for the requested URI/method/representation combination.
This avoids cluttering handlers with case statements like the one above.
By the way, this example also shows the Yaws |ehtml| type, which is a
way of representing HTML as a series of Erlang terms. I find |ehtml|
quite intuitive to write because it directly follows the structure of
HTML, but is far more compact and eliminates the tedium and errors of
matching tags that you face when writing literal HTML.
Status Codes
RESTful web services must return proper HTTP status codes, as indicated
by RFC 2616. Returning the right status is easy with Yaws: simply
include a |status| tuple in the result of your |out/1| function. See the
case statement above for an example of returning the appropriate status
code. If your code does not explicitly set a status, Yaws will set a
status 200 for you, indicating success.
HTTP Headers
Retrieving request headers and setting reply headers with Yaws is
straightforward, too. We've already seen an example of retrieving the
|Accept| header from the headers record; other request headers can be
retrieved in the same fashion. Setting reply headers simply requires
putting a |header| tuple in the outgoing reply, like this:
{header, {content_type, "text/html"}}
This sets the |Content-type| header to "text/html," for example.
Similarly, in our previous example where we returned status 405 to
indicate a "method not allowed" error, we should have also included the
following header:
{header, {"Allow", "GET, POST"}}
Appmods or Yapps?
So far we've seen how Yaws and Erlang make it almost trivial to handle
many of the most important concerns for RESTful web services. One
remaining question is about choosing appmods vs. yapps, and the answer
depends on what your services do. If you're writing web services that
have to interact with other back-end services, then yapps are probably
your best bet. Since they're full-blown Erlang/OTP applications, they
typically have initialization and termination functions where
connections to the back end can be created and shut down. If your yapp
is an Erlang/OTP |gen_server|, for example, your |init/1||gen_server|
framework will provide to you, and allow you to modify, every time it
calls you back due to an incoming call to your server. Besides, using
yapps also means you can use appmods as well, so it's not really a
matter of choosing one over the other. Finally, yapps can participate in
Erlang/OTP supervision trees, where supervisor processes can monitor
your yapps and restart them if they should fail. Supervisor trees play a
significant role in the reliability of long-running Erlang systems.
function can establish state that the
This article is geared toward RESTful web services based on back ends
other than relational databases. If you're writing a traditional web
server on top of a relational database, you should check out Erlyweb
<http://erlyweb.org/>, a framework for such web services, which is also
based on Yaws and Erlang.
Conclusion
A significant aspect of writing RESTful web services is choosing the
right programming language. We've seen numerous service frameworks in a
variety of programming languages come and go over the years, and most
were failures simply because they were a poor match to the problem. Yaws
and Erlang do not specifically provide a RESTful web services framework,
yet the facilities they provide are a better match for RESTful
development than many other language frameworks that were built
specifically for that purpose.
While an article of this nature necessarily can't dive deeply into the
details of Yaws, Erlang, and RESTful web services, it has hopefully
touched on the important topics and provided, through its minimal code
examples, an idea of how to address them. In my experience, building
RESTful web applications with Yaws and Erlang is very straightforward,
and the resulting code is easy to read, easy to maintain, and easy to
extend.>>
*You can read this at:
*
*http://www.infoq.com/articles/vinoski-erlang-rest;jsessionid=DE23AB2982FB4FF34989ADE0A1F536AD
*
*Gervas*