[Ledger-smb-devel] [DESIGN] Webservice design second draft

Erik Huelsmann Tue, 28 Jul 2015 13:03:51 -0700

Based on the discussion between John and myself elsethread, I've
refined my earlier proposal into what I now have below.



Please review and send your questions, comments and concerns!


JUSTIFICATION
============
In order to continue on our path of development with the browser
pages, we need to decide on the services structure rather sooner than
later (rather, since I'm building new pages: now).


GOAL
====

The goal of having a webservice API is to allow development of a
broader ecosystem of service-using applications. In addition, we're
more and more moving to a rich browser-based webapplication ourselves;
rich-client web application frameworks expect to integrate with a
backend service through web-services.


VERSIONING
==========

More even than the Perl API and database stored procedures, will the
webservice API be public in the sense that services and applications
will bind to it that fall out of the scope of management of the
LedgerSMB project. To that extent, API versioning (and version
detection) is of highest importance. I'd like to put the API under the
rules of [semantic versioning][1].
Chris has argued that our software is in flux and as such the API
can't really be stable. My counter argument is that the Subversion
project built a new working copy library, using a completely new
paradigm, all within the rules of the v1 api. On the other hand,
numbers are cheap and Chrome is at version 43 now. Who cares about
which number it is? It's only a sequence specifier, I'd say.


VERSION DETECTION AND SELF-DOCUMENTATION
========================================

The api must provide documentation about the api through a response to
an OPTIONS request on the base URL (/api) to allow the user to
discover version, options and features of the API.

In addition, every service must provide self-documentation through a
response to an OPTIONS request as a (minimal) means of documentation.
The response must be machine processable.

PROPOSAL: Use http://www.restdoc.org/spec.html as the spec for our
self-documentation.


URL STRUCTURE
==============

All URLs in the document are assumed to be relative to some base URL.
E.g. assuming LedgerSMB to be hosted under the following URL:
https://example.com/path/to/ledgersmb/ , the URL /api in this document
in fact means https://example.com/path/to/ledgersmb/api .

The proposed URL structure is (as can be found in many existing web
service schemas):

 1. /api/[version]/[resource]/[?query parameters]
 2. /api/[version]/[resource]/id
 3. /api/[version]/[resource]/id?perform=[action]

The above is mostly inspired on the PayPal API which - I think -
drives a system much like ours in the sense that their system manages
workflow producing transactions.

In our case, I think the "id" specifier in the resource may be
multiple path segments long; e.g. for currency rates:
/api/v1/exchangerate/EUR/1/2015-12-12 where "EUR/1/2015-12-12" is the
identifier for the: currency identifier, rate type and (start)date of
the rate.

Form (1) will be used for creating (POST) and listing (GET) resources
instances. Dojo proposes to use the 'Range:' HTTP header to limit
results in the request. I think that makes more sense than to use
query parameters for it.
Form (2) will be used for retrieving (GET) an individual resource instances.
Form (3) will be used for (POST) modifying state of individual
resource instances by executing <action> on the specified resource.


MEANING OF REQUEST TYPES
=========================

(Note that the API doesn't attach meaning to the HTTP request types
PUT and PATCH (which the PayPal API *does* do)) -- I could see value
in supporting a PATCH request for resources which require secondary
approval and have not yet been approved (this is where PayPal uses it
too).

GET
------
 Retrieves an object or collection of objects, potentially restricted
by query parameters or HTTP headers.

POST
--------
 Creates an object or collectino of objects when executed on a
resource URL; when executed on a resource-instance URL, a required
?perform=[action] query string is to be added to the URL to specify
which state transition is to be executed.

Each POST request in the API carries a payload where the consuming
service should support at least one of the following formats (as
indicated by the OPTIONS response)

 1. application/json
 1. application/xml
 1. application/form-data
 1. application/x-www-form-urlencoded

OPTIONS
--------------
  In general requests metadata about the endpoint the URL points to.
Minimal endpoints that provide metadata are:
    /api
    /api/[version]
    /api/[version]/[resource]
 Whether or not metadata can be requested for individual resource
instances is to be specified in the return value of the resource
collection URL OPTIONS return.
Requirements for return values of the OPTIONS request should be
separately documented to make them meaningful and machine processable.
Typical items to be included in the OPTIONS response are the DTD for
the POST XML payload and response and JSON field specification.

DELETE
-----------
  Most objects can't be removed from the system (although e.g. GL
accounts can be marked 'obsolete'), but some (notably sessions)
*should* be removed from the system after they have served their
purpose. Running DELETE on anything other than an individual resource
instance isn't supported. In case the resource supports deletion, the
resource instance is deleted.


REQUEST HANDLING SEQUENCE
=========================

Below is a list of objects that have a role in request handling, in
the order they are executed.

 * API controller+router
   Receives the initial request, attaches the desired request decoder
and deserializer and determines which service handler to send the
request to
 * Request decoder
   Decodes the request: parses the request payload into a Perl
structure (e.g. multipart/related, multipart/form-data, etc)
 * Request deserializer
   Translates the parsed request structure into a Perl object with the
structure the service handler expects
 * Service handler
   Processor of the actual application logic; receives a request
object argument, returns a response object handed off to the response
serializer and encoder
 * Response serializer
   Generates a response format which best matches the Accept request header
 * Response encoder
   Encodes the response to best match the Accept-Encoding request header

Each of the roles must be pluggable.

@@CHRIS: Which part of this work can we delegate to Dancer?

ATOMICITY
=========

When an api call affects multiple resources and the API call returns
an error *none* of the affected resources are to be affected.

In this verison of the design, we're explicitly NOT addressing the
issue of providing transactions through the web api, although it
sounds like a nice addition to be able to perform a series of API
calls and have those committed by a final call, or completely
reversed.

SESSION MANAGEMENT
====================

The API user logs in by creating a new session through the
/api/<version>/session/ API. Each application login (including API
logins) is attached to an application user. the webservice caller
thereby identifies itself as an application user/employee. Currently,
credentials will be provided through basic auth on the first *and* all
following requests. Session replay attacks are prevented by sending
cookies back and forth; just as they are now. Each request should
provide the cookies created during the session; possibly updated by
the response of the last request -- basic cookie management.
At the end of a session, the session is to be removed by issueing a
DELETE request on the session resource instance.

Regardless of whether the response generated by the server is a
failure or a success, the session cookies should be updated on each
request. The client must respect cookie updates regardless of the type
of response.

Alternatively, API calls can be invoked from sessions originally
authenticated against /login.pl?action=authenticate (with the same
further requirements as above).


ENCODING OF VALUES
==================

Each of the supported formats need to have their own design documents
which specify how to encode specific values. While this has been
mostly handled for JSON, there's a missing data point with respect to
encoding dates. Dojo handles encoding dates from the client to the
server, but I've been unable to find if/how Dojo's JSON can
deserialize dates coming from the server.

Each of the input (and output) formats will have separate
documentation detailing the mapping of (nested) fields to the output
format.


RESPONSE FORMAT
===============

@@Question: which of the items below do we want to support in the
first implementation?

 * application/json
 * application/xml
 * text/yaml
 * text/csv
 * text/html
 * application/x-latex
 * application/pdf


VALUE OF METADATA SPECIFICATION
===============================

The purpose of having the server be required to specify metadata and
include in that metadata a description of the response objects, is
among others, meant to serve a generic response parser on the client
which can parse responses into the correct objects on the client (e.g.
parse dates into dates, even if dates are transferred as JSON strings)
-- without the need to implement knowledge in advance into the client.


NESTING OF RESOURCES
=====================

When obtaining a resource from the server, the serving webservice may
include embedded in its response objects that it refers to; e.g. the
server may decide to include address data included in a response to a
query for a customer. The server isn't required to include more than
just the key by which the resource can be queried out of the resource
collection.

Nested resources in the URL space (such as the GitLab example with
team members in a project [2]).
*** Nested resources like the GitLab example pollute the namespace,
because there's a two way correspondence: users-in-project and
projects-in-user. *** How to handle this in the way that creates the
least complexity??? *** Presumably, we want things to be layered,
building complex resources on simple ones; so it's problematic in the
gitlab example to make the user aware of the projects... ***



TRANSITIONING TO THE TARGET URL SPACE
====================================

Since we have no infrastructure in place (yet) to create all of the
above, I'm thinking to start out with a new script in the toplevel:
/api.pl. api.pl accepts all the query parameters it accepts in the
proposal above, but in addition, it accepts a 'path' query parameter
which is the API path in the future URL space, like this:

 /api.pl?path=/api/v1/exchangerate/EUR/1/2015-12-12&perform=approve

Which maps to:

 /api/v1/exchangerate/EUR/1/2015-12-12?perform=approve

in the target namespace.


INITIAL IMPLEMENTATION
=====================

The initial implementation should implement 3 resources:

 * sessions
 * exchangerate types
 * exchangerates



So, this being the initial draft, there's probably a lot wrong with it
:-) Lets hear your feedback!



[1] http://semver.org/
[2] http://doc.gitlab.com/ce/api/projects.html#get-project-team-member


-- 
Bye,

Erik.

http://efficito.com -- Hosted accounting and ERP.
Robust and Flexible. No vendor lock-in.

------------------------------------------------------------------------------

_______________________________________________
Ledger-smb-devel mailing list
Ledger-smb-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ledger-smb-devel

[Ledger-smb-devel] [DESIGN] Webservice design second draft

Reply via email to