Re: [Ledger-smb-devel] [DESIGN] Proposed structure fol LedgerSMB web services

Erik Huelsmann Mon, 27 Jul 2015 11:58:36 -0700

On Mon, Jul 27, 2015 at 7:03 PM, John Locke <m...@freelock.com> wrote:

>  Hi, Erik,
>
> Nice start. Some quick comments (not a lot of time...):
>
> On 07/27/2015 05:51 AM, Erik Huelsmann wrote:
>
>
> VERSION DETECTION
> =================
>
>  The user of the api should run an OPTIONS request on the base URL (/api)
> to discover version, options and features of the API.
>
>
> This is starting to sound like a WSDL? I think that could be a huge
> benefit to do, as long as it's not required...
>
>
What I was talking about was this discussed in this blog post:
http://zacstewart.com/2012/04/14/http-options-method.html

As to what you mean by "required", I don't know. If you mean "required to
read before using the API", then, no, it's not required. If you mean
"required when implementing a new service", then I think the answer should
be "yes, it's required". Every service requires documentation and if this
is the only documentation, then I'm quite happy to accept the service. And
if you're creating documentation anyway...

>
>  URL STRUCTURE
> ==============
>
>  All URLs in the document are assumed to be relative to some base URL.
> E.g. assuming LedgerSMB to be hosted under the following URL:
> https://example.com/path/to/ledgersmb/ , the URL /api in this document in
> fact means https://example.com/path/to/ledgersmb/api .
>
>  The proposed URL structure is (as can be found in many existing web
> service schemas):
>
>   (a) /api/<version>/<resource>/[?query parameters]
>  (b) /api/<version>/<resource>/id
>   (c) /api/<version>/<resource>/id?perform=<action>
>
>  The above is mostly inspired on the PayPal API which - I think - drives
> a system much like ours in the sense that their system manages workflow
> producing transactions.
>
>  In our case, I think the "id" specifier in the resource may be multiple
> path segments long; e.g. for currency rates:
> /api/v1/exchangerate/EUR/1/2015-12-12 where "EUR/1/2015-12-12" is the
> identifier for the: currency identifier, rate type and (start)date of the
> rate.
>
>  Form (a) will be used for creating (POST) and listing (GET) resources
> instances. Dojo proposes to use the 'Range:' HTTP header to limit results
> in the request. I think that makes more sense than to use query parameters
> for it.
>
>
> For development/debugging, I really like having an API observe query
> parameters in addition to Range headers. I would suggest we support both,
> and pick one to win...
>

Hmm. I'm not really in favor of having duplicate functionality. People seem
to be using cURL to test their services; it should be pretty easy to add
the Range header (or others, for that matter) to a cURL request. What
method do you use?

>   Form (b) will be used for retrieving (GET) an individual resource
> instances.
>
> Should we add support for PUT here?
>

Well, the entire API doesn't list PUT, because PUT was mapped to "U" of
CRUD. And the design explicitly delegates updates of resources to RPC
calls. The reasoning behind that is: old code looks the way it does because
it *tries* to derive based on a (partial) "before" state and an "after"
state what the user might have done in the web application. Most of the
time it guesses correctly, but there are lots of cases where it simply
can't tell the user's intent from the combined states. I don't want to
propagate that mess to another level of teh application when I'm able to
eliminate it somewhere (by building a different UI).

>   Form (c) will be used for (POST) modifying state of individual resource
> instances by executing <action> on the specified resource.
>
>
>  MEANING OF REQUEST TYPES
> =========================
>
>  (Note that the API doesn't attach meaning to the HTTP request types PUT
> and PATCH (which the PayPal API *does* do)) -- I could see value in
> supporting a PATCH request for resources which require secondary approval
> and have not yet been approved (this is where PayPal uses it too).
>
>  GET
> ------
>  Retrieves an object or collection of objects, potentially restricted by
> query parameters or HTTP headers.
>
>  POST
> --------
>  Creates an object or collectino of objects when executed on a resource
> URL; when executed on a resource-instance URL, a required ?perform=<action>
> query string is to be added to the URL to specify which state transition is
> to be executed.
>
>  Each POST request in the API carries a payload where the consuming
> service should support at least one of the following formats (as indicated
> by the OPTIONS response)
>
>   (i) application/json
>  (ii) application/xml
>  (iii) application/form-data
>  (iv) application/x-www-form-urlencoded
>
>
> The API itself should be responsible for doing this conversion -- and
> should allow the consuming client to send whichever of these it wants. The
> API can then convert to a Perl data object of some kind to pass off to the
> internal code.
>

Ok. You're saying there's *always* going to exist a mapping from
application/form-data to application/json? I mean, I can imagine that a
mapping like that for non-nested structures, but what about nested objects
in arrays? I mean, in the new multi currency branch we have form fields
named debit_1 and debit_fx_1; how do those map to a JSON/Javascript object?

Don't get me wrong; I'd like to delegate this to the request consumer too.

Is PUT to be added to this list? I would expect PUT to update values of an
> existing object, and needs to contain all new values for the object.
> Obviously since we're doing financial transactions, this probably can only
> modify drafts and not anything posted (in a financial sense). But for
> drafts, reconciliation, batches, etc. this seems useful.
>

> POST or PATCH can be used for modifying just a field on an object, or
> handling things like payments on an object?
>

Ah. Good point. POST(with an rpc endpoint) would be for adding a payment to
an open item. PATCH would be to change the values of an existing object
which is still editable.

>  OPTIONS
> --------------
>   In general requests metadata about the endpoint the URL points to.
> Minimal endpoints that provide metadata are:
>     /api
>     /api/<version>
>     /api/<version>/<resource>
>  Whether or not metadata can be requested for individual resource
> instances is to be specified in the return value of the resource collection
> URL OPTIONS return.
> Requirements for return values of the OPTIONS request should be separately
> documented to make them meaningful and machine processable. Typical items
> to be included in the OPTIONS response are the DTD for the POST XML payload
> and response and JSON field specification.
>
>  DELETE
> -----------
>   Most objects can't be removed from the system (although e.g. GL accounts
> can be marked 'obsolete'), but some (notably sessions) *should* be removed
> from the system after they have served their purpose. Running DELETE on
> anything other than an individual resource instance isn't supported. In
> case the resource supports deletion, the resource instance is deleted.
>
>
>  ATOMICITY
> =========
>
>  When an api call affects multiple resources and the API call returns an
> error *none* of the affected resources are to be affected.
>
> Do we want the API to support a transaction, allow a bunch of operations
> to get batched with atomicity? e.g. failure after a series of web service
> calls rolls back the whole batch, if there are no errors entire batch gets
> committed?
>
> If we can support that, that seems like another big win...
>

Hmm. My initial reaction is that we can support most of this by attaching
transactions to an (unapproved) batch and then offering the user the option
to either approve or remove the entire batch. Would that work for the
use-case you envision?

Actually, thinking about it, I can see how to put it all into one
transaction. However, if that works, it depends on what you expect on
subsequent calls within the same batch. Do you expect any queries to return
the new values while they have not been fully committed to the database? Or
do you just expect to send loads of modifications? Do you expect to be
returned new IDs?

My basic idea was to batch up all RPC calls and delay them until the final
"COMMIT" comes in and executing all the batched commands inside a single
database transaction.

>  SESSION MANAGEMENT
> ====================
>
>  The API user logs in by creating a new session through the
> /api/<version>/session/ API. Each application login (including API logins)
> is attached to an application user. the webservice caller thereby
> identifies itself as an application user/employee. Currently, credentials
> will be provided through basic auth on the first *and* all following
> requests. Session replay attacks are prevented by sending cookies back and
> forth; just as they are now. Each request should provide the cookies
> created during the session; possibly updated by the response of the last
> request -- basic cookie management.
>
>   At the end of a session, the session is to be removed by issueing a
> DELETE request on the session resource instance.
>
>  Regardless of whether the response generated by the server is a failure
> or a success, the session cookies should be updated on each request. The
> client must respect cookie updates regardless of the type of response.
>
>
>
> Hmm. What if the same client is running multiple, parallel transactions?
> How would we handle race conditions here? Is it possible for the same
> session to have multiple sequences?
>

Good point, but it seems to work for PHP, RoR, ... I'll look around and try
to find how others solve it. Maybe by opening a second session?

>
>  Alternatively, API calls can be invoked from sessions originally
> authenticated against /login.pl?action=authenticate (with the same
> further requirements as above).
>
>
>  ENCODING OF VALUES
> ==================
>
>  Each of the supported formats need to have their own design documents
> which specify how to encode specific values. While this has been mostly
> handled for JSON, there's a missing data point with respect to encoding
> dates. Dojo handles encoding dates from the client to the server, but I've
> been unable to find if/how Dojo's JSON can deserialize dates coming from
> the server.
>
> Yes it can... the dojo/date functionality works both ways -- I would
> suggest we deserialize to a Javascript object in the store functions
> themselves, this works pretty well.
>

Well, agreed that at least it *used* to do it: in the dojo/data docs
there's mention of *serializing* (but I couldn't find any mention of
deserializing). In the new dojo/(d)store, there's nothing in the
documentation that I could find. But, indeed, the only correct place to
deserialize dates into date objects does seem to be in the stores.

> I suggest we explicitly state the format for particular fields that the
> API expects... for date fields, this needs to include timezone handling.
> ISO 8601 with UTC really seems like the only reasonable standard to use for
> this. I don't know of any language or toolkit that does not support
> conversion to/from that.
>
> Otherwise I think we should just pick UTF-8 for character encoding, and
> use the escaping standards for the transport chosen -- XML, Json,
> form-data, etc.
>
> Currency might be the other thing to specify -- do we require a currency
> type as a separate field? Parse out a currency code/symbol? I'm sure
> there's an obvious standard here, but I don't know what it is ;-)
>

Ok. I'll modify my original document and include encodings there.
Basically, we need different levels of encoding: character encoding (at the
base request level) [UTF-8] encoding of field values (currency indicators,
etc) and encoding of objects (date values, other complex objects).

>  VALUE OF METADATA SPECIFICATION
> ===============================
>
>  The purpose of having the server be required to specify metadata and
> include in that metadata a description of the response objects, is among
> others, meant to serve a generic response parser on the client which can
> parse responses into the correct objects on the client (e.g. parse dates
> into dates, even if dates are transferred as JSON strings) -- without the
> need to implement knowledge in advance into the client.
>
>
> +1 this is very nice to have in an API, and makes using the API so much
> nicer. Although I prefer this to be more documentation than enforcement --
> it seems like one of the design goals of SOAP, and I shudder every time I
> have to use SOAP.
>

You mean you're *creating* SOAP services? It's not my intention to put any
other requirements on the API users than to read the service documentation
available on the OPTIONS request before using the service. However, when
*creating* a service, ... someone has to do the dirty work... Maybe we can
generate some of the docs from other structures in the application though?

So I'm all in favor of a self-documenting API, as long as it doesn't force
> you to go through a bunch of extra steps to establish a connection/do
> anything useful.
>

That's definitely not the intent, no.

 NESTING OF RESOURCES
> =====================
>
>  When obtaining a resource from the server, the serving webservice may
> include embedded in its response objects that it refers to; e.g. the server
> may decide to include address data included in a response to a query for a
> customer. The server isn't required to include more than just the key by
> which the resource can be queried out of the resource collection.
>
>  Nested resources in the URL space (such as the GitLab example with team
> members in a project [2]).
> *** Nested resources like the GitLab example pollute the namespace,
> because there's a two way correspondence: users-in-project and
> projects-in-user. *** How to handle this in the way that creates the least
> complexity??? *** Presumably, we want things to be layered, building
> complex resources on simple ones; so it's problematic in the gitlab example
> to make the user aware of the projects... ***
>
> We should support and default to "obvious" nested resources. e.g. line
> items on an invoice, payment lines, etc.
>

Do you mean that these nested resources should be made available at the URL
level? Or simply *always* be embedded in the response object? Basically, I
wasn't thinking of the journal lines as individual resources. I think the
*journal* is the individual resource, with a number of lines "inside" it.
Would it make more sense to you to make the individual lines into resources
too? [I can see reasoning for that too, because it allows running queries
on the journal-line resource and filter out all lines on e.g. a single
account...]

> I do think we should plan to allow the client to request what data to
> nest, perhaps either a custom header or a parameter (or both)? This would
> be one area that needs to be self-documenting, what resources can be
> excluded/included/expanded in which requests, and what is included by
> default.
>
I like that. I'll think about how we can model this.

Thanks for your response! With a few times going back-and-forth, we can
probably have something we can start working with and build our experience
in the context of *this* application.

-- 
Bye,

Erik.

http://efficito.com -- Hosted accounting and ERP.
Robust and Flexible. No vendor lock-in.

------------------------------------------------------------------------------

_______________________________________________
Ledger-smb-devel mailing list
Ledger-smb-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ledger-smb-devel

Re: [Ledger-smb-devel] [DESIGN] Proposed structure fol LedgerSMB web services

Reply via email to