Re: [Ledger-smb-devel] [DESIGN] Proposed structure fol LedgerSMB web services

Erik Huelsmann Tue, 28 Jul 2015 13:01:04 -0700

Hi John,

Posted a cleaned-up, expanded v2 draft to the list to start clean
discussion about what would need adding in the document. Responses to your
comments, concerns and ideas below.




> Very cool! Following some links from there led me here:
> http://www.restdoc.org/
>
> Incorporated that reference in the design doc.



>
>   As to what you mean by "required", I don't know. If you mean "required
> to read before using the API", then, no, it's not required. If you mean
> "required when implementing a new service", then I think the answer should
> be "yes, it's required". Every service requires documentation and if this
> is the only documentation, then I'm quite happy to accept the service.
>
>
> Agreed, sounds good.
>
> Ok. Changed wording to make it more clear that it's a requirement on the
provider, not the consumer.



> For development/debugging, I really like having an API observe query
>> parameters in addition to Range headers. I would suggest we support both,
>> and pick one to win...
>>
>
>  Hmm. I'm not really in favor of having duplicate functionality. People
> seem to be using cURL to test their services; it should be pretty easy to
> add the Range header (or others, for that matter) to a cURL request. What
> method do you use?
>
>
> Ha. For a while I was using a home-grown Dojo single-page app for testing
> out APIs, have played around with quite a bit, but it's been a while since
> I've done a major API project. I've seen some decent browser extensions for
> some of these kinds of things...
>
> The other thing I'm thinking of here is for more light-weight, reporting
> types of uses. I'm not sure how much control you can get over headers when
> doing a cross-domain request from a browser -- I'm thinking a lightweight
> JS app that might want to grab the last 10 sales invoices for a dashboard,
> or something like that -- with an iframe, for example, you can't
> necessarily set browser headers but you can easily add a GET parameter.
>
> Not a big deal these days, there's so many decent tools for doing it right
> with a toolkit that we may not need the "lightweight" GET-only approach,
> but I do think there may be scenarios where it might prove useful...
>

Ok. Would really love to have a single API access method; lets help people
find their way to development tools available today and see if that limits
us too much and as soon as it does, extend the api with multiple options.
(My experience with multiple options is that we have to support all options
endlessly, but people use only one option anyway...)


>
>>  The API itself should be responsible for doing this conversion -- and
>> should allow the consuming client to send whichever of these it wants. The
>> API can then convert to a Perl data object of some kind to pass off to the
>> internal code.
>>
>
>  Ok. You're saying there's *always* going to exist a mapping from
> application/form-data to application/json? I mean, I can imagine that a
> mapping like that for non-nested structures, but what about nested objects
> in arrays? I mean, in the new multi currency branch we have form fields
> named debit_1 and debit_fx_1; how do those map to a JSON/Javascript object?
>
>  Don't get me wrong; I'd like to delegate this to the request consumer
> too.
>
>
> I think we just define a convention, and describe it. Perhaps make it
> simply mirror the Json structure with _ separated parts? e.g.
> debit_1_value=234&debit_1_fx=222&debit_2_value=444&debit_2_fx=400 maps to
> json as: (intentionally swapping the index to the 2nd position)
> [{
>   "debit":{
>     "value":234,
>     "fx":222
>   }
> },
> {
>   "debit":{
>     "value":444,
>     "fx":400
>   }
> }]
>
> ... I mean we already do this for form posts now, we need to convert it to
> some sort of data object internally anyway, why not build a library that
> does this for us, regardless of what format it receives in the request?
> Might need to change some of the current form field names...
>

Ok. Added a note to the doc that each format should document such a
key/fieldname mapping.


> This and the previous note does bring up something missing here: response
> format. Like the Range header, there's the "Accept" header the client can
> send, and I've also found it useful for very quick browser debugging to
> allow overriding that with a GET parameter.
>
> So we should discuss the formats we support for the response:
>
> application/json
> application/xml
> text/yaml
> text/csv
> text/html
> application/x-latex
> application/pdf
>
> ... and of course how we handle these. Json, XML, CSV are pretty
> straightforward (hey, are there any industry-specific XML formats we should
> leverage/offer?) -- for nested data in CSV I've typically seen Json used...
>
> For those last 3, clearly there's a need for templates for each kind of
> object...
>

Added the list (and the question which ones we need to support initially);
my thinking is that initially we simply need application/json just to be
able to get started with the exchangerate upload/download.


> If we've done a good job on the API, we should be able to plug in request
> formatters and response formatters easily -- so we could add text/yaml by
> writing a new plugin for both response and request handling...
>
> Right. This and the response types made me put out the question in the
document about how much we need to get started with Dancer *right now* (and
how much we can delegate to later versions).

If the answer is that we do need dancer, then the question also becomes how
much of the current URL space needs to be rewritten *right now*...

Ah, yes, and that's exactly why I think we need to support a GET parameter
> in addition to Range: header -- then you can simply generate a URL to get a
> CSV or HTML report of the most recent 10 payments from client X.
>
>    Is PUT to be added to this list? I would expect PUT to update values
>> of an existing object, and needs to contain all new values for the object.
>> Obviously since we're doing financial transactions, this probably can only
>> modify drafts and not anything posted (in a financial sense). But for
>> drafts, reconciliation, batches, etc. this seems useful.
>>
>
>> POST or PATCH can be used for modifying just a field on an object, or
>> handling things like payments on an object?
>>
>
>  Ah. Good point. POST(with an rpc endpoint) would be for adding a payment
> to an open item. PATCH would be to change the values of an existing object
> which is still editable.
>
>
> Ok. That all sounds fine to me...
>
>
>

[ snip batches ]

>   Actually, thinking about it, I can see how to put it all into one
> transaction. However, if that works, it depends on what you expect on
> subsequent calls within the same batch. Do you expect any queries to return
> the new values while they have not been fully committed to the database? Or
> do you just expect to send loads of modifications? Do you expect to be
> returned new IDs?
>
>    Good questions, and this gets beyond my experience -- I haven't
> actually done that much transactional programming to know the best
> practices here...
>
> I would think we would expect subsequent calls to have the new values, and
> I do know that Dojo stores have supported "placeholder" ids that can be
> replaced with permanent ones after the data is committed, so I would tend
> to think that pattern should work, a "placeholder" that is returned while
> the batch is "open", and when the batch is "approved" a set of replacements
> get returned so the client can update with final IDs.
>
> Should we be considering UUIDs here?
>
>   My basic idea was to batch up all RPC calls and delay them until the
> final "COMMIT" comes in and executing all the batched commands inside a
> single database transaction.
>
>
> Hmm. Goes against REST, but then we are talking about financial systems,
> practically the definition of transactional logic. It feels like we are
> reinventing SOAP!
>
> Ok. as per your suggestion below, putting this on hold for now. We don't
want to reinvent SOAP and we want things to stay simple for now so we
finally get the API off the ground...


> I'm thinking about the scenarios here, and the one that comes to mind is
> "shipping" some products on a sales order. We use this all the time
> --skipping the shipping screen, we just put in a value in the "ship" box
> and "Create Invoice." The current LSMB adjusts the sales order line
> items/totals, and commits that, and then takes you to a create invoice form
> that is completely open, unsaved, and in my opinion really should be in a
> transaction -- the sales order qtys shouldn't get updated until the invoice
> is posted (or at least saved as a draft).
>
> That's a scenario I think the current app should do in a transaction, and
> doesn't.
>

Right. But it does sound like a bug/problem in the current application; not
something we should move to "web transactions" to fix it in a "heavy
client" client-side.


I am also thinking about how you do transactions in a database, that you
> generally have to start a transaction with a "BEGIN" and otherwise it's not
> in a transaction. I'm thinking we just model the API the same way, that
> it's not in a transaction unless it's explicitly called for.
>
> I also think this entire transaction functionality can be deferred until a
> later version, as long as you're thinking about it with the current version
> so it's something that can be added later...
>

Right. Added that to the doc.


>>  Regardless of whether the response generated by the server is a failure
>> or a success, the session cookies should be updated on each request. The
>> client must respect cookie updates regardless of the type of response.
>>
>>
>>
>>  Hmm. What if the same client is running multiple, parallel transactions?
>> How would we handle race conditions here? Is it possible for the same
>> session to have multiple sequences?
>>
>
>  Good point, but it seems to work for PHP, RoR, ... I'll look around and
> try to find how others solve it. Maybe by opening a second session?
>
>
> I think the general approach is a token sent in the body, not a cookie.
> The browser will send all cookies in any session... You can probably go to
> some extra steps to isolate sessions with curl, but I mostly just use the
> "cookiejar" in curl that makes it act like a browser here...
>

Actually, what I'm thinking of is this:
http://guides.rubyonrails.org/security.html#replay-attacks-for-cookiestore-sessions
; the nonce there may prevent session replay. The token you're talking
about seems to be one to prevent session hijacking. GitLab allows to send
the token as a header or as a query parameter. I like that approach,
because it's separate from the payload.


> IIRC, the Drupal Form API sets a "form-build-id" and a "form-token". The
> build-id essentially is the session/form sent to the browser, and the token
> is used to validate and detect replays.
>

Ok. But in a webservice situation, there's no form that's being sent from
the application first.


[ snip ]

>   Yes it can... the dojo/date functionality works both ways -- I would
>> suggest we deserialize to a Javascript object in the store functions
>> themselves, this works pretty well.
>>
>
>  Well, agreed that at least it *used* to do it: in the dojo/data docs
> there's mention of *serializing* (but I couldn't find any mention of
> deserializing). In the new dojo/(d)store, there's nothing in the
> documentation that I could find. But, indeed, the only correct place to
> deserialize dates into date objects does seem to be in the stores.
>
>
>
> http://dojotoolkit.org/reference-guide/1.10/dojo/date/locale/parse.html
>
>
Right. My point is not that it can parse, but that the stores
*automatically* instantiate Date objects from their serialized form. It's
that functionality that I'm looking for. If we need to write our own JSON
store which takes advantage of the request return schema description to
find the date fields and parse them into date objects, that's great and
completely fine by me. I had hoped it already exists though.

    NESTING OF RESOURCES
>> =====================
>>
>>  When obtaining a resource from the server, the serving webservice may
>> include embedded in its response objects that it refers to; e.g. the server
>> may decide to include address data included in a response to a query for a
>> customer. The server isn't required to include more than just the key by
>> which the resource can be queried out of the resource collection.
>>
>>  Nested resources in the URL space (such as the GitLab example with team
>> members in a project [2]).
>> *** Nested resources like the GitLab example pollute the namespace,
>> because there's a two way correspondence: users-in-project and
>> projects-in-user. *** How to handle this in the way that creates the least
>> complexity??? *** Presumably, we want things to be layered, building
>> complex resources on simple ones; so it's problematic in the gitlab example
>> to make the user aware of the projects... ***
>>
>>    We should support and default to "obvious" nested resources. e.g.
>> line items on an invoice, payment lines, etc.
>>
>
>  Do you mean that these nested resources should be made available at the
> URL level? Or simply *always* be embedded in the response object?
> Basically, I wasn't thinking of the journal lines as individual resources.
> I think the *journal* is the individual resource, with a number of lines
> "inside" it. Would it make more sense to you to make the individual lines
> into resources too? [I can see reasoning for that too, because it allows
> running queries on the journal-line resource and filter out all lines on
> e.g. a single account...]
>
>
> Well... yes. I think this boils down to a question of "document database"
> or "relational database". Obviously, we're built on a relational database,
> and I've never truly warmed up to pure document/object storage, the "NoSQL"
> movement... At the same time, the structure of an invoice in LSMB is pretty
> well-defined, and doesn't vary much, so we can present the entire thing as
> a "document", even though the lines are themselves first-class objects.
>
> Maybe this is just force of habit for me, and there may well not be any
> actual need for it, but I would think that pretty much anything that can be
> a line on a report or an invoice should be directly addressable. But maybe
> that's overkill?
>

I can see how that's useful for GL lines. I can see how that's useful for
inventory movements. But I can't really see (yet) how that's useful for
invoice lines which are already part of the GL lines information as far as
the accounting impact is concerned and already part of the inventory
movement information as far as the inventory impact is concerned. However,
maybe it's better to just disclose all the information along the same
design and not have a lot of variance so its easy to write the services for
them.



> I've built one very complex system from scratch, and with that one I just
> made each level of the hierarchy extend the base data object class, and so
> I did essentially get the basic CRUD APIs for this for free, once I mapped
> my API layer to the data object -- about the only thing that needed
> attention at each level were the fields available for index queries -- and
> then the nesting issues we're discussing. I guess I didn't think that much
> about whether we *needed* that level of access (though it certainly helped
> when debugging).
>

Ok. That strategy works for me, especially if it reduces coding in the
overall application.

>  I do think we should plan to allow the client to request what data to
>> nest, perhaps either a custom header or a parameter (or both)? This would
>> be one area that needs to be self-documenting, what resources can be
>> excluded/included/expanded in which requests, and what is included by
>> default.
>>
> I like that. I'll think about how we can model this.
>
> As I think about it, I really only see two levels here: expanded, or
> condensed. Expanded, for an invoice, the response would include the
> customer record, each line item detail, each payment line. Condensed, it
> would only contain references to these other records, which would have to
> be retrieved separately if they don't exist.
>
> How much deeper is useful to go?
>
> Would we ever want to load the product from the line item? Perhaps, and
> then need to look up a pricegroup for a customer for the product... not
> exactly sure how this is currently modeled. But that really seems as
> complicated as this system gets. Oh, I guess there's entity/eca/contact
> method.
>

Right. So, if we talk about condensed versus expanded, we'd be talking
about the first-level expansion only. I think that should work for most
use-cases. If we need more, then we should probably be looking to solve
that on a specific case-by-case basis. (Allowing N-level expansion would
probably mean building a graph of interdependencies and teach the service
code to walk them... That sounds like an interesting problem, but not a way
to achieve a solid webservice any time soon...)

-- 
Bye,

Erik.

http://efficito.com -- Hosted accounting and ERP.
Robust and Flexible. No vendor lock-in.

------------------------------------------------------------------------------

_______________________________________________
Ledger-smb-devel mailing list
Ledger-smb-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ledger-smb-devel

Re: [Ledger-smb-devel] [DESIGN] Proposed structure fol LedgerSMB web services

Reply via email to