On 18/12/2019 19:37, Andy Seaborne wrote:
I'm looking at the remote access APIs and HTTP usage.

Unless there are good reasons why not, I think using the JDK Java11 HTTP code, java.net.http is good - less dependencies, lots of info on the web. It has both sync and async support, and also HTTP/2. It is complicated and there is value in having packaged ways for common use cases that have the RDF handling baked in (base URI anyone?!)

So far ...

https://github.com/afs/jena-http/

has mostly complete basic HTTP operations, query, update, and GSP.

java.net.http works well, and it uses flow to deliver data. There is a zero-copy InputStream to access the data from an HTTP body. Haven't looked at higher-level Subscribers (readers) that produce java objects directly, only uses the InputStream to pass to the existing parsers. I'm not convinced that there are any gains and certainly there are costs to parse from fragmented data ByteBuffers (tokens split across boundaries are quite nasty to handle and would need new tokenizers - the InputStream does that work.

What is missing is suitable authentication. Basic auth is supported in response to a failed HTTP operation - which is the user-centric case of a dialog popping up. Slight downside is that if the auth is wrong, it does it 3 times before giving up (i.e. to allow 3 user attempts). The number "3" is a system wide system property.

Challenge-response authentication can be fiddly to handle for requsts were the data is not replayable. The first request fails ... and can't be resent unless the data is replayable.

What does work is directly setting the Basic auth HTTP header and probably worth adding some custom helper support (c.f. SERVICE keyword). No challenge round of HTTP requests. That's OK if the connection is HTTPS.

    Andy


Some thoughts:

RDFConnection:
* The name is a bit long!
* New RDFConn (other name?) same operations as RDFConnection but at the Graph/Node level.
* RDFConnection is an adapter to Resource/Model.

SPARQL Query:
* Convert HttpQuery to use java.net.http.
* Keep the "default global setup" style, and share this with other network-related code.
* Builder pattern for the per object settings.
* This may use HttpOp or directly use the java.net.http code.
   Worth doing it the best way for the long term.

RDF-centric:
* Library of functions and RDF-centric BodyHandler/BodyPublishers,
   Deal with compression on input stream, response to RDF,
* Could be useful for sync and async.

HttpRDF
* RDF operations eg.
     Graph x = HttpRDF.getGraph(url)
     Graph x = HttpRDF.getGraph(httpClient, url)
* GSP naming is in RDFConnectionRemote.

My play area is:
https://github.com/afs/jena-http/blob/master/src/main/java/org/seaborne/http/HttpRDF.java

HttpOp
* This can be smaller and focused on common uses cases; less coverage, easier to use (and still support use for tests). * common cases are sync usage of HTTP. If you are writing a spider with async requests, you'll want control of the HttpClient. * For each operation have httpGet(args) and httpGet(HttpClient, args) versions. * retain the idea of one default "system wide" HttpClient so common uses cases "Just work". Share with QueryHTTP. Put this in one place "HttpEnv".
* No "HttpResponseHandler" variants, no "httpPostForm"
       That's about 50% of the execs.

     Andy

Reply via email to