[ 
https://issues.apache.org/jira/browse/JENA-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052284#comment-17052284
 ] 

Andy Seaborne commented on JENA-1854:
-------------------------------------

The issue is that parse errors mean a response is sent ASAP and the input body 
not completely read from HTTP. This can mess up reused connections.

A reusable connection will have a {{Content-Length}} header. The content needs 
to be drained so the connection is reusable for another HTTP request.

For a request with no {{Content-Length}}, it is not reusable anyway - they can 
be used for large data transfers.

{{curl -v -X POST -H "Content-type: text/turtle" --data-binary}} sets the 
{{Content-Length}}.

----

Places this affects:

Receiving in Fuseki (need checking they are OK; if not drain the connection of 
bytes). All places parsing happens:

* {{Upload.incomingData}} (GSP and GSP Quads)
* {{ActionLib.parse}}
* SPARQL Queries {{SPARQLQueryProcessor.executeBody}} – this is OK - the whole 
string is read before parsing.
* SPARQL Update requests {{SPARQL_Update.executeBody}} – may need to read whole 
string. Ideally, switch depending on presence of {{Content-Length}} so large 
updates stream.

Library code: check Content-Length usage.  Mainly, this is a check while 
considering cached connections.

{{RDFConnectionRemote}} uses {{QueryEngineHTTP}}.

[https://github.com/afs/jena-http] – code not yet in jena that uses the 
post-Java8 {{java.net.http}}.

> 502 Bad Gateway with reverse proxy on Fuseki
> --------------------------------------------
>
>                 Key: JENA-1854
>                 URL: https://issues.apache.org/jira/browse/JENA-1854
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Fuseki
>    Affects Versions: Jena 3.11.0, Jena 3.14.0
>            Reporter: Mikael
>            Priority: Major
>
> When posting 40k turtle data that results error
> {noformat}
> Error 400: Parse error: [line: 1, col: 34] Undefined prefix: ebucore
> {noformat}
> the response is sent before entire input data is received. That confuses 
> proxy and instead of above Fuseki error, client receives proxy error
> {noformat}
> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
>  <html><head>
>  <title>502 Bad Gateway</title>
>  </head><body>
>  <h1>Bad Gateway</h1>
>  <p>The proxy server received an invalid
>  response from an upstream server.<br />
>  </p>
>  <hr>
>  <address>Apache/2.4.18 (Ubuntu) Server at xxx.fi Port 443</address>
>  </body></html>
> {noformat}
>  
> Posting data:
> {noformat}
> curl -v -X POST -H "Content-type: text/turtle" --data-binary 
> @/tmp/subtitle.ttl 
> [https://xxx.fi/fuseki/ds?graph=https://resource.lingsoft.fi/graph/demo]
> {noformat}
>  
> Proxy:
> [https://xxx.fi/fuseki/ds] is proxy address pointing to 
> [http://127.0.0.1:3030/ds|http://127.0.0.1:3030/]
>  
> Discussion:
> [https://mail-archives.apache.org/mod_mbox/jena-users/202003.mbox/%[email protected]%3e]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to