Hello,

I have been agonizing over a problem with a service I'm trying to call from
RESTXQ. The service (www.tei-c.org/oxgarage/) only accepts
multipart/form-data submissions. It provides a front-end client for
uploading files from the browser, which calls a back-end RESTful service to
do document conversion.

I've installed a local instance of the service (
https://github.com/TEIC/oxgarage) and have it running on Tomcat 7, along
with BaseX.

The problem arises when I try to submit documents with non-ASCII characters
from RESTXQ. Looking at the network traffic, I can see that if the document
contains only ASCII characters, the multipart submission body is not base64
encoded. For example:

Encapsulated multipart part:  (text/xml)
        Content-Disposition: form-data; name='fileToConvert';
filename='homework.xml'\r\n
        Content-Type: text/xml\r\n\r\n
        eXtensible Markup Language
            <html
                xmlns="http://www.w3.org/1999/xhtml";>
                <head>
                    <meta/>
                    <title>
                        Test
                        </title>
                    </head>
                <body>
                    <math
                        xmlns="http://www.w3.org/1998/Math/MathML";>
                        <msub>
                            <mi/>
                            <mn>
                                2
                                </mn>
                            </msub>
                        </math>
                    </body>
                </html>

However, if the document does contain non-ASCII characters (such as β),
BaseX sets the Content-Transfer-Encoding to "base64." This causes the
OxGarage service to fail because it thinks it is receiving an image file
rather than a textual document. For example:

Content-Type: text/xml\r\n
        Content-Transfer-Encoding: base64\r\n\r\n
        eXtensible Markup Language
            [truncated]
PGh0bWwgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzE5OTkveGh0bWwiPjxoZWFkPjxtZXRhLz48\r\ndGl0bGU+VGVzdDwvdGl0bGU+PC9oZWFkPjxib2R5PjxtYXRoIHhtbG5zPSJodHRwOi8vd3d3Lncz\r\nLm9yZy8xOTk4L01hdGgvTWF0aE1MIj48bXN1Yj48bWk+zrI8L21pPjxtbj5Ud288L21

Attached here is a basic test case to replicate the problem: an HTML page
with a form and the RESTXQ function that it calls.

I've tried setting a new header to specify Content-Transfer-Encoding as
"binary" instead of "base64," but it doesn't replace the default header. Is
there any way that the encoding could be controlled from RESTXQ?

Thanks in advance!

Tim

--
Tim A. Thompson
Metadata Librarian (Spanish/Portuguese Specialty)
Princeton University Library

www.linkedin.com/in/timathompson
[email protected]
Title: Test multipart-post

Click to test multipart-post from RESTXQ:

A small HTML doc containing the following MathML markup will be submitted by the server to an instance of the OxGarage service.

β 2

Attachment: multipart-test.xqm
Description: Binary data

Reply via email to