Danny, Thanks for the suggestion, I had never spotted the options to xdmp:quote before, but unfortunately that still did not help. I tried "ASCII" and "ISO-8859-1", which are both valid values for the output-encoding parameter, but neither had any effect on the error message I am getting.
Neil. -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Danny Sokolsky Sent: 03 December 2009 19:25 To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] Upload Data via Form - Invalid UTF-8 Escape Sequence I am not sure if this will work, but you can try using the <output-encoding> option to xdmp:quote. Something like: text { xdmp:quote( xdmp:get-request-field("upload"), <options xmlns="xdmp:quote"> <output-encoding>ASCII</output-encoding> </options> ) } -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Neil Bradley Sent: Thursday, December 03, 2009 1:30 AM To: [email protected] Subject: [MarkLogic Dev General] Upload Data via Form - Invalid UTF-8 Escape Sequence Hi, I have a requirement to import data from spreadsheets and databases, using tab-separated text format, which I convert to XML. The problem I am having occurs when the source data comes from Excel and contains a pound symbol (or, I suspect, any character with an ASCII value above 127). Initially, the problem was that the text file was not recognised by the browser as text, so it came in as "application/octet-stream" instead of "text/plain", but I solved that using the following technique: text { xdmp:quote( xdmp:get-request-field("upload") ) } That solved the problem when the pound symbol was not in the data, (and also works when the data arrives in "plain/text" format, so covers both scenarios). But when the pound symbols was present, I got the following error: XDMP-UTF8SEQ: xdmp:quote(binary{"46756e64204e616d650944617465094e65742041737365742056616c7 5650944..."}) -- Invalid UTF-8 escape sequence in /test/UploadData.xqy, on line 61 [1.0-ml] Now, I have opened the file I am uploading in TextPad, which tells me it is a PC format ANSI text file, so I guess that might explain the UTF-8 error. The document is NOT in UTF 8. So I think it converting from ANSI to UTF-8. Any idea how to do that in this form-upload scenario? Thanks Neil. _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
