Hi Geert, You can specify the encoding with the <encoding> option to xdmp:document-get or xdmp:document-load. You do have to know the encoding though--it will not use an encoding in a header of the document on its own, and will default to UTF-8.
-Danny -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Geert Josten Sent: Wednesday, March 25, 2009 6:07 AM To: General Mark Logic Developer Discussion Subject: [MarkLogic Dev General] Importing xml with unpredictable encoding Hi, Is it correct that the MarkLogic built-in functions xdmp:document-load and xdmp:document-get do not respect the encoding specification in the XML declaration? They expect UTF-8 by default and otherwise try to consume the file with the encoding specified in the options. Is there a way to anticipate on the encoding in the XML declaration? I tried using something like xdmp:filesystem-file and (rather ugly) try parsing the string with string functions, but it chokes with the message that the string contains a bad codepoint (SVC-BAD: ... -- Bad CodepointIterator::_next). Any ideas? Kind regards, Geert Drs. G.P.H. Josten Consultant http://www.daidalos.nl/ Daidalos BV Source of Innovation Hoekeindsehof 1-4 2665 JZ Bleiswijk Tel.: +31 (0) 10 850 1200 Fax: +31 (0) 10 850 1199 http://www.daidalos.nl/ KvK 27164984 De informatie - verzonden in of met dit emailbericht - is afkomstig van Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit bericht kunnen geen rechten worden ontleend. _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
