Hi Geert,

You can specify the encoding with the <encoding> option to
xdmp:document-get or xdmp:document-load.  You do have to know the
encoding though--it will not use an encoding in a header of the document
on its own, and will default to UTF-8.  

-Danny

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Geert
Josten
Sent: Wednesday, March 25, 2009 6:07 AM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] Importing xml with unpredictable
encoding

Hi,

Is it correct that the MarkLogic built-in functions xdmp:document-load
and xdmp:document-get do not respect the encoding specification in the
XML declaration? They expect UTF-8 by default and otherwise try to
consume the file with the encoding specified in the options. Is there a
way to anticipate on the encoding in the XML declaration?

I tried using something like xdmp:filesystem-file and (rather ugly) try
parsing the string with string functions, but it chokes with the message
that the string contains a bad codepoint (SVC-BAD: ... -- Bad
CodepointIterator::_next).

Any ideas?

Kind regards,
Geert


Drs. G.P.H. Josten
Consultant


http://www.daidalos.nl/
Daidalos BV
Source of Innovation
Hoekeindsehof 1-4
2665 JZ Bleiswijk
Tel.: +31 (0) 10 850 1200
Fax: +31 (0) 10 850 1199
http://www.daidalos.nl/
KvK 27164984
De informatie - verzonden in of met dit emailbericht - is afkomstig van
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u
dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te
verwijderen. Aan dit bericht kunnen geen rechten worden ontleend.



_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to