Hi Folks,
I am using xdmp:document-load to insert content into MarkLogic. Until recently I had only been loading UTF-8 XML into the database, but recently started encountering some ISO-8859-1 encoded content. I was able to adjust the xdmp:document-load options to accommodate ISO-8859-1 and for the most part it has been working okay; however, the ISO-8859-1 content occasionally includes HTML character entities such as ∼ which appears to be causing the load to fail (which subsequently is generating an XDMP-DOCUNEOF error message when the error is not trapped with a try-catch block but generates an XDMP-DOCENTITYREF error message when the error is trapped with a try-catch block). Is there a simple way to add a list of character entity mappings to get this to work? For example, I've read that ∼ maps to the Unicode character U+0223C <http://www.fileformat.info/info/unicode/char/223c/index.htm> (http://code.google.com/p/doctype/wiki/SimCharacterEntity). Thanks ahead of time for any help with this! Tim Meagher
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
