I'm trying to load some documents that come to me with the following error:

<AuthEnty affiliation="&quot;Carleton University. Data Centre&quot; ">

I know that entities are not allowed in attributes but I can't change the program that generates these files.

I want to load them into MarkLogic and have ML take care of repairing them. When I use:

xdmp:document-load ("/odesi/slid_75M0010_E_2003ke.xml",
<options xmlns="xdmp:document-load">
      <repair>full</repair>
      <permissions>{xdmp:default-permissions()}</permissions>
</options>)

the documents load and the entities are removed.  Great.

But I'd like to use recordloader for this task -- particularly in conjunction with the AutoLoader program.

When running record loader I'm using the XML_REPAIR_LEVEL=FULL property as in

java -cp recordloader.jar:xcc.jar:xpp3-1.1.4c.jar - DXML_REPAIR_LEVEL=FULL

but when recordloader tries to load the documents with these errors, instead of correcting them it says:

SEVERE: exception
com.marklogic.xcc.exceptions.XQueryException: XDMP-STARTTAGCHAR: Unexpected character "U" in start tag at cipo_912_1_E_1989-12/481 line 12
in /insert

Line 12 of that document has this in it:

<AuthEnty affiliation="&quot;Carleton University. Data Centre&quot; ">


Any ideas or help would be appreciated.  Thanks in advance.

Alan


_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to