Hi,
I am getting a weird encoding issue when importing documents.
While my local copy of MarkLogic has NO problem loading a document, when I
try to load the same document on a remote server the following error is
raised:
[1.0-ml] XDMP-DOCROOTTEXT: xdmp:document-load("C:\TestFile.html", (),
<options
xmlns="xdmp:eval"><database>16453038828028925603</database><modules>0</modul
es><de...</options>) -- Invalid root text "" at C:\TestFile.html
line 1
The document loads OK in IE, which reports that the document is "Unicode-
UTF-8" encoding, and opens OK in Oxygen too. When I open the document in
Notepad I do not see any unusual characters on line one, but did not really
expect to. I recall that characters FE and FF are used in Unicode to
indicate whether the bytes are lower- or higher-byte first (but is that just
in Unicode 16?).
I tried add <encoding> options to xdmp:document-load() but none of the
values I tried helped any.
My local configuration is:
Architecture: i686
Platform: winnt
Host: neil-pc
MarkLogic Product Edition: Standard
MarkLogic Product Version: 4.2-5
The server configuration is:
Architecture: amd64
Platform: winnt
Host: dgdbsrv1.dg.local
MarkLogic Product Edition: Enterprise
MarkLogic Product Version: 4.0-3
So the obvious question is whether differences between 4.0 and 4.2 account
for me not seeing the error locally, and the error appearing on the server?
If so is the only solution to upgrade the server version? Or is there an
easy way I can convert the documents into a format that WILL load into the
earlier version of ML?
Regards,
Neil.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general