Hi Dave,

All content in MarkLogic Server is stored in UTF-8.  There are mechanisms in 
the server for transcoding other encodings to UTF-8.  These are accessible via 
several loading built-in XQuery functions, such as xdmp:document-load (see the 
<encoding> option):

http://developer.marklogic.com/pubs/4.0/apidocs/UpdateBuiltins.html#xdmp:document-load

You can also specify the encoding in XCC, which is the Java (or .NET) interface 
to MarkLogic Server.  In general, the developer (or some program) must tell 
MarkLogic Server the encoding, then it will be translated to UTF-8 on the way 
into the server.

You will also find more information about this in the Developer's Guide:

http://developer.marklogic.com/pubs/4.0/books/dev_guide.pdf

See the Chapter "Encodings and Collations" on page 260.

Note that RecordLoader is not part of MarkLogic Server, but it is an 
open-source developer project designed to help developers, supported by the 
community (this list, for example). I do think it supports non-UTF-8 encodings, 
however, but I unfortunately do not know too much about RecordLoader.  Someone 
else probably knows.

-Danny


 
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dave Pawson
Sent: Wednesday, December 03, 2008 7:23 AM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] Re: XDBC creation / access

2008/12/1 Dave Pawson <[EMAIL PROTECTED]>:
> I note in Loader.java
>
>  try {
>            xpp = config.getXppFactory().newPullParser();
>            xpp.setInput(new InputStreamReader(input, decoder));
>            // TODO feature isn't supported by xpp3 - look at xpp5?
>            // xpp.setFeature(XmlPullParser.FEATURE_DETECT_ENCODING, true);
>            // TODO feature isn't supported by xpp3 - look at xpp5?
>            // xpp.setFeature(XmlPullParser.FEATURE_PROCESS_DOCDECL, true);
>            xpp
>                    .setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES,
>                            true);
>        } catch (XmlPullParserException e) {
>            throw new FatalException(e);
>        }
>
>
> Does that mean this code only supports utf-8 encodings?

I note from http://www.xmlpull.org/ that the recommended implementation
(http://www.extreme.indiana.edu/dist/java-repository/xpp3/distributions/?M=A)
calls on also has no mention of encoding?

http://www.xmlpull.org/v1/doc/api/org/xmlpull/v1/XmlPullParserFactory.html#setFeature(java.lang.String,%20boolean)
has the setfeatures ... but
http://www.extreme.indiana.edu/viewcvs/~checkout~/XPP3/java/src/java/api/org/xmlpull/v1/XmlPullParser.java
this has no mention of encoding.

The whole emphasis seems to be on speed rather than competeness.

Is the implication that utf-8 is the only encoding usable via this
interface... or MarkLogic?

regards


-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to