Hi Chris,

Yes, it looks like this approach may be more appropriate for my needs. After
posting my request I found out that there is a need to support "custom"
character encodings (during import and export) for characters that fall into the
Unicode Private Use Area, otherwise I would try to use the xdmp:output options.
Makes it interesting with no doubt!

You're 2 for 2 with describing viable solutions to some of my recent and custom
requirements.

Thanks!

Tim M.

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Christopher Hamlin
Sent: Wednesday, September 10, 2014 5:15 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] How to export XML to ASCII text with ISO
encodings?

Hi Tim,

Maybe simplistic, but you could do it with a map:map and analyze-string.  Search
for entities on the way in and look up the names.  Search for unicode char
ranges that you want to convert on the way out, and look them up in the
inversion of the map:map.  Something like

xquery version "1.0-ml";

declare namespace s = "http://www.w3.org/2005/xpath-functions";;

declare function local:replace-hits-from-map ($in, $regex, $map) {
      fn:string-join ((
          let $checked := fn:analyze-string ($in, $regex)
          for $bit in $checked/*
          let $text := $bit/text()
          return
              if ($bit/self::s:non-match) then $text else map:get ($map, $text)
      ), '')
};

let $conf := map:entry ('&rarrow;', '→') let $in := 'foo &rarrow;
bar'
let $imported := local:replace-hits-from-map ($in, '&[^;]+;', $conf) let
$exported := local:replace-hits-from-map ($in, '[ÿ-]', -$conf)
return ($in, $imported, $exported)

You could serialize the map somewhere and deserialize it when needed.

Of course it could get huge depending on how much of Unicode you need.
When I had this it was just a case of people adding things as they needed them,
and translating short strings. So I was OK just having obvious error checking
and putting in the new mappings when they arose.

- Chris
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general


_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to