Thanks Indy – how is that meant to flow, something like this?
declare namespace xhtml = "http://www.w3.org/1999/xhtml";
declare function do:makeXMLsafe( $Str as xs:string ) {
let $Str:=fn:escape-html-uri($Str)
let $Str:=xdmp:tidy($Str, <options
xmlns="xdmp:tidy"><output-xhtml>yes</output-xhtml>
</options>)[2]/xhtml:html/xhtml:body/node()
return $Str
};
From: <[email protected]> on behalf of Indrajeet Verma
<[email protected]>
Reply-To: MarkLogic <[email protected]>
Date: Wednesday, February 8, 2017 at 10:28 AM
To: MarkLogic <[email protected]>
Subject: Re: [MarkLogic Dev General] Is xdml:unquote appropriate for handling
accent characters?
See if this works for you.
declare namespace xhtml = "http://www.w3.org/1999/xhtml";
xdmp:tidy($Str, <options xmlns="xdmp:tidy"><output-xhtml>yes</output-xhtml>
</options>)[2]/xhtml:html/xhtml:body/node()
Regards,
Indy
On Wed, Feb 8, 2017 at 11:40 PM, Kari Cowan
<[email protected]<mailto:[email protected]>> wrote:
I guess I can make it palatable with the function I added below – then have
them unfurl it on the front end. When I pulled actual doc source – even
though ‘Pokémon’ displayed in Qconsole, it was actually encoded as è
declare function do:makeXMLsafe( $Str as xs:string ) {
let $Str:=fn:escape-html-uri($Str)
return $Str
};
>> changes ‘Pokémon’ to ‘Pok%C3%A9mon’
Is there any better way to deal with it?
From:
<[email protected]<mailto:[email protected]>>
on behalf of Kari Cowan <[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic
<[email protected]<mailto:[email protected]>>
Date: Tuesday, February 7, 2017 at 2:34 PM
To: MarkLogic
<[email protected]<mailto:[email protected]>>
Subject: Re: [MarkLogic Dev General] Is xdml:unquote appropriate for handling
accent characters?
(note outlook stripped out the unknown character below, in the <title> node it
was “Pok?mon”
From: Kari Cowan <[email protected]<mailto:[email protected]>>
Date: Tuesday, February 7, 2017 at 2:31 PM
To: MarkLogic
<[email protected]<mailto:[email protected]>>
Subject: Is xdml:unquote appropriate for handling accent characters?
The doc contains a node with text including an accent grave, example:
<HEADLINE>VOIR DIRE: Pokémon Drive?</HEADLINE>
I tried to handle it with:
let $theTitle:=xdmp:unquote($theTitle, "", ("repair-full"))
But I still get an output with an unknown character in xml
<title>VOIR DIRE: Pokmon Drive?</title>
>> XML Parsing Error: not well-formed
Anyone have a tip they can share on how to handle it?
_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general