Hi Jim,
You might have had an easier time running the text w/ character entities
through xdmp:unquote rather than xdmp:tidy.
-jh-
James A. Robinson wrote:
So many years ago, when we were an SGML-only shop, somebody somewhere
decided it would be a Great Idea to stick SGML entities, e.g., ¨,
into the "ascii" fields of our databases. For example, an author table
might have a field fname_ascii to indicate a first name, and when one
queries that one finds a mixture of what are indeed ASCII characters --
which happen to require running through an SGML/XML entity resolver to
be usable!
Of course, the idea at the time was to write to browsers, and not worry
about the contents, and so nobody bothered to construct a mapping of
the entities they used, they just used HTML entities. Bah.
So now it's our turn to clean up the mess. I was using mlsql, which was
very neat. Naturally it treates the various fields of the DB as text
and slurps up '¨' as '¨'.
So I wanted to see if xdmp:tidy could deal with it. My first attempts
at processing the entire sql:response were not productive, if I pass in
an element() it translates ¨ into an NCR but strips all elements,
if I pass in the result of xdmp:quote($sql_result) it leaves the ¨
declarations unmolested.
So I ended up writing this, but I was wondering if anyone has done
something similar in perhaps a more efficent manner?
import module namespace sql = "http://xqdev.com/sql"
at "/modules/mlsql/sql.xqy"
declare namespace html="http://www.w3.org/1999/xhtml"
(:~
: Run the text() nodes of an element through xdmp:tidy (useful for
: translating escaped HTML entities into NCRs).
: @param $input element to clean.
: @return $input with any text() nodes passed via xdmp:tidy.
:)
define function tidyText($input as element())
as element()
{
element {node-name($input)} {
for $node in $input/node()
return
if ($node instance of element())
then tidyText($node)
else if ($node instance of text())
then normalize-space(xdmp:tidy($node)/html:html/html:body/text())
else $node
}
}
So:
tidyText(sql:execute($query, $mlsqlserver, ())
will return the sql:result with its text passed via tidy.
Jim
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
James A. Robinson [EMAIL PROTECTED]
Stanford University HighWire Press http://highwire.stanford.edu/
+1 650 7237294 (Work) +1 650 7259335 (Fax)
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general