I noticed that Matt Turner and Kelly's trick with XMP (http://xquery.typepad.com/xquery/2006/09/the_xmp_trick.html) needed some work in 4.0-x and 1.0-ml. The problem seems to be that string functions no longer ignore invalid UTF-8 codepoints, so the substring-*(xdmp:quote($binary)) idiom now fails.

Here's my update:

xquery version "1.0-ml";

declare namespace dc="http://purl.org/dc/elements/1.1/";;
declare namespace tiff="http://ns.adobe.com/tiff/1.0/";;
declare namespace rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";;
declare namespace x="adobe:ns:meta/";

declare function local:string-to-hex($str as xs:string)
 as xs:string
{
  upper-case(string-join(
    for $c in string-to-codepoints($str)
    return xdmp:integer-to-hex($c), ''))
};

let $start-tag-hex := local:string-to-hex("<x:xmpmeta")
let $end-tag-hex := local:string-to-hex("</x:xmpmeta>")
let $xmp := string(xs:hexBinary(doc(
  "http://xquery.typepad.com/photos/uncategorized/maineboat.jpg";)))
let $xmp := substring-before($xmp, $end-tag-hex)
let $xmp := substring-after($xmp, $start-tag-hex)
let $xmp := concat($start-tag-hex, $xmp, $end-tag-hex)
return xdmp:unquote(xdmp:quote(binary { xs:hexBinary($xmp) }))/*

In my tests, the XMP tag tends to appear early in the binary, so performance is better when calling substring-before() first.

-- Mike
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to