Hi

I am trying to select a value from the xml that is spat out by the JHOVE project (http://hul.harvard.edu/jhove/). The xsd describing the xml results a jhove command is http://hul.harvard.edu/ois/xml/ns/jhove.

The xsd has another nested namespace http://www.loc.gov/mix/ and I am trying to select information at this nested level (namely mix:ImageWidth)

Here is some relevant XML:

<?xml version="1.0" encoding="UTF-8"?>
<jhove xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns="http://hul.harvard.edu/ois/xml/ns/jhove"; xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/jhove http:// hul.harvard.edu/ois/xml/xsd/jhove/1.3/jhove.xsd" name="Jhove" release="1.0" date="2005-05-26">
 <date>2007-12-12T13:16:55-08:00</date>
<repInfo uri="/opt/itms_import/Labels/FilterUS/packages/ 10000000018106-20071212131633.itmsp/cover.tif"> <reportingModule release="1.3" date="2005-05-05">TIFF-hul</ reportingModule>
  <lastModified>2007-12-12T13:16:37-08:00</lastModified>
  <size>1088278</size>
...
<mix:mix xmlns:mix="http://www.loc.gov/mix/"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="http://www.loc.gov/mix/ http://www.loc.gov/mix/ mix.xsd">
        <mix:BasicImageParameters>
...
        </mix:BasicImageParameters>
        <mix:ImageCreation>
...
        </mix:ScanningSystemCapture>
        </mix:ImageCreation>
         <mix:ImagingPerformanceAssessment>
          <mix:SpatialMetrics>
           <mix:SamplingFrequencyUnit>2</mix:SamplingFrequencyUnit>
           <mix:XSamplingFrequency>300</mix:XSamplingFrequency>
           <mix:YSamplingFrequency>300</mix:YSamplingFrequency>
           <mix:ImageWidth>600</mix:ImageWidth>
           <mix:ImageLength>600</mix:ImageLength>
         </mix:SpatialMetrics>
         <mix:Energetics>
...
  </properties>
 </repInfo>
</jhove>

I get an XmlObject for the mix section like so:

    private XmlObject mix() {
        XmlObject result = null;
String namespace = "declare namespace jhove='http:// hul.harvard.edu/ois/xml/ns/jhove';"; String path = "//*[namespace-uri()='http://www.loc.gov/mix/' and local-name()='mix']";
        String query = namespace + path;

        XmlObject[] mixes = document.selectPath(query);
        if ((mixes != null) && (mixes.length > 0))
            result = mixes[0];

        return result;
    }

and returns this:

<xml-fragment xsi:schemaLocation="http://www.loc.gov/mix/ http:// www.loc.gov/mix/mix.xsd" xmlns:mix="http://www.loc.gov/mix/"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
  <mix:BasicImageParameters>
    <mix:Format>
      <mix:MIMEType>image/jpeg</mix:MIMEType>
      <mix:ByteOrder>big-endian</mix:ByteOrder>
      <mix:Compression>
        <mix:CompressionScheme>6</mix:CompressionScheme>
      </mix:Compression>
 ...


I have this code to try and get the width:

    public Double width() {
        Double width = null;
        XmlObject mix = mix();
String mixNamespace = "declare namespace mix='http:// www.loc.gov/mix/';"; String path = "./mix:ImagingPerformanceAssessment/ mix:SpatialMetrics/mix:ImageWidth";
        String query = mixNamespace + path;
        XmlCursor widthCursor = mix.newCursor().execQuery(query);

        if (widthCursor != null) {
            String widthString = widthCursor.getTextValue();
            try {
                width = new Double(widthString);
            }
            catch (NumberFormatException e) {
log.info("Unexpected string in the jhove result file for mix:ImageWidth");
            }
        }

        return width;
    }

The widthCursor.currentTokenType after execQuery() though is STARTDOC. Am I doing something fundamentally wrong with this approach? BTW the path /mix:mix/mix:ImagingPerformanceAssessment/ mix:SpatialMetrics/mix:ImageWidth is correct when executed in oXygen.

Thanks for any help.

Donald





Reply via email to