Godmar, let me find some (good) time to address many of your questions. This is really just a lack of tie currently ...
Werner Godmar Back wrote: > PS: > > On Sat, Apr 5, 2008 at 9:31 AM, Godmar Back <[EMAIL PROTECTED]> wrote: >> On Sat, Apr 5, 2008 at 5:16 AM, Werner Guttmann <[EMAIL PROTECTED]> wrote: >> > >> > Given that I would like to know what your question really is about, let's >> > see what your reply is. >> > >> >> Let me give a concrete, practical example. >> >> I'd like to process the XML that comes from this URL: >> >> >> http://www.worldcat.org/webservices/registry/search/Institutions?version=1.1&operation=searchRetrieve&recordSchema=info%3Arfa%2FrfaRegistry%2FschemaInfos%2FadminData&maximumRecords=10&startRecord=1&resultSetTTL=300&recordPacking=xml&query=local.oclcAccountName+all+%22Virginia+Polytechnic+Institute+and+State+University%22+or+local.institutionAlias+all+%22Virginia+Polytechnic+Institute+and+State+University%22+or+local.institutionName+all+%22Virginia+Polytechnic+Institute+and+State+University%22 >> >> This is a XML format that bundles a set of search results. Let's focus >> on the first record retrieved. I extracted it and placed it here: >> http://top.cs.vt.edu/~gback/srw-record.xml >> The important excerpt looks like this: >> >> <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"> >> ..... elements elided .... >> <recordData> >> <adminData:adminData> >> >> >> <adminData:resourceID>info:rfa/localhost/Institutions/3064</adminData:resourceID> >> <adminData:briefLabel>VIRGINIA POLYTECHNIC >> INSTITUTE AND >> STATE UNIVERSITY >> </adminData:briefLabel> >> ... details elided .... >> </adminData:adminData> >> </recordData> >> .... >> >> The Schema that describes this XML is here: >> http://www.loc.gov/standards/sru/sru1-1archive/xml-files/srw-types.xsd >> >> It describes <recordData> as: >> >> >> <xsd:element name="recordData" type="stringOrXmlFragment" nillable="false"/> >> >> <xsd:complexType name="stringOrXmlFragment" mixed="true"> >> <xsd:sequence> >> <xsd:any namespace="##any" processContents="lax" minOccurs="0" >> maxOccurs="unbounded"/> >> </xsd:sequence> >> </xsd:complexType> >> >> My code to process this XML starts out clean (I'm giving an >> abbreviated version here:) >> >> SearchRetrieveResponse srr = >> (SearchRetrieveResponse)SearchRetrieveResponse.unmarshal( >> new InputStreamReader(urlconn.getInputStream())); >> for (RecordType r : srr.getRecords().getRecord()) { >> RecordData rdata = r.getRecordData(); >> >> and now the ugliness starts: >> >> for (Object o : rdata.getAnyObject()) { >> AnyNode node = (AnyNode)o; >> >> and subsequently, I need to work with "AnyNode", getNamespacePrefix(), >> getLocalName(), getFirstChild(), getNextSibling(), getNodeType() etc. >> I needed to write a recursive function in order to retrieve, for >> instance, the value of the "adminData:briefLabel" child. It's a lot >> more complicated, btw, then if I used XOM and XPath. >> > > I've since found the schema that expresses the subtree, and I can > marshal the 'node' element retrieved above into a String, then > unmarshal from that String, as in: > > for (Object o : rdata.getAnyObject()) { > AnyNode node = (AnyNode)o; > > Writer w = new StringWriter(); > Marshaller.marshal(node, w); > AdminDataType adData = AdminData.unmarshal(new > StringReader(w.toString())); > System.out.println(">>>> " + adData.getBriefLabel()); > > This is, obviously, inefficient. Is there a better way of doing that? > Can I tell Castor in some way that the XML that "AnyNode" represents > can really be unmarshaled using type "AdminData"? > > - Godmar > >> Obviously, what I would like is to be able to specify a schema that >> describes the structure of the XML fragment within >> <recordData></recordData>, then tell Castor that this XML can occur >> there, and generate code for the entire thing and call, for instance >> rdata.getAdminData().getBriefData() >> >> Is it possible to combine schemas in this way? >> >> Note that this should be a pretty common challenge for XML processing >> - there are many formats where a schema describes the structure of an >> outer carrier format but allows for XML on the inside which is >> described separately elsewhere. (OAI-PMH and Atom are two other >> examples.) >> >> - Godmar >> > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email

