On Sat, Apr 5, 2008 at 5:16 AM, Werner Guttmann <[EMAIL PROTECTED]> wrote: > > Given that I would like to know what your question really is about, let's > see what your reply is. >
Let me give a concrete, practical example. I'd like to process the XML that comes from this URL: http://www.worldcat.org/webservices/registry/search/Institutions?version=1.1&operation=searchRetrieve&recordSchema=info%3Arfa%2FrfaRegistry%2FschemaInfos%2FadminData&maximumRecords=10&startRecord=1&resultSetTTL=300&recordPacking=xml&query=local.oclcAccountName+all+%22Virginia+Polytechnic+Institute+and+State+University%22+or+local.institutionAlias+all+%22Virginia+Polytechnic+Institute+and+State+University%22+or+local.institutionName+all+%22Virginia+Polytechnic+Institute+and+State+University%22 This is a XML format that bundles a set of search results. Let's focus on the first record retrieved. I extracted it and placed it here: http://top.cs.vt.edu/~gback/srw-record.xml The important excerpt looks like this: <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"> ..... elements elided .... <recordData> <adminData:adminData> <adminData:resourceID>info:rfa/localhost/Institutions/3064</adminData:resourceID> <adminData:briefLabel>VIRGINIA POLYTECHNIC INSTITUTE AND STATE UNIVERSITY </adminData:briefLabel> ... details elided .... </adminData:adminData> </recordData> .... The Schema that describes this XML is here: http://www.loc.gov/standards/sru/sru1-1archive/xml-files/srw-types.xsd It describes <recordData> as: <xsd:element name="recordData" type="stringOrXmlFragment" nillable="false"/> <xsd:complexType name="stringOrXmlFragment" mixed="true"> <xsd:sequence> <xsd:any namespace="##any" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> My code to process this XML starts out clean (I'm giving an abbreviated version here:) SearchRetrieveResponse srr = (SearchRetrieveResponse)SearchRetrieveResponse.unmarshal( new InputStreamReader(urlconn.getInputStream())); for (RecordType r : srr.getRecords().getRecord()) { RecordData rdata = r.getRecordData(); and now the ugliness starts: for (Object o : rdata.getAnyObject()) { AnyNode node = (AnyNode)o; and subsequently, I need to work with "AnyNode", getNamespacePrefix(), getLocalName(), getFirstChild(), getNextSibling(), getNodeType() etc. I needed to write a recursive function in order to retrieve, for instance, the value of the "adminData:briefLabel" child. It's a lot more complicated, btw, then if I used XOM and XPath. Obviously, what I would like is to be able to specify a schema that describes the structure of the XML fragment within <recordData></recordData>, then tell Castor that this XML can occur there, and generate code for the entire thing and call, for instance rdata.getAdminData().getBriefData() Is it possible to combine schemas in this way? Note that this should be a pretty common challenge for XML processing - there are many formats where a schema describes the structure of an outer carrier format but allows for XML on the inside which is described separately elsewhere. (OAI-PMH and Atom are two other examples.) - Godmar --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email

