No problem. Here's some more observations related to this to consider when you have time to think about it. In one particular application (SRU as it's used by OCLC), the schema that describes the record data that is inlined is even referred to in the XML being returned, for instance, like so:
<rootelement> .... <ad:aboutData xmlns:ad="info:rfa/rfaRegistry/xmlSchemas/Institutions/aboutData" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="info:rfa/rfaRegistry/xmlSchemas/Institutions/aboutData http://worldcat.org/xsd/collections/Institutions/aboutData.xsd"> .... The "schemaLocation" describes the schema for this particular subtree. Now; unfortunately, "xsi:schemaLocation" isn't always present. In other situations, format may use the following approach: <adminData:adminData xmlns:adminData="info:rfa/rfaRegistry/xmlSchemas/adminData" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> In this case, the schema is referred to by the "info:rfa/...." URL. This particular schema is defined here: http://worldcat.org/registry/xsd/adminData.xsd as follows: <?xml version="1.0" encoding="UTF-8"?> <xs:schema targetNamespace="info:rfa/rfaRegistry/xmlSchemas/adminData" xmlns="info:rfa/rfaRegistry/xmlSchemas/adminData" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> where you'll notice the 'targetNamespace' attribute that matches the 'adminData' namespace in where it's used. Considering that I can download adminData.xsd and aboutData.xsd, and considering that they both contain targetNamespace declarations, it seems that Castor would have all pieces available to create a single, uniform binding. My question is whether it's possible. - Godmar On Wed, Apr 16, 2008 at 3:37 AM, Werner Guttmann <[EMAIL PROTECTED]> wrote: > Godmar, > > let me find some (good) time to address many of your questions. This is > really just a lack of tie currently ... > > Werner > > > > Godmar Back wrote: > > PS: > > > > On Sat, Apr 5, 2008 at 9:31 AM, Godmar Back <[EMAIL PROTECTED]> wrote: > >> On Sat, Apr 5, 2008 at 5:16 AM, Werner Guttmann <[EMAIL PROTECTED]> wrote: > >> > > >> > Given that I would like to know what your question really is about, > let's > >> > see what your reply is. > >> > > >> > >> Let me give a concrete, practical example. > >> > >> I'd like to process the XML that comes from this URL: > >> > >> > http://www.worldcat.org/webservices/registry/search/Institutions?version=1.1&operation=searchRetrieve&recordSchema=info%3Arfa%2FrfaRegistry%2FschemaInfos%2FadminData&maximumRecords=10&startRecord=1&resultSetTTL=300&recordPacking=xml&query=local.oclcAccountName+all+%22Virginia+Polytechnic+Institute+and+State+University%22+or+local.institutionAlias+all+%22Virginia+Polytechnic+Institute+and+State+University%22+or+local.institutionName+all+%22Virginia+Polytechnic+Institute+and+State+University%22 > >> > >> This is a XML format that bundles a set of search results. Let's focus > >> on the first record retrieved. I extracted it and placed it here: > >> http://top.cs.vt.edu/~gback/srw-record.xml > >> The important excerpt looks like this: > >> > >> <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"> > >> ..... elements elided .... > >> <recordData> > >> <adminData:adminData> > >> > >> > <adminData:resourceID>info:rfa/localhost/Institutions/3064</adminData:resourceID> > >> <adminData:briefLabel>VIRGINIA POLYTECHNIC > >> INSTITUTE AND > >> STATE UNIVERSITY > >> </adminData:briefLabel> > >> ... details elided .... > >> </adminData:adminData> > >> </recordData> > >> .... > >> > >> The Schema that describes this XML is here: > >> http://www.loc.gov/standards/sru/sru1-1archive/xml-files/srw-types.xsd > >> > >> It describes <recordData> as: > >> > >> > >> <xsd:element name="recordData" type="stringOrXmlFragment" > nillable="false"/> > >> > >> <xsd:complexType name="stringOrXmlFragment" mixed="true"> > >> <xsd:sequence> > >> <xsd:any namespace="##any" processContents="lax" minOccurs="0" > >> maxOccurs="unbounded"/> > >> </xsd:sequence> > >> </xsd:complexType> > >> > >> My code to process this XML starts out clean (I'm giving an > >> abbreviated version here:) > >> > >> SearchRetrieveResponse srr = > >> (SearchRetrieveResponse)SearchRetrieveResponse.unmarshal( > >> new InputStreamReader(urlconn.getInputStream())); > >> for (RecordType r : srr.getRecords().getRecord()) { > >> RecordData rdata = r.getRecordData(); > >> > >> and now the ugliness starts: > >> > >> for (Object o : rdata.getAnyObject()) { > >> AnyNode node = (AnyNode)o; > >> > >> and subsequently, I need to work with "AnyNode", getNamespacePrefix(), > >> getLocalName(), getFirstChild(), getNextSibling(), getNodeType() etc. > >> I needed to write a recursive function in order to retrieve, for > >> instance, the value of the "adminData:briefLabel" child. It's a lot > >> more complicated, btw, then if I used XOM and XPath. > >> > > > > I've since found the schema that expresses the subtree, and I can > > marshal the 'node' element retrieved above into a String, then > > unmarshal from that String, as in: > > > > for (Object o : rdata.getAnyObject()) { > > AnyNode node = (AnyNode)o; > > > > Writer w = new StringWriter(); > > Marshaller.marshal(node, w); > > AdminDataType adData = AdminData.unmarshal(new > > StringReader(w.toString())); > > System.out.println(">>>> " + adData.getBriefLabel()); > > > > This is, obviously, inefficient. Is there a better way of doing that? > > Can I tell Castor in some way that the XML that "AnyNode" represents > > can really be unmarshaled using type "AdminData"? > > > > - Godmar > > > >> Obviously, what I would like is to be able to specify a schema that > >> describes the structure of the XML fragment within > >> <recordData></recordData>, then tell Castor that this XML can occur > >> there, and generate code for the entire thing and call, for instance > >> rdata.getAdminData().getBriefData() > >> > >> Is it possible to combine schemas in this way? > >> > >> Note that this should be a pretty common challenge for XML processing > >> - there are many formats where a schema describes the structure of an > >> outer carrier format but allows for XML on the inside which is > >> described separately elsewhere. (OAI-PMH and Atom are two other > >> examples.) > >> > >> - Godmar > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe from this list, please visit: > > > > http://xircles.codehaus.org/manage_email > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email

