On Fri, 26 Oct 2012 08:11:37 -0700, Fernandes, Nivaldo <[email protected]> wrote:
> Point 1 > ------- > As for the addition of defaulted attributes, does this also apply to > *optional* ones? > > Here is an example: > Original doc in file system: > <ISBN>some_isbn_value</ISBN> > Doc after ingestion in MarkLogic: > <ISBN Type="Set">some_isbn_value</ISBN> > So, this is clearly not what we want. Is MarkLogic perhaps assuming that > a Fixed value in an attribute makes it a required attribute? > Here is the schema definition for the attribute in question: > <xs:attribute fixed="Set" name="Type" type="xs:string"/> > So, it is OPTIONAL, otherwise its definition would have been: > <xs:attribute fixed="Set" name="Type" type="xs:string" use="required"/> > > So, I believe we need some clarification here. Yes, attributes declared as fixed="xxx" or as default="xxx" will be added into the data model as "defaulted attributes" declare xdmp:output "default-attributes=yes"; (or the default-attributes option in xdmp:quote or xdmp:save) causes them to be included declare xdmp:output "default-attributes=no"; causes them to be suppressed. There is also a global trace event for this: Output Default Attributes If it is enabled, they are included. > > Point 2 > ------- > As for the control during serialization, this has implications for the > ticket I mentioned. And, sorry, my bad, this ticket (#10661) is no > longer open, but according to my co-worker, your statement regarding > control is significant...it was not clear to him that this was possible > from the ticket responses. > Here is his observation: > "We found that whitespace-only text nodes were being added to our docs, > in a few specific places, at serialization. The docs in question were > not namespaced, and we eventually determined that some unrelated schemas > in our Schemas db were interacting with these docs, apparently adding > the unexpected whitespace upon serialization. At the time, as recorded > in the ticket (10661), we solved the problem by circumventing the > Schemas db (pointing the main db to itself for its 'Schemas database' > setting). We did not understand that there would be another way to > disable the whitespace behavior while still leaving the Schemas setup as > it was, and would be interested to know more about that. There is some > discussion in the ticket of setting 'output indent' and 'boundary-space' > options, but it appears that these did not address the problem." There are a couple of relevant serialization options: declare xdmp:output "indent-untyped=yes"; means indent/pretty print content that is untyped and it not demonstrably mixed content (i.e. it only has element children) declare xdmp:output "indent-untyped=no"; turns that off. Again, there is a global trace event for this: Output Indent Untyped is the same as turning this on declare xdmp:output "indent=yes"; means to indent/pretty print content: this will cause elements declared as element-only to be indented, and, depending on the state of indent-untyped, may cause untyped content to be indented as well. declare xdmp:output "indent=no"; turns all pretty-printing off. That said, if the XML was constructed not by parsing an existing document but by direct construction in XQuery, the XQuery boundary whitespace rules do come into play: we won't discard whitespace that was put into the data model in the first place. In MarkLogic 6, the output options can be set in the appserver config, so you don't have to put this in every query prolog. There are a bunch of output options that control all kinds of stuff, like what elements get CDATA sections, whether you emit a !DOCTYPE header, and so on and so forth. //Mary _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
