On Thu, 25 Oct 2012 11:40:33 -0700, Fernandes, Nivaldo <[email protected]> 
wrote:

> I would like to extend this thread just a bit more...I have concerns
> about the background checking against schemas. I understand the
> technical reasons for type checking but there seems to be some wrinkles
> in the practical outcome.
>
> But to make sure that I understand the jist of your responses to this
> thread: if the Schemas db has a schema document with the same namespace
> as the xml documents in a db that points to said Schemas db, then
> operations on the xml documents that require type checking (e.g.
> fn:data, and many others) will cause MarkLogic to do an IMPLICIT
> verification/check against the schema document.

It isn't doing an implicit validation, it is doing an implicit type
assignment of the specific data you are trying to perform a typed
operation on. This will entail a certain amount of propogation of
that assessment up the document tree if you have local elements.
It doesn't process the whole tree and it doesn't check complex
type validity: it assumes complex type validity.

> So, if for some reason (see below) I do not want this background
> checking done when operating on the xml documents, is my only choice
> then NOT to have any schema documents (with same namespace as xml doc)
> in the Schemas db? Sounds fair BUT what if I still want to be able to do
> EXPLICIT validation against the schema??? [BTW, this is how I understood
> things to be with MarkLogic, especially since it claims to be quite
> functional in a Schema-agnostic world.]
>
> In summary, my understanding was that schema validation was totally
> under my control via an EXPLICIT call to validate.

Schema validation, yes. Type assessment, less so.

You can't disable automatic type assessment, but you can make it
essentially a no-op by making sure you explicitly refer to a dummy
schema for that namespace.

I wouldn't recommend this, however. It is tricky to get right, and
there really is not a good reason to not want to use the correct types
for typed operations.

> So, what is my reason for not wanting the implicit validation? Well,
> during a high stress period in my organization, when we reload all our
> databases, I found myself staring at documents being ingested in
> MarkLogic (4.1-7.1) that were mysteriously having an attribute being
> added to them upon ingestion, even though I made sure that nowhere in
> the loading this was explicitly happening. After cracking my head for a
> while, I had the realization to look at the schema in the Schemas
> database being pointed to in the db config, and saw that the attribute
> being added was an *optional* attribute in the schema with a Fixed value
> (i.e. this attribute may not occur but when it does, it always has the
> same value). My next step was to remove the schema document from the
> schema database in order to eliminate the remote possibility that
> MarkLogic was doing some background schema validation (WHICH NOW I KNOW
> IT DOES). To my surprise (and dismay) at the time, the problem was
> solved by removing the schema document...no longer the attribute was
> being incorrectly added to the elements in the xml documents. And by
> golly, no longer was I going to put any schema documents in the Schemas
> database and go through some similar bad experience.
>
> NOTE: similar unwanted interactions between schema and xml documents
> have been experienced by other developers in my organization (ML 4.2-9)
> (with current tickets opened yet still unresolved).

I am not aware of any open tickets or bugs in this area. That doesn't
mean there aren't any, mind you, but I couldn't find them.

But I think these are a bit of a misperception of what is happening here.
Yes, when we parse XML documents, we will add defaulted attributes to the
data model. You can control whether those show up in the serialization or
not, because there are customers who want it one way or the other.

We need the defaulted attributes in the internal data model because many
things would otherwise not work correctly, such as the processing the
XML Schema documents themselves, or of XSLT stylesheets.

> So, what can be done here? Should MarkLogic perhaps offer a switch in
> its db config page that allows us to NOT want background schema
> validation and avoid its bad side effects? Or?
>
> Please advise.

//Mary
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to