On Tue, 23 Oct 2012 08:47:36 -0700, Tim Meagher <[email protected]> wrote:

> Hi Folks,
>
> To follow up, I'd like to get a clear picture of when the schema  
> database is
> accessed.  It has always been my understanding that the schemas database  
> is
> only accessed when an explicit validation is performed, but from  
> experience
> we're wondering if the schemas database is being accessed during  
> ingestion
> or output.  For example, is the schemas database changing content during
> serialization?
>
> If so, and if the schemas database does not point to itself, could such
> unexpected access to the schemas database get referred to the secondary
> schemas database even if both schema databases are empty?
>
> Thanks!
>
> :) Tim
>


The schemas database is accessed every time an uncached schema
is required for some purpose. This purpose may be explicit validation,
but it is more likely because it was needed to determine the typed
value of a node either due to an explicit call to fn:data or due to
implicit atomization of a value passed to a function or used with an
operator.  The schema is also used to determine whitespace handling
rules during parsing and serialization.

Once a schema is accessed, the schema itself is assembled into an
internal data structure, which is cached. Type information on specific
data model instances is also cached.

Even if a schemas database is empty, they'll still be a query run against
that database to locate a schema, whether by location or namespace URI.

> Just to follow up on that question, is it more advantageous to have a  
> schema
> defined for content and how much impact does that have on whether or not  
> the
> content has a namespace?

There is quite a bit of overhead to processing schemas, so I wouldn't
bother unless you have specific needs regarding whitespace or type
information.  Namespace vs non-namespace is a bit of a wash with
the important caveat that having multiple schema (documents) for the
same namespace (or non-namespace) essentially randomizes the
automatic type assignment unless you are very deliberate and
careful in how you set up your application: you have to make sure
that the correct schema is the one that is chosen for every action
you perform on your content. In practice, it is much easier to use
namespaces where only one root schema document is relevant for
any given namespace.  Unless you do the "poor man's namespaces"
and make sure that all your local names are distinct.

//Mary
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to