Re: [MarkLogic Dev General] Creating a new schemas database

Michael Blakeley Thu, 25 Oct 2012 12:45:49 -0700

You may be conflating two different parts of XQuery that use schemas in 
different ways. Explicit schema validation is structure-aware and type-aware, 
while atomization operations are only type-aware - mostly anyhow. With 
MarkLogic the only way to get automatic validation is to install the CPF schema 
validation pipeline, and that only validates updates or inserts.


Atomization happens all the time: it is implicit when you write '<a>1</a> eq 
<b>1</b>' and get back true(), or explicit when you code 'data(<a>1</a>)' and 
get back... whatever the schema says it should return. Atomization always uses 
XML schema data types, whether the content has a schema or not: sometimes it 
uses xs:untypedAtomic. Atomization pays some attention to complex types, 
because they cannot be atomized. But atomization does not validate XML 
structure.

Sometimes atomization can expose problems in an application. For example if a 
schema says that a node is an integer, and its value is 'fubar', atomization 
will fail with an error. In that case the developer can fix the schema, or 
write code that avoids type-aware atomization, or may chose to remove it 
entirely.

With all this in mind, I'm not sure what caused the problem you describe. I 
might be missing something, but I struggle to explain the mysterious appearance 
of a new node in the document. Still, without a reproducible test case it's all 
guesswork.

Perhaps you could put together a test case and send it to support? It may be a 
bug: if so, I'm sure they would like to know about it.

-- Mike

On 25 Oct 2012, at 11:40 , "Fernandes, Nivaldo" <[email protected]> wrote:

> Hi Mary,
> 
> I would like to extend this thread just a bit more...I have concerns
> about the background checking against schemas. I understand the
> technical reasons for type checking but there seems to be some wrinkles
> in the practical outcome. 
> 
> But to make sure that I understand the jist of your responses to this
> thread: if the Schemas db has a schema document with the same namespace
> as the xml documents in a db that points to said Schemas db, then
> operations on the xml documents that require type checking (e.g.
> fn:data, and many others) will cause MarkLogic to do an IMPLICIT
> verification/check against the schema document. 
> 
> So, if for some reason (see below) I do not want this background
> checking done when operating on the xml documents, is my only choice
> then NOT to have any schema documents (with same namespace as xml doc)
> in the Schemas db? Sounds fair BUT what if I still want to be able to do
> EXPLICIT validation against the schema??? [BTW, this is how I understood
> things to be with MarkLogic, especially since it claims to be quite
> functional in a Schema-agnostic world.]
> 
> In summary, my understanding was that schema validation was totally
> under my control via an EXPLICIT call to validate. 
> 
> So, what is my reason for not wanting the implicit validation? Well,
> during a high stress period in my organization, when we reload all our
> databases, I found myself staring at documents being ingested in
> MarkLogic (4.1-7.1) that were mysteriously having an attribute being
> added to them upon ingestion, even though I made sure that nowhere in
> the loading this was explicitly happening. After cracking my head for a
> while, I had the realization to look at the schema in the Schemas
> database being pointed to in the db config, and saw that the attribute
> being added was an *optional* attribute in the schema with a Fixed value
> (i.e. this attribute may not occur but when it does, it always has the
> same value). My next step was to remove the schema document from the
> schema database in order to eliminate the remote possibility that
> MarkLogic was doing some background schema validation (WHICH NOW I KNOW
> IT DOES). To my surprise (and dismay) at the time, the problem was
> solved by removing the schema document...no longer the attribute was
> being incorrectly added to the elements in the xml documents. And by
> golly, no longer was I going to put any schema documents in the Schemas
> database and go through some similar bad experience.
> 
> NOTE: similar unwanted interactions between schema and xml documents
> have been experienced by other developers in my organization (ML 4.2-9)
> (with current tickets opened yet still unresolved).
> 
> So, what can be done here? Should MarkLogic perhaps offer a switch in
> its db config page that allows us to NOT want background schema
> validation and avoid its bad side effects? Or?
> 
> Please advise.
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Mary
> Holstege
> Sent: Tuesday, October 23, 2012 1:22 PM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Creating a new schemas database
> 
> On Tue, 23 Oct 2012 10:17:11 -0700, David Lee <[email protected]>
> 
> wrote:
> 
>> One more question unanswered ( please correct me if I am wrong).
>> 
>> On data Ingest ... if range indexes are created and the document has a
> 
>> schema then the schema is used to help type coercion of the values.
>> I found this in particular for list values where "1 2 3" could only
> make  
>> sense as a list of integers with a schema, so that implies that schema
> 
>> is read on (some?) cases of range index population.
>> Since you wouldnt know without looking in the schema, I am presuming
> the  
>> schema DB is searched for ALL ingested data documents (well most
> likely  
>> the cached schema document table)
>> 
> 
> Yes. Schemas are also used to determine whether a range index is a list
> type.
> 
> //Mary
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Creating a new schemas database

Reply via email to