This is a very useful summary. I must admit that I have been somewhat shy about using schema validation in MarkLogic ever since I came across this: http://developer.marklogic.com/pipermail/general/2012-October/011576.html
In summary, attributes were inadvertently being added to our data on ingestion...but perhaps this has changed. From: Ellis Pritchard <[email protected]<mailto:[email protected]>> Reply-To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Date: Thursday, February 6, 2014 4:05 AM To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Subject: Re: [MarkLogic Dev General] Validation against schema issue Hi Lanz, Schema validation is probably a neglected feature for most devs using MarkLogic, and unlike most of the rest of ML, there are several 'gotchas' (and even a defect: 19722!) which can make working with schema's a bit of a pain: 1/ A schema split over several files having the same namespace will need Group configuration to point to the root document for the namespace, else ML will pick up a random document from the set and you may get an unexpected type error. 2/ By default, databases share the Schemas database; this is generally a bad idea, and you should probably set a separate schema database for each content database. 3/ If you are using no-namespace schemas, you are very vulnerable to the types conflicting with each other, especially if sharing schema databases. 4/ Due to Bug #19722, ML doesn't automatically pick up changes to schemas, even worse, it can mean that it gets confused about them when they are re-loaded. However, if you've got a decently typed schema, it sure saves a lot of casting, and makes data integrity easier to maintain, especially with a pre-commit validation trigger as suggested by Geert. Ellis. On 15 Jan 2014, at 09:56, Jakob Fix <[email protected]<mailto:[email protected]>> wrote: hi, thanks for this. a couple of follow-up questions: - will there be support for xml schema 1.1 at some stage? - i have the impression that is very few talk about validation of documents on this list. is that because people don't validate? or because it's so easy that it's not worth mentioning? i'd be interested in patterns related to validation people are using. validation outside of the database? what about validation when a document is updated in the database, how do you assure the document is still valid? xdmp:validate, schema validation? other options? On Jan 14, 2014 7:28 PM, "Mary Holstege" <[email protected]<mailto:[email protected]>> wrote: I think the problem here is you are using XSD 1.1 and relying on one of its features. MarkLogic currently doesn't support XSD 1.1. Technically we ought to not even attempt the validation when you have an xs:all extended by an xs:all, but in general MarkLogic doesn't do a great job of schema checking in that way; mostly just assuming the schemas are OK. //Mary On Tue, 14 Jan 2014 09:43:44 -0800, Lanz <[email protected]<mailto:[email protected]>> wrote: > Hi all, > > Here is the context : > we use Marklogic 7.0-1. > we have a schema database containing ours schemas, this db is referenced > in > our doc db as the schema db. > These schemas (version 1.1) defined a base type and 2 extension types > (ie : > a basic publication as a base type and a 'summary' and an 'indicator' as > extension types). The extensions types have their own elements in > addition > of the ones from the basic type. Some elements could be optional or > mandatory, they are 'unordered' (using xs:all). All these schemas use the > same namespace. > Because the root element is the same for the 2 extension type ('work') we > set the attribute 'schemalocation' in the 'work' root element to be sure > ML > uses the right schema during the validation. > The documents have been validated against its schema in Oxygen without > issue > > > Here is the issue! > When we try to validate a document before inserting it in Marklogic with > xdmp:validate using neither strict", "lax", or "type" (with its own > type), > it failed. > The error message mentions the right schema but does not take in account > the optional elements. > > Please find the mentioned (simplified) schema, XML sample and error > message > here : https://gist.github.com/anonymous/8422411 > > > Any help is welcome, many thanks > Lanz -- Using Opera's revolutionary email client: http://www.opera.com/mail/ _______________________________________________ General mailing list [email protected]<mailto:[email protected]> http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected]<mailto:[email protected]> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
