Thanks, Kelly. It's probably worth exploring why the error is uncatchable.

I believe it's uncatchable because the range index work happens in the commit 
phase, not the eval phase. At that point the XQuery has already been evaluated. 
It's too late to catch errors, so they have to be thrown. My last email 
arranged to test the same conditions in the eval phase, where errors can be 
caught. Kelly's idea nests an update within a read-only query, to similar 
effect.

Which one is better? Setting up a nested evaluation has a fixed cost, whereas 
the cost of data() will be O(n) with the number of calls. So I think the nested 
eval (or better yet invoke) would be more efficient if you have many nodes to 
test, and the data() call will be more efficient for a smaller number of tests. 
On my laptop, the breakeven point appears to be around 40 data() calls, but 
YMMV. The profile when using 'validate' might be different, too.

-- Mike

On 28 Nov 2011, at 08:02 , Kelly Stirman wrote:

> It doesn't really help the conversation, but you could catch that error, I 
> believe, if you were to wrap it in an eval or an invoke.
> 
> try { xdmp:eval("
>  xdmp:document-insert(
>    'test.xml',
>    element test {
>      element { QName('http://marklogic.com/xdmp/dls', 'created') } {
>        'fubar' }})")
> }
> catch ($ex) { xdmp:log($ex) }
> 
> Most expressions can be caught, but the updates are a little different 
> because of some server performance optimizations. There are a few others that 
> escape me at this time.
> 
> Kelly
> 
> Message: 6
> Date: Mon, 28 Nov 2011 07:48:40 -0800
> From: Michael Blakeley <[email protected]>
> Subject: Re: [MarkLogic Dev General] CORB processing continuation
>       query.
> To: General MarkLogic Developer Discussion
>       <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=us-ascii
> 
> First, Corb halts on errors as a design choice. The idea is that you are 
> probably running Corb to make sure your content meets some minimum standard, 
> and if 1% of the documents throw an error then you might miss them until much 
> later. But if you want to open an issue and request an enhancement at 
> https://github.com/marklogic/corb/issues you are welcome to do so. Or you 
> could create a patch and a pull request, but it's probably better to talk 
> first about how to implement it. The Java try-catch part would probably be 
> the easy part: the trickier part is that Corb has a bare minimum of 
> configurability, by design, and a change like this would have to extend it.
> 
> But let's see if we can catch this in XQuery. I'll use the built-in 
> {http://marklogic.com/xdmp/dls}created index to test.
> 
> xdmp:document-insert(
>  'test.xml',
>  element test {
>    element { QName('http://marklogic.com/xdmp/dls', 'created') } {
>      current-dateTime() }})
> => ()
> 
> That works, as expected.
> 
> xdmp:document-insert(
>  'test.xml',
>  element test {
>    element { QName('http://marklogic.com/xdmp/dls', 'created') } {
>      'fubar' }})
> => XDMP-RANGEINDEX
> 
> So far so good. Let's try to catch it.
> 
> try {
>  xdmp:document-insert(
>    'test.xml',
>    element test {
>      element { QName('http://marklogic.com/xdmp/dls', 'created') } {
>        'fubar' }})
> }
> catch ($ex) { xdmp:log($ex) }
> => XDMP-RANGEINDEX
> 
> So that error is not catchable in XQuery: some errors aren't. Let's try a 
> different approach: it will be more work in XQuery, but won't require changes 
> to Corb.
> 
> try {
>  let $new := element test {
>      element { QName('http://marklogic.com/xdmp/dls', 'created') } {
>        'fubar' }}
>  let $assert := data($new//*[not(*)])
>  return xdmp:document-insert('test.xml', $new) } catch ($ex) { xdmp:log($ex) }
> 
> That avoids the error, and in the log I see XDMP-LEXVAL with details of the 
> problem. The trick here is that we used data() to atomize all the leaf 
> elements in $new, and data() will also throw an exception for lexical values 
> that don't match up with your range index configuration.
> 
> Now, you probably aren't calling doc-insert in your Corb module, and you 
> probably don't need to call data() on every leaf element either. Most likely 
> you are using node-replace or node-insert. That's fine: simply call data() on 
> the node you are about to insert, and wrap that in a try-catch. Or if the 
> error is in a node that you aren't updating, you can use data() to check that 
> too. If you have an XML schema, you might also consider a 'validate' 
> expression.
> 
> -- Mike
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> 

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to