Paul, That may not be intractable, depending on the response time you need. E.g. this runs in 1M values in 10 seconds on my laptop:
let $m1 := map:map() let $add:= for $i in 1 to 1000000 return map:put($m1, xs:string(xdmp:random(1000000)), true()) let $m2 := map:map() let $add:= for $i in 1 to 10000 return map:put($m2, xs:string(xdmp:random(1000000)), true()) return ( count(map:keys($m1 - $m2)), xdmp:elapsed-time() ) Yours, Damon From: Paul M [mailto:[email protected]] Sent: Friday, February 08, 2013 10:54 AM To: Damon Feldman; MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] finding an id that does not exist Hi Damon, The number of uniqueIds is somewhat high, so element-values will be rather larger (1mil+). The control seq ids will be in 1k-10k range "non-sequential". The missing id's from the control seq likely be in the 100 -1000. But I'll chk and see. Thanks.. ________________________________ From: Damon Feldman <[email protected]<mailto:[email protected]>> To: Paul M <[email protected]<mailto:[email protected]>>; MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Sent: Friday, February 8, 2013 9:49 AM Subject: RE: [MarkLogic Dev General] finding an id that does not exist Paul, I believe you can range-index the uniqueId, element or attribute, then call cts:element-values() with the option to return data as a map. You can put your other sequence into a map also and “subtract” maps via the “-“ operator to get a fast set difference. Yours, Damon -- Damon Feldman Sr. Principal Consultant, MarkLogic From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Paul M Sent: Friday, February 08, 2013 9:19 AM To: [email protected]<mailto:[email protected]> Subject: [MarkLogic Dev General] finding an id that does not exist 4 documents: docA, docB, docC, docD. Each have a unique id field with values: 111, 222, 333, 555 respectively. I have a sequence 111,222,333,444. 444 does not exist in the document set docA, docB, docC, docD. Is there a faster way of finding this information. I have looked at a few cts functions but I keep coming back to recurse through each sequence 111,222,333,444 and do xdmp:estimate cts:search cts:element-value-query on each value. Fast, but still takes time. Maybe co-occurrence, if data has multiple id fields? 111-aaa,222-bbb,333-ccc,555-eee thanks
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
