I can't tell if you want a list of matching documents per value, or for all values. I could imagine a "report" like the following:
S0016-5085(68)70198-0 -abc.xml -def.xml -ghi.xml S0016-5085(68)70199-2 -123.xml -456.xml -789.xml S0016-5085(68)70200-6 -a12.xml Etc. Or, do you just want a list of URI's for documents that match any of the terms? I would use cts:uris to get the URIs instead of search:search. You can filter cts:uris with cts:element-value-query, and you can pass in many values. 100K may be too many, depending on your hardware. Another approach would be to use reverse-query. You can make each of your documents an or query of multiple element-value-query queries. Then you can pass your document with 100K PI values in using reverse-query and it will return matching queries, which in this case map to your documents and their URIs. This may be overkill for what you're trying to do. Kelly Message: 2 Date: Wed, 20 Jul 2011 13:28:05 +0530 From: Vijayasekar Padmanaban <[email protected]> Subject: Re: [MarkLogic Dev General] Search using 100k terms To: General MarkLogic Developer Discussion <[email protected]> Message-ID: <66586dccf3922145b01ca0b975f3a5670a32da1...@chnshlmbx03.ad.infosys.com> Content-Type: text/plain; charset="us-ascii" Hi Jason, Sorry for the confusion. Please find below the snippet of the xml we have in DB. (DB is having 10 million xml documents) <ja:item-info> <ja:jid>YMSG</ja:jid> <ja:aid>0103883</ja:aid> <ce:pii>S0011-3840(01)70009-3</ce:pii> <ce:doi>10.1016/S0011-3840(01)70009-3</ce:doi> <ce:copyright type="other" year="2001"/> </ja:item-info> The file we used to upload will have the PIIs (which I had mentioned as terms in my earlier email) as shown below: (There could be 100k PIIs in the file) S0016-5085(68)70198-0 S0016-5085(68)70199-2 S0016-5085(68)70200-6 S0016-5085(68)70201-8 S0016-5085(68)70202-X S0016-5085(68)70203-1 S0016-5085(68)70204-3 ..... ..... I need to identify documents that matches the PIIs (which I had mentioned as terms in my earlier email) in the file. Currently we are using search:search() API in our application. Hence I had tried using the additional query option of search API as shown below: cts:element-value-query(xs:QName("ce:pii"), $uploadedPIIs as xs:string*) But this additional query option is taking lot of time to yield result. So is there any other better way to perform this? Please suggest. Regards, Vijay _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
