Hi,
We are facing performance issue while fetching documents from Marklogic, Please
let us know your suggestions. We are using Java and Marklogic here. PFB the
detailed scenario.
- We have large volume of xmls in a directory say /TEST/EMPLOYEE.
- Here, any two xmls are said to be related if they have a common doc
number. Like
xml1
xml2
<test>
<test>
<doc number = "1200">
<doc number = "1200">
....
.....
</text>
</test>
- Our requirement here is that we have to insert a node is both the
documents mentioning that a document has link to another document. Like.
xml1
xml2
<test>
<test>
<doc name="refer" number = "1200">
<doc name="refer" number = "1200">
<link node ="xml2">
<link node = "xml1">
....
.....
</text>
</test>
Approach
- In the directory /TEST/EMPLOYEE loop through all the document and
get all the document that has the node <doc number="xxx"> as this node might
not be present in certain documents. Then this list is sent back to JAVA.
for $doc in xdmp:directory($dir,"1")
return if(doc(xdmp:node-uri($doc)) $type ) then(
let
$searchResponse:=search:search("",
<options
xmlns="http://marklogic.com/appservices/search">
<additional-query>{
cts:and-query((
cts:directory-query(($dir),"1"),
cts:element-attribute-value-query(xs:QName('doc'),xs:QName('name'),'refer'),
))}</additional-query>,
</options>)//search:result/@uri
let $linkedDocs := if(
$searchResponse ne "" ) then (
for $lk in $searchResponse
return
concat($document,"==>",$lk)
)
else
(concat((xdmp:node-uri($doc)),"==>nolink"))
return
$linkedDocs
)
else ()
- This list of documents is iterated (loop) and using search:search
here, the link document ID is determined.
- A new node is created in the corresponding xml with its linked
document name.
Problem
The problem that we are facing here is, as the document volume
is high, the first step here itself takes too long to get execute and results
in XDMP:EXTIME. To log the events we are passing the list from Marklogic to
JAVA. We already have range indexes available in these elements.
Regards,
Shaik Ummer Faruk D.
**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are
not
to copy, disclose, or distribute this e-mail or its contents to any other
person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has
taken
every reasonable precaution to minimize this risk, but is not liable for any
damage
you may sustain as a result of any virus in this e-mail. You should carry out
your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this
e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general