Re: [MarkLogic Dev General] alternative ways to access documents

Danny Sokolsky Wed, 09 May 2012 15:23:00 -0700

My guess is that, considering you actually want to return the document, that 
they will be pretty similar in speed.  If you did not need to return the 
document, then I would think the lexicon approach would be faster, as the XPath 
approach would need to grab the document anyway.


-Danny

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Jakob Fix
Sent: Wednesday, May 09, 2012 3:12 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] alternative ways to access documents

Thanks Mike and Danny,

both approaches do work, and they are both fast. Is there any
advantage of one over the other?

I already have a URI lexicon, so Danny's suggestion doesn't need
something that isn't already there, but it's a bit more involved.

The trace log seems to indicates that both are efficient, and the
profiler shows timings in the milliseconds range which is completely
acceptable.

Flip a coin?


xdmp:query-trace(true()),
collection("metadata")//dt:identifier[. = "budget-10-5km8xx3mp60n"]/root(),

xdmp:log("======================="),

fn:doc(
 cts:uri-match(fn:concat("/content/*/", "budget-10-5km8xx3mp60n", ".xml"), (),
   cts:and-query((
      cts:collection-query("metadata"),
      cts:element-value-query(xs:QName("dt:identifier"),
"budget-10-5km8xx3mp60n") )) ) )


2012-05-10 00:04:09.537 Info: App-Services: at 14:24:
xdmp:eval("xquery version &quot;1.0-ml&quot;;&#10;declare namespace
html = ...", (), <options
xmlns="xdmp:eval"><database>4261191992707022248</database><root>/Users/jakob/Proje...</options>)
2012-05-10 00:04:09.537 Info: App-Services: at 14:24: Analyzing path:
fn:collection("metadata")/descendant::dt:identifier[. =
"budget-10-5km8xx3mp60n"]/fn:root(.)
2012-05-10 00:04:09.537 Info: App-Services: at 14:24: Step 1 is
searchable: fn:collection("metadata")
2012-05-10 00:04:09.537 Info: App-Services: at 14:24: Step 2 is
searchable: descendant::dt:identifier[. = "budget-10-5km8xx3mp60n"]
2012-05-10 00:04:09.542 Info: App-Services: at 14:24: Step 3 is
unsearchable: fn:root(.)
2012-05-10 00:04:09.542 Info: App-Services: at 14:24: First 2 steps of
path are searchable:
fn:collection("metadata")/descendant::dt:identifier[. =
"budget-10-5km8xx3mp60n"]
2012-05-10 00:04:09.542 Info: App-Services: at 14:24: Gathering constraints.
2012-05-10 00:04:09.542 Info: App-Services: at 14:0: Step 1
contributed 1 constraint: fn:collection("metadata")
2012-05-10 00:04:09.551 Info: App-Services: at 14:38: Comparison
contributed hash value constraint: dt:identifier =
"budget-10-5km8xx3mp60n"
2012-05-10 00:04:09.551 Info: App-Services: at 14:24: Step 2 predicate
1 contributed 1 constraint: . = "budget-10-5km8xx3mp60n"
2012-05-10 00:04:09.551 Info: App-Services: at 14:38: Comparison
contributed hash value constraint: dt:identifier =
"budget-10-5km8xx3mp60n"
2012-05-10 00:04:09.551 Info: App-Services: at 14:24: Step 2 predicate
1 contributed 1 constraint: . = "budget-10-5km8xx3mp60n"
2012-05-10 00:04:09.551 Info: App-Services: at 14:24: Step 2
contributed 2 constraints: descendant::dt:identifier[. =
"budget-10-5km8xx3mp60n"]
2012-05-10 00:04:09.551 Info: App-Services: at 14:24: Executing search.
2012-05-10 00:04:09.553 Info: App-Services: at 14:24: Selected 1
fragment to filter
2012-05-10 00:04:09.553 Info: App-Services: =======================
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: xdmp:eval("xquery
version &quot;1.0-ml&quot;;&#10;declare namespace html = ...", (),
<options 
xmlns="xdmp:eval"><database>4261191992707022248</database><root>/Users/jakob/Proje...</options>)
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Analyzing path:
fn:doc("/content/article/budget-10-5km8xx3mp60n.xml")
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Step 1 is
searchable: fn:doc("/content/article/budget-10-5km8xx3mp60n.xml")
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Path is fully searchable.
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Gathering constraints.
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Step 1
contributed 1 constraint:
fn:doc("/content/article/budget-10-5km8xx3mp60n.xml")
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Executing search.
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Selected 1
fragment to filter


cheers,
Jakob.


On Wed, May 9, 2012 at 11:13 PM, Danny Sokolsky
<[email protected]> wrote:
> As an alternate, I wonder if you can do this as a URI lexicon query?  For 
> example:
>
> fn:doc(
>  cts:uri-match(fn:concat("/content/*/", $id, ".xml"), (),
>    cts:and-query((
>       cts:collection-query("metadata"),
>       cts:element-value-query(xs:QName("dt:identifier"), $id) )) ) )
>
> -Danny
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Michael Blakeley
> Sent: Wednesday, May 09, 2012 1:56 PM
> To: MarkLogic Developer Discussion
> Cc: General Mark Logic Developer Discussion
> Subject: Re: [MarkLogic Dev General] alternative ways to access documents
>
> You are fetching the document twice, aren't you? Try this:
>
>    collection("metadata")//dt:identifier[. = $id]/root()
>
> I don't really like using // but in this case it may be the best option.
>
> -- Mike
>
> On May 9, 2012, at 13:37, Jakob Fix <[email protected]> wrote:
>
>> Hi,
>>
>> So far I've been successfully using
>> document("/content/[type]/[id].xml") to efficiently access a document.
>> This worked because I had both the [type] and the [id] values that
>> make up the path and the filename.
>>
>> Now my scenario has changed and I no longer know the [type] bit of the
>> path.  For convenience I still want to store all documents belonging
>> to a given [type] in the corresponding subdirectory (although this may
>> be negotiable).
>>
>> The only way that I've found so far, and which is horribly inefficient
>> is to look inside the document for the [id] value, like so:
>>
>> let $doc as node() :=
>> document(xdmp:node-uri(collection("metadata")/*[dt:identifier = $id]))
>>
>> A document has a type specific root element (such as Book, Article,
>> ...), and as one if its children the <dt:identifier> element. This
>> takes consistently longer than a second to execute.
>>
>> I've considered and rejected creating a specific collection for each
>> identifier, i.e. I would end up with one collection per element which
>> seems to be counter the concept of collections which is intended to
>> regroup documents with common properties.
>>
>> I appreciate your input.
>>
>> cheers,
>> Jakob.
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] alternative ways to access documents

Reply via email to