My guess is that, considering you actually want to return the document, that 
they will be pretty similar in speed.  If you did not need to return the 
document, then I would think the lexicon approach would be faster, as the XPath 
approach would need to grab the document anyway.

-Danny

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Jakob Fix
Sent: Wednesday, May 09, 2012 3:12 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] alternative ways to access documents

Thanks Mike and Danny,

both approaches do work, and they are both fast. Is there any
advantage of one over the other?

I already have a URI lexicon, so Danny's suggestion doesn't need
something that isn't already there, but it's a bit more involved.

The trace log seems to indicates that both are efficient, and the
profiler shows timings in the milliseconds range which is completely
acceptable.

Flip a coin?


xdmp:query-trace(true()),
collection("metadata")//dt:identifier[. = "budget-10-5km8xx3mp60n"]/root(),

xdmp:log("======================="),

fn:doc(
 cts:uri-match(fn:concat("/content/*/", "budget-10-5km8xx3mp60n", ".xml"), (),
   cts:and-query((
      cts:collection-query("metadata"),
      cts:element-value-query(xs:QName("dt:identifier"),
"budget-10-5km8xx3mp60n") )) ) )


2012-05-10 00:04:09.537 Info: App-Services: at 14:24:
xdmp:eval("xquery version "1.0-ml";
declare namespace
html = ...", (), <options
xmlns="xdmp:eval"><database>4261191992707022248</database><root>/Users/jakob/Proje...</options>)
2012-05-10 00:04:09.537 Info: App-Services: at 14:24: Analyzing path:
fn:collection("metadata")/descendant::dt:identifier[. =
"budget-10-5km8xx3mp60n"]/fn:root(.)
2012-05-10 00:04:09.537 Info: App-Services: at 14:24: Step 1 is
searchable: fn:collection("metadata")
2012-05-10 00:04:09.537 Info: App-Services: at 14:24: Step 2 is
searchable: descendant::dt:identifier[. = "budget-10-5km8xx3mp60n"]
2012-05-10 00:04:09.542 Info: App-Services: at 14:24: Step 3 is
unsearchable: fn:root(.)
2012-05-10 00:04:09.542 Info: App-Services: at 14:24: First 2 steps of
path are searchable:
fn:collection("metadata")/descendant::dt:identifier[. =
"budget-10-5km8xx3mp60n"]
2012-05-10 00:04:09.542 Info: App-Services: at 14:24: Gathering constraints.
2012-05-10 00:04:09.542 Info: App-Services: at 14:0: Step 1
contributed 1 constraint: fn:collection("metadata")
2012-05-10 00:04:09.551 Info: App-Services: at 14:38: Comparison
contributed hash value constraint: dt:identifier =
"budget-10-5km8xx3mp60n"
2012-05-10 00:04:09.551 Info: App-Services: at 14:24: Step 2 predicate
1 contributed 1 constraint: . = "budget-10-5km8xx3mp60n"
2012-05-10 00:04:09.551 Info: App-Services: at 14:38: Comparison
contributed hash value constraint: dt:identifier =
"budget-10-5km8xx3mp60n"
2012-05-10 00:04:09.551 Info: App-Services: at 14:24: Step 2 predicate
1 contributed 1 constraint: . = "budget-10-5km8xx3mp60n"
2012-05-10 00:04:09.551 Info: App-Services: at 14:24: Step 2
contributed 2 constraints: descendant::dt:identifier[. =
"budget-10-5km8xx3mp60n"]
2012-05-10 00:04:09.551 Info: App-Services: at 14:24: Executing search.
2012-05-10 00:04:09.553 Info: App-Services: at 14:24: Selected 1
fragment to filter
2012-05-10 00:04:09.553 Info: App-Services: =======================
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: xdmp:eval("xquery
version &quot;1.0-ml&quot;;&#10;declare namespace html = ...", (),
<options 
xmlns="xdmp:eval"><database>4261191992707022248</database><root>/Users/jakob/Proje...</options>)
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Analyzing path:
fn:doc("/content/article/budget-10-5km8xx3mp60n.xml")
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Step 1 is
searchable: fn:doc("/content/article/budget-10-5km8xx3mp60n.xml")
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Path is fully searchable.
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Gathering constraints.
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Step 1
contributed 1 constraint:
fn:doc("/content/article/budget-10-5km8xx3mp60n.xml")
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Executing search.
2012-05-10 00:04:09.553 Info: App-Services: at 18:0: Selected 1
fragment to filter


cheers,
Jakob.


On Wed, May 9, 2012 at 11:13 PM, Danny Sokolsky
<[email protected]> wrote:
> As an alternate, I wonder if you can do this as a URI lexicon query?  For 
> example:
>
> fn:doc(
>  cts:uri-match(fn:concat("/content/*/", $id, ".xml"), (),
>    cts:and-query((
>       cts:collection-query("metadata"),
>       cts:element-value-query(xs:QName("dt:identifier"), $id) )) ) )
>
> -Danny
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Michael Blakeley
> Sent: Wednesday, May 09, 2012 1:56 PM
> To: MarkLogic Developer Discussion
> Cc: General Mark Logic Developer Discussion
> Subject: Re: [MarkLogic Dev General] alternative ways to access documents
>
> You are fetching the document twice, aren't you? Try this:
>
>    collection("metadata")//dt:identifier[. = $id]/root()
>
> I don't really like using // but in this case it may be the best option.
>
> -- Mike
>
> On May 9, 2012, at 13:37, Jakob Fix <[email protected]> wrote:
>
>> Hi,
>>
>> So far I've been successfully using
>> document("/content/[type]/[id].xml") to efficiently access a document.
>> This worked because I had both the [type] and the [id] values that
>> make up the path and the filename.
>>
>> Now my scenario has changed and I no longer know the [type] bit of the
>> path.  For convenience I still want to store all documents belonging
>> to a given [type] in the corresponding subdirectory (although this may
>> be negotiable).
>>
>> The only way that I've found so far, and which is horribly inefficient
>> is to look inside the document for the [id] value, like so:
>>
>> let $doc as node() :=
>> document(xdmp:node-uri(collection("metadata")/*[dt:identifier = $id]))
>>
>> A document has a type specific root element (such as Book, Article,
>> ...), and as one if its children the <dt:identifier> element. This
>> takes consistently longer than a second to execute.
>>
>> I've considered and rejected creating a specific collection for each
>> identifier, i.e. I would end up with one collection per element which
>> seems to be counter the concept of collections which is intended to
>> regroup documents with common properties.
>>
>> I appreciate your input.
>>
>> cheers,
>> Jakob.
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to