I think what is going on here is that the defaults are tripping you up.  
MarkLogic tries to be smart about the default query options like wildcarded and 
case-sensitive, setting the defaults based on the actual query text and on the 
database settings.  The defaults work well for your typical text query, but 
sometimes for these data-like queries, it will default to something that is 
inconvenient. 

To avoid this confusion, particularly when searching for values with numbers, 
caps, and punctuation, it is good to be explicit in the query about the query 
options, that way there are no surprises and your query behavior will not 
change with your data and configuration.

For example, consider the following:

let $x := <testNode insertDate="10JUN2009" />
let $y := <testNode>
            <insertDate>10JUN2009</insertDate>
        </testNode>
return
(cts:contains($x, cts:element-attribute-value-query(
     xs:QName("testNode"), xs:QName("insertDate"),
     "10JUN2009*")),
cts:contains($x, cts:element-attribute-value-query(
     xs:QName("testNode"), xs:QName("insertDate"),
     "10JUN2009*", ("wildcarded", "case-sensitive",
      "punctuation-sensitive"))),
cts:contains($y, cts:element-value-query(
     xs:QName("insertDate"),
     "10JUN2009*")),
cts:contains($y, cts:element-value-query(
     xs:QName("insertDate"),
     "10JUN2009*", ("wildcarded", "case-sensitive",
      "punctuation-sensitive")))
)

returns:

false
true
false
true

So make sure you are running the query you think you are running.

Also, you might consider creating range indexes on those values and then using 
either the lexicon APIs to get the values or cts:element-range-query and 
cts:element-attribute-range-query.  THese might prove to be faster.

Does that help?

-Danny

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Geert Josten
Sent: Wednesday, June 10, 2009 8:59 AM
To: General Mark Logic Developer Discussion
Subject: RE: [MarkLogic Dev General] element-value-query 
vs.element-attribute-value-query behavior difference

Hi,

I would have expected that doing a wildcard-enabled search for "10JUN2009*" 
would return two results, but I could be wrong. Can you tell which wildcard 
indexes are enabled? Is trailing wildcard index enabled?

And wildcard can behave in some unexpected way, they tend not to match against 
word-boundaries. I am not sure how a minus sign is treated in a value search. 
One of the document on the MarkLogic site should contain some details about 
this, but I forgot which one.

Just my quick two cents. HTH!

Kind regards,
Geert

>


Drs. G.P.H. Josten
Consultant


http://www.daidalos.nl/
Daidalos BV
Source of Innovation
Hoekeindsehof 1-4
2665 JZ Bleiswijk
Tel.: +31 (0) 10 850 1200
Fax: +31 (0) 10 850 1199
http://www.daidalos.nl/
KvK 27164984
De informatie - verzonden in of met dit emailbericht - is afkomstig van 
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit 
bericht kunnen geen rechten worden ontleend.


> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Glidden, Douglass A
> Sent: woensdag 10 juni 2009 15:27
> To: [email protected]
> Subject: [MarkLogic Dev General] element-value-query vs.
> element-attribute-value-query behavior difference
>
> Hi,
>
> I've run into an odd difference in the behavior of
> cts:element-attribute-value-query() as oppposed to
> cts:element-value-query().  (To tell the truth, neither one
> of them behaves exactly like I would prefer, but for now I'm
> just trying to figure out what is making the difference
> between them.)  Assume the following four sample documents
> are in the database:
>
> *     <testNode insertDate="10JUN2009" />
> *     <testNode insertDate="10JUN2009-0849" />
> *     <testNode>
>     <insertDate>10JUN2009</insertDate>
> </testNode>
> *     <testNode>
>     <insertDate>10JUN2009-0849</insertDate>
> </testNode>
>
> Wildcard searches are enabled, but pretty much all other
> settings for the Documents database are default.
>
> Executing this query:
>
>       cts:search(/testNode,
> cts:element-value-query(xs:QName("insertDate"), "10JUN2009*"))
>
> returns a single result, as expected:
>
>       <testNode>
>           <insertDate>10JUN2009</insertDate>
>       </testNode>
>
> On the contrary, executing this query:
>
>       cts:search(/testNode,
> cts:element-attribute-value-query(xs:QName("testNode"),
> xs:QName("insertDate"), "10JUN2009*"))
>
> returns an empty sequence, which is inconsistent with the
> behavior of the element-value-query.  Further complicating
> matters, executing an otherwise identical, case-insensitive
> version of the above query:
>
>       cts:search(/testNode,
> cts:element-attribute-value-query(xs:QName("testNode"),
> xs:QName("insertDate"), "10jun2009*"))
>
> returns a single result:
>
>       <testNode insertDate="10JUN2009" />
>
> which is what I expected the first query to return!
>
> Note that if I remove the trailing wildcard, the
> case-sensitive and case-insensitive versions both return the
> same result, so the discrepancy seems to exist only for an
> element-attribute-value-query (or
> element-attribute-word-query) that combines case-sensitivity
> with a wildcard.
>
> I'd appreciate any ideas you can throw out.
>
> Thanks,
>
> Doug Glidden
> Software Engineer
> The Boeing Company
> [email protected]
>
> _______________________________________________
> General mailing list
> [email protected]
> http://xqzone.com/mailman/listinfo/general
>

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to