There are a couple things going on here:
(1) Queries do matching per fragment, so if you do an and query of two value 
queries or range queries, there is no constraint that the relationship elements 
be the same instance in the fragment.

(2) Wrapping an element-query on relationship around the and-query will 
constrain things to be happening within the same instance.

cts:element-query(xs:QName("relationship"),
  cts:and-query((
    
cts:element-attribute-value-query(xs:QName("relationship"),xs:QName("id"),"one","exact"),
    
cts:element-attribute-value-query(xs:QName("relationship"),xs:QName("inherited"),"true","exact"))))

If you have the right positions enabled, the indexes can resolve this without 
the filter (although if you have a lot of relationship instances in a document 
this can be quite expensive)... except:

(3) Empty element positions are problematic. Positions are word positions, and 
the position of an element is the word position of the first word when the 
element starts to the word position of the first word after the element ends. 
Positions of attributes are the positions of their element. If everything is an 
empty element, you have no words and everything has the same position: 0 to 1 
and so positions cannot discriminate between what is happening in one instance 
of relationship and the next, and you have to rely on filtered search to get 
you the answer, at which point expressing this as an XPath is a lot less 
verbose.  You can force positions to work by making sure there is at least one 
word somehow, always, between relationship elements.

//Mary

On 07/15/2015 06:37 AM, Dan Meyers wrote:
I feel sure it should be, and I’m just failing to make proper use of the 
system, but I’d appreciate more eyes on this.

Consider the following set of (contrived, simple) documents:

Doc 1:
<test uuid="X" >
<relationships>
<relationship id="one" />
<relationship id="two" inherited="true" />
</relationships>
</test>

Doc 2:
<test uuid="Y">
<relationships>
<relationship id="one" inherited="false" />
<relationship id="three" />
</relationships>
</test>

Doc 3:
<test uuid="Z">
<relationships>
<relationship id="one" inherited="true" />
<relationship id="four" />
</relationships>
</test>

I want to be able to return all those documents which contain at least one 
relationship with id equal to “one” and inherited not equal to “true” (i.e. 
false or not present). So in the above example I’d expect to be returned docs 1 
and 2. In the real world I’ll be searching across over 100 million documents, 
so I need to be able to do this via indexes and xquery not looping over all 
available documents and examining their content.

With a cts:and-not-query looking for the presence of id as “one” and not having 
inherited as “true” I only get returned doc2. As I understand it, this is 
because the and-not-query is matching inherited equal to “true” in the second 
relationship of doc1 and discarding the entire doc, even though that 
relationship is not for the desired id.

Is there any kind of query (with any necessary indexes) I can construct that 
will do what I want and only pay attention to the inherited field when it is 
within the same relationship element as id?

An alternative we have considered is to create 2 different element types within 
the relationships parent, so that we have relationship, and relationship_direct 
(or whatever), and the query that doesn’t want to count inherited relationships 
looks only at relationship_direct elements, but that seems like a hacky method 
of doing this. We’ve also considered separate documents, but because of all the 
other data held within them and not shown in this basic example that would be a 
massive headache. If it would help, we could ensure the inherited attribute was 
always present, and set to true or false as necessary, rather than normally 
being either true or not present.

How would other people go about doing this? Any ideas would be great. In case 
it matters, and there’s some new MarkLogic 8 function that does this, we’re 
currently on MarkLogic 7 in our live environment. We will be upgrading 
eventually, but not soon enough for us to be able to wait for that before we 
make this query work.

Thanks

Dan



----------------------------

http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

---------------------



_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general


_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to