There are a couple things going on here:
(1) Queries do matching per fragment, so if you do an and query of two value
queries or range queries, there is no constraint that the relationship elements
be the same instance in the fragment.
(2) Wrapping an element-query on relationship around the and-query will
constrain things to be happening within the same instance.
cts:element-query(xs:QName("relationship"),
cts:and-query((
cts:element-attribute-value-query(xs:QName("relationship"),xs:QName("id"),"one","exact"),
cts:element-attribute-value-query(xs:QName("relationship"),xs:QName("inherited"),"true","exact"))))
If you have the right positions enabled, the indexes can resolve this without
the filter (although if you have a lot of relationship instances in a document
this can be quite expensive)... except:
(3) Empty element positions are problematic. Positions are word positions, and
the position of an element is the word position of the first word when the
element starts to the word position of the first word after the element ends.
Positions of attributes are the positions of their element. If everything is an
empty element, you have no words and everything has the same position: 0 to 1
and so positions cannot discriminate between what is happening in one instance
of relationship and the next, and you have to rely on filtered search to get
you the answer, at which point expressing this as an XPath is a lot less
verbose. You can force positions to work by making sure there is at least one
word somehow, always, between relationship elements.
//Mary
On 07/15/2015 06:37 AM, Dan Meyers wrote:
I feel sure it should be, and I’m just failing to make proper use of the
system, but I’d appreciate more eyes on this.
Consider the following set of (contrived, simple) documents:
Doc 1:
<test uuid="X" >
<relationships>
<relationship id="one" />
<relationship id="two" inherited="true" />
</relationships>
</test>
Doc 2:
<test uuid="Y">
<relationships>
<relationship id="one" inherited="false" />
<relationship id="three" />
</relationships>
</test>
Doc 3:
<test uuid="Z">
<relationships>
<relationship id="one" inherited="true" />
<relationship id="four" />
</relationships>
</test>
I want to be able to return all those documents which contain at least one
relationship with id equal to “one” and inherited not equal to “true” (i.e.
false or not present). So in the above example I’d expect to be returned docs 1
and 2. In the real world I’ll be searching across over 100 million documents,
so I need to be able to do this via indexes and xquery not looping over all
available documents and examining their content.
With a cts:and-not-query looking for the presence of id as “one” and not having
inherited as “true” I only get returned doc2. As I understand it, this is
because the and-not-query is matching inherited equal to “true” in the second
relationship of doc1 and discarding the entire doc, even though that
relationship is not for the desired id.
Is there any kind of query (with any necessary indexes) I can construct that
will do what I want and only pay attention to the inherited field when it is
within the same relationship element as id?
An alternative we have considered is to create 2 different element types within
the relationships parent, so that we have relationship, and relationship_direct
(or whatever), and the query that doesn’t want to count inherited relationships
looks only at relationship_direct elements, but that seems like a hacky method
of doing this. We’ve also considered separate documents, but because of all the
other data held within them and not shown in this basic example that would be a
massive headache. If it would help, we could ensure the inherited attribute was
always present, and set to true or false as necessary, rather than normally
being either true or not present.
How would other people go about doing this? Any ideas would be great. In case
it matters, and there’s some new MarkLogic 8 function that does this, we’re
currently on MarkLogic 7 in our live environment. We will be upgrading
eventually, but not soon enough for us to be able to wait for that before we
make this query work.
Thanks
Dan
----------------------------
http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
---------------------
_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general