That's correct. The values are from a range index that stores only
autocomplete data, which is only a fraction (~12MB) of the total test
database (~1GB).
count(cts:element-values(xs:QName("element"))) => 97676
So very similar to your 100k document test. We ran this with all the
values stored in a single document and from a document for each value, and
the test results were the same. However, our test user role only has
"read" not "update" permission on the documents.
-Will
On 3/26/13 12:19 PM, "Michael Blakeley" <[email protected]> wrote:
>So your database size is *not* proportional to the number of values?
>
>How many values do you have?
>
>-- Mike
>
>On 26 Mar 2013, at 11:27 , Will Thompson <[email protected]>
>wrote:
>
>> After further testing it appears the latency increase for non-admin
>>seems
>> to be proportional to database size (note NOT range index size). We
>> bootstrapped a fresh database, and with only some test documents loaded
>> the query speeds were virtually identical. However, as content is loaded
>> into the db, query time for the non-admin user essentially doubles every
>> time the database size doubles.
>>
>> -Will
>>
>>
>> On 3/25/13 5:12 PM, "Michael Blakeley" <[email protected]> wrote:
>>
>>> For reference, here's what I tried. I only created 100,000 documents
>>>and
>>> they are very small.
>>>
>>> (: setup :)
>>> (1 to 100 * 1000) ! (
>>> xdmp:document-insert(
>>> '/test/'||.,
>>> element test {
>>> attribute id { . },
>>> element a { xdmp:integer-to-hex(xdmp:random()) } },
>>> ('read', 'update') ! xdmp:permission('test', .))),
>>> xdmp:elapsed-time()
>>>
>>> That takes about 30-sec on my laptop.
>>>
>>> (: test - admin :)
>>> (for $i in 1 to 1000
>>> return cts:element-value-match(xs:QName("a"), "e*",
>>>"limit=1"))[last()],
>>> xdmp:elapsed-time()
>>> =>
>>> e0003170ed3130a4
>>> PT0.084827S
>>>
>>> (: test - user :)
>>> xdmp:eval('
>>> (for $i in 1 to 1000
>>> return cts:element-value-match(
>>> xs:QName("a"), "e*", "limit=1"))[last()]',
>>> (),
>>> <options xmlns="xdmp:eval">
>>> <user-id>test</user-id>
>>> </options>),
>>> xdmp:elapsed-time()
>>> =>
>>> e0003170ed3130a4
>>> PT0.07878S
>>>
>>> In this particular run the non-admin user was faster - but that is
>>> probably a caching effect, and anyway the difference was not
>>>significant.
>>> I'm using 6.0-2.1 on OS X, running the queries in cq.
>>>
>>> There are about 6700 values that match 'e*'. According to the profiler,
>>> about 50% of the elapsed time is spent in the cts:element-value-match
>>> call. The rest is split between the FLWOR and the predicate on last().
>>>
>>> -- Mike
>>>
>>> On 25 Mar 2013, at 15:15 , Will Thompson <[email protected]>
>>> wrote:
>>>
>>>> I've tested this on on 6.0-2 (OSX) and 6.0-2.2 (Windows), and both
>>>>have
>>>> the same issue. The xdmp:plan output is the same under both users.
>>>> Maybe I
>>>> should try creating a more isolated test case...
>>>>
>>>> -Will
>>>>
>>>>
>>>> On 3/25/13 2:53 PM, "Michael Blakeley" <[email protected]> wrote:
>>>>
>>>>> An amp shouldn't really be necessary, but it's puzzling that you see
>>>>> such
>>>>> a large difference. I tried to set up a similar test with some data I
>>>>> had
>>>>> handy, and saw a difference of less than 5% between admin and
>>>>>non-admin
>>>>> users.
>>>>>
>>>>> Which release are you using?
>>>>>
>>>>> -- Mike
>>>>>
>>>>> On 25 Mar 2013, at 14:20 , Will Thompson <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Mike - It seems to ignore the query-trace (inside or outside eval),
>>>>>> but
>>>>>> I
>>>>>> suspect you're right. Unfortunately this is dramatic enough to be
>>>>>>the
>>>>>> difference between a usable and unusable autocomplete solution, in
>>>>>> which
>>>>>> we're squeezing as much as we can into a limited time budget. We
>>>>>>will
>>>>>> need
>>>>>> to run the query as case- and diacritic-insensitive.
>>>>>>
>>>>>> Will I need to amp this operation to run under the admin role to be
>>>>>>be
>>>>>> performant?
>>>>>>
>>>>>> -W
>>>>>>
>>>>>>
>>>>>> On 3/25/13 1:54 PM, "Michael Blakeley" <[email protected]> wrote:
>>>>>>
>>>>>>> The non-admin user should be checking extra query terms, to enforce
>>>>>>> the
>>>>>>> read permissions it has through its roles. That might be enough to
>>>>>>> explain the difference. I think the extra terms will show up in an
>>>>>>> xdmp:query-trace, if you want to verify that.
>>>>>>>
>>>>>>> You might also try the 'diacritic-sensitive' and 'case-sensitive'
>>>>>>> options. That should speed up the value-matching a bit.
>>>>>>>
>>>>>>> -- Mike
>>>>>>>
>>>>>>> On 25 Mar 2013, at 13:40 , Will Thompson
>>>>>>><[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I ran this in a loop 100 times for the limited user and for admin,
>>>>>>>> and
>>>>>>>> the limited user was roughly 50X slower than admin:
>>>>>>>>
>>>>>>>> xdmp:eval(concat(
>>>>>>>> 'xquery version "1.0-ml";',
>>>>>>>> 'cts:element-value-match(xs:QName("element"), "value*",
>>>>>>>> "limit=1")'),
>>>>>>>> (),
>>>>>>>> <options xmlns="xdmp:eval">
>>>>>>>> <user-id>{ xdmp:user("limited") }</user-id>
>>>>>>>> </options>)
>>>>>>>>
>>>>>>>> The limited user has a role with read permissions on the documents
>>>>>>>> containing those values (obviously, since it returns non-empty
>>>>>>>> results),
>>>>>>>> and also has the app-user role. Otherwise, this user has no other
>>>>>>>> roles.
>>>>>>>> With log level = debug, nothing really jumps out at me. I only see
>>>>>>>> occasional "InMemoryStand", "OnDiskStand", and "Saving" messages,
>>>>>>>> and
>>>>>>>> they appear regardless of the user running the query.
>>>>>>>>
>>>>>>>> -Will
>>>>>>>> _______________________________________________
>>>>>>>> General mailing list
>>>>>>>> [email protected]
>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> [email protected]
>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> [email protected]
>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> [email protected]
>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>
>>>
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>
>_______________________________________________
>General mailing list
>[email protected]
>http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general