That's correct. The values are from a range index that stores only
autocomplete data, which is only a fraction (~12MB) of the total test
database (~1GB). 

count(cts:element-values(xs:QName("element"))) => 97676

So very similar to your 100k document test. We ran this with all the
values stored in a single document and from a document for each value, and
the test results were the same. However, our test user role only has
"read" not "update" permission on the documents.

-Will


On 3/26/13 12:19 PM, "Michael Blakeley" <[email protected]> wrote:

>So your database size is *not* proportional to the number of values?
>
>How many values do you have?
>
>-- Mike
>
>On 26 Mar 2013, at 11:27 , Will Thompson <[email protected]>
>wrote:
>
>> After further testing it appears the latency increase for non-admin
>>seems
>> to be proportional to database size (note NOT range index size). We
>> bootstrapped a fresh database, and with only some test documents loaded
>> the query speeds were virtually identical. However, as content is loaded
>> into the db, query time for the non-admin user essentially doubles every
>> time the database size doubles.
>> 
>> -Will
>> 
>> 
>> On 3/25/13 5:12 PM, "Michael Blakeley" <[email protected]> wrote:
>> 
>>> For reference, here's what I tried. I only created 100,000 documents
>>>and
>>> they are very small.
>>> 
>>> (: setup :)
>>> (1 to 100 * 1000) ! (
>>> xdmp:document-insert(
>>>   '/test/'||.,
>>>   element test {
>>>     attribute id { . },
>>>     element a { xdmp:integer-to-hex(xdmp:random()) } },
>>>   ('read', 'update') ! xdmp:permission('test', .))),
>>> xdmp:elapsed-time()
>>> 
>>> That takes about 30-sec on my laptop.
>>> 
>>> (: test - admin :)
>>> (for $i in 1 to 1000
>>> return cts:element-value-match(xs:QName("a"), "e*",
>>>"limit=1"))[last()],
>>> xdmp:elapsed-time()
>>> =>
>>> e0003170ed3130a4
>>> PT0.084827S
>>> 
>>> (: test - user :)
>>> xdmp:eval('
>>> (for $i in 1 to 1000
>>>  return cts:element-value-match(
>>>    xs:QName("a"), "e*", "limit=1"))[last()]',
>>> (),
>>> <options xmlns="xdmp:eval">
>>>   <user-id>test</user-id>
>>> </options>),
>>> xdmp:elapsed-time()
>>> =>
>>> e0003170ed3130a4
>>> PT0.07878S
>>> 
>>> In this particular run the non-admin user was faster - but that is
>>> probably a caching effect, and anyway the difference was not
>>>significant.
>>> I'm using 6.0-2.1 on OS X, running the queries in cq.
>>> 
>>> There are about 6700 values that match 'e*'. According to the profiler,
>>> about 50% of the elapsed time is spent in the cts:element-value-match
>>> call. The rest is split between the FLWOR and the predicate on last().
>>> 
>>> -- Mike
>>> 
>>> On 25 Mar 2013, at 15:15 , Will Thompson <[email protected]>
>>> wrote:
>>> 
>>>> I've tested this on on 6.0-2 (OSX) and 6.0-2.2 (Windows), and both
>>>>have
>>>> the same issue. The xdmp:plan output is the same under both users.
>>>> Maybe I
>>>> should try creating a more isolated test case...
>>>> 
>>>> -Will
>>>> 
>>>> 
>>>> On 3/25/13 2:53 PM, "Michael Blakeley" <[email protected]> wrote:
>>>> 
>>>>> An amp shouldn't really be necessary, but it's puzzling that you see
>>>>> such
>>>>> a large difference. I tried to set up a similar test with some data I
>>>>> had
>>>>> handy, and saw a difference of less than 5% between admin and
>>>>>non-admin
>>>>> users.
>>>>> 
>>>>> Which release are you using?
>>>>> 
>>>>> -- Mike
>>>>> 
>>>>> On 25 Mar 2013, at 14:20 , Will Thompson <[email protected]>
>>>>> wrote:
>>>>> 
>>>>>> Mike - It seems to ignore the query-trace (inside or outside eval),
>>>>>> but
>>>>>> I
>>>>>> suspect you're right. Unfortunately this is dramatic enough to be
>>>>>>the
>>>>>> difference between a usable and unusable autocomplete solution, in
>>>>>> which
>>>>>> we're squeezing as much as we can into a limited time budget. We
>>>>>>will
>>>>>> need
>>>>>> to run the query as case- and diacritic-insensitive.
>>>>>> 
>>>>>> Will I need to amp this operation to run under the admin role to be
>>>>>>be
>>>>>> performant?
>>>>>> 
>>>>>> -W
>>>>>> 
>>>>>> 
>>>>>> On 3/25/13 1:54 PM, "Michael Blakeley" <[email protected]> wrote:
>>>>>> 
>>>>>>> The non-admin user should be checking extra query terms, to enforce
>>>>>>> the
>>>>>>> read permissions it has through its roles. That might be enough to
>>>>>>> explain the difference. I think the extra terms will show up in an
>>>>>>> xdmp:query-trace, if you want to verify that.
>>>>>>> 
>>>>>>> You might also try the 'diacritic-sensitive' and 'case-sensitive'
>>>>>>> options. That should speed up the value-matching a bit.
>>>>>>> 
>>>>>>> -- Mike
>>>>>>> 
>>>>>>> On 25 Mar 2013, at 13:40 , Will Thompson
>>>>>>><[email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I ran this in a loop 100 times for the limited user and for admin,
>>>>>>>> and
>>>>>>>> the limited user was  roughly 50X slower than admin:
>>>>>>>> 
>>>>>>>> xdmp:eval(concat(
>>>>>>>> 'xquery version "1.0-ml";',
>>>>>>>> 'cts:element-value-match(xs:QName("element"), "value*",
>>>>>>>> "limit=1")'),
>>>>>>>> (),
>>>>>>>> <options xmlns="xdmp:eval">
>>>>>>>>  <user-id>{ xdmp:user("limited") }</user-id>
>>>>>>>> </options>)
>>>>>>>> 
>>>>>>>> The limited user has a role with read permissions on the documents
>>>>>>>> containing those values (obviously, since it returns non-empty
>>>>>>>> results),
>>>>>>>> and also has the app-user role. Otherwise, this user has no other
>>>>>>>> roles.
>>>>>>>> With log level = debug, nothing really jumps out at me. I only see
>>>>>>>> occasional "InMemoryStand", "OnDiskStand", and "Saving" messages,
>>>>>>>> and
>>>>>>>> they appear regardless of the user running the query.
>>>>>>>> 
>>>>>>>> -Will
>>>>>>>> _______________________________________________
>>>>>>>> General mailing list
>>>>>>>> [email protected]
>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> [email protected]
>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>> 
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> [email protected]
>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> [email protected]
>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> 
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> 
>>> 
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>> 
>
>_______________________________________________
>General mailing list
>[email protected]
>http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to