Re: [basex-talk] Is there any documentation on the narrow limits of XQuery index optimization in BaseX?

2017-09-18 Thread Christian Grün
> The thing I miss most is a function
> like xquery:eval that accepts a function as an argument but also takes a
> context and does that runtime optimization.

I assume you are looking for something like the following query?

  xquery:eval-func(
function($db, $name) { db:open($db)//person[name = $name] },
[ 'persons', 'john' ]
  )

This sounds like an enticing idea. It is hard to realize, though, as
we would need to recompile code that has already been rewritten to an
internal representation that can be evaluated by our XQuery processor.

> Or a way to convert a function
> to  string.

Same here: It would require a lot of work to create a bug free XQuery
string representation of the internal code.




> Am 18.09.2017 um 15:59 schrieb Christian Grün:
>>
>> Hi Omar,
>>
>> Our current XQuery optimizer opens the addressed database in order to
>> find out if it has the required index structures, and if these are
>> up-to-date. Moreover, the cheapest index lookup will be selected if
>> there are several index candidates. For example, in the following
>> query, it will be likely that the second predicate will be rewritten
>> for index access:
>>
>>db:open('persons')//person[country = 'Italy'][@id = 'id124']
>>
>> If the addressed database is not statically known, these checks cannot
>> be performed that easily. Further implications and in-depth
>> information can be found in »Storing and Querying Large XML Instances«
>> [1].
>>
>> Here are two ideas how this could be tackled:
>>
>> • We could add an XQuery pragma to enforce specific index rewritings.
>> Examples:
>>
>>for $n in 1 to 10
>>for $db in ('persons' || $n)
>>for $person in db:open($db)//person
>>where (# basex:index #) { $person/country = 'Italy' }
>>where $person/@id = 'id124's
>>return $person
>>
>>(1 to 10) ! db:open('persons' || .)//person
>>  [(# basex:index #) { country = 'Italy' }]
>>  [@id = 'id124']
>>
>> • We could create multiple query plans at compile time (with and
>> without index, one rewriting for each index candidate) and choose the
>> one that is expected to be the cheapest at evaluation time. This would
>> definitely be the most flexible option (but the number of query plans
>> increases exponentially if you have nested FLWOR expressions and
>> queries with numerous predicates or where clauses).
>>
>> Cheers,
>> Christian
>>
>> [1] http://basex.org/about-us/publications/
>>
>


Re: [basex-talk] Is there any documentation on the narrow limits of XQuery index optimization in BaseX?

2017-09-18 Thread Omar Siam

Hi!

Interesting ideas. I don't like the pragma idea that much because there 
is already sth. like that with xquery:eval. The thing I miss most is a 
function like xquery:eval that accepts a function as an argument but 
also takes a context and does that runtime optimization. Or a way to 
convert a function to  string. Is there already sth. like this? I though 
it might be xquery:invoke but that seems to do sth. else.


Best regards

Omar


Am 18.09.2017 um 15:59 schrieb Christian Grün:

Hi Omar,

Our current XQuery optimizer opens the addressed database in order to
find out if it has the required index structures, and if these are
up-to-date. Moreover, the cheapest index lookup will be selected if
there are several index candidates. For example, in the following
query, it will be likely that the second predicate will be rewritten
for index access:

   db:open('persons')//person[country = 'Italy'][@id = 'id124']

If the addressed database is not statically known, these checks cannot
be performed that easily. Further implications and in-depth
information can be found in »Storing and Querying Large XML Instances«
[1].

Here are two ideas how this could be tackled:

• We could add an XQuery pragma to enforce specific index rewritings. Examples:

   for $n in 1 to 10
   for $db in ('persons' || $n)
   for $person in db:open($db)//person
   where (# basex:index #) { $person/country = 'Italy' }
   where $person/@id = 'id124'
   return $person

   (1 to 10) ! db:open('persons' || .)//person
 [(# basex:index #) { country = 'Italy' }]
 [@id = 'id124']

• We could create multiple query plans at compile time (with and
without index, one rewriting for each index candidate) and choose the
one that is expected to be the cheapest at evaluation time. This would
definitely be the most flexible option (but the number of query plans
increases exponentially if you have nested FLWOR expressions and
queries with numerous predicates or where clauses).

Cheers,
Christian

[1] http://basex.org/about-us/publications/



Re: [basex-talk] Is there any documentation on the narrow limits of XQuery index optimization in BaseX?

2017-09-18 Thread Christian Grün
Hi Omar,

Our current XQuery optimizer opens the addressed database in order to
find out if it has the required index structures, and if these are
up-to-date. Moreover, the cheapest index lookup will be selected if
there are several index candidates. For example, in the following
query, it will be likely that the second predicate will be rewritten
for index access:

  db:open('persons')//person[country = 'Italy'][@id = 'id124']

If the addressed database is not statically known, these checks cannot
be performed that easily. Further implications and in-depth
information can be found in »Storing and Querying Large XML Instances«
[1].

Here are two ideas how this could be tackled:

• We could add an XQuery pragma to enforce specific index rewritings. Examples:

  for $n in 1 to 10
  for $db in ('persons' || $n)
  for $person in db:open($db)//person
  where (# basex:index #) { $person/country = 'Italy' }
  where $person/@id = 'id124'
  return $person

  (1 to 10) ! db:open('persons' || .)//person
[(# basex:index #) { country = 'Italy' }]
[@id = 'id124']

• We could create multiple query plans at compile time (with and
without index, one rewriting for each index candidate) and choose the
one that is expected to be the cheapest at evaluation time. This would
definitely be the most flexible option (but the number of query plans
increases exponentially if you have nested FLWOR expressions and
queries with numerous predicates or where clauses).

Cheers,
Christian

[1] http://basex.org/about-us/publications/



On Wed, Sep 6, 2017 at 4:02 PM, Omar Siam  wrote:
> Hi list!
>
> Recently I started to wonder why functions in my XQuery modules make no use
> of indexes unless I force them to by using the respective function like
> db:text(). Now I just did some I think minimal changes to the example for
> text index at: http://docs.basex.org/wiki/Indexes#Text_Index.
>
> If I just change it like below I loose almost all index optimization.
>
> The only optimizations left are:
>
> * Using db:open or collection with a string but not one in a variable of a
> for FLOWR expression or the simple map operator.
>
> * Using xquery:eval and passing a context with db:open or collection
>
> Is there any chance that this will change any time soon? Is this a
> fundamental restriction?
>
> Best regards
>
> Omar
>
> xquery version "3.1";
>
> declare namespace _ = "urn:local:namespace:_";
> import module namespace functx = "http://www.functx.com;;
>
> declare function _:_1st_example($ctx as document-node()) {
>   $ctx//*[text() = 'Germany']
> };
>
> declare function _:_2nd_example($file as xs:string) {
>   doc($file)//name[. = 'Germany']
> };
>
> declare function _:_3rd_example($dbname as xs:string+) {
>   (
>   for $c in ($dbname!collection(.))//country
>   where $c//city/name = 'Hanoi'
>   return $c/name,
>   $dbname!xquery:eval("//*[text() = 'Vietnam']", map {'': db:open(.)}))
> };
>
> (
> (: 1st example :)
> _:_1st_example(.),
> (: 2nd example :)
> _:_2nd_example('factbook.xml'),
> (: 3rd example :)
> _:_3rd_example(('factbook', 'factbook')),
> xs:string(_:_1st_example)
> )
>
> Optimized Query:
> (let $ctx_314 := . return $ctx_314/descendant::*[(text() = "Germany")],
> db:open-pre("factbook",0)/descendant::name[(. = "Germany")], (for $c_317 in
> (("factbook", "factbook") !
> collection(.))/descendant::country[(descendant::city/name = "Hanoi")] return
> $c_317/name, (("factbook", "factbook") ! xquery:eval("//*[text() =
> 'Vietnam']", map { "":db:open(.) }))), _:_1st_example cast as xs:string?)
>
> Compiling:
>
> [...]
>
> - RUNTIME: pre-evaluate root() to document-node()
> - RUNTIME: rewrite descendant-or-self step(s)
> - RUNTIME: apply text index for "Vietnam"
> - RUNTIME: pre-evaluate root() to document-node()
> - RUNTIME: rewrite descendant-or-self step(s)
> - RUNTIME: apply text index for "Vietnam"
>