Re: [basex-talk] Is there any documentation on the narrow limits of XQuery index optimization in BaseX?
> The thing I miss most is a function > like xquery:eval that accepts a function as an argument but also takes a > context and does that runtime optimization. I assume you are looking for something like the following query? xquery:eval-func( function($db, $name) { db:open($db)//person[name = $name] }, [ 'persons', 'john' ] ) This sounds like an enticing idea. It is hard to realize, though, as we would need to recompile code that has already been rewritten to an internal representation that can be evaluated by our XQuery processor. > Or a way to convert a function > to string. Same here: It would require a lot of work to create a bug free XQuery string representation of the internal code. > Am 18.09.2017 um 15:59 schrieb Christian Grün: >> >> Hi Omar, >> >> Our current XQuery optimizer opens the addressed database in order to >> find out if it has the required index structures, and if these are >> up-to-date. Moreover, the cheapest index lookup will be selected if >> there are several index candidates. For example, in the following >> query, it will be likely that the second predicate will be rewritten >> for index access: >> >>db:open('persons')//person[country = 'Italy'][@id = 'id124'] >> >> If the addressed database is not statically known, these checks cannot >> be performed that easily. Further implications and in-depth >> information can be found in »Storing and Querying Large XML Instances« >> [1]. >> >> Here are two ideas how this could be tackled: >> >> • We could add an XQuery pragma to enforce specific index rewritings. >> Examples: >> >>for $n in 1 to 10 >>for $db in ('persons' || $n) >>for $person in db:open($db)//person >>where (# basex:index #) { $person/country = 'Italy' } >>where $person/@id = 'id124's >>return $person >> >>(1 to 10) ! db:open('persons' || .)//person >> [(# basex:index #) { country = 'Italy' }] >> [@id = 'id124'] >> >> • We could create multiple query plans at compile time (with and >> without index, one rewriting for each index candidate) and choose the >> one that is expected to be the cheapest at evaluation time. This would >> definitely be the most flexible option (but the number of query plans >> increases exponentially if you have nested FLWOR expressions and >> queries with numerous predicates or where clauses). >> >> Cheers, >> Christian >> >> [1] http://basex.org/about-us/publications/ >> >
Re: [basex-talk] Is there any documentation on the narrow limits of XQuery index optimization in BaseX?
Hi! Interesting ideas. I don't like the pragma idea that much because there is already sth. like that with xquery:eval. The thing I miss most is a function like xquery:eval that accepts a function as an argument but also takes a context and does that runtime optimization. Or a way to convert a function to string. Is there already sth. like this? I though it might be xquery:invoke but that seems to do sth. else. Best regards Omar Am 18.09.2017 um 15:59 schrieb Christian Grün: Hi Omar, Our current XQuery optimizer opens the addressed database in order to find out if it has the required index structures, and if these are up-to-date. Moreover, the cheapest index lookup will be selected if there are several index candidates. For example, in the following query, it will be likely that the second predicate will be rewritten for index access: db:open('persons')//person[country = 'Italy'][@id = 'id124'] If the addressed database is not statically known, these checks cannot be performed that easily. Further implications and in-depth information can be found in »Storing and Querying Large XML Instances« [1]. Here are two ideas how this could be tackled: • We could add an XQuery pragma to enforce specific index rewritings. Examples: for $n in 1 to 10 for $db in ('persons' || $n) for $person in db:open($db)//person where (# basex:index #) { $person/country = 'Italy' } where $person/@id = 'id124' return $person (1 to 10) ! db:open('persons' || .)//person [(# basex:index #) { country = 'Italy' }] [@id = 'id124'] • We could create multiple query plans at compile time (with and without index, one rewriting for each index candidate) and choose the one that is expected to be the cheapest at evaluation time. This would definitely be the most flexible option (but the number of query plans increases exponentially if you have nested FLWOR expressions and queries with numerous predicates or where clauses). Cheers, Christian [1] http://basex.org/about-us/publications/
Re: [basex-talk] Is there any documentation on the narrow limits of XQuery index optimization in BaseX?
Hi Omar, Our current XQuery optimizer opens the addressed database in order to find out if it has the required index structures, and if these are up-to-date. Moreover, the cheapest index lookup will be selected if there are several index candidates. For example, in the following query, it will be likely that the second predicate will be rewritten for index access: db:open('persons')//person[country = 'Italy'][@id = 'id124'] If the addressed database is not statically known, these checks cannot be performed that easily. Further implications and in-depth information can be found in »Storing and Querying Large XML Instances« [1]. Here are two ideas how this could be tackled: • We could add an XQuery pragma to enforce specific index rewritings. Examples: for $n in 1 to 10 for $db in ('persons' || $n) for $person in db:open($db)//person where (# basex:index #) { $person/country = 'Italy' } where $person/@id = 'id124' return $person (1 to 10) ! db:open('persons' || .)//person [(# basex:index #) { country = 'Italy' }] [@id = 'id124'] • We could create multiple query plans at compile time (with and without index, one rewriting for each index candidate) and choose the one that is expected to be the cheapest at evaluation time. This would definitely be the most flexible option (but the number of query plans increases exponentially if you have nested FLWOR expressions and queries with numerous predicates or where clauses). Cheers, Christian [1] http://basex.org/about-us/publications/ On Wed, Sep 6, 2017 at 4:02 PM, Omar Siamwrote: > Hi list! > > Recently I started to wonder why functions in my XQuery modules make no use > of indexes unless I force them to by using the respective function like > db:text(). Now I just did some I think minimal changes to the example for > text index at: http://docs.basex.org/wiki/Indexes#Text_Index. > > If I just change it like below I loose almost all index optimization. > > The only optimizations left are: > > * Using db:open or collection with a string but not one in a variable of a > for FLOWR expression or the simple map operator. > > * Using xquery:eval and passing a context with db:open or collection > > Is there any chance that this will change any time soon? Is this a > fundamental restriction? > > Best regards > > Omar > > xquery version "3.1"; > > declare namespace _ = "urn:local:namespace:_"; > import module namespace functx = "http://www.functx.com;; > > declare function _:_1st_example($ctx as document-node()) { > $ctx//*[text() = 'Germany'] > }; > > declare function _:_2nd_example($file as xs:string) { > doc($file)//name[. = 'Germany'] > }; > > declare function _:_3rd_example($dbname as xs:string+) { > ( > for $c in ($dbname!collection(.))//country > where $c//city/name = 'Hanoi' > return $c/name, > $dbname!xquery:eval("//*[text() = 'Vietnam']", map {'': db:open(.)})) > }; > > ( > (: 1st example :) > _:_1st_example(.), > (: 2nd example :) > _:_2nd_example('factbook.xml'), > (: 3rd example :) > _:_3rd_example(('factbook', 'factbook')), > xs:string(_:_1st_example) > ) > > Optimized Query: > (let $ctx_314 := . return $ctx_314/descendant::*[(text() = "Germany")], > db:open-pre("factbook",0)/descendant::name[(. = "Germany")], (for $c_317 in > (("factbook", "factbook") ! > collection(.))/descendant::country[(descendant::city/name = "Hanoi")] return > $c_317/name, (("factbook", "factbook") ! xquery:eval("//*[text() = > 'Vietnam']", map { "":db:open(.) }))), _:_1st_example cast as xs:string?) > > Compiling: > > [...] > > - RUNTIME: pre-evaluate root() to document-node() > - RUNTIME: rewrite descendant-or-self step(s) > - RUNTIME: apply text index for "Vietnam" > - RUNTIME: pre-evaluate root() to document-node() > - RUNTIME: rewrite descendant-or-self step(s) > - RUNTIME: apply text index for "Vietnam" >