Re: [basex-talk] Curious query optimization

2018-07-27 Thread Sebastian Zimmer

Thanks for the fix. Works smoothly now.

Best,
Sebastian


Am 27.07.2018 um 12:56 schrieb Christian Grün:

A new snapshot is available: http://files.basex.org/releases/latest.
​Thanks for reporting it,
Christian​


On Tue, Jul 24, 2018 at 12:37 PM Christian Grün 
mailto:christian.gr...@gmail.com>> wrote:


Thanks, good catch. I have opened an issue for that [1]. – Best,
Christian

[1] https://github.com/BaseXdb/basex/issues/1597


On Tue, Jul 24, 2018 at 11:51 AM Sebastian Zimmer
mailto:sebastian.zim...@uni-koeln.de>> wrote:

Hi Christian,

this is the simplest query I could come up with:

xquery version "3.1";
let $fuzzy := false()

return (
   collection('BIBL')/*:TEI[
     if ($fuzzy)
     then ()
     else (.[descendant::text() contains text {"string"} using fuzzy])
   ]
)

is optimized to:

(db:open-pre("BIBL", 0), ...)/*:TEI[descendant::text() contains text 
"string" using fuzzy using language 'English']


should be optimized to:

ft:search("BIBL", "string")/ancestor::*:TEI

Thanks,
Sebastian


Am 24.07.2018 um 11:23 schrieb Christian Grün:

Hi Sebastian,

I guess the index was only rewritten for index access in
BaseX 8 because the two branches of the if/then/else
expression was simplified incorrectly and replaced with one
of the branches. To simplify debugging, could you please
simplify your query even more and drop all superfluous
expressions that do not relate to this bug?

Thanks in advance,
Christian



On Tue, Jul 24, 2018 at 11:16 AM Sebastian Zimmer
mailto:sebastian.zim...@uni-koeln.de>> wrote:

Hi again,

I have compared the output of the latest BaseX 9.1 with
8.6.7 and it looks like a regression.

Whereas in 8.6.7, the query ist optimized two times with
ft:search, in 9.1 only one time.

See attached for both outputs.

Best,
Sebastian


Am 23.07.2018 um 11:55 schrieb Sebastian Zimmer:


The full-text index of the database is enabled and the
compiling info section of the query states

- apply full-text index for { $string_0 } using language
'English'

The optimized query looks to me as if the index is
applied only once via ft:search, but not in both cases.

Best,
Sebastian


Am 23.07.2018 um 11:47 schrieb Christian Grün:

Hi Sebastian,

Did you check in the Info View panel if the index is applied? If no,
you might try something as follows:

   if ($fuzzy) then (
 collection('ZK')/tei:TEI[... using fuzzy])
   ) else (
 collection('ZK')/tei:TEI[...])
   )

Usually, if full-text options are dynamic, I tend to use ft:search 
[1].

Best,
Christian

[1]http://docs.basex.org/wiki/Full-Text_Module#ft:search




On Mon, Jul 23, 2018 at 11:41 AM Sebastian Zimmer

  wrote:

Hi Christian,

thanks for the fix, the result is correct now.

But this query now takes about 18 seconds (!) to execute, instead of 
<1 second like before. Do you think, this could be accelerated?

See attached for the complete console output.

Best,
Sebastian


Am 12.07.2018 um 13:03 schrieb Christian Grün:

Hi Sebastian,

This has been fixed. The background: In one of the optimizations of
the "if" expression, identical branches are merged:

   if(..expensive query..) then 1 else 1
   → Optimized Query: 1

The full-text options were ignored in the equality check.
A new snapshot is online.

Best,
Christian



On Wed, Jul 11, 2018 at 1:22 PM Sebastian Zimmer

  wrote:

Hi,

I have a query which is optimized in a curious way in BaseX 9.0.2 
(yesterday's snapshot).

This is the original query:

xquery version "3.1";
declare namespace tei ="http://www.tei-c.org/ns/1.0";
;
let $string := "string"
let $fuzzy := false()

return (
   collection('ZK')/tei:TEI[
 if (false())
   then (.[descendant::text() contains text {$string} using 
fuzzy])
   else (.[descendant::text() contains text {$string}])
   ],
   collection('ZK')/tei:TEI[
 if ($fuzzy)
 then (.

Re: [basex-talk] Curious query optimization

2018-07-27 Thread Christian Grün
A new snapshot is available: http://files.basex.org/releases/latest.
​Thanks for reporting it,
Christian​


On Tue, Jul 24, 2018 at 12:37 PM Christian Grün 
wrote:

> Thanks, good catch. I have opened an issue for that [1]. – Best, Christian
>
> [1] https://github.com/BaseXdb/basex/issues/1597
>
>
> On Tue, Jul 24, 2018 at 11:51 AM Sebastian Zimmer <
> sebastian.zim...@uni-koeln.de> wrote:
>
>> Hi Christian,
>>
>> this is the simplest query I could come up with:
>>
>> xquery version "3.1";
>> let $fuzzy := false()
>>
>> return (
>>   collection('BIBL')/*:TEI[
>> if ($fuzzy)
>> then ()
>> else (.[descendant::text() contains text {"string"} using fuzzy])
>>   ]
>> )
>>
>> is optimized to:
>>
>> (db:open-pre("BIBL", 0), ...)/*:TEI[descendant::text() contains text 
>> "string" using fuzzy using language 'English']
>>
>>
>> should be optimized to:
>>
>> ft:search("BIBL", "string")/ancestor::*:TEI
>>
>> Thanks,
>> Sebastian
>>
>> Am 24.07.2018 um 11:23 schrieb Christian Grün:
>>
>> Hi Sebastian,
>>
>> I guess the index was only rewritten for index access in BaseX 8 because
>> the two branches of the if/then/else expression was simplified incorrectly
>> and replaced with one of the branches. To simplify debugging, could you
>> please simplify your query even more and drop all superfluous expressions
>> that do not relate to this bug?
>>
>> Thanks in advance,
>> Christian
>>
>>
>>
>> On Tue, Jul 24, 2018 at 11:16 AM Sebastian Zimmer <
>> sebastian.zim...@uni-koeln.de> wrote:
>>
>>> Hi again,
>>>
>>> I have compared the output of the latest BaseX 9.1 with 8.6.7 and it
>>> looks like a regression.
>>>
>>> Whereas in 8.6.7, the query ist optimized two times with ft:search, in
>>> 9.1 only one time.
>>>
>>> See attached for both outputs.
>>>
>>> Best,
>>> Sebastian
>>>
>>> Am 23.07.2018 um 11:55 schrieb Sebastian Zimmer:
>>>
>>> The full-text index of the database is enabled and the compiling info
>>> section of the query states
>>>
>>> - apply full-text index for { $string_0 } using language 'English'
>>>
>>> The optimized query looks to me as if the index is applied only once via
>>> ft:search, but not in both cases.
>>>
>>> Best,
>>> Sebastian
>>>
>>> Am 23.07.2018 um 11:47 schrieb Christian Grün:
>>>
>>> Hi Sebastian,
>>>
>>> Did you check in the Info View panel if the index is applied? If no,
>>> you might try something as follows:
>>>
>>>   if ($fuzzy) then (
>>> collection('ZK')/tei:TEI[... using fuzzy])
>>>   ) else (
>>> collection('ZK')/tei:TEI[...])
>>>   )
>>>
>>> Usually, if full-text options are dynamic, I tend to use ft:search [1].
>>>
>>> Best,
>>> Christian
>>>
>>> [1] http://docs.basex.org/wiki/Full-Text_Module#ft:search
>>>
>>>
>>>
>>>
>>> On Mon, Jul 23, 2018 at 11:41 AM Sebastian 
>>> Zimmer  wrote:
>>>
>>> Hi Christian,
>>>
>>> thanks for the fix, the result is correct now.
>>>
>>> But this query now takes about 18 seconds (!) to execute, instead of <1 
>>> second like before. Do you think, this could be accelerated?
>>>
>>> See attached for the complete console output.
>>>
>>> Best,
>>> Sebastian
>>>
>>>
>>> Am 12.07.2018 um 13:03 schrieb Christian Grün:
>>>
>>> Hi Sebastian,
>>>
>>> This has been fixed. The background: In one of the optimizations of
>>> the "if" expression, identical branches are merged:
>>>
>>>   if(..expensive query..) then 1 else 1
>>>   → Optimized Query: 1
>>>
>>> The full-text options were ignored in the equality check.
>>> A new snapshot is online.
>>>
>>> Best,
>>> Christian
>>>
>>>
>>>
>>> On Wed, Jul 11, 2018 at 1:22 PM Sebastian 
>>> Zimmer  wrote:
>>>
>>> Hi,
>>>
>>> I have a query which is optimized in a curious way in BaseX 9.0.2 
>>> (yesterday's snapshot).
>>>
>>> This is the original query:
>>>
>>> xquery version "3.1";
>>> declare namespace tei = "http://www.tei-c.org/ns/1.0"; 
>>> ;
>>> let $string := "string"
>>> let $fuzzy := false()
>>>
>>> return (
>>>   collection('ZK')/tei:TEI[
>>> if (false())
>>>   then (.[descendant::text() contains text {$string} using fuzzy])
>>>   else (.[descendant::text() contains text {$string}])
>>>   ],
>>>   collection('ZK')/tei:TEI[
>>> if ($fuzzy)
>>> then (.[descendant::text() contains text {$string} using fuzzy])
>>> else (.[descendant::text() contains text {$string}])
>>>   ]
>>> )
>>>
>>> And this is the optimized one (newlines inserted by me for better 
>>> readability):
>>>
>>> (
>>> ft:search("ZK", "string" using language 
>>> 'English')/ancestor::tei:TEI[parent::document-node()],
>>> ft:search("ZK", "string" using fuzzy using language 
>>> 'English')/ancestor::tei:TEI[parent::document-node()]
>>> )
>>>
>>> I'm curious why the second search is using fuzzy, even though the variable 
>>> $fuzzy is false. I presume that query optimization is independent of the 
>>> data, so you won't need the data to reproduce. But if you do, I can provide 
>>> it. A database with enabled full-text index is required obviously.
>>>
>>> Best regards,
>>> Sebastian Z

Re: [basex-talk] Curious query optimization

2018-07-24 Thread Christian Grün
Thanks, good catch. I have opened an issue for that [1]. – Best, Christian

[1] https://github.com/BaseXdb/basex/issues/1597


On Tue, Jul 24, 2018 at 11:51 AM Sebastian Zimmer <
sebastian.zim...@uni-koeln.de> wrote:

> Hi Christian,
>
> this is the simplest query I could come up with:
>
> xquery version "3.1";
> let $fuzzy := false()
>
> return (
>   collection('BIBL')/*:TEI[
> if ($fuzzy)
> then ()
> else (.[descendant::text() contains text {"string"} using fuzzy])
>   ]
> )
>
> is optimized to:
>
> (db:open-pre("BIBL", 0), ...)/*:TEI[descendant::text() contains text "string" 
> using fuzzy using language 'English']
>
>
> should be optimized to:
>
> ft:search("BIBL", "string")/ancestor::*:TEI
>
> Thanks,
> Sebastian
>
> Am 24.07.2018 um 11:23 schrieb Christian Grün:
>
> Hi Sebastian,
>
> I guess the index was only rewritten for index access in BaseX 8 because
> the two branches of the if/then/else expression was simplified incorrectly
> and replaced with one of the branches. To simplify debugging, could you
> please simplify your query even more and drop all superfluous expressions
> that do not relate to this bug?
>
> Thanks in advance,
> Christian
>
>
>
> On Tue, Jul 24, 2018 at 11:16 AM Sebastian Zimmer <
> sebastian.zim...@uni-koeln.de> wrote:
>
>> Hi again,
>>
>> I have compared the output of the latest BaseX 9.1 with 8.6.7 and it
>> looks like a regression.
>>
>> Whereas in 8.6.7, the query ist optimized two times with ft:search, in
>> 9.1 only one time.
>>
>> See attached for both outputs.
>>
>> Best,
>> Sebastian
>>
>> Am 23.07.2018 um 11:55 schrieb Sebastian Zimmer:
>>
>> The full-text index of the database is enabled and the compiling info
>> section of the query states
>>
>> - apply full-text index for { $string_0 } using language 'English'
>>
>> The optimized query looks to me as if the index is applied only once via
>> ft:search, but not in both cases.
>>
>> Best,
>> Sebastian
>>
>> Am 23.07.2018 um 11:47 schrieb Christian Grün:
>>
>> Hi Sebastian,
>>
>> Did you check in the Info View panel if the index is applied? If no,
>> you might try something as follows:
>>
>>   if ($fuzzy) then (
>> collection('ZK')/tei:TEI[... using fuzzy])
>>   ) else (
>> collection('ZK')/tei:TEI[...])
>>   )
>>
>> Usually, if full-text options are dynamic, I tend to use ft:search [1].
>>
>> Best,
>> Christian
>>
>> [1] http://docs.basex.org/wiki/Full-Text_Module#ft:search
>>
>>
>>
>>
>> On Mon, Jul 23, 2018 at 11:41 AM Sebastian 
>> Zimmer  wrote:
>>
>> Hi Christian,
>>
>> thanks for the fix, the result is correct now.
>>
>> But this query now takes about 18 seconds (!) to execute, instead of <1 
>> second like before. Do you think, this could be accelerated?
>>
>> See attached for the complete console output.
>>
>> Best,
>> Sebastian
>>
>>
>> Am 12.07.2018 um 13:03 schrieb Christian Grün:
>>
>> Hi Sebastian,
>>
>> This has been fixed. The background: In one of the optimizations of
>> the "if" expression, identical branches are merged:
>>
>>   if(..expensive query..) then 1 else 1
>>   → Optimized Query: 1
>>
>> The full-text options were ignored in the equality check.
>> A new snapshot is online.
>>
>> Best,
>> Christian
>>
>>
>>
>> On Wed, Jul 11, 2018 at 1:22 PM Sebastian 
>> Zimmer  wrote:
>>
>> Hi,
>>
>> I have a query which is optimized in a curious way in BaseX 9.0.2 
>> (yesterday's snapshot).
>>
>> This is the original query:
>>
>> xquery version "3.1";
>> declare namespace tei = "http://www.tei-c.org/ns/1.0"; 
>> ;
>> let $string := "string"
>> let $fuzzy := false()
>>
>> return (
>>   collection('ZK')/tei:TEI[
>> if (false())
>>   then (.[descendant::text() contains text {$string} using fuzzy])
>>   else (.[descendant::text() contains text {$string}])
>>   ],
>>   collection('ZK')/tei:TEI[
>> if ($fuzzy)
>> then (.[descendant::text() contains text {$string} using fuzzy])
>> else (.[descendant::text() contains text {$string}])
>>   ]
>> )
>>
>> And this is the optimized one (newlines inserted by me for better 
>> readability):
>>
>> (
>> ft:search("ZK", "string" using language 
>> 'English')/ancestor::tei:TEI[parent::document-node()],
>> ft:search("ZK", "string" using fuzzy using language 
>> 'English')/ancestor::tei:TEI[parent::document-node()]
>> )
>>
>> I'm curious why the second search is using fuzzy, even though the variable 
>> $fuzzy is false. I presume that query optimization is independent of the 
>> data, so you won't need the data to reproduce. But if you do, I can provide 
>> it. A database with enabled full-text index is required obviously.
>>
>> Best regards,
>> Sebastian Zimmer
>>
>> --
>> Sebastian zimmersebastian.zim...@uni-koeln.de
>>
>> Cologne Center for eHumanities
>> DH Center at the University of Cologne
>> @CCeHum
>>
>>
>> --
>> Sebastian zimmersebastian.zim...@uni-koeln.de
>>
>> Cologne Center for eHumanities
>> DH Center at the University of Cologne
>> @CCeHum
>>
>>
>> --
>> Sebastian Zimmer
>> sebastian.zi

Re: [basex-talk] Curious query optimization

2018-07-24 Thread Sebastian Zimmer

Hi Christian,

this is the simplest query I could come up with:

xquery version "3.1";
let $fuzzy := false()

return (
  collection('BIBL')/*:TEI[
    if ($fuzzy)
    then ()
    else (.[descendant::text() contains text {"string"} using fuzzy])
  ]
)

is optimized to:

(db:open-pre("BIBL", 0), ...)/*:TEI[descendant::text() contains text "string" 
using fuzzy using language 'English']


should be optimized to:

ft:search("BIBL", "string")/ancestor::*:TEI

Thanks,
Sebastian


Am 24.07.2018 um 11:23 schrieb Christian Grün:

Hi Sebastian,

I guess the index was only rewritten for index access in BaseX 8 
because the two branches of the if/then/else expression was simplified 
incorrectly and replaced with one of the branches. To simplify 
debugging, could you please simplify your query even more and drop all 
superfluous expressions that do not relate to this bug?


Thanks in advance,
Christian



On Tue, Jul 24, 2018 at 11:16 AM Sebastian Zimmer 
mailto:sebastian.zim...@uni-koeln.de>> 
wrote:


Hi again,

I have compared the output of the latest BaseX 9.1 with 8.6.7 and
it looks like a regression.

Whereas in 8.6.7, the query ist optimized two times with
ft:search, in 9.1 only one time.

See attached for both outputs.

Best,
Sebastian


Am 23.07.2018 um 11:55 schrieb Sebastian Zimmer:


The full-text index of the database is enabled and the compiling
info section of the query states

- apply full-text index for { $string_0 } using language 'English'

The optimized query looks to me as if the index is applied only
once via ft:search, but not in both cases.

Best,
Sebastian


Am 23.07.2018 um 11:47 schrieb Christian Grün:

Hi Sebastian,

Did you check in the Info View panel if the index is applied? If no,
you might try something as follows:

   if ($fuzzy) then (
 collection('ZK')/tei:TEI[... using fuzzy])
   ) else (
 collection('ZK')/tei:TEI[...])
   )

Usually, if full-text options are dynamic, I tend to use ft:search [1].

Best,
Christian

[1]http://docs.basex.org/wiki/Full-Text_Module#ft:search




On Mon, Jul 23, 2018 at 11:41 AM Sebastian Zimmer

  wrote:

Hi Christian,

thanks for the fix, the result is correct now.

But this query now takes about 18 seconds (!) to execute, instead of <1 
second like before. Do you think, this could be accelerated?

See attached for the complete console output.

Best,
Sebastian


Am 12.07.2018 um 13:03 schrieb Christian Grün:

Hi Sebastian,

This has been fixed. The background: In one of the optimizations of
the "if" expression, identical branches are merged:

   if(..expensive query..) then 1 else 1
   → Optimized Query: 1

The full-text options were ignored in the equality check.
A new snapshot is online.

Best,
Christian



On Wed, Jul 11, 2018 at 1:22 PM Sebastian Zimmer

  wrote:

Hi,

I have a query which is optimized in a curious way in BaseX 9.0.2 
(yesterday's snapshot).

This is the original query:

xquery version "3.1";
declare namespace tei ="http://www.tei-c.org/ns/1.0"; 
;
let $string := "string"
let $fuzzy := false()

return (
   collection('ZK')/tei:TEI[
 if (false())
   then (.[descendant::text() contains text {$string} using fuzzy])
   else (.[descendant::text() contains text {$string}])
   ],
   collection('ZK')/tei:TEI[
 if ($fuzzy)
 then (.[descendant::text() contains text {$string} using fuzzy])
 else (.[descendant::text() contains text {$string}])
   ]
)

And this is the optimized one (newlines inserted by me for better 
readability):

(
ft:search("ZK", "string" using language 
'English')/ancestor::tei:TEI[parent::document-node()],
ft:search("ZK", "string" using fuzzy using language 
'English')/ancestor::tei:TEI[parent::document-node()]
)

I'm curious why the second search is using fuzzy, even though the variable 
$fuzzy is false. I presume that query optimization is independent of the data, 
so you won't need the data to reproduce. But if you do, I can provide it. A 
database with enabled full-text index is required obviously.

Best regards,
Sebastian Zimmer

--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de


Cologne Center for eHumanities
DH Center at the University of Cologne
@CCeHum


--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de


Cologne Center for eHumanities
DH Center at the University of Cologne
@CCeHum


-- 
Sebastian Zimmer

sebastian.zim...@uni-koeln.de 
CCeH Logo 

Cologne Cent

Re: [basex-talk] Curious query optimization

2018-07-24 Thread Christian Grün
Hi Sebastian,

I guess the index was only rewritten for index access in BaseX 8 because
the two branches of the if/then/else expression was simplified incorrectly
and replaced with one of the branches. To simplify debugging, could you
please simplify your query even more and drop all superfluous expressions
that do not relate to this bug?

Thanks in advance,
Christian



On Tue, Jul 24, 2018 at 11:16 AM Sebastian Zimmer <
sebastian.zim...@uni-koeln.de> wrote:

> Hi again,
>
> I have compared the output of the latest BaseX 9.1 with 8.6.7 and it looks
> like a regression.
>
> Whereas in 8.6.7, the query ist optimized two times with ft:search, in 9.1
> only one time.
>
> See attached for both outputs.
>
> Best,
> Sebastian
>
> Am 23.07.2018 um 11:55 schrieb Sebastian Zimmer:
>
> The full-text index of the database is enabled and the compiling info
> section of the query states
>
> - apply full-text index for { $string_0 } using language 'English'
>
> The optimized query looks to me as if the index is applied only once via
> ft:search, but not in both cases.
>
> Best,
> Sebastian
>
> Am 23.07.2018 um 11:47 schrieb Christian Grün:
>
> Hi Sebastian,
>
> Did you check in the Info View panel if the index is applied? If no,
> you might try something as follows:
>
>   if ($fuzzy) then (
> collection('ZK')/tei:TEI[... using fuzzy])
>   ) else (
> collection('ZK')/tei:TEI[...])
>   )
>
> Usually, if full-text options are dynamic, I tend to use ft:search [1].
>
> Best,
> Christian
>
> [1] http://docs.basex.org/wiki/Full-Text_Module#ft:search
>
>
>
>
> On Mon, Jul 23, 2018 at 11:41 AM Sebastian 
> Zimmer  wrote:
>
> Hi Christian,
>
> thanks for the fix, the result is correct now.
>
> But this query now takes about 18 seconds (!) to execute, instead of <1 
> second like before. Do you think, this could be accelerated?
>
> See attached for the complete console output.
>
> Best,
> Sebastian
>
>
> Am 12.07.2018 um 13:03 schrieb Christian Grün:
>
> Hi Sebastian,
>
> This has been fixed. The background: In one of the optimizations of
> the "if" expression, identical branches are merged:
>
>   if(..expensive query..) then 1 else 1
>   → Optimized Query: 1
>
> The full-text options were ignored in the equality check.
> A new snapshot is online.
>
> Best,
> Christian
>
>
>
> On Wed, Jul 11, 2018 at 1:22 PM Sebastian 
> Zimmer  wrote:
>
> Hi,
>
> I have a query which is optimized in a curious way in BaseX 9.0.2 
> (yesterday's snapshot).
>
> This is the original query:
>
> xquery version "3.1";
> declare namespace tei = "http://www.tei-c.org/ns/1.0"; 
> ;
> let $string := "string"
> let $fuzzy := false()
>
> return (
>   collection('ZK')/tei:TEI[
> if (false())
>   then (.[descendant::text() contains text {$string} using fuzzy])
>   else (.[descendant::text() contains text {$string}])
>   ],
>   collection('ZK')/tei:TEI[
> if ($fuzzy)
> then (.[descendant::text() contains text {$string} using fuzzy])
> else (.[descendant::text() contains text {$string}])
>   ]
> )
>
> And this is the optimized one (newlines inserted by me for better 
> readability):
>
> (
> ft:search("ZK", "string" using language 
> 'English')/ancestor::tei:TEI[parent::document-node()],
> ft:search("ZK", "string" using fuzzy using language 
> 'English')/ancestor::tei:TEI[parent::document-node()]
> )
>
> I'm curious why the second search is using fuzzy, even though the variable 
> $fuzzy is false. I presume that query optimization is independent of the 
> data, so you won't need the data to reproduce. But if you do, I can provide 
> it. A database with enabled full-text index is required obviously.
>
> Best regards,
> Sebastian Zimmer
>
> --
> Sebastian zimmersebastian.zim...@uni-koeln.de
>
> Cologne Center for eHumanities
> DH Center at the University of Cologne
> @CCeHum
>
>
> --
> Sebastian zimmersebastian.zim...@uni-koeln.de
>
> Cologne Center for eHumanities
> DH Center at the University of Cologne
> @CCeHum
>
>
> --
> Sebastian Zimmer
> sebastian.zim...@uni-koeln.de
> [image: CCeH Logo] 
>
> Cologne Center for eHumanities 
> DH Center at the University of Cologne
> [image: Twitter Logo] @CCeHum
> 
>
>
> --
> Sebastian Zimmer
> sebastian.zim...@uni-koeln.de
> [image: CCeH Logo] 
>
> Cologne Center for eHumanities 
> DH Center at the University of Cologne
> [image: Twitter Logo] @CCeHum
> 
>


Re: [basex-talk] Curious query optimization

2018-07-24 Thread Sebastian Zimmer

Hi again,

I have compared the output of the latest BaseX 9.1 with 8.6.7 and it 
looks like a regression.


Whereas in 8.6.7, the query ist optimized two times with ft:search, in 
9.1 only one time.


See attached for both outputs.

Best,
Sebastian


Am 23.07.2018 um 11:55 schrieb Sebastian Zimmer:


The full-text index of the database is enabled and the compiling info 
section of the query states


- apply full-text index for { $string_0 } using language 'English'

The optimized query looks to me as if the index is applied only once 
via ft:search, but not in both cases.


Best,
Sebastian


Am 23.07.2018 um 11:47 schrieb Christian Grün:

Hi Sebastian,

Did you check in the Info View panel if the index is applied? If no,
you might try something as follows:

   if ($fuzzy) then (
 collection('ZK')/tei:TEI[... using fuzzy])
   ) else (
 collection('ZK')/tei:TEI[...])
   )

Usually, if full-text options are dynamic, I tend to use ft:search [1].

Best,
Christian

[1]http://docs.basex.org/wiki/Full-Text_Module#ft:search




On Mon, Jul 23, 2018 at 11:41 AM Sebastian Zimmer
  wrote:

Hi Christian,

thanks for the fix, the result is correct now.

But this query now takes about 18 seconds (!) to execute, instead of <1 second 
like before. Do you think, this could be accelerated?

See attached for the complete console output.

Best,
Sebastian


Am 12.07.2018 um 13:03 schrieb Christian Grün:

Hi Sebastian,

This has been fixed. The background: In one of the optimizations of
the "if" expression, identical branches are merged:

   if(..expensive query..) then 1 else 1
   → Optimized Query: 1

The full-text options were ignored in the equality check.
A new snapshot is online.

Best,
Christian



On Wed, Jul 11, 2018 at 1:22 PM Sebastian Zimmer
  wrote:

Hi,

I have a query which is optimized in a curious way in BaseX 9.0.2 (yesterday's 
snapshot).

This is the original query:

xquery version "3.1";
declare namespace tei ="http://www.tei-c.org/ns/1.0";;
let $string := "string"
let $fuzzy := false()

return (
   collection('ZK')/tei:TEI[
 if (false())
   then (.[descendant::text() contains text {$string} using fuzzy])
   else (.[descendant::text() contains text {$string}])
   ],
   collection('ZK')/tei:TEI[
 if ($fuzzy)
 then (.[descendant::text() contains text {$string} using fuzzy])
 else (.[descendant::text() contains text {$string}])
   ]
)

And this is the optimized one (newlines inserted by me for better readability):

(
ft:search("ZK", "string" using language 
'English')/ancestor::tei:TEI[parent::document-node()],
ft:search("ZK", "string" using fuzzy using language 
'English')/ancestor::tei:TEI[parent::document-node()]
)

I'm curious why the second search is using fuzzy, even though the variable 
$fuzzy is false. I presume that query optimization is independent of the data, 
so you won't need the data to reproduce. But if you do, I can provide it. A 
database with enabled full-text index is required obviously.

Best regards,
Sebastian Zimmer

--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de

Cologne Center for eHumanities
DH Center at the University of Cologne
@CCeHum


--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de

Cologne Center for eHumanities
DH Center at the University of Cologne
@CCeHum


--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de 
CCeH Logo 

Cologne Center for eHumanities 
DH Center at the University of Cologne
Twitter Logo @CCeHum 





--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de 
CCeH Logo 

Cologne Center for eHumanities 
DH Center at the University of Cologne
Twitter Logo @CCeHum 



szimmer1@luhmann1:/var/local/basex9.1/webapp$ ../bin/basex -V 
"./predicate_test.xql"   

Query:
xquery version "3.1"; declare namespace tei = "http://www.tei-c.org/ns/1.0";; 
let $string := "string" let $fuzzy := false() return ( 
collection('BIBL')/tei:TEI[ if (false()) then (.[descendant::text() contains 
text {$string} using fuzzy]) else (.[descendant::text() contains text 
{$string}]) ], collection('BIBL')/tei:TEI[ if ($fuzzy) then 
(.[descendant::text() contains text {$string} using fuzzy]) else 
(.[descendant::text() contains text {$string}]) ] )

Compiling:
- pre-evaluate fn:collection([uri]) to document-node() sequence: 
collection("BIBL") -> (db:open-pre("BIBL", 0), ...)
- rewrite if to iter filter: if(false()) then (.)[descendant::text() ... -> 
(.)[descendant::text() contains text { $...
- rewrite iter filter to ftcontains: (.)[descendant::text() contains text { 
$... -> descendant::text() contains text { $stri...
- rewrite iter filter to ftcontains: (.)[descendant::text() contains text { 
$... -> descendant::text() contains text { $stri...
- apply fu

Re: [basex-talk] Curious query optimization

2018-07-23 Thread Sebastian Zimmer
The full-text index of the database is enabled and the compiling info 
section of the query states


- apply full-text index for { $string_0 } using language 'English'

The optimized query looks to me as if the index is applied only once via 
ft:search, but not in both cases.


Best,
Sebastian


Am 23.07.2018 um 11:47 schrieb Christian Grün:

Hi Sebastian,

Did you check in the Info View panel if the index is applied? If no,
you might try something as follows:

   if ($fuzzy) then (
 collection('ZK')/tei:TEI[... using fuzzy])
   ) else (
 collection('ZK')/tei:TEI[...])
   )

Usually, if full-text options are dynamic, I tend to use ft:search [1].

Best,
Christian

[1] http://docs.basex.org/wiki/Full-Text_Module#ft:search




On Mon, Jul 23, 2018 at 11:41 AM Sebastian Zimmer
 wrote:

Hi Christian,

thanks for the fix, the result is correct now.

But this query now takes about 18 seconds (!) to execute, instead of <1 second 
like before. Do you think, this could be accelerated?

See attached for the complete console output.

Best,
Sebastian


Am 12.07.2018 um 13:03 schrieb Christian Grün:

Hi Sebastian,

This has been fixed. The background: In one of the optimizations of
the "if" expression, identical branches are merged:

   if(..expensive query..) then 1 else 1
   → Optimized Query: 1

The full-text options were ignored in the equality check.
A new snapshot is online.

Best,
Christian



On Wed, Jul 11, 2018 at 1:22 PM Sebastian Zimmer
 wrote:

Hi,

I have a query which is optimized in a curious way in BaseX 9.0.2 (yesterday's 
snapshot).

This is the original query:

xquery version "3.1";
declare namespace tei = "http://www.tei-c.org/ns/1.0";;
let $string := "string"
let $fuzzy := false()

return (
   collection('ZK')/tei:TEI[
 if (false())
   then (.[descendant::text() contains text {$string} using fuzzy])
   else (.[descendant::text() contains text {$string}])
   ],
   collection('ZK')/tei:TEI[
 if ($fuzzy)
 then (.[descendant::text() contains text {$string} using fuzzy])
 else (.[descendant::text() contains text {$string}])
   ]
)

And this is the optimized one (newlines inserted by me for better readability):

(
ft:search("ZK", "string" using language 
'English')/ancestor::tei:TEI[parent::document-node()],
ft:search("ZK", "string" using fuzzy using language 
'English')/ancestor::tei:TEI[parent::document-node()]
)

I'm curious why the second search is using fuzzy, even though the variable 
$fuzzy is false. I presume that query optimization is independent of the data, 
so you won't need the data to reproduce. But if you do, I can provide it. A 
database with enabled full-text index is required obviously.

Best regards,
Sebastian Zimmer

--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de

Cologne Center for eHumanities
DH Center at the University of Cologne
@CCeHum


--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de

Cologne Center for eHumanities
DH Center at the University of Cologne
@CCeHum


--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de 
CCeH Logo 

Cologne Center for eHumanities 
DH Center at the University of Cologne
Twitter Logo @CCeHum 





Re: [basex-talk] Curious query optimization

2018-07-23 Thread Christian Grün
Hi Sebastian,

Did you check in the Info View panel if the index is applied? If no,
you might try something as follows:

  if ($fuzzy) then (
collection('ZK')/tei:TEI[... using fuzzy])
  ) else (
collection('ZK')/tei:TEI[...])
  )

Usually, if full-text options are dynamic, I tend to use ft:search [1].

Best,
Christian

[1] http://docs.basex.org/wiki/Full-Text_Module#ft:search




On Mon, Jul 23, 2018 at 11:41 AM Sebastian Zimmer
 wrote:
>
> Hi Christian,
>
> thanks for the fix, the result is correct now.
>
> But this query now takes about 18 seconds (!) to execute, instead of <1 
> second like before. Do you think, this could be accelerated?
>
> See attached for the complete console output.
>
> Best,
> Sebastian
>
>
> Am 12.07.2018 um 13:03 schrieb Christian Grün:
>
> Hi Sebastian,
>
> This has been fixed. The background: In one of the optimizations of
> the "if" expression, identical branches are merged:
>
>   if(..expensive query..) then 1 else 1
>   → Optimized Query: 1
>
> The full-text options were ignored in the equality check.
> A new snapshot is online.
>
> Best,
> Christian
>
>
>
> On Wed, Jul 11, 2018 at 1:22 PM Sebastian Zimmer
>  wrote:
>
> Hi,
>
> I have a query which is optimized in a curious way in BaseX 9.0.2 
> (yesterday's snapshot).
>
> This is the original query:
>
> xquery version "3.1";
> declare namespace tei = "http://www.tei-c.org/ns/1.0";;
> let $string := "string"
> let $fuzzy := false()
>
> return (
>   collection('ZK')/tei:TEI[
> if (false())
>   then (.[descendant::text() contains text {$string} using fuzzy])
>   else (.[descendant::text() contains text {$string}])
>   ],
>   collection('ZK')/tei:TEI[
> if ($fuzzy)
> then (.[descendant::text() contains text {$string} using fuzzy])
> else (.[descendant::text() contains text {$string}])
>   ]
> )
>
> And this is the optimized one (newlines inserted by me for better 
> readability):
>
> (
> ft:search("ZK", "string" using language 
> 'English')/ancestor::tei:TEI[parent::document-node()],
> ft:search("ZK", "string" using fuzzy using language 
> 'English')/ancestor::tei:TEI[parent::document-node()]
> )
>
> I'm curious why the second search is using fuzzy, even though the variable 
> $fuzzy is false. I presume that query optimization is independent of the 
> data, so you won't need the data to reproduce. But if you do, I can provide 
> it. A database with enabled full-text index is required obviously.
>
> Best regards,
> Sebastian Zimmer
>
> --
> Sebastian Zimmer
> sebastian.zim...@uni-koeln.de
>
> Cologne Center for eHumanities
> DH Center at the University of Cologne
> @CCeHum
>
>
> --
> Sebastian Zimmer
> sebastian.zim...@uni-koeln.de
>
> Cologne Center for eHumanities
> DH Center at the University of Cologne
> @CCeHum


Re: [basex-talk] Curious query optimization

2018-07-23 Thread Sebastian Zimmer

Hi Christian,

thanks for the fix, the result is correct now.

But this query now takes about 18 seconds (!) to execute, instead of <1 
second like before. Do you think, this could be accelerated?


See attached for the complete console output.

Best,
Sebastian


Am 12.07.2018 um 13:03 schrieb Christian Grün:

Hi Sebastian,

This has been fixed. The background: In one of the optimizations of
the "if" expression, identical branches are merged:

   if(..expensive query..) then 1 else 1
   → Optimized Query: 1

The full-text options were ignored in the equality check.
A new snapshot is online.

Best,
Christian



On Wed, Jul 11, 2018 at 1:22 PM Sebastian Zimmer
 wrote:

Hi,

I have a query which is optimized in a curious way in BaseX 9.0.2 (yesterday's 
snapshot).

This is the original query:

xquery version "3.1";
declare namespace tei = "http://www.tei-c.org/ns/1.0";;
let $string := "string"
let $fuzzy := false()

return (
   collection('ZK')/tei:TEI[
 if (false())
   then (.[descendant::text() contains text {$string} using fuzzy])
   else (.[descendant::text() contains text {$string}])
   ],
   collection('ZK')/tei:TEI[
 if ($fuzzy)
 then (.[descendant::text() contains text {$string} using fuzzy])
 else (.[descendant::text() contains text {$string}])
   ]
)

And this is the optimized one (newlines inserted by me for better readability):

(
ft:search("ZK", "string" using language 
'English')/ancestor::tei:TEI[parent::document-node()],
ft:search("ZK", "string" using fuzzy using language 
'English')/ancestor::tei:TEI[parent::document-node()]
)

I'm curious why the second search is using fuzzy, even though the variable 
$fuzzy is false. I presume that query optimization is independent of the data, 
so you won't need the data to reproduce. But if you do, I can provide it. A 
database with enabled full-text index is required obviously.

Best regards,
Sebastian Zimmer

--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de

Cologne Center for eHumanities
DH Center at the University of Cologne
@CCeHum


--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de 
CCeH Logo 

Cologne Center for eHumanities 
DH Center at the University of Cologne
Twitter Logo @CCeHum 



szimmer1@luhmann1:/var/local/basex/webapp$ ../bin/basex -V 
"./predicate_test.xql"

Query:
xquery version "3.1"; declare namespace tei = "http://www.tei-c.org/ns/1.0";; 
let $string := "string" let $fuzzy := false() return ( 
collection('ZK')/tei:TEI[ if (false()) then (.[descendant::text() contains text 
{$string} using fuzzy]) else (.[descendant::text() contains text {$string}]) ], 
collection('ZK')/tei:TEI[ if ($fuzzy) then (.[descendant::text() contains text 
{$string} using fuzzy]) else (.[descendant::text() contains text {$string}]) ] )

Compiling:
- pre-evaluate fn:collection([uri]) to document-node() sequence: 
collection("ZK") -> (db:open-pre("ZK", 0), ...)
- rewrite if to iter filter: if(false()) then (.)[descendant::text() ... -> 
(.)[descendant::text() contains text { $...
- rewrite iter filter to ftcontains: (.)[descendant::text() contains text { 
$... -> descendant::text() contains text { $stri...
- rewrite iter filter to ftcontains: (.)[descendant::text() contains text { 
$... -> descendant::text() contains text { $stri...
- apply full-text index for { $string_0 } using language 'English'
- pre-evaluate fn:collection([uri]) to document-node() sequence: 
collection("ZK") -> (db:open-pre("ZK", 0), ...)
- inline $string_0
- inline $fuzzy_1
- rewrite if to iter filter: if(false()) then (.)[descendant::text() ... -> 
(.)[descendant::text() contains text "st...
- rewrite iter filter to ftcontains: (.)[descendant::text() contains text 
"st... -> descendant::text() contains text "string...
- rewrite iter filter to ftcontains: (.)[descendant::text() contains text 
"st... -> descendant::text() contains text "string...
- simplify gflwor

Optimized Query:
(ft:search("ZK", "string" using language 
'English')/ancestor::tei:TEI[parent::document-node()], (db:open-pre("ZK", 0), 
...)/tei:TEI[descendant::text() contains text "string" using language 
'English'])

Parsing: 321.15 ms
Compiling: 114.32 ms
Evaluating: 6.49 ms
Printing: 17467.63 ms
Total Time: 17909.59 ms

Hit(s): 0 Items
Updated: 0 Items
Printed: 0 b
Read Locking: ZK
Write Locking: (none)

Query "predicate_test.xql" executed in 17909.59 ms.
szimmer1@luhmann1:/var/local/basex/webapp$


Re: [basex-talk] Curious query optimization

2018-07-12 Thread Christian Grün
Hi Sebastian,

This has been fixed. The background: In one of the optimizations of
the "if" expression, identical branches are merged:

  if(..expensive query..) then 1 else 1
  → Optimized Query: 1

The full-text options were ignored in the equality check.
A new snapshot is online.

Best,
Christian



On Wed, Jul 11, 2018 at 1:22 PM Sebastian Zimmer
 wrote:
>
> Hi,
>
> I have a query which is optimized in a curious way in BaseX 9.0.2 
> (yesterday's snapshot).
>
> This is the original query:
>
> xquery version "3.1";
> declare namespace tei = "http://www.tei-c.org/ns/1.0";;
> let $string := "string"
> let $fuzzy := false()
>
> return (
>   collection('ZK')/tei:TEI[
> if (false())
>   then (.[descendant::text() contains text {$string} using fuzzy])
>   else (.[descendant::text() contains text {$string}])
>   ],
>   collection('ZK')/tei:TEI[
> if ($fuzzy)
> then (.[descendant::text() contains text {$string} using fuzzy])
> else (.[descendant::text() contains text {$string}])
>   ]
> )
>
> And this is the optimized one (newlines inserted by me for better 
> readability):
>
> (
> ft:search("ZK", "string" using language 
> 'English')/ancestor::tei:TEI[parent::document-node()],
> ft:search("ZK", "string" using fuzzy using language 
> 'English')/ancestor::tei:TEI[parent::document-node()]
> )
>
> I'm curious why the second search is using fuzzy, even though the variable 
> $fuzzy is false. I presume that query optimization is independent of the 
> data, so you won't need the data to reproduce. But if you do, I can provide 
> it. A database with enabled full-text index is required obviously.
>
> Best regards,
> Sebastian Zimmer
>
> --
> Sebastian Zimmer
> sebastian.zim...@uni-koeln.de
>
> Cologne Center for eHumanities
> DH Center at the University of Cologne
> @CCeHum


[basex-talk] Curious query optimization

2018-07-11 Thread Sebastian Zimmer

Hi,

I have a query which is optimized in a curious way in BaseX 9.0.2 
(yesterday's snapshot).


This is the original query:

xquery version "3.1";
declare namespace tei ="http://www.tei-c.org/ns/1.0";;
let $string := "string"
let $fuzzy := false()

return (
  collection('ZK')/tei:TEI[
    if (false())
      then (.[descendant::text() contains text {$string} using fuzzy])
      else (.[descendant::text() contains text {$string}])
  ],
  collection('ZK')/tei:TEI[
    if ($fuzzy)
    then (.[descendant::text() contains text {$string} using fuzzy])
    else (.[descendant::text() contains text {$string}])
  ]
)

And this is the optimized one (newlines inserted by me for better 
readability):


(
ft:search("ZK", "string" using language 
'English')/ancestor::tei:TEI[parent::document-node()],
ft:search("ZK", "string" using fuzzy using language 
'English')/ancestor::tei:TEI[parent::document-node()]
)

I'm curious why the second search is using fuzzy, even though the 
variable $fuzzy is false. I presume that query optimization is 
independent of the data, so you won't need the data to reproduce. But if 
you do, I can provide it. A database with enabled full-text index is 
required obviously.


Best regards,
Sebastian Zimmer

--
Sebastian Zimmer
sebastian.zim...@uni-koeln.de 

Cologne Center for eHumanities 
DH Center at the University of Cologne
@CCeHum