Re: [basex-talk] Performance of ft:search function

2022-04-29 Thread Christian Grün
Exactly: The longer you run a BaseX instance, the faster it gets. That’s
particularly noticeable when using the client/server or HTTP architecture.

There are various reasons for that: BaseX caches, OS & main-memory caching,
JIT optimizations, …



Tim Thompson  schrieb am Fr., 29. Apr. 2022, 22:40:

> Oh, I see--thanks for the tip; I wasn't aware of the SET RUNS feature,
> which is really helpful! With 1000 runs, the average execution time is more
> in line with expectations: 38.96ms for expression #1 and 12.44ms for #2.
> But I notice that with successive executions, #1 gets faster: 38.96ms,
> 17.73ms, 12.82ms. Is this a result of caching?
>
> Best,
> Tim
>
>
> --
> Tim A. Thompson (he, him)
> Librarian for Applied Metadata Research
> Yale University Library
>
>
>
> On Wed, Apr 27, 2022 at 5:09 PM Christian Grün 
> wrote:
>
>> 2. Direct lookup against subindex
>>> Time: 3.3ms
>>> Expression: ft:search($index, $text)/../..
>>>
>>> 3. Lookup against subindex file with reference to large index
>>> Time: 2.9ms
>>> Expression:
>>> let $s :=
>>>   ft:search($index, $text)/../..
>>> return db:open-id($db, $s/id)/../..
>>>
>>> My question is: why would the third expression be slightly faster (or at
>>> least not slower) than the second one, if it involves additional
>>> computation?
>>>
>>
>> I assume it's due to slight variations during your measurements. How many
>> items will be returned by ft:search? Do you get the same runtime if you run
>> the code 100 or 1000 times?
>>
>> In the GUI, you can type and execute SET RUNS 100 in the top input bar
>> (in command mode). Your query will then be executed multiple times, and you
>> will get shown the average runtime in the Info View.
>>
>>
>>
>>
>>


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Christian Grün
That's a good hint: Some optimizations still need to be tweaked to support
namespaces [1]. You’ll be safe if you include the explicit text step.

[1] https://github.com/BaseXdb/basex/issues/1763




Matthew Dziuban  schrieb am Fr., 29. Apr. 2022, 21:00:

> As I was trying to come up with a simple example to reproduce it I
> rediscovered that the top-level  element specifies an XML namespace,
> apologies I failed to mention that initially. Would that affect whether the
> index is used or not?
>
> I'm able to reproduce by loading this data into a new database named
> ElementsTest:
>
> http://www.w3.org/2001/XMLSchema-instance;>
>   1
> 
>
> And then running this query:
>
> for $x in db:open('ElementsTest')/data/element
> where $x/id = '1'
> return $x/id
>
> The GUI shows the following as the optimized query:
>
> db:open-pre("ElementsTest", 0)/data/element[(id = "1")]/id
>


Re: [basex-talk] Performance of ft:search function

2022-04-29 Thread Tim Thompson
Oh, I see--thanks for the tip; I wasn't aware of the SET RUNS feature,
which is really helpful! With 1000 runs, the average execution time is more
in line with expectations: 38.96ms for expression #1 and 12.44ms for #2.
But I notice that with successive executions, #1 gets faster: 38.96ms,
17.73ms, 12.82ms. Is this a result of caching?

Best,
Tim


-- 
Tim A. Thompson (he, him)
Librarian for Applied Metadata Research
Yale University Library



On Wed, Apr 27, 2022 at 5:09 PM Christian Grün 
wrote:

> 2. Direct lookup against subindex
>> Time: 3.3ms
>> Expression: ft:search($index, $text)/../..
>>
>> 3. Lookup against subindex file with reference to large index
>> Time: 2.9ms
>> Expression:
>> let $s :=
>>   ft:search($index, $text)/../..
>> return db:open-id($db, $s/id)/../..
>>
>> My question is: why would the third expression be slightly faster (or at
>> least not slower) than the second one, if it involves additional
>> computation?
>>
>
> I assume it's due to slight variations during your measurements. How many
> items will be returned by ft:search? Do you get the same runtime if you run
> the code 100 or 1000 times?
>
> In the GUI, you can type and execute SET RUNS 100 in the top input bar (in
> command mode). Your query will then be executed multiple times, and you
> will get shown the average runtime in the Info View.
>
>
>
>
>


Re: [basex-talk] xsl:transform-report message truncation

2022-04-29 Thread Andy Bunce
Ah, yes I see now. I never noticed this before. Looks good now I realise
what is happening.
/Andy

On Fri, 29 Apr 2022 at 19:11, Christian Grün 
wrote:

> Hi Andy,
>
> It’s the BaseX standard serializer that truncates maps and arrays.
> Some more examples:
>
> [ string-join(1 to 1000) ],
> map { 1: string-join(1 to 1000) }
>
> You can get the full string by attaching a ?* lookup step to your query.
>
> Maybe we can remove the truncation of values in function items; I’ll
> have some more thoughts on that.
>
> Thanks,
> Christian
>


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Matthew Dziuban
As I was trying to come up with a simple example to reproduce it I
rediscovered that the top-level  element specifies an XML namespace,
apologies I failed to mention that initially. Would that affect whether the
index is used or not?

I'm able to reproduce by loading this data into a new database named
ElementsTest:

http://www.w3.org/2001/XMLSchema-instance;>
  1


And then running this query:

for $x in db:open('ElementsTest')/data/element
where $x/id = '1'
return $x/id

The GUI shows the following as the optimized query:

db:open-pre("ElementsTest", 0)/data/element[(id = "1")]/id


Re: [basex-talk] xsl:transform-report message truncation

2022-04-29 Thread Christian Grün
Hi Andy,

It’s the BaseX standard serializer that truncates maps and arrays.
Some more examples:

[ string-join(1 to 1000) ],
map { 1: string-join(1 to 1000) }

You can get the full string by attaching a ?* lookup step to your query.

Maybe we can remove the truncation of values in function items; I’ll
have some more thoughts on that.

Thanks,
Christian


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Christian Grün
> Thanks for the quick response! That query returns the following:

Interesting; all elements seem to have a single text node. Hm. Can you provide
us with a self-contained example?

> Out of curiosity, is there a way to see index utilization through the DBA
web app or via the ClientSession java class [1] instead of the GUI? I'm
using the client/server architecture so mainly run queries these ways.

With the ClientSession class, it should be possible to enable the query
info by enabling the QUERYINFO option [1] (which you can then request via
the info() method). In the DBA, there’s currently no such option.

[1] https://docs.basex.org/wiki/Options#QUERYINFO


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Matthew Dziuban
Hi Christian,

Thanks for the quick response! That query returns the following:


  


Out of curiosity, is there a way to see index utilization through the DBA
web app or via the ClientSession java class [1] instead of the GUI? I'm
using the client/server architecture so mainly run queries these ways.

Best,
Matt

On Fri, Apr 29, 2022 at 1:52 PM Christian Grün 
wrote:

> Hi Matthew,
>
> If you run your query on the following document …
>
> 
>   123
>   456
> 
>
> … and if you look into the Info View in the GUI, you will notice that
> the index will be utilized:
>
> Optimized Query:
> db:text("data", "DatabaseName")/parent::id/parent::element
>
> The query optimizer detects that all “data/element/id” elements are
> leaf elements (i.e., have a single text child node), and the resulting
> query will be rewritten for index.
>
> Maybe there are “id” elements in your document that are no leaf
> elements? Could you share the result of the following query with us?
>
>
> index:facets('data')/*/element[@name='data']/element[@name='element']/element[@name='id']
>
> Best,
> Christian
>


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Christian Grün
Hi Matthew,

If you run your query on the following document …


  123
  456


… and if you look into the Info View in the GUI, you will notice that
the index will be utilized:

Optimized Query:
db:text("data", "DatabaseName")/parent::id/parent::element

The query optimizer detects that all “data/element/id” elements are
leaf elements (i.e., have a single text child node), and the resulting
query will be rewritten for index.

Maybe there are “id” elements in your document that are no leaf
elements? Could you share the result of the following query with us?

index:facets('data')/*/element[@name='data']/element[@name='element']/element[@name='id']

Best,
Christian


[basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Matthew Dziuban
Hi all,

I was recently debugging performance of a query with an exact string
comparison and discovered that it seems the query was only rewritten to use
the text index [1] if I explicitly added `/text()` to the path I was
comparing.

My data looks like this:


  123


And my original query was:

for $el in db:open('DatabaseName')/data/element
where $el/id = '123'
return $el

With 3 million  nodes in the database, this query took about 4
seconds, which made me question whether the text index was being used. I
then changed the query to add `/text()` to the `where` clause, like so:

for $el in db:open('DatabaseName')/data/element
where $el/id/text() = '123'
return $el

With this change, the query only takes 0.4 seconds. Is it expected that
`/text()` is required to get the text index to kick in?

Thanks in advance,
Matt

[1] https://docs.basex.org/wiki/Indexes#Text_Index


[basex-talk] xsl:transform-report message truncation

2022-04-29 Thread Andy Bunce
Hi,
Using 9.7.1
(: test transform :)
let $xslt:=http://www.w3.org/1999/XSL/Transform;
   version="3.0">

I want to see all of the very long message
a
a bbb  

gg  gg
aaa important
bit


return xslt:transform-report(,$xslt)?messages

Returns
["I want to see all of the very long message a
 a bbb  
gg  gg
aaa..."]

Is it BaseX truncating this? Can it be turned off for this case?

/Andy


Re: [basex-talk] Date picture and xslt:transform()

2022-04-29 Thread Zimmel, Daniel
I get "Saxon HE", but that's what I already know ;-)

Yes it works with Saxon EE, I only wonder why it does not with Saxon HE.
I will ask over at Saxonica, already found a bug report that might be 
related... there is nothing in the documentation that says it should not work 
with HE as well.

Thanks!

-Ursprüngliche Nachricht-
Von: Christian Grün  
Gesendet: Freitag, 29. April 2022 10:29
An: Zimmel, Daniel 
Cc: basex-talk@mailman.uni-konstanz.de
Betreff: Re: [basex-talk] Date picture and xslt:transform()

Hi Daniel,

What do you get if you invoke xslt:processor() ?

If it’s "Saxon EE", you should get "29. März 2022" as result of 
your query (at least that’s what I get). If it’s something else, it indicates 
that Saxon EE has not correctly been embedded in your Java classpath (see [1] 
for further information).

If it’s only about formatting date, you can also run your function call within 
BaseX …

format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), ())

… but I guess that’s what you already know.

Best,
Christian

[1] https://docs.basex.org/wiki/XSLT_Module



On Fri, Apr 29, 2022 at 10:05 AM Zimmel, Daniel  wrote:
>
> Hi,
>
> why do I get different results with the following two queries?
> xslt:transform() does not respect my date picture.
>
> Expected result:
>
> 29. März 2022
>
> Query 1:
>
> {format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 
> 'de', (), ())}
>
> Result:
> 29. März 2022
>
> Query 2:
>
> declare namespace xsl = 'http://www.w3.org/1999/XSL/Transform';
> let $xslt :=xmlns:xs="http://www.w3.org/2001/XMLSchema;
>   exclude-result-prefixes="xs">
>   
> 
>   
> 
>   
> 
> let $xml := 
>
> return
>   for $xml in $xml
>   return
> $xml => xslt:transform($xslt)
>
> Result:
> [Language: en]29. March 2022
>
>
> Running the XSLT with Saxon EE (not in BaseX via xslt:transform) returns 
> (correctly):
>
> 29. März 2022
>
> Using BaseX 9.5
>
> ?
>
> Daniel
>


Re: [basex-talk] Date picture and xslt:transform()

2022-04-29 Thread Christian Grün
Hi Daniel,

What do you get if you invoke xslt:processor() ?

If it’s "Saxon EE", you should get "29. März 2022" as
result of your query (at least that’s what I get). If it’s something
else, it indicates that Saxon EE has not correctly been embedded in
your Java classpath (see [1] for further information).

If it’s only about formatting date, you can also run your function
call within BaseX …

format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), ())

… but I guess that’s what you already know.

Best,
Christian

[1] https://docs.basex.org/wiki/XSLT_Module



On Fri, Apr 29, 2022 at 10:05 AM Zimmel, Daniel  wrote:
>
> Hi,
>
> why do I get different results with the following two queries?
> xslt:transform() does not respect my date picture.
>
> Expected result:
>
> 29. März 2022
>
> Query 1:
>
> {format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), 
> ())}
>
> Result:
> 29. März 2022
>
> Query 2:
>
> declare namespace xsl = 'http://www.w3.org/1999/XSL/Transform';
> let $xslt :=xmlns:xs="http://www.w3.org/2001/XMLSchema;
>   exclude-result-prefixes="xs">
>   
> 
>   
> 
>   
> 
> let $xml := 
>
> return
>   for $xml in $xml
>   return
> $xml => xslt:transform($xslt)
>
> Result:
> [Language: en]29. March 2022
>
>
> Running the XSLT with Saxon EE (not in BaseX via xslt:transform) returns 
> (correctly):
>
> 29. März 2022
>
> Using BaseX 9.5
>
> ?
>
> Daniel
>


Re: [basex-talk] Date picture and xslt:transform()

2022-04-29 Thread Zimmel, Daniel
... for clarification, I can reproduce the behavior with stand-alone Saxon HE 
10.6 (but not in PE and EE).
Perhaps this needs to be addressed to Saxon then, since the documentation says 
"available in all editions"... still happy if anybody does have some helpful 
insights here.

Daniel

-Ursprüngliche Nachricht-
Von: Zimmel, Daniel <> 
Gesendet: Freitag, 29. April 2022 10:05
An: basex-talk@mailman.uni-konstanz.de
Betreff: Date picture and xslt:transform()

Hi,

why do I get different results with the following two queries?
xslt:transform() does not respect my date picture.

Expected result: 

29. März 2022

Query 1:

{format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), 
())}

Result: 
29. März 2022

Query 2:

declare namespace xsl = 'http://www.w3.org/1999/XSL/Transform';
let $xslt := http://www.w3.org/2001/XMLSchema; 
  exclude-result-prefixes="xs">
  

  



let $xml := 

return
  for $xml in $xml
  return
$xml => xslt:transform($xslt)

Result: 
[Language: en]29. March 2022


Running the XSLT with Saxon EE (not in BaseX via xslt:transform) returns 
(correctly): 

29. März 2022

Using BaseX 9.5

?

Daniel



[basex-talk] Date picture and xslt:transform()

2022-04-29 Thread Zimmel, Daniel
Hi,

why do I get different results with the following two queries?
xslt:transform() does not respect my date picture.

Expected result: 

29. März 2022

Query 1:

{format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), 
())}

Result: 
29. März 2022

Query 2:

declare namespace xsl = 'http://www.w3.org/1999/XSL/Transform';
let $xslt := http://www.w3.org/2001/XMLSchema; 
  exclude-result-prefixes="xs">
  

  



let $xml := 

return
  for $xml in $xml
  return
$xml => xslt:transform($xslt)

Result: 
[Language: en]29. March 2022


Running the XSLT with Saxon EE (not in BaseX via xslt:transform) returns 
(correctly): 

29. März 2022

Using BaseX 9.5

?

Daniel