Re: [basex-talk] Performance of ft:search function
Exactly: The longer you run a BaseX instance, the faster it gets. That’s particularly noticeable when using the client/server or HTTP architecture. There are various reasons for that: BaseX caches, OS & main-memory caching, JIT optimizations, … Tim Thompson schrieb am Fr., 29. Apr. 2022, 22:40: > Oh, I see--thanks for the tip; I wasn't aware of the SET RUNS feature, > which is really helpful! With 1000 runs, the average execution time is more > in line with expectations: 38.96ms for expression #1 and 12.44ms for #2. > But I notice that with successive executions, #1 gets faster: 38.96ms, > 17.73ms, 12.82ms. Is this a result of caching? > > Best, > Tim > > > -- > Tim A. Thompson (he, him) > Librarian for Applied Metadata Research > Yale University Library > > > > On Wed, Apr 27, 2022 at 5:09 PM Christian Grün > wrote: > >> 2. Direct lookup against subindex >>> Time: 3.3ms >>> Expression: ft:search($index, $text)/../.. >>> >>> 3. Lookup against subindex file with reference to large index >>> Time: 2.9ms >>> Expression: >>> let $s := >>> ft:search($index, $text)/../.. >>> return db:open-id($db, $s/id)/../.. >>> >>> My question is: why would the third expression be slightly faster (or at >>> least not slower) than the second one, if it involves additional >>> computation? >>> >> >> I assume it's due to slight variations during your measurements. How many >> items will be returned by ft:search? Do you get the same runtime if you run >> the code 100 or 1000 times? >> >> In the GUI, you can type and execute SET RUNS 100 in the top input bar >> (in command mode). Your query will then be executed multiple times, and you >> will get shown the average runtime in the Info View. >> >> >> >> >>
Re: [basex-talk] Text index requires `/text()` in query
That's a good hint: Some optimizations still need to be tweaked to support namespaces [1]. You’ll be safe if you include the explicit text step. [1] https://github.com/BaseXdb/basex/issues/1763 Matthew Dziuban schrieb am Fr., 29. Apr. 2022, 21:00: > As I was trying to come up with a simple example to reproduce it I > rediscovered that the top-level element specifies an XML namespace, > apologies I failed to mention that initially. Would that affect whether the > index is used or not? > > I'm able to reproduce by loading this data into a new database named > ElementsTest: > > http://www.w3.org/2001/XMLSchema-instance;> > 1 > > > And then running this query: > > for $x in db:open('ElementsTest')/data/element > where $x/id = '1' > return $x/id > > The GUI shows the following as the optimized query: > > db:open-pre("ElementsTest", 0)/data/element[(id = "1")]/id >
Re: [basex-talk] Performance of ft:search function
Oh, I see--thanks for the tip; I wasn't aware of the SET RUNS feature, which is really helpful! With 1000 runs, the average execution time is more in line with expectations: 38.96ms for expression #1 and 12.44ms for #2. But I notice that with successive executions, #1 gets faster: 38.96ms, 17.73ms, 12.82ms. Is this a result of caching? Best, Tim -- Tim A. Thompson (he, him) Librarian for Applied Metadata Research Yale University Library On Wed, Apr 27, 2022 at 5:09 PM Christian Grün wrote: > 2. Direct lookup against subindex >> Time: 3.3ms >> Expression: ft:search($index, $text)/../.. >> >> 3. Lookup against subindex file with reference to large index >> Time: 2.9ms >> Expression: >> let $s := >> ft:search($index, $text)/../.. >> return db:open-id($db, $s/id)/../.. >> >> My question is: why would the third expression be slightly faster (or at >> least not slower) than the second one, if it involves additional >> computation? >> > > I assume it's due to slight variations during your measurements. How many > items will be returned by ft:search? Do you get the same runtime if you run > the code 100 or 1000 times? > > In the GUI, you can type and execute SET RUNS 100 in the top input bar (in > command mode). Your query will then be executed multiple times, and you > will get shown the average runtime in the Info View. > > > > >
Re: [basex-talk] xsl:transform-report message truncation
Ah, yes I see now. I never noticed this before. Looks good now I realise what is happening. /Andy On Fri, 29 Apr 2022 at 19:11, Christian Grün wrote: > Hi Andy, > > It’s the BaseX standard serializer that truncates maps and arrays. > Some more examples: > > [ string-join(1 to 1000) ], > map { 1: string-join(1 to 1000) } > > You can get the full string by attaching a ?* lookup step to your query. > > Maybe we can remove the truncation of values in function items; I’ll > have some more thoughts on that. > > Thanks, > Christian >
Re: [basex-talk] Text index requires `/text()` in query
As I was trying to come up with a simple example to reproduce it I rediscovered that the top-level element specifies an XML namespace, apologies I failed to mention that initially. Would that affect whether the index is used or not? I'm able to reproduce by loading this data into a new database named ElementsTest: http://www.w3.org/2001/XMLSchema-instance;> 1 And then running this query: for $x in db:open('ElementsTest')/data/element where $x/id = '1' return $x/id The GUI shows the following as the optimized query: db:open-pre("ElementsTest", 0)/data/element[(id = "1")]/id
Re: [basex-talk] xsl:transform-report message truncation
Hi Andy, It’s the BaseX standard serializer that truncates maps and arrays. Some more examples: [ string-join(1 to 1000) ], map { 1: string-join(1 to 1000) } You can get the full string by attaching a ?* lookup step to your query. Maybe we can remove the truncation of values in function items; I’ll have some more thoughts on that. Thanks, Christian
Re: [basex-talk] Text index requires `/text()` in query
> Thanks for the quick response! That query returns the following: Interesting; all elements seem to have a single text node. Hm. Can you provide us with a self-contained example? > Out of curiosity, is there a way to see index utilization through the DBA web app or via the ClientSession java class [1] instead of the GUI? I'm using the client/server architecture so mainly run queries these ways. With the ClientSession class, it should be possible to enable the query info by enabling the QUERYINFO option [1] (which you can then request via the info() method). In the DBA, there’s currently no such option. [1] https://docs.basex.org/wiki/Options#QUERYINFO
Re: [basex-talk] Text index requires `/text()` in query
Hi Christian, Thanks for the quick response! That query returns the following: Out of curiosity, is there a way to see index utilization through the DBA web app or via the ClientSession java class [1] instead of the GUI? I'm using the client/server architecture so mainly run queries these ways. Best, Matt On Fri, Apr 29, 2022 at 1:52 PM Christian Grün wrote: > Hi Matthew, > > If you run your query on the following document … > > > 123 > 456 > > > … and if you look into the Info View in the GUI, you will notice that > the index will be utilized: > > Optimized Query: > db:text("data", "DatabaseName")/parent::id/parent::element > > The query optimizer detects that all “data/element/id” elements are > leaf elements (i.e., have a single text child node), and the resulting > query will be rewritten for index. > > Maybe there are “id” elements in your document that are no leaf > elements? Could you share the result of the following query with us? > > > index:facets('data')/*/element[@name='data']/element[@name='element']/element[@name='id'] > > Best, > Christian >
Re: [basex-talk] Text index requires `/text()` in query
Hi Matthew, If you run your query on the following document … 123 456 … and if you look into the Info View in the GUI, you will notice that the index will be utilized: Optimized Query: db:text("data", "DatabaseName")/parent::id/parent::element The query optimizer detects that all “data/element/id” elements are leaf elements (i.e., have a single text child node), and the resulting query will be rewritten for index. Maybe there are “id” elements in your document that are no leaf elements? Could you share the result of the following query with us? index:facets('data')/*/element[@name='data']/element[@name='element']/element[@name='id'] Best, Christian
[basex-talk] Text index requires `/text()` in query
Hi all, I was recently debugging performance of a query with an exact string comparison and discovered that it seems the query was only rewritten to use the text index [1] if I explicitly added `/text()` to the path I was comparing. My data looks like this: 123 And my original query was: for $el in db:open('DatabaseName')/data/element where $el/id = '123' return $el With 3 million nodes in the database, this query took about 4 seconds, which made me question whether the text index was being used. I then changed the query to add `/text()` to the `where` clause, like so: for $el in db:open('DatabaseName')/data/element where $el/id/text() = '123' return $el With this change, the query only takes 0.4 seconds. Is it expected that `/text()` is required to get the text index to kick in? Thanks in advance, Matt [1] https://docs.basex.org/wiki/Indexes#Text_Index
[basex-talk] xsl:transform-report message truncation
Hi, Using 9.7.1 (: test transform :) let $xslt:=http://www.w3.org/1999/XSL/Transform; version="3.0"> I want to see all of the very long message a a bbb gg gg aaa important bit return xslt:transform-report(,$xslt)?messages Returns ["I want to see all of the very long message a a bbb gg gg aaa..."] Is it BaseX truncating this? Can it be turned off for this case? /Andy
Re: [basex-talk] Date picture and xslt:transform()
I get "Saxon HE", but that's what I already know ;-) Yes it works with Saxon EE, I only wonder why it does not with Saxon HE. I will ask over at Saxonica, already found a bug report that might be related... there is nothing in the documentation that says it should not work with HE as well. Thanks! -Ursprüngliche Nachricht- Von: Christian Grün Gesendet: Freitag, 29. April 2022 10:29 An: Zimmel, Daniel Cc: basex-talk@mailman.uni-konstanz.de Betreff: Re: [basex-talk] Date picture and xslt:transform() Hi Daniel, What do you get if you invoke xslt:processor() ? If it’s "Saxon EE", you should get "29. März 2022" as result of your query (at least that’s what I get). If it’s something else, it indicates that Saxon EE has not correctly been embedded in your Java classpath (see [1] for further information). If it’s only about formatting date, you can also run your function call within BaseX … format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), ()) … but I guess that’s what you already know. Best, Christian [1] https://docs.basex.org/wiki/XSLT_Module On Fri, Apr 29, 2022 at 10:05 AM Zimmel, Daniel wrote: > > Hi, > > why do I get different results with the following two queries? > xslt:transform() does not respect my date picture. > > Expected result: > > 29. März 2022 > > Query 1: > > {format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', > 'de', (), ())} > > Result: > 29. März 2022 > > Query 2: > > declare namespace xsl = 'http://www.w3.org/1999/XSL/Transform'; > let $xslt :=xmlns:xs="http://www.w3.org/2001/XMLSchema; > exclude-result-prefixes="xs"> > > > > > > > let $xml := > > return > for $xml in $xml > return > $xml => xslt:transform($xslt) > > Result: > [Language: en]29. March 2022 > > > Running the XSLT with Saxon EE (not in BaseX via xslt:transform) returns > (correctly): > > 29. März 2022 > > Using BaseX 9.5 > > ? > > Daniel >
Re: [basex-talk] Date picture and xslt:transform()
Hi Daniel, What do you get if you invoke xslt:processor() ? If it’s "Saxon EE", you should get "29. März 2022" as result of your query (at least that’s what I get). If it’s something else, it indicates that Saxon EE has not correctly been embedded in your Java classpath (see [1] for further information). If it’s only about formatting date, you can also run your function call within BaseX … format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), ()) … but I guess that’s what you already know. Best, Christian [1] https://docs.basex.org/wiki/XSLT_Module On Fri, Apr 29, 2022 at 10:05 AM Zimmel, Daniel wrote: > > Hi, > > why do I get different results with the following two queries? > xslt:transform() does not respect my date picture. > > Expected result: > > 29. März 2022 > > Query 1: > > {format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), > ())} > > Result: > 29. März 2022 > > Query 2: > > declare namespace xsl = 'http://www.w3.org/1999/XSL/Transform'; > let $xslt :=xmlns:xs="http://www.w3.org/2001/XMLSchema; > exclude-result-prefixes="xs"> > > > > > > > let $xml := > > return > for $xml in $xml > return > $xml => xslt:transform($xslt) > > Result: > [Language: en]29. March 2022 > > > Running the XSLT with Saxon EE (not in BaseX via xslt:transform) returns > (correctly): > > 29. März 2022 > > Using BaseX 9.5 > > ? > > Daniel >
Re: [basex-talk] Date picture and xslt:transform()
... for clarification, I can reproduce the behavior with stand-alone Saxon HE 10.6 (but not in PE and EE). Perhaps this needs to be addressed to Saxon then, since the documentation says "available in all editions"... still happy if anybody does have some helpful insights here. Daniel -Ursprüngliche Nachricht- Von: Zimmel, Daniel <> Gesendet: Freitag, 29. April 2022 10:05 An: basex-talk@mailman.uni-konstanz.de Betreff: Date picture and xslt:transform() Hi, why do I get different results with the following two queries? xslt:transform() does not respect my date picture. Expected result: 29. März 2022 Query 1: {format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), ())} Result: 29. März 2022 Query 2: declare namespace xsl = 'http://www.w3.org/1999/XSL/Transform'; let $xslt := http://www.w3.org/2001/XMLSchema; exclude-result-prefixes="xs"> let $xml := return for $xml in $xml return $xml => xslt:transform($xslt) Result: [Language: en]29. March 2022 Running the XSLT with Saxon EE (not in BaseX via xslt:transform) returns (correctly): 29. März 2022 Using BaseX 9.5 ? Daniel
[basex-talk] Date picture and xslt:transform()
Hi, why do I get different results with the following two queries? xslt:transform() does not respect my date picture. Expected result: 29. März 2022 Query 1: {format-date(xs:date('2022-03-29'), '[D]. [MNn] [Y]', 'de', (), ())} Result: 29. März 2022 Query 2: declare namespace xsl = 'http://www.w3.org/1999/XSL/Transform'; let $xslt := http://www.w3.org/2001/XMLSchema; exclude-result-prefixes="xs"> let $xml := return for $xml in $xml return $xml => xslt:transform($xslt) Result: [Language: en]29. March 2022 Running the XSLT with Saxon EE (not in BaseX via xslt:transform) returns (correctly): 29. März 2022 Using BaseX 9.5 ? Daniel