Re: [basex-talk] Text index requires `/text()` in query

2022-05-02 Thread Matthew Dziuban
Good to know -- thanks for the help, Christian!


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Christian Grün
That's a good hint: Some optimizations still need to be tweaked to support
namespaces [1]. You’ll be safe if you include the explicit text step.

[1] https://github.com/BaseXdb/basex/issues/1763




Matthew Dziuban  schrieb am Fr., 29. Apr. 2022, 21:00:

> As I was trying to come up with a simple example to reproduce it I
> rediscovered that the top-level  element specifies an XML namespace,
> apologies I failed to mention that initially. Would that affect whether the
> index is used or not?
>
> I'm able to reproduce by loading this data into a new database named
> ElementsTest:
>
> http://www.w3.org/2001/XMLSchema-instance;>
>   1
> 
>
> And then running this query:
>
> for $x in db:open('ElementsTest')/data/element
> where $x/id = '1'
> return $x/id
>
> The GUI shows the following as the optimized query:
>
> db:open-pre("ElementsTest", 0)/data/element[(id = "1")]/id
>


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Matthew Dziuban
As I was trying to come up with a simple example to reproduce it I
rediscovered that the top-level  element specifies an XML namespace,
apologies I failed to mention that initially. Would that affect whether the
index is used or not?

I'm able to reproduce by loading this data into a new database named
ElementsTest:

http://www.w3.org/2001/XMLSchema-instance;>
  1


And then running this query:

for $x in db:open('ElementsTest')/data/element
where $x/id = '1'
return $x/id

The GUI shows the following as the optimized query:

db:open-pre("ElementsTest", 0)/data/element[(id = "1")]/id


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Christian Grün
> Thanks for the quick response! That query returns the following:

Interesting; all elements seem to have a single text node. Hm. Can you provide
us with a self-contained example?

> Out of curiosity, is there a way to see index utilization through the DBA
web app or via the ClientSession java class [1] instead of the GUI? I'm
using the client/server architecture so mainly run queries these ways.

With the ClientSession class, it should be possible to enable the query
info by enabling the QUERYINFO option [1] (which you can then request via
the info() method). In the DBA, there’s currently no such option.

[1] https://docs.basex.org/wiki/Options#QUERYINFO


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Matthew Dziuban
Hi Christian,

Thanks for the quick response! That query returns the following:


  


Out of curiosity, is there a way to see index utilization through the DBA
web app or via the ClientSession java class [1] instead of the GUI? I'm
using the client/server architecture so mainly run queries these ways.

Best,
Matt

On Fri, Apr 29, 2022 at 1:52 PM Christian Grün 
wrote:

> Hi Matthew,
>
> If you run your query on the following document …
>
> 
>   123
>   456
> 
>
> … and if you look into the Info View in the GUI, you will notice that
> the index will be utilized:
>
> Optimized Query:
> db:text("data", "DatabaseName")/parent::id/parent::element
>
> The query optimizer detects that all “data/element/id” elements are
> leaf elements (i.e., have a single text child node), and the resulting
> query will be rewritten for index.
>
> Maybe there are “id” elements in your document that are no leaf
> elements? Could you share the result of the following query with us?
>
>
> index:facets('data')/*/element[@name='data']/element[@name='element']/element[@name='id']
>
> Best,
> Christian
>


Re: [basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Christian Grün
Hi Matthew,

If you run your query on the following document …


  123
  456


… and if you look into the Info View in the GUI, you will notice that
the index will be utilized:

Optimized Query:
db:text("data", "DatabaseName")/parent::id/parent::element

The query optimizer detects that all “data/element/id” elements are
leaf elements (i.e., have a single text child node), and the resulting
query will be rewritten for index.

Maybe there are “id” elements in your document that are no leaf
elements? Could you share the result of the following query with us?

index:facets('data')/*/element[@name='data']/element[@name='element']/element[@name='id']

Best,
Christian


[basex-talk] Text index requires `/text()` in query

2022-04-29 Thread Matthew Dziuban
Hi all,

I was recently debugging performance of a query with an exact string
comparison and discovered that it seems the query was only rewritten to use
the text index [1] if I explicitly added `/text()` to the path I was
comparing.

My data looks like this:


  123


And my original query was:

for $el in db:open('DatabaseName')/data/element
where $el/id = '123'
return $el

With 3 million  nodes in the database, this query took about 4
seconds, which made me question whether the text index was being used. I
then changed the query to add `/text()` to the `where` clause, like so:

for $el in db:open('DatabaseName')/data/element
where $el/id/text() = '123'
return $el

With this change, the query only takes 0.4 seconds. Is it expected that
`/text()` is required to get the text index to kick in?

Thanks in advance,
Matt

[1] https://docs.basex.org/wiki/Indexes#Text_Index