[basex-talk] Executing optimized XQuery in RestXQ without having to deal with global lock situations

2022-05-02 Thread Omar Siam
What I like about BaseX is that it is very good at optimizing 
self-contained queries about the size a user can read and understand [1] 
[2] and that it has a DB locking system for transaction management [3] 
that is robust and easy to understand.


What I don’t like so much about BaseX is that these two mechanisms don’t 
work very well with complex code that is split into various modules. I 
use modules for code that may be shared among projects or just as a 
means of grouping common concerns in one module.
That I don’t like this behaviour does not mean I know (or have any hope) 
that this can be solved in a better way without at least make unpleasant 
sacrifices elsewhere. It is just the setting I have to deal with.


When BaseX cannot determine anymore which DBs are used in a query and 
which are not, it falls back to assuming there are no indexes, so 
automatic optimization in this regard is stopped, and it assumes that 
just all DBs known to BaseX are used in that query so it acquires a 
global lock. [4]


When doing only reading queries this is not much of a problem. Using 
indexes in queries can be forced with functions or with the 
db:enforceindex pragma [5].


Problems start showing when trying to implement a CRUD RestXQ 
application. Create, update and delete can be implemented using the 
XQuery update standard but of course now this will get slow and 
cumbersome when for many read operations it cannot be determined which 
DBs they use and so a global read lock is held. That of course means 
that no global write lock can be acquired until all read operations are 
finished on all DBs known to a BaseX instance.


This is especially problematic when one instance of BaseX with a RestXQ 
application is used to serve data from independent databases. Say one 
instance of BaseX has a RestXQ API that servers a lot of different 
dictionaries for different natural languages. This is my use case. 
Although the content of dictionary entries is different, the parts in 
the TEI/XML I try to manipulate, that are created, read, updated or 
deleted, are the same. So, a common API should handle many independent 
dictionaries, edited by many users, using one instance of BaseX.


Also, when working with my biggest XML database of several GB I ran into 
problems when reindexing after an update. Reindexing all those GB of 
data takes too long and makes small updates in there impossible.


Why not multiple instances of BaseX? Well because for better or worse 
BaseX runs in a JVM and even after I tried to minimize the memory 
footprint of an idle BaseX it is still a little less than 300 MB and we 
run a lot of services here on shared servers so RAM usage matters. Also, 
RAM usage is a part of the costs when using commercial cloud services. 
But of course, not running BaseX at all if not used is best if you pay 
per minute. And also: as recently discussed on the list: BaseX as any 
Java program gets optimized while running by the JVM and then those 
optimizations as well as caching will benefit all the data hosted in one 
instance but would be less efficient with multiple instances I assume.


So how do I achieve four goals:
*  Keep the XQuery short and concise because that is what the optimizer 
can handle best?

* Keep the code separated into Modules that deal with one particular aspect?
* Use RestXQ and not another technique to actually implement the RESTful 
API?
* All this while being able to split GB of XML data into portions that 
can be reindexed in a reasonable amount of time?


The two thing that help here a lot is eval functions like xquery:eval 
[6] and String Constructors [7].
Say, I want to run a query but on different collections (databases). I 
can do this by having a list of collections and executing the actual 
query in a for loop with the concrete collection as a variable.
If I just write the XQuery code down like this the problem is that the 
optimizer would need to evaluate the query to find out which databases 
to lock and what indexes can be used. BaseX is not built to do this 
(yet). It does not mock run the query. So, it decides that a global lock 
needs to be used. Depending on the use of XQuery Update either a global 
write lock or a global read lock is acquired. Easy to understand but 
does not help with performance here.
If I want to make the situation worse for the optimizer I can use 
xquery:eval. That of course makes the XQuery code totally opaque to the 
optimizer. A global lock is guaranteed.


Still another eval function is a solution here. There is the jobs module 
jobs:eval [8].
If I break up my code into jobs only these jobs hold locks for as long 
as they run. This can be a much shorter period of time than what it 
takes to run a whole RestXQ request. It is also possible to find a place 
that needs to be changed in a number of databases and then only write 
lock one of them to change something.
So, if my data is stored in not one but several database files I can 
make them look like 

Re: [basex-talk] [basex-announce] BaseX 9.7.1: Tweaks, Fixes, Features

2022-05-02 Thread Wiemer, Sebastian
Hi Christian,
to recreate the difference:
* download camunda BPMN engine (https://camunda.com/download/) for your platform
* unzip (it will create the folder "camunda")
* enter directory "camunda" and start "start.bat" or "start.sh" with no 
parameters
* wait for camunda to start
* execute the xquery code below in 9.6.4 and 9.7.1

best regards and どういたしまして!

Sebastian

Example code:
http:send-request(
http://localhost:8080/engine-rest/deployment/create;
  username="demo"
  password="demo">
 


{file:read-text("test.bpmn")=>parse-xml()}


  )


Version 9.6.4 result:
http://expath.org/ns/http-client; status="200" 
message="">
  
  
  
  
  
  
  


  
<_ type="object">
  GET
  
http://localhost:8080/engine-rest/deployment/fbd9b0d2-ca1e-11ec-adec-c2a34077f81f
  self

  
  fbd9b0d2-ca1e-11ec-adec-c2a34077f81f
  
  
  2022-05-02T15:51:36.482+0200
  
  
  
  
  


Version 9.7.1 result:
http://expath.org/ns/http-client; status="415" 
message="">
  
  
  
  
  
  
  


  NotSupportedException
  HTTP 415 Unsupported Media Type





Von: Christian Grün 
Gesendet: Montag, 2. Mai 2022 09:41
An: Wiemer, Sebastian 
Cc: basex-talk@mailman.uni-konstanz.de 
Betreff: Re: [basex-announce] BaseX 9.7.1: Tweaks, Fixes, Features

Hi Sebastian,

Thanks for your feedback. Could you provide us with a little self-contained 
example to reproduce the issue?

どうもありがとう,
Christian


On Mon, May 2, 2022 at 8:00 AM Wiemer, Sebastian 
mailto:sebastian.wie...@adesso.de>> wrote:
Hi,
thank you for your continuing effort to make BaseX better!
I eagerly await every new version 

While trying out BaseX 9.7.1, i experienced a different behavior than in BaseX 
9.6.4

A POST request sent to any camunda bpm-engine rest endpoint 
(http:send-request(...) with a POST method)
works fine with 9.6.4 but yields a "wrong media-type" 415 error with 9.7.1
The media-type is "multipart/formdata"

I tried changing the media-type of the request, the body, the form fields, ... 
but always got the 415 error.

Did something change between 9.6.4 and 9.7.1?
Is there a new way to set the media-type?

Best regards,
 Sebastian



Von: BaseX-Announce 
mailto:basex-announce-boun...@mailman.uni-konstanz.de>>
 im Auftrag von Christian Grün 
mailto:christian.gr...@gmail.com>>
Gesendet: Dienstag, 26. April 2022 11:17
An: BaseX 
mailto:basex-talk@mailman.uni-konstanz.de>>;
 BaseX 
mailto:basex-annou...@mailman.uni-konstanz.de>>
Betreff: [basex-announce] BaseX 9.7.1: Tweaks, Fixes, Features

Dear all,

We provide you with a new version of BaseX, our open source XML framework, 
database system and XQuery 3.1 processor. Apart from performance tweaks and bug 
fixes, it comes with the following enhancements [1,2]:

• Backups: support for comments added
• RESTXQ: improved caching for unmodified modules
• GUI, editor: list opened files (Ctrl-F6)
• GUI: improved support for middle mouse button
• XQuery: inspect:functions: parse modules only once
• XQuery: db:delete: Faster deletion of binary resources
• XQuery: jobs:eval: handling of large job numbers revised
• XQuery optimizations: rewrite value to general comparisons

All the best to everyone,
Christian

[1] https://basex.org/2022/04/26/basex-9.7.1/
[2] https://github.com/BaseXdb/basex/commits/master


[https://www.adesso.de/adesso-de/downloads/e-mail-signatur-nsoit.gif]


---
 >>> business. people. technology. <<<
---

adesso SE mit Sitz in Dortmund
Vorstand: Michael Kenfenheuer (Vors.), Dirk Pothen, Andreas Prenneis, Stefan 
Riedel, Jörg Schroeder, Torsten Wegener
Vorsitzender des Aufsichtsrates: Prof. Dr. Volker Gruhn
Amtsgericht Dortmund HRB 20663


test.bpmn
Description: test.bpmn


Re: [basex-talk] Text index requires `/text()` in query

2022-05-02 Thread Matthew Dziuban
Good to know -- thanks for the help, Christian!


Re: [basex-talk] [basex-announce] BaseX 9.7.1: Tweaks, Fixes, Features

2022-05-02 Thread Christian Grün
Hi Sebastian,

Thanks for your feedback. Could you provide us with a little self-contained
example to reproduce the issue?

どうもありがとう,
Christian


On Mon, May 2, 2022 at 8:00 AM Wiemer, Sebastian 
wrote:

> Hi,
> thank you for your continuing effort to make BaseX better!
> I eagerly await every new version 
>
> While trying out BaseX 9.7.1, i experienced a different behavior than in
> BaseX 9.6.4
>
> A POST request sent to any camunda bpm-engine rest endpoint
> (http:send-request(...) with a POST method)
> works fine with 9.6.4 but yields a "wrong media-type" 415 error with 9.7.1
> The media-type is "multipart/formdata"
>
> I tried changing the media-type of the request, the body, the form fields,
> ... but always got the 415 error.
>
> Did something change between 9.6.4 and 9.7.1?
> Is there a new way to set the media-type?
>
> Best regards,
>  Sebastian
>
>
> --
> *Von:* BaseX-Announce  im
> Auftrag von Christian Grün 
> *Gesendet:* Dienstag, 26. April 2022 11:17
> *An:* BaseX ; BaseX <
> basex-annou...@mailman.uni-konstanz.de>
> *Betreff:* [basex-announce] BaseX 9.7.1: Tweaks, Fixes, Features
>
> Dear all,
>
> We provide you with a new version of BaseX, our open source XML framework,
> database system and XQuery 3.1 processor. Apart from performance tweaks and
> bug fixes, it comes with the following enhancements [1,2]:
>
> • Backups: support for comments added
> • RESTXQ: improved caching for unmodified modules
> • GUI, editor: list opened files (Ctrl-F6)
> • GUI: improved support for middle mouse button
> • XQuery: inspect:functions: parse modules only once
> • XQuery: db:delete: Faster deletion of binary resources
> • XQuery: jobs:eval: handling of large job numbers revised
> • XQuery optimizations: rewrite value to general comparisons
>
> All the best to everyone,
> Christian
>
> [1] https://basex.org/2022/04/26/basex-9.7.1/
> [2] https://github.com/BaseXdb/basex/commits/master
>
>
>
> 
> ---
>  >>> business. people. technology. <<<
> ---
>
> adesso SE mit Sitz in Dortmund
> Vorstand: Michael Kenfenheuer (Vors.), Dirk Pothen, Andreas Prenneis,
> Stefan Riedel, Jörg Schroeder, Torsten Wegener
> Vorsitzender des Aufsichtsrates: Prof. Dr. Volker Gruhn
> Amtsgericht Dortmund HRB 20663
>


Re: [basex-talk] [basex-announce] BaseX 9.7.1: Tweaks, Fixes, Features

2022-05-02 Thread Wiemer, Sebastian
Hi,
thank you for your continuing effort to make BaseX better!
I eagerly await every new version 

While trying out BaseX 9.7.1, i experienced a different behavior than in BaseX 
9.6.4

A POST request sent to any camunda bpm-engine rest endpoint 
(http:send-request(...) with a POST method)
works fine with 9.6.4 but yields a "wrong media-type" 415 error with 9.7.1
The media-type is "multipart/formdata"

I tried changing the media-type of the request, the body, the form fields, ... 
but always got the 415 error.

Did something change between 9.6.4 and 9.7.1?
Is there a new way to set the media-type?

Best regards,
 Sebastian



Von: BaseX-Announce  im Auftrag 
von Christian Grün 
Gesendet: Dienstag, 26. April 2022 11:17
An: BaseX ; BaseX 

Betreff: [basex-announce] BaseX 9.7.1: Tweaks, Fixes, Features

Dear all,

We provide you with a new version of BaseX, our open source XML framework, 
database system and XQuery 3.1 processor. Apart from performance tweaks and bug 
fixes, it comes with the following enhancements [1,2]:

• Backups: support for comments added
• RESTXQ: improved caching for unmodified modules
• GUI, editor: list opened files (Ctrl-F6)
• GUI: improved support for middle mouse button
• XQuery: inspect:functions: parse modules only once
• XQuery: db:delete: Faster deletion of binary resources
• XQuery: jobs:eval: handling of large job numbers revised
• XQuery optimizations: rewrite value to general comparisons

All the best to everyone,
Christian

[1] https://basex.org/2022/04/26/basex-9.7.1/
[2] https://github.com/BaseXdb/basex/commits/master


New School of IT
https://www.new-school-of-it.com/?pk_campaign=new_school_of_it_source=email_medium=banner_content=new_school_of_it
 

---
 >>> business. people. technology. <<<
---

adesso SE mit Sitz in Dortmund
Vorstand: Michael Kenfenheuer (Vors.), Dirk Pothen, Andreas Prenneis, Stefan 
Riedel, Jörg Schroeder, Torsten Wegener
Vorsitzender des Aufsichtsrates: Prof. Dr. Volker Gruhn
Amtsgericht Dortmund HRB 20663