I did some more work to capture the relevant information for the two crashes.

As you recall, I build a container image on top of the official (but old) basex 
one. It copies the database into the right place in the container.

I added the -c switch to the basexhttp command when the container starts. Even 
when I do this, the container does not have data context set when the first 
operation involving the database happens - If I do a query that involves the 
open database:

<query>
   <text>count(/ada)</text>
</query>

I get:

Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 9.6 RC1
Java: IcedTea,
1.8.0_212
OS: Linux, amd64
Stack Trace:
java.lang.NullPointerException
    at org.basex.data.Data.defaultNs(Data.java: 270)
    at org.basex.query.expr.path.NameTest.noMatches(NameTest.java: 60)
    at org.basex.query.expr.path.Step.optimize(Step.java: 162)
    at org.basex.query.expr.path.Step.optimize(Step.java: 134)
    at org.basex.query.expr.Preds.compile(Preds.java: 59)
    at org.basex.query.expr.path.Path.lambda$compile$0(Path.java: 139)
    at org.basex.query.CompileContext.get(CompileContext.java: 165)
    at org.basex.query.expr.path.Path.compile(Path.java: 134)
    at org.basex.query.expr.Arr.compile(Arr.java: 47)
    at org.basex.query.scope.MainModule.comp(MainModule.java: 81)
    at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 119)
    at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 106)
    at org.basex.query.QueryContext.compile(QueryContext.java: 306)
    at org.basex.query.QueryProcessor.compile(QueryProcessor.java: 79)
    at org.basex.core.cmd.AQuery.query(AQuery.java: 91)
    at org.basex.core.cmd.XQuery.run(XQuery.java: 22)
    at org.basex.core.Command.run(Command.java: 257)
    at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 105)
    at org.basex.http.rest.RESTQuery.query(RESTQuery.java: 69)
    at org.basex.http.rest.RESTQuery.run0(RESTQuery.java: 37)
    at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 70)
    at org.basex.core.Command.run(Command.java: 257)
    at org.basex.core.Command.execute(Command.java: 93)
    at org.basex.core.Command.execute(Command.java: 116)
    at org.basex.http.rest.RESTServlet.run(RESTServlet.java: 32)
    at org.basex.http.BaseXServlet.service(BaseXServlet.java: 65)

If I remove the -c flag, and just let the container start (still with the 
database copied into place in the container), I get this trace when I try to do 
anything related with the database:

Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 9.6 RC1
Java: IcedTea, 1.8.0_212
OS: Linux, amd64
Stack Trace:
java.lang.NullPointerException
        at org.basex.data.DiskData.write(DiskData.java:146)
        at org.basex.data.DiskData.close(DiskData.java:160)
        at org.basex.core.Datas.unpin(Datas.java:52)
        at org.basex.core.cmd.Close.close(Close.java:45)
        at org.basex.query.QueryResources.close(QueryResources.java:92)
        at org.basex.query.QueryContext.close(QueryContext.java:515)
        at org.basex.query.QueryProcessor.close(QueryProcessor.java:251)
        at org.basex.core.cmd.AQuery.query(AQuery.java:132)
        at org.basex.core.cmd.XQuery.run(XQuery.java:22)
        at org.basex.core.Command.run(Command.java:257)
        at org.basex.core.Command.execute(Command.java:93)
        at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
        at org.basex.api.client.Session.execute(Session.java:36)
        at org.basex.core.CLI.execute(CLI.java:92)
        at org.basex.core.CLI.execute(CLI.java:76)
        at org.basex.BaseX.console(BaseX.java:177)
        at org.basex.BaseX.<init>(BaseX.java:152)
        at org.basex.BaseX.main(BaseX.java:43)

I hope this is useful. Right now I am blocked.

Best Regards

Peter Villadsen.


From: Peter Villadsen
Sent: Tuesday, March 4, 2025 12:43 PM
To: Christian Grün <christian.gr...@gmail.com>
Cc: basex-talk@mailman.uni-konstanz.de
Subject: RE: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very 
slow...

Christian,

Yes, I have. Thank you for following up - I should have come back earlier.

I have been experimenting with this for a while now, and the container image 
(the official one and the quodatum, newer 10.3 one) both crash when I try to 
use them, both through HTTP and TCP. I am still looking into it. If I do not 
manage to find out what the issue is, I will upload the stack traces.

In both cases, I built my own container to include the database, so I can avoid 
the volumes and have the container be completely self-contained. The database 
is 19GB, so the container gets pretty big.

Here is how I start the container:

docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 
rainier05042023

In my humble opinion it is unfortunate that the official container image has 
not been updated for at least 3 years. It would be nice to have the newest bits 
there, supported by BaseX.

Here is the dockerfile I use:

# escape=`
# Use the BaseX 10.3 image as the base image
FROM basex/basexhttp

# Copy the Windows database directory into the container so it is available
# when the container starts, without providing a --volume parameter.
# This is fine since the database is essentially read-only.

WORKDIR /srv/basex/data
COPY --chown=basex:basex Rainier05042023 "Rainier05042023/"

# The older versions of BaseX just use admin/admin.
# RUN echo "admin" | /srv/basex/bin/basex -cPASSWORD

# Modify the CMD command so that the Rainier05042023 database is opened
CMD /usr/local/bin/basexhttp -c "open Rainier05042023"

LABEL description="Legacy BaseX with Rainier05042023 database"

# Here is a build command that builds the container with the name 
Rainier05042023:
#
# cd to the directory containing this Dockerfile and run the command:
# docker build -t rainier05042023 .
#
# When the docker container has been built it can be run with the name
# provided in the build command i.e. rainier05042023. It can be saved
# to a file with the command:
#
# docker save -o Rainier05042023.tar rainier05042023
#
# and loaded with the command:
#
# docker load -i rainier05042023.tar
#
# The container can be run with the command:
# docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 
rainier05042023
# The database can be accessed at http://localhost:8080/dba/

Best Regards

Peter VIlladsen

From: Christian Grün 
<christian.gr...@gmail.com<mailto:christian.gr...@gmail.com>>
Sent: Tuesday, March 4, 2025 6:05 AM
To: Peter Villadsen 
<peter.villad...@microsoft.com<mailto:peter.villad...@microsoft.com>>
Cc: 
basex-talk@mailman.uni-konstanz.de<mailto:basex-talk@mailman.uni-konstanz.de>
Subject: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...

Hi Peter,

To be sure, could you confirm that you have received my mails?

Best regards,
Christian


On Sat, Feb 22, 2025 at 12:41 PM Christian Grün 
<christian.gr...@gmail.com<mailto:christian.gr...@gmail.com>> wrote:
Hi Peter,

> This leads me to believe that a lot of the time (>7 seconds) may be spent 
> opening the database each time a POST is done? Is there a way to tweak the 
> HTTP server to “remember” the connection with the current database for a 
> little while? This may be against the REST principles, of course. The 
> database is guaranteed to be read-only in my case.

One option is to open the database with the initial basexhttp call. It will be 
kept open until the server is shut down:

  basexhttp -c"open name-of-db"

Best,
Christian


On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk 
<basex-talk@mailman.uni-konstanz.de<mailto:basex-talk@mailman.uni-konstanz.de>> 
wrote:
All,

I have been using BaseX for a while, connecting to the TCP endpoint. I know the 
performance I typically get, and it is impressive! However, now I wanted to use 
the HTTP endpoint, and it seems the performance is at least 2 orders of 
magnitude worse!

Here is the query that I am POSTing to 
http://localhost:8984/rest/RainFnd_6.0.10.0

<query xmlns=http://basex.org/rest>
  <text>/Class[@Package='ApplicationPlatform']/@Name</text>
</query>

This simple query will generate around 1500 results from the 13GB database 
(RainFnd_6.0.10.0<http://localhost:8984/rest/RainFnd_6.0.10.0>). It takes just 
over 7 seconds to do this. If I do this in the BaseX GUI that is self 
contained, it takes around 20ms.

However, it seems that the time spent executing the query against the database 
is negligible. Please consider this query:

<query xmlns=http://basex.org/rest>
  <text>1 + 2</text>
</query>

In which there there is obviously no database access. It takes almost the same 
amount of time as the query that accesses the database. 7 seconds to calculate 
1 + 2 is too long.

If I post the 1 + 2 query to the endpoint without specifying the database on 
the URL:

http://localhost:8984/rest

it takes around 7 milliseconds, close to what I expected, certainly within 
expectations for the time spent sending the query over the wire and serializing 
etc.

This leads me to believe that a lot of the time (>7 seconds) may be spent 
opening the database each time a POST is done? Is there a way to tweak the HTTP 
server to “remember” the connection with the current database for a little 
while? This may be against the REST principles, of course. The database is 
guaranteed to be read-only in my case.

The problem is that this makes the HTTP server inappropriate for interactive 
applications. I can still use the TCP server, where I get the results I need, 
but using the HTTP would be simpler, and have less overhead in terms of code 
needed to communicate with the server.

Please let me know if there is a way to accomplish acceptable performance with 
the HTTP server.


Best Regards

Peter Villadsen
Principal Technical Program Manager
Microsoft Business Applications Group


Reply via email to