Hi Peter,

Can you reproduce the problem with the latest version (11.7) and the zip
distribution of BaseX (e.g., without Docker)?

Best,
Christian



Peter Villadsen <peter.villad...@microsoft.com> schrieb am Fr., 7. März
2025, 05:00:

> I did some more work to capture the relevant information for the two
> crashes.
>
>
>
> As you recall, I build a container image on top of the official (but old)
> basex one. It copies the database into the right place in the container.
>
>
>
> I added the -c switch to the basexhttp command when the container starts.
> Even when I do this, the container does not have data context set when the
> first operation involving the database happens - If I do a query that
> involves the open database:
>
>
>
> <query>
>
>    <text>count(/ada)</text>
>
> </query>
>
>
>
> I get:
>
>
>
> Improper use? Potential bug? Your feedback is welcome:
>
> Contact: basex-talk@mailman.uni-konstanz.de
>
> Version: BaseX 9.6 RC1
>
> Java: IcedTea,
>
> 1.8.0_212
>
> OS: Linux, amd64
>
> Stack Trace:
>
> java.lang.NullPointerException
>
>     at org.basex.data.Data.defaultNs(Data.java: 270)
>
>     at org.basex.query.expr.path.NameTest.noMatches(NameTest.java: 60)
>
>     at org.basex.query.expr.path.Step.optimize(Step.java: 162)
>
>     at org.basex.query.expr.path.Step.optimize(Step.java: 134)
>
>     at org.basex.query.expr.Preds.compile(Preds.java: 59)
>
>     at org.basex.query.expr.path.Path.lambda$compile$0(Path.java: 139)
>
>     at org.basex.query.CompileContext.get(CompileContext.java: 165)
>
>     at org.basex.query.expr.path.Path.compile(Path.java: 134)
>
>     at org.basex.query.expr.Arr.compile(Arr.java: 47)
>
>     at org.basex.query.scope.MainModule.comp(MainModule.java: 81)
>
>     at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 119)
>
>     at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 106)
>
>     at org.basex.query.QueryContext.compile(QueryContext.java: 306)
>
>     at org.basex.query.QueryProcessor.compile(QueryProcessor.java: 79)
>
>     at org.basex.core.cmd.AQuery.query(AQuery.java: 91)
>
>     at org.basex.core.cmd.XQuery.run(XQuery.java: 22)
>
>     at org.basex.core.Command.run(Command.java: 257)
>
>     at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 105)
>
>     at org.basex.http.rest.RESTQuery.query(RESTQuery.java: 69)
>
>     at org.basex.http.rest.RESTQuery.run0(RESTQuery.java: 37)
>
>     at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 70)
>
>     at org.basex.core.Command.run(Command.java: 257)
>
>     at org.basex.core.Command.execute(Command.java: 93)
>
>     at org.basex.core.Command.execute(Command.java: 116)
>
>     at org.basex.http.rest.RESTServlet.run(RESTServlet.java: 32)
>
>     at org.basex.http.BaseXServlet.service(BaseXServlet.java: 65)
>
>
>
> If I remove the -c flag, and just let the container start (still with the
> database copied into place in the container), I get this trace when I try
> to do anything related with the database:
>
>
>
> Improper use? Potential bug? Your feedback is welcome:
>
> Contact: basex-talk@mailman.uni-konstanz.de
>
> Version: BaseX 9.6 RC1
>
> Java: IcedTea, 1.8.0_212
>
> OS: Linux, amd64
>
> Stack Trace:
>
> java.lang.NullPointerException
>
>         at org.basex.data.DiskData.write(DiskData.java:146)
>
>         at org.basex.data.DiskData.close(DiskData.java:160)
>
>         at org.basex.core.Datas.unpin(Datas.java:52)
>
>         at org.basex.core.cmd.Close.close(Close.java:45)
>
>         at org.basex.query.QueryResources.close(QueryResources.java:92)
>
>         at org.basex.query.QueryContext.close(QueryContext.java:515)
>
>         at org.basex.query.QueryProcessor.close(QueryProcessor.java:251)
>
>         at org.basex.core.cmd.AQuery.query(AQuery.java:132)
>
>         at org.basex.core.cmd.XQuery.run(XQuery.java:22)
>
>         at org.basex.core.Command.run(Command.java:257)
>
>         at org.basex.core.Command.execute(Command.java:93)
>
>         at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
>
>         at org.basex.api.client.Session.execute(Session.java:36)
>
>         at org.basex.core.CLI.execute(CLI.java:92)
>
>         at org.basex.core.CLI.execute(CLI.java:76)
>
>         at org.basex.BaseX.console(BaseX.java:177)
>
>         at org.basex.BaseX.<init>(BaseX.java:152)
>
>         at org.basex.BaseX.main(BaseX.java:43)
>
>
>
> I hope this is useful. Right now I am blocked.
>
>
>
> Best Regards
>
>
>
> Peter Villadsen.
>
>
>
>
>
> *From:* Peter Villadsen
> *Sent:* Tuesday, March 4, 2025 12:43 PM
> *To:* Christian Grün <christian.gr...@gmail.com>
> *Cc:* basex-talk@mailman.uni-konstanz.de
> *Subject:* RE: [EXTERNAL] Re: [basex-talk] HTTP server performance seems
> very slow...
>
>
>
> Christian,
>
>
>
> Yes, I have. Thank you for following up - I should have come back earlier.
>
>
>
> I have been experimenting with this for a while now, and the container
> image (the official one and the quodatum, newer 10.3 one) both crash when I
> try to use them, both through HTTP and TCP. I am still looking into it. If
> I do not manage to find out what the issue is, I will upload the stack
> traces.
>
>
>
> In both cases, I built my own container to include the database, so I can
> avoid the volumes and have the container be completely self-contained. The
> database is 19GB, so the container gets pretty big.
>
>
>
> Here is how I start the container:
>
>
>
> docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984
> rainier05042023
>
>
>
> In my humble opinion it is unfortunate that the official container image
> has not been updated for at least 3 years. It would be nice to have the
> newest bits there, supported by BaseX.
>
>
>
> Here is the dockerfile I use:
>
>
>
> # escape=`
>
> # Use the BaseX 10.3 image as the base image
>
> FROM basex/basexhttp
>
>
>
> # Copy the Windows database directory into the container so it is available
>
> # when the container starts, without providing a --volume parameter.
>
> # This is fine since the database is essentially read-only.
>
>
>
> WORKDIR /srv/basex/data
>
> COPY --chown=basex:basex Rainier05042023 "Rainier05042023/"
>
>
>
> # The older versions of BaseX just use admin/admin.
>
> # RUN echo "admin" | /srv/basex/bin/basex -cPASSWORD
>
>
>
> # Modify the CMD command so that the Rainier05042023 database is opened
>
> CMD /usr/local/bin/basexhttp -c "open Rainier05042023"
>
>
>
> LABEL description="Legacy BaseX with Rainier05042023 database"
>
>
>
> # Here is a build command that builds the container with the name
> Rainier05042023:
>
> #
>
> # cd to the directory containing this Dockerfile and run the command:
>
> # docker build -t rainier05042023 .
>
> #
>
> # When the docker container has been built it can be run with the name
>
> # provided in the build command i.e. rainier05042023. It can be saved
>
> # to a file with the command:
>
> #
>
> # docker save -o Rainier05042023.tar rainier05042023
>
> #
>
> # and loaded with the command:
>
> #
>
> # docker load -i rainier05042023.tar
>
> #
>
> # The container can be run with the command:
>
> # docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p
> 8984:8984 rainier05042023
>
> # The database can be accessed at http://localhost:8080/dba/
>
>
>
> Best Regards
>
>
>
> Peter VIlladsen
>
>
>
> *From:* Christian Grün <christian.gr...@gmail.com>
> *Sent:* Tuesday, March 4, 2025 6:05 AM
> *To:* Peter Villadsen <peter.villad...@microsoft.com>
> *Cc:* basex-talk@mailman.uni-konstanz.de
> *Subject:* [EXTERNAL] Re: [basex-talk] HTTP server performance seems very
> slow...
>
>
>
> Hi Peter,
>
>
>
> To be sure, could you confirm that you have received my mails?
>
>
>
> Best regards,
>
> Christian
>
>
>
>
>
> On Sat, Feb 22, 2025 at 12:41 PM Christian Grün <christian.gr...@gmail.com>
> wrote:
>
> Hi Peter,
>
>
>
> > This leads me to believe that a lot of the time (>7 seconds) may be
> spent opening the database each time a POST is done? Is there a way to
> tweak the HTTP server to “remember” the connection with the current
> database for a little while? This may be against the REST principles, of
> course. The database is guaranteed to be read-only in my case.
>
>
>
> One option is to open the database with the initial basexhttp call. It
> will be kept open until the server is shut down:
>
>
>
>   basexhttp -c"open name-of-db"
>
>
>
> Best,
>
> Christian
>
>
>
>
>
> On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk <
> basex-talk@mailman.uni-konstanz.de> wrote:
>
> All,
>
>
>
> I have been using BaseX for a while, connecting to the TCP endpoint. I
> know the performance I typically get, and it is impressive! However, now I
> wanted to use the HTTP endpoint, and it seems the performance is at least 2
> orders of magnitude worse!
>
>
>
> Here is the query that I am POSTing to
> http://localhost:8984/rest/RainFnd_6.0.10.0
>
>
>
> <query xmlns=http://basex.org/rest>
>
>   <text>/Class[@Package='ApplicationPlatform']/@Name</text>
>
> </query>
>
>
>
> This simple query will generate around 1500 results from the 13GB database
> (RainFnd_6.0.10.0 <http://localhost:8984/rest/RainFnd_6.0.10.0>). It
> takes just over 7 seconds to do this. If I do this in the BaseX GUI that is
> self contained, it takes around 20ms.
>
>
>
> However, it seems that the time spent executing the query against the
> database is negligible. Please consider this query:
>
>
>
> <query xmlns=http://basex.org/rest>
>
>   <text>1 + 2</text>
>
> </query>
>
>
>
> In which there there is obviously no database access. It takes almost the
> same amount of time as the query that accesses the database. 7 seconds to
> calculate 1 + 2 is too long.
>
>
>
> If I post the 1 + 2 query to the endpoint without specifying the database
> on the URL:
>
>
>
> http://localhost:8984/rest
>
>
>
> it takes around 7 milliseconds, close to what I expected, certainly within
> expectations for the time spent sending the query over the wire and
> serializing etc.
>
>
>
> This leads me to believe that a lot of the time (>7 seconds) may be spent
> opening the database each time a POST is done? Is there a way to tweak the
> HTTP server to “remember” the connection with the current database for a
> little while? This may be against the REST principles, of course. The
> database is guaranteed to be read-only in my case.
>
>
>
> The problem is that this makes the HTTP server inappropriate for
> interactive applications. I can still use the TCP server, where I get the
> results I need, but using the HTTP would be simpler, and have less overhead
> in terms of code needed to communicate with the server.
>
>
>
> Please let me know if there is a way to accomplish acceptable performance
> with the HTTP server.
>
>
>
>
>
> Best Regards
>
>
>
> Peter Villadsen
>
> Principal Technical Program Manager
>
> Microsoft Business Applications Group
>
>
>
>
>
>

Reply via email to