Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
I would think so. Let me see if I can create a small example using the BaseX WEB API that recreates the problem. Geoff Alexander, Ph.D. Software Engineer, Corporate Tools Development IBM Corporation Charlotte, NC From: "Christian Grün" To: Geoff Alexander Cc: BaseX Date: 07/16/2020 03:10 PM Subject:[EXTERNAL] Re: Possible bug in database size (bytes) after database optimize from BaseX Database Administration We use the BaseX REST API from a Java problem to add and update documents in BaseX. Do you think it’s reproducible for us?
Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
> > We use the BaseX REST API from a Java problem to add and update documents > in BaseX. > Do you think it’s reproducible for us? >
Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
We use the BaseX REST API from a Java problem to add and update documents in BaseX. Geoff Alexander, Ph.D. Software Engineer, Corporate Tools Development IBM Corporation Charlotte, NC From: "Christian Grün" To: Geoff Alexander Cc: BaseX Date: 07/16/2020 03:07 PM Subject:[EXTERNAL] Re: Possible bug in database size (bytes) after database optimize from BaseX Database Administration (2) Update one or of the database's entries. If I replace the document via the DBA, or if I run an update expression via the Queries Panel, the count always reflects the number of resources, it doesn’t change. How did you update the document? Is it an XML document or a binary file you updated? Maybe I have a misundertanding in what the Optimize button on the BaseX Database Administration's Database page actually does. Feel free to have a look into our documentation [1]. [1] https://docs.basex.org/wiki/Commands#OPTIMIZE
Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
> > (2) Update one or of the database's entries. > If I replace the document via the DBA, or if I run an update expression via the Queries Panel, the count always reflects the number of resources, it doesn’t change. How did you update the document? Is it an XML document or a binary file you updated? > Maybe I have a misundertanding in what the Optimize button on the BaseX > Database Administration's Database page actually does. > Feel free to have a look into our documentation [1]. [1] https://docs.basex.org/wiki/Commands#OPTIMIZE
Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
Here are steps to recreate the problem: (1) Add one or more entries to an new (empty) database. One the BaseX Database Administration's Database page, you'll find that the database's COUNT column shows the number of entries added and that the database's BYTES column shows the database size. (2) Update one or of the database's entries. After refreshing BaseX Database Administration's Database page, you should find that the database's COUNT and BYTES columns both increased. (3) On the BaseX Database Administration's Database page, select the database and press the Optimize button. You should find that the database's COUNT column decreases back to the number of entries in the database. However, the database's BYTES column doesn't decrease to reflect a reduction in the database size. Maybe I have a misundertanding in what the Optimize button on the BaseX Database Administration's Database page actually does. Geoff Alexander, Ph.D. Software Engineer, Corporate Tools Development IBM Corporation Charlotte, NC From: "Christian Grün" To: Geoff Alexander Cc: BaseX Date: 07/16/2020 02:11 PM Subject:[EXTERNAL] Re: Possible bug in database size (bytes) after database optimize from BaseX Database Administration It's surprising that the count value changed, as it should represent the number of resources (documents, binary files) in your database – and this value shouldn't change if your data is optimized. Feel free to provide us with a little reproducible example. The size of the database may stay the same, though. The DBA provides no way to trigger a full optimization, but you can e.g. use the query panel for that. Geoff Alexander schrieb am Do., 16. Juli 2020, 20:04: On the BaseX Database Administration's Database page at https://localhost:10443/BaseX/dba/databases, I selected a database I knew was unoptimized and pressed the Optimize button. The database's COUNT column decreased to the number entries in the database as expected. However, the database's BYTES column did not change, even after I logged off and back on to BaseX Database Administration. Geoff Alexander, Ph.D. Software Engineer, Corporate Tools Development IBM Corporation Charlotte, NC "Christian Grün" ---07/16/2020 01:10:38 PM---Hi Geoff, Did you run OPTIMIZE ALL or db:optimize(..., true()) ? What do you mean by From: "Christian Grün" To: Geoff Alexander Cc: BaseX Date: 07/16/2020 01:10 PM Subject: [EXTERNAL] Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration Hi Geoff, Did you run OPTIMIZE ALL or db:optimize(..., true()) ? What do you mean by "record count"? Best, Christian Geoff Alexander schrieb am Do., 16. Juli 2020, 18:06: I've found that when I perform a database optimize on an unoptimized database that while the record count decreases as expected, the database size (bytes column) stay the same. Is this a bug in reporting the database size (less severe problem) or a bug in the database not reducing on optimize (more severe problem)? I've encountered this with both BaseX 8.6.7 and 9.3.3 running on Windows 10. Thanks, Geoff Alexander, Ph.D. Software Engineer, Corporate Tools Development IBM Corporation Charlotte, NC [attachment "graycol.gif" deleted by Geoff Alexander/Raleigh/IBM]
Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
It's surprising that the count value changed, as it should represent the number of resources (documents, binary files) in your database – and this value shouldn't change if your data is optimized. Feel free to provide us with a little reproducible example. The size of the database may stay the same, though. The DBA provides no way to trigger a full optimization, but you can e.g. use the query panel for that. Geoff Alexander schrieb am Do., 16. Juli 2020, 20:04: > On the BaseX Database Administration's Database page at > https://localhost:10443/BaseX/dba/databases, I selected a database I knew > was unoptimized and pressed the Optimize button. The database's COUNT > column decreased to the number entries in the database as expected. > However, the database's BYTES column did not change, even after I logged > off and back on to BaseX Database Administration. > > Geoff Alexander, Ph.D. > Software Engineer, Corporate Tools Development > IBM Corporation > Charlotte, NC > > > [image: Inactive hide details for "Christian Grün" ---07/16/2020 01:10:38 > PM---Hi Geoff, Did you run OPTIMIZE ALL or db:optimize(..., t]"Christian > Grün" ---07/16/2020 01:10:38 PM---Hi Geoff, Did you run OPTIMIZE ALL or > db:optimize(..., true()) ? What do you mean by > > From: "Christian Grün" > To: Geoff Alexander > Cc: BaseX > Date: 07/16/2020 01:10 PM > Subject: [EXTERNAL] Re: [basex-talk] Possible bug in database size > (bytes) after database optimize from BaseX Database Administration > -- > > > > Hi Geoff, > > Did you run OPTIMIZE ALL or db:optimize(..., true()) ? What do you mean by > "record count"? > > Best, > Christian > > > > Geoff Alexander <*gd...@us.ibm.com* > schrieb am Do., > 16. Juli 2020, 18:06: > >I've found that when I perform a database optimize on an unoptimized >database that while the record count decreases as expected, the database >size (bytes column) stay the same. Is this a bug in reporting the database >size (less severe problem) or a bug in the database not reducing on >optimize (more severe problem)? I've encountered this with both BaseX 8.6.7 >and 9.3.3 running on Windows 10. > >Thanks, >Geoff Alexander, Ph.D. >Software Engineer, Corporate Tools Development >IBM Corporation >Charlotte, NC > > > > >
Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
On the BaseX Database Administration's Database page at https://localhost:10443/BaseX/dba/databases, I selected a database I knew was unoptimized and pressed the Optimize button. The database's COUNT column decreased to the number entries in the database as expected. However, the database's BYTES column did not change, even after I logged off and back on to BaseX Database Administration. Geoff Alexander, Ph.D. Software Engineer, Corporate Tools Development IBM Corporation Charlotte, NC From: "Christian Grün" To: Geoff Alexander Cc: BaseX Date: 07/16/2020 01:10 PM Subject:[EXTERNAL] Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration Hi Geoff, Did you run OPTIMIZE ALL or db:optimize(..., true()) ? What do you mean by "record count"? Best, Christian Geoff Alexander schrieb am Do., 16. Juli 2020, 18:06: I've found that when I perform a database optimize on an unoptimized database that while the record count decreases as expected, the database size (bytes column) stay the same. Is this a bug in reporting the database size (less severe problem) or a bug in the database not reducing on optimize (more severe problem)? I've encountered this with both BaseX 8.6.7 and 9.3.3 running on Windows 10. Thanks, Geoff Alexander, Ph.D. Software Engineer, Corporate Tools Development IBM Corporation Charlotte, NC
Re: [basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
Hi Geoff, Did you run OPTIMIZE ALL or db:optimize(..., true()) ? What do you mean by "record count"? Best, Christian Geoff Alexander schrieb am Do., 16. Juli 2020, 18:06: > I've found that when I perform a database optimize on an unoptimized > database that while the record count decreases as expected, the database > size (bytes column) stay the same. Is this a bug in reporting the database > size (less severe problem) or a bug in the database not reducing on > optimize (more severe problem)? I've encountered this with both BaseX 8.6.7 > and 9.3.3 running on Windows 10. > > Thanks, > Geoff Alexander, Ph.D. > Software Engineer, Corporate Tools Development > IBM Corporation > Charlotte, NC > >
[basex-talk] Possible bug in database size (bytes) after database optimize from BaseX Database Administration
I've found that when I perform a database optimize on an unoptimized database that while the record count decreases as expected, the database size (bytes column) stay the same. Is this a bug in reporting the database size (less severe problem) or a bug in the database not reducing on optimize (more severe problem)? I've encountered this with both BaseX 8.6.7 and 9.3.3 running on Windows 10. Thanks, Geoff Alexander, Ph.D. Software Engineer, Corporate Tools Development IBM Corporation Charlotte, NC
Re: [basex-talk] Transaction Support
Hi Marco Thanks for answering But that means Basex is NOT supporting database transactions Or is it possible to implement real transactions (START TRANS, do something, COMMIT trans) with that PUL or something? Reto From: BaseX-Talk [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of Marco Lettere Sent: 16 July 2020 17:46 To: basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] Transaction Support Hi Reto, AFAIK Basex is transactional in the sense that whenever you start a sequence of commands or an XQuery script, all the "updating operations" that modify the database are always stored in a PUL (a list of potential updates). Only when the script terminates all the operations on the DB are effectively committed. There is no explicit rollback operation. Regards, Marco. On 16/07/20 17:40, Reto Peter wrote: I am evaluating BaseX for my XML project. I need transaction support like -Start transaction -Run queries (read, write, update) -Commit or rollback transaction When I see the documentation, it lists Transaction Manager. But when I look at the details, I cannot find anything like that. Anyone can explain me how is the support, or is there an add-on or planned something? Best regards Reto, Frauenfeld, Schweiz
Re: [basex-talk] Transaction Support
Hi Reto, AFAIK Basex is transactional in the sense that whenever you start a sequence of commands or an XQuery script, all the "updating operations" that modify the database are always stored in a PUL (a list of potential updates). Only when the script terminates all the operations on the DB are effectively committed. There is no explicit rollback operation. Regards, Marco. On 16/07/20 17:40, Reto Peter wrote: I am evaluating BaseX for my XML project. I need transaction support like -Start transaction -Run queries (read, write, update) -Commit or rollback transaction When I see the documentation, it lists Transaction Manager. But when I look at the details, I cannot find anything like that. Anyone can explain me how is the support, or is there an add-on or planned something? Best regards Reto, Frauenfeld, Schweiz
Re: [basex-talk] Database file path
Ah no, I'm not talking about changing it in runtime, I'm talking about specifying the path on database creation, for example when CreateDB command is executed. There shouldn't be concurrency problems at the moment of database creation, correct? -Vladimir On Thu, Jul 16, 2020 at 8:41 AM Christian Grün wrote: > Right, the option is global. As BaseX has been designed to serve > concurrent requests, it would introduce unexpected side effects of the path > was changed at runtime. > > If you are careful, you can try to change the path by assigning a new > value to Context.soptions. > > > Vladimir Churyukin schrieb am Do., 16. Juli 2020, > 17:33: > >> Yes, I've seen that option. >> But there is no way to set it per database, correct? >> I'm asking because by nature our operations are ad-hoc, we don't really >> "startup" the instances, >> we create a database, process the data, then destroy the database. >> Is there some internal limitation why this option needs to be global? >> >> thank you, >> -Vladimir >> >> On Thu, Jul 16, 2020 at 4:36 AM Christian Grün >> wrote: >> >>> Hi Vladimir, >>> >>> The DBPATH option is the one you’ll need to assign. As it’s a global >>> option, it should be assigned at startup time [1]. >>> >>> Best, >>> Christian >>> >>> [1] https://docs.basex.org/wiki/Options >>> >>> >>> >>> On Thu, Jul 16, 2020 at 6:11 AM Vladimir Churyukin >>> wrote: >>> > >>> > Hello, >>> > >>> > We have a data transformation pipeline that works with XML files of >>> different sizes, sometimes big (up to several gigabytes). >>> > We are using BaseX to do the transformations. >>> > For smaller files we use the MAINMEM option, because the whole >>> database can fit in memory. But for some files we can't do that, but all >>> databases we create are simply disposable, and we'd like to control where >>> we put them, and destroy them after the processing. >>> > Is there a special option or any other way to specify where particular >>> database's files will reside? >>> > How to specify that when we call CreateDB command? >>> > >>> > thank you, >>> > Vladimir >>> >>
Re: [basex-talk] Database file path
Right, the option is global. As BaseX has been designed to serve concurrent requests, it would introduce unexpected side effects of the path was changed at runtime. If you are careful, you can try to change the path by assigning a new value to Context.soptions. Vladimir Churyukin schrieb am Do., 16. Juli 2020, 17:33: > Yes, I've seen that option. > But there is no way to set it per database, correct? > I'm asking because by nature our operations are ad-hoc, we don't really > "startup" the instances, > we create a database, process the data, then destroy the database. > Is there some internal limitation why this option needs to be global? > > thank you, > -Vladimir > > On Thu, Jul 16, 2020 at 4:36 AM Christian Grün > wrote: > >> Hi Vladimir, >> >> The DBPATH option is the one you’ll need to assign. As it’s a global >> option, it should be assigned at startup time [1]. >> >> Best, >> Christian >> >> [1] https://docs.basex.org/wiki/Options >> >> >> >> On Thu, Jul 16, 2020 at 6:11 AM Vladimir Churyukin >> wrote: >> > >> > Hello, >> > >> > We have a data transformation pipeline that works with XML files of >> different sizes, sometimes big (up to several gigabytes). >> > We are using BaseX to do the transformations. >> > For smaller files we use the MAINMEM option, because the whole database >> can fit in memory. But for some files we can't do that, but all databases >> we create are simply disposable, and we'd like to control where we put >> them, and destroy them after the processing. >> > Is there a special option or any other way to specify where particular >> database's files will reside? >> > How to specify that when we call CreateDB command? >> > >> > thank you, >> > Vladimir >> >
[basex-talk] Transaction Support
I am evaluating BaseX for my XML project. I need transaction support like -Start transaction -Run queries (read, write, update) -Commit or rollback transaction When I see the documentation, it lists Transaction Manager. But when I look at the details, I cannot find anything like that. Anyone can explain me how is the support, or is there an add-on or planned something? Best regards Reto, Frauenfeld, Schweiz
Re: [basex-talk] Database file path
Yes, I've seen that option. But there is no way to set it per database, correct? I'm asking because by nature our operations are ad-hoc, we don't really "startup" the instances, we create a database, process the data, then destroy the database. Is there some internal limitation why this option needs to be global? thank you, -Vladimir On Thu, Jul 16, 2020 at 4:36 AM Christian Grün wrote: > Hi Vladimir, > > The DBPATH option is the one you’ll need to assign. As it’s a global > option, it should be assigned at startup time [1]. > > Best, > Christian > > [1] https://docs.basex.org/wiki/Options > > > > On Thu, Jul 16, 2020 at 6:11 AM Vladimir Churyukin > wrote: > > > > Hello, > > > > We have a data transformation pipeline that works with XML files of > different sizes, sometimes big (up to several gigabytes). > > We are using BaseX to do the transformations. > > For smaller files we use the MAINMEM option, because the whole database > can fit in memory. But for some files we can't do that, but all databases > we create are simply disposable, and we'd like to control where we put > them, and destroy them after the processing. > > Is there a special option or any other way to specify where particular > database's files will reside? > > How to specify that when we call CreateDB command? > > > > thank you, > > Vladimir >
Re: [basex-talk] Database file path
Hi Vladimir, The DBPATH option is the one you’ll need to assign. As it’s a global option, it should be assigned at startup time [1]. Best, Christian [1] https://docs.basex.org/wiki/Options On Thu, Jul 16, 2020 at 6:11 AM Vladimir Churyukin wrote: > > Hello, > > We have a data transformation pipeline that works with XML files of different > sizes, sometimes big (up to several gigabytes). > We are using BaseX to do the transformations. > For smaller files we use the MAINMEM option, because the whole database can > fit in memory. But for some files we can't do that, but all databases we > create are simply disposable, and we'd like to control where we put them, and > destroy them after the processing. > Is there a special option or any other way to specify where particular > database's files will reside? > How to specify that when we call CreateDB command? > > thank you, > Vladimir
[basex-talk] Corrupt database after update
Hi, we have a rather strange and hard to track problem with corrupted databases. Our setup is: * Docker container with a Tomcat that hosts BaseX with some custom RESTXQ services * BaseX 9.2.4 * Java 14 * Docker runs on a Linux VM Workflow * Create database with RESTXQ service call * Import JSON document with RESTXQ service call o call this multiple times. After some Import calls, the import fails and the database is corrupt from this point on. We first thought that it has something to do with the content of the document. But we found no pattern. Sometimes it works, but sometime it does not. There is no concurrency involved. There are no other clients that read or write to the database. We also tried to deactivate the UPDINDEX setting. But it had no effect and we could reproduce the error with and without the automatic index update. The logs in case of errors look like this: 06:58:00.683 172.18.0.2:33728 admin REQUEST [PUT] /c42-core/api/v1/restxq/user/documents/c42-index/metadata%40document06:58:00.700 172.18.0.2:33728 admin 500 Unexpected error: Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 9.2.4 Java: Oracle Corporation, 14.0.1 OS: Linux, amd64 Stack Trace: java.lang.ArrayIndexOutOfBoundsException: Index 4 out of bounds for length 1 at org.basex.io.random.TableDiskAccess.fpre(TableDiskAccess.java:507) at org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:467) at org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:156) at org.basex.data.Data.kind(Data.java:304) at org.basex.query.up.DataUpdates.prepare(DataUpdates.java:133) at org.basex.query.up.ContextModifier.prepare(ContextModifier.java:90) at org.basex.query.up.Updates.prepare(Updates.java:168) at org.basex.query.QueryContext.update(QueryContext.java:678) at org.basex.query.QueryContext.iter(QueryContext.java:332) at org.basex.http.restxq.RestXqResponse.serialize(RestXqResponse.java:73) at org.basex.http.web.WebResponse.create(WebResponse.java:63) at org.basex.http... 16.59 ms Our service does not much. It just calls db:replace(). declare variable $documents:IMPORT_OPTS := map {'chop': fn:false(), 'stripns': fn:false(), 'intparse': fn:true()};declare%rest:PUT("{$xml}")%rest:consumes("application/xml")%rest:produces("application/json")%rest:path("/user/documents/{$databaseId}/{$documentId}")%updatingfunction documents:create($databaseId as xs:string, $documentId as xs:string, $xml as document-node()){if (db:exists($databaseId)) then (update:output(response:empty(204, ())),db:replace($databaseId, documents:decode($documentId), $xml, $documents:IMPORT_OPTS))else (update:output(response:json(errors:error('C42UDO002', map {'databaseId': $databaseId}), 404)), ())}; I've attached the example input document. One addition: We could not reproduce this error running the Docker container on a Windows host. Any feedback or hints to solve this are greatly appreciated. Best regards Johannes [ { "databaseid": "c42-content", "documentid": "doc_26424521995_de-DE", "metadata": [ { "id": "document", "name": "Document ID", "values": [ "doc_26424521995_de-DE" ] }, { "id": "media", "name": "Media Document ID", "values": [ "media_26424521995_de-DE" ] }, { "id": "projectId", "name": "Project ID", "values": [ "26424521995" ] }, { "id": "lang", "name": "Language", "values": [ "de-DE" ] }, { "id": "sysTitle", "name": "System Title", "values": [ "Kompaktleistungsschalter 3VA mit IEC-Zertifikat" ] }, { "id": "type", "name": "Type", "values": [ "Gerätehandbuch" ] }, { "id": "system", "name": "System", "values": [ "SENTRON" ] }, { "id": "productGroup", "name": "Product Group", "values": [ "Schutzgeräte" ] }, { "id": "importDate", "name": "Import Date", "values": [ "2020-07-16T06:10:50.405Z" ] } ] }, { "databaseid": "c42-content", "documentid": "doc_26424521995_en-US", "metadata": [ { "id": "document", "name": "Document ID", "values": [ "doc_26424521995_en-US" ] }, { "id": "media", "name": "Media Document ID", "values": [ "media_26424521995_en-US" ] }, { "id": "projectId", "name": "Project ID", "values": [ "26424521995" ] }, { "id": "lang", "name": "Language", "values": [ "en-US" ] }, { "id":